The invisible gatekeeper (how to develop creativity and culture in Wales)

Over on IWA’s blog, Colin Thomas writes about bypassing gatekeepers in his write-up of the Creativity in Hard Times event. I like his themes but Thomas could go further and I’d like to identify an “invisible gatekeeper” so here’s a post in response.

Lately I’ve done a lot of thinking and a bunch of posts in Welsh about copyright, licensing and content here. As it’s not my first language I’m hitting limits about how expressive I can be at the moment. I think the ideal language to discuss Welsh language culture is Welsh itself so I hope to rectify that pretty soon. But on with the blog post.

I went to a small premiere of the first episode of Pen Talar, the S4C TV series recently (it’s partly what inspired PenTalarPedia which I co-developed).

At the event Arwel Ellis Owen said a few words and mentioned the Tynged yr Iaith radio speech by Saunders Lewis which features in Pen Talar. For what was to become a pivotal moment in Wales’ history it’s now astonishing that only one person had the foresight to record the audio of Lewis’ speech. His name was Dafydd Alun Jones and the audio would have been lost to history if he hadn’t taken the initiative. As I understand it this was an unofficial, unlicensed recording done at home. I don’t think it was part of his original plan but this enabled the official LP release to happen later. Whatever your politics, I hope you’ll agree it was a pioneering thing for Jones to do.

So I asked myself, “with regard to content how can we be like Dafydd Alun Jones in 2010? What should we be doing?”

Even now you can still be a pioneer by recording audio – and now video. But the revolutionary and exciting changes I want to discuss are in copying and distribution. And the most effective distribution we now know is the web – in other words, uploading something to make it available online. Tynged yr Iaith is now on YouTube and can be embedded on any web page or blog next to any comment you’d like to make. It’s just one piece of cultural produce from Cymru of course.

When I started university in 1999, a friend showed me Napster which was software to enable peer-to-peer music file sharing. It gradually became clear that this would change the nature of the game for content creators, owners and distributors (although I might not have expressed it quite that way at the time). Today there are many people, young and old, who realise that unlicensed copying can be a legitimate practice – it’s just waiting for official, more sensible, licensing. Decades ago it happened with various rights around music like the performance right for songs, in response to unlicensed uses. It really should now happen for other works including TV programmes and films – especially those which are unavailable or out of print and therefore, regrettably, approaching limited use or even uselessness in this digital age. Unlike text, you’re not even officially allowed to lift a segment of a film to use under fair dealing as a “quote”.

YouTube’s own mechanisms for royalty collection are still being debated and sometimes negotiated by lawyers, by many accounts they are flawed. But these are minor details. Similar discussions happen around Spotify the licensed music streaming service which actually uses peer-to-peer sharing in the background to distribute the music and lower costs.

Dafydd Alun Jones (as I understand it) did not write to the BBC expecting to wait for a letter of permission to come back. In doing so he could have missed the programme.

Today we are missing the programme in Wales, not only figuratively but also literally.

Many, many things lie decaying in archives. They don’t make a penny for anyone and they need to be released somehow. Fritz Lang’s film Metropolis was made available recently in an extended director’s cut because a reel containing lost scenes was found in Argentina. That was lucky in a way. It’s a warning for us and shows us what we need to emulate – to the power of a hundred – with Welsh culture. We can’t rely on a tiny number of decaying copies somewhere. Nevermind old things which have gone into the public domain, I actually think we are missing wider availability and business opportunities by not copying the cultural treasures of TODAY. By copying we increase not only the long term value of a work but its value today. But there are more ways to maximise this value.

The Newport parody video which Colin Thomas mentions is a good example of remix which is a major practice which the web enables. This is just a continuation of 20th century sampling and folk cultures which date before that. Remixing can be another legitimate practice which is waiting for a sensible licence (and sometimes tantalisingly close to huge profitability). Jay-Z who raps on the original New York track routinely releases acapella versions of his songs. He’s not stupid. This “openness” leads to more creativity, more underground kudos, more free promotion and more highly paid gigs for Jay-Z. Remember when Danger Mouse unofficially sampled a whole Jay-Z album and combined it with The Beatles on his own Grey Album? He didn’t ask for permission and it led to a legal letter from EMI’s lawyers. But people forget that detail because not long after Danger Mouse signed a recording contract and became a massively successful EMI artist (as a key member of Gorillaz, Gnarls Barkley and so on).

The comparative paucity of Welsh language content means we need to use every single trick we can find to make it go further. There is amazing creative potential in Wales lying unused and waiting to be enjoyed – and financially exploited.

Often what’s true for things in English and other big languages is so much more true, utterly true, in Welsh. Lawrence Lessig talks about how creativity is being strangled by the law, which is a perfect example. Lessig’s metaphor of strangulation is sometimes too painful for me to think about in a Welsh language context.

There’s an economic race to the bottom going on now. We need to remove as many restrictions on creativity as possible – the two that come to mind here are the excessively long copyright terms and constrictive “all rights reserved” licensing.

No matter how famous you are, if you create stuff (music, programmes, films, art, blog posts) then you need to make it your business to research what Creative Commons means. I recommend that family of licences because they are the most popular and now the de facto and will ensure the widest interoperability of different works from around the globe.

Creative Commons is a sensible move in a digital era because the digital era is synonymous with copying. Aside from open licences, content owners and record companies will have to change and where the money comes from may change. But our problem in Wales is not piracy – it’s obscurity. (Thanks to Tim O’Reilly for that insight.)

People are already mashing up S4C and it’s leading to, as you’d expect, mixed results. But in order for really cool things to happen in abundance, people need to be explicitly encouraged.

So what is the invisible gatekeeper? It’s the way we approach copyright. It potentially affects all creative people in Wales.

In my ideal world I would like to see all S4C programmes released under a permissive licence such as Creative Commons to allow and explicitly encourage adaptations. All of these would need to be credited as derivative works. I realise there are a whole bunch of things that need to be done before that can happen, such as negotiations with various unions and production companies. These are important but like YouTube’s spats over percentage points with PRS they are details.

The BBC have toyed with open licensing (Creative Archive and R&DTV come to mind). It’s one way they could take on Murdoch.

We pay for S4C, we pay for the BBC, it’s time we looked to maximise the value of these things. What could happen if creators were given more resources and more freedom? It’s an interesting thought experiment, let’s make it a real experiment.

S4C’s licence or adoption of Creative Commons (in my ideal world) would probably have to allow only non-commercial re-uses. If S4C liked the result, they could then work out a separate deal with the remixers/co-creators. A hypothetical example could be (say) a Galician language production of Pen Talar On Ice. I’m only half joking about this. It could be performed, recorded and copied non-commercially. If anyone wanted to use it commercially they would have to come to an agreement to pay S4C and, presumably, Fiction Factory who are the production company behind Pen Talar. This is all standard practice when something openly licensed leads to use under a separate commercial licence.

Colin Thomas says:

There seems to me to be alarmingly little realisation of what convergence will mean for the future of Welsh media. Only Y Lolfa, it seems, is producing books in Welsh that can be read on a Kindle or e-book…

Book publishers will realise the benefits of ebook distribution, I don’t doubt it. There’s money there.

But while we’re on it someone needs to release Thomas’ TV programme The Dragon Has Two Tongues digitally for those who weren’t around (including me). In the digital age, we need a link to the programme because a misty eyed reminisce is not enough. Stick it up on YouTube if you must.

In summary here are three major benefits – and corresponding threats – which I’ve expanded on through this blog over the last few weeks. You can apply them to creative output from Wales but particularly things in the Welsh language:

  • Benefit: wide availability now (threats: restrictive licences and for older stuff, copyright term)
  • Benefit: wide availability tomorrow (threat: lack of copies)
  • Benefit: creativity, re-use, remix and adaptation (threats: restrictive licences and for older stuff, copyright term)

Bonus: read the story of Bernie Andrews for another example of archival heroism.

UPDATE 19/10/2010: Just found some extracts from The Dragon Has Two Tongues on YouTube, courtesy of someone who’s transferred them from wobbly VHS.

UPDATE 15/12/2010: Just re-read this and realised this sentence is too limited “And the most effective distribution we now know is the web – in other words, uploading something to make it available online.”. I didn’t include online methods of distribution that use the Internet but run outside the web, e.g. BitTorrent, FTP and so on. It also doesn’t include peer-to-peer copying that happens on other networks, e.g. intranets, LANs and other media, e.g. a memory stick. Most of the principles are the same though. Fecundity is good.

Hacio’r Iaith – what it is, why it is and what happened (monster post!)

A group of us did a free, open event in Aberystwyth on 30th January 2010 called Hacio’r Iaith. It was fun. I learned things. It was based on the BarCamp format. You can use the format to have a conference on any subject and many people do. Some people call it an unconference.

The reasons we organised an offline event should be obvious. A chance to shake hands and consume body aroma content, the only remaining experiences not yet available online.

Around 40 people came. That number seemed about right for a one-day event, I didn’t even get a chance to talk to everyone properly.

One of the main aims was to get people together to talk about shared interests, so on that basis it was almost bound to be a success after the second or third person said they’d come along. When you know people will get talking there is no need for anxiety, even if the wifi access goes down (it was fine actually), the food doesn’t arrive (it did and was splendid – thanks chefs and sponsors!) or the firewall doesn’t allow FTP access (unfortunately it didn’t, but that was a mere glitch and chance to learn something).

Keywords will be in bold here because this is getting long…

The offline component of the event is finished. For a few reasons it’s a pity you can’t access big chunks of the event now. You really had to be there maaan. Saying all that, it’s still open to an extent because we purposefully made it a hybrid of offline and online. Several web-based backchannels existed before and during the meet-up: the event wiki, the group blog, Twitter messages, videos on YouTube and photos/images on Flickr.

These backchannels persist afterwards, which increases the value of doing the event for years to come. That goes for potentially everyone on the web (especially now that Google Translate can get you the gist of the Welsh in several other languages).

These are some of the benefits of the social web. These benefits are seldom discussed by the mainstream media, incidentally!

I want other people to see all this stuff if they search for related things. I know there are other people who attended who want it to have an influence. On that note, not every problem is a problem of information. (That’s the second Neil Postman link in this post. Consider that chin thoroughly stroked.) But some problems are related to information. For instance, taking abundant information and converting it into something useful is something we can step up. It’s something that could benefit Wales, where I live and most of the attendees live.

I’d like to see more BarCamps, unconferences and so on happening in Wales. Incidentally that’s part of the reason why I’ve chosen to write this in English, to give the non-Welsh speaking people in Wales some access to the proceedings. And other people around the world who might be interested.

As far as I know, Hacio’r Iaith on Saturday was the first BarCamp-style event to be conducted in Cymraeg, the Welsh language. The subject matter? Web and technology as it relates to the Welsh language. Those things – language and subject matter – don’t necessarily follow. Naturally people discuss their language in their own language. But a group could organise a BarCamp about any subject and do it in the Welsh language. Absolutely any subject.

For nearly everyone who attended it’s their number one language for everything they do daily and has been for as long as they remember.

I can only talk about the sessions I attended. Everything is from my perspective!

The first session was about tools for Welsh learners, including a website and series of online lessons called Say Something In Welsh build with phpBB, an iPhone application called Learn Welsh and some ideas for mobile app “flashcards” suggested by a tutor. We talked about the conflicting difficulties of making apps available to all mobile users, even if they are web-based apps running on mobile. I asked Aran from Say Something In Welsh a question about open content and search engines. The site is a private “walled garden” for a number of reasons related to maintaining a community of learners, but it’s free to register to join. (UPDATE: See Aran’s comment below for more about this.)

I then stayed for the Metastwnsh podcast recording and live web stream. Metastwnsh is a web and technology blog with several contributors. There was some discussion of gadgets and some jokes. My favourite part was a discussion of how the language choice of our online posts and conversations can differ from that of our offline choice. In particular, Twitter was cited as an example of a tool which first language Welsh speakers sometimes opt to use in English, for many reasons – some understandable. It was suggested that perhaps in some cases they file it under an “English language part of their brain”, alluding to the possibility that bilingual people associate some spaces or platforms with specific languages. So the effect of the platform is not necessarily “neutral”, or doesn’t remain that way. (I’ve been building a list of Welsh speakers on Twitter, including learners. Every person who is on the list can see the list and access all the other members of the list. It’s a way of strengthening the network and thereby, potentially, the impulse to post in the Welsh language should people wish to do so. Linguistic diversity leads to other forms of diversity and improves the internet as a whole in my opinion.)

I popped next door to catch the very end of a presentation about Llen Natur, a website about wildlife and nature. It has a dictionary of species, maps and photos.

Free lunch was not something I had insisted on, especially as it increases the admin for such events. But Rhodri ap Dyfrig was convinced it was possible and fixed up catering and covered it with money from some of the sponsors. For me it was a valuable part of the event, meeting some very talented people I’d only previously known online.

It was my turn next – purely because I’d volunteered to speak, as had everyone. So the title was “FyWordPressCyntaf.com – does dim angen profiad o flaen llaw” (which translates as MyFirstWordPress.com – no previous experience necessary). I wanted to talk about WordPress as a blogging and general site CMS, downloadable from wordpress.org with no coding necessary. It gave me the chance to talk about free software (unambiguously rendered as meddalwedd rydd in Welsh, free software as in freedom) with a bit about how localised code and themes are available for Welsh (but, as I also added, we can always do with more). Unlike the audience, Welsh isn’t my first language so I had a job explaining some of the concepts. I achieved my main objective though, which was to get a bare bones installation of WordPress running to show how quick and easy it can be.

In hindsight it was a little ambitious to shoehorn the mash-up/hack session into the event plan. On the day I ended up putting my talk in the hack session, which came just to mean practical session. Even WordCamp, which I attended last year, was spread over two days – allowing space for team building, pre-planning and the hack session on the second day. At Hacio’r Iaith, I think the initiative and creativity of the attendees to do the hacks could have been there, as well as the capability. But in a day already packed with presentations and to some an unfamiliar format, it became too much to expect. Next time some more practical stuff would be good. I do think a dedicated hack event could work.

We had a quick discussion about making online how-to videos and what subjects to cover. There is plenty of room for how-to videos in Welsh, especially showing non-geeks and normal people how to get the best use of software and the web. The ideas we generated are available to take.

Finally I went to a session on the game Civilization IV and its unofficial Welsh translation, using game mods. Welsh translation of open source games like OpenTTD also came up. I’m not a big gamer but it gave me some ideas…

Video by Sioned Edwards

Where is my mind? (Books, blogs and networks)

One of my new year’s resolutions is to read more books.

Like old books, unfashionable novels and books which challenge my assumptions.

The benefits of books are clearer, now that we also consume digital text and hypertext. I’m not talking about how the smell of the paper is wonderful or anything like that. It’s about the relationship between the author and the reader. The author can write with the assurance that you’re on board. It’s possible for him or her to explore the diverse ideas that make up a theme, with a high degree of subtlety. These are the joys and rewards of commitment.

This renewed interest in books is going to require time from somewhere. I’ve always loved books but lately I’ve been distracted by the glow of the screen. So for me, this means reducing the amount of time I spend in my feed reader. This trade-off between book reading time and blog reading time is purely one which I have constructed for my own purposes. I try never to complain about not having time to pursue my interests. I make time for the things I value.

Blogs and books are totally different media, clearly. They are not in opposition. They can complement each other. Web log culture, relatively young, should be learning more from books. Not only the facts on the pages and not only the histories they present, but how to explore a theme.

I love blogging dearly. I love reading blogs and I am excited about the potential of blogging. I’ll continue to encourage others to blog about subjects they care about – in languages they care about. There are not enough blogs.

Part of the attraction of blogging, for me, is being able to put a page on the web quickly. But for the art of blogging to develop, that is only part of it. It has to be about the blog over time.

Let’s look at reading. When I show people a feed reader for the first time (almost invariably Google Reader), they often recoil in horror at the thought of another inbox – and who can blame them? Some of this stuff is time-limited and should just flow past, not accumulate (Dave Winer highlights the “inbox” shortcoming of Google Reader).

But my favourite blogs are the ones where I DO want to read everything.

I’m not looking at any proper research here, but I wonder if feed readers are declining. That’s a pity. Whether or not that’s true, they certainly need a boost. Good feed readers help the art of blogging.

If people aren’t using feed readers then it follows that they are peck-pecking haphazardly at links to individual posts received via Twitter, Facebook, email, search results and so on. I’ve done it. This is what people presumably mean when they refer to the “death of RSS”. As a technology, RSS is no more dead than HTML of course and to claim otherwise would be silly. But people seem happy to peck and let others throw the odd link to a snippet or giblet their way. Either that or they are “subscribing” to their favourite blogs by repeated visits in the web browser, rather than with feeds. Or, of course, they are not reading blogs at all.

Right now, in early 2010, as well as a devaluing of feed readers it feels as if other forces are converging to unbundle blogs. Rather than whole bundles, they are viewed as loose collections of individual posts. Attention spans and loyalty to specific blogs could be at an all-time low. This is akin to books losing their spines and pages fluttering away on a breeze. Gone is the continuity. Each post now has to fight for your attention. Granted, the edges of a blog are always more fluid than that of a book.

But following a particular blogger over a period of time is part of what makes the medium good (and fun).

The popular blogs exert an influence on expectations and practice. Some of the most popular and influential blogs are banner ad-supported. These blogs have an intrinsic problem of course – they need to pull the maximum number of eyeballs. This results in tabloidisation, Gawkerization or Techcrunching, if you will. How embarrassing. Most likely this does not align with our own interests for reading a blog, certainly not our long-term interests. Typically we need truth, insight, fairness and all the good stuff.

Instead, every single post has to hustle for attention. Crafted blog post titles become more important than they need to be, that’s one sign. In the text, you can sense the desparation to create a Digg firework which will shoot to the top. You know what I mean.

A common hustle is to present any given story as some kind of conflict or controversy. If you’re interested, read a recent Giles Bowkett post where he simultaneously mimics this and criticises it. The title of the post is Blogs are Godless Communist Bullshit – and the urge to click that title is strong, for reasons he explains.

This is not an exclusively online phenomenon, it’s also discernable in mainstream media. But it’s exaggerated and accelerated in its online form. How? Inbound links and SEO rapidly solidify the attention flows. This leads to more popularity. And Google search is merely a popularity filter. It filters what comes to your attention on the basis of popularity, along keyword lines. That’s very useful but not always in our long-term interests.

Everything that is wrong with the most popular blogs (and news sites, for that matter) can be traced back to this lust for eyeballs. Baseless gossip, sexism, lies, slander, unpleasantness, bullying, you name it. Bad science. Churnalism. Lazy writing and endless lists. The set-up creates the wrong motivations for these bloggers. They influence other bloggers with their woeful example. All but the strong are infested by mediocrity. Stay strong.

Blogs don’t tend to identify their own shortcomings. Techcrunch, for instance, won’t tell you that it does not deal with useful startup or business news that falls outside the venture capital system. “Everything on TechCrunch revolves around the venture capital system”, as another Giles Bowkett gem suggests.

More and better blogs will dissipate some of the influence of the crap. I think a good feed reader which doesn’t frighten normal people would help too. Maybe we could then cultivate our attention spans and intolerance of cheap firework tactics.

I wonder about the concept of a “blogosphere” and the limits to its explaining power. The blogosphere is a subset of the web. In a sense, the web is a network of pages and people. In another sense it is a network of ideas.

Networks have become very interesting in the last few years.

Networks of people make up societies.

Networks of machines make up the world wide web.

Networks of neurons make up brains.

It’s fun to get reductionist and attempt to draw parallels here. For example, Kevin Kelly is fond of saying that the internet is ONE HUGE MIND. It’s a web of machines and people. So we’re just nodes in the network. His enthusiasm is scary and funny. He also has a notion that human beings are the sex organs of technology. At a restaurant he might be the one to inform you that the beef tongue on your plate is getting ready to taste you in revenge. Like me, he’s a theist and a Christian so I obviously find that side interesting.

The blogosphere that I am conscious of is what I read and what’s in my feed reader, a subset of the whole blogosphere. Maybe we are dealing with a number of smaller, only sometimes overlapping blogospheres. How small and how overlapping? The flows of influence are hard to measure. You can look directly at outbound links but it’s harder to see contextual density. Which bloggers watch the same television programmes and which ones read each other?

My own blog is influenced by patterns in things I read, including hundreds of blogs I’ve read that you can’t see. They reinforce pathways in my brain.

By the way, this is why a regular subscription to a daily newspaper can be destructive, when people choose poorly. OK, I’ll name one: the Daily Mail. It tends to appeal to people’s innate selfishness, the same selfishness which is in all of us. Daily Mail writers know their market very well and taken regularly and uncritically the paper can amplify this selfishness. I think it will handle the unbundling of news very deftly too, the online headlines are some of the most sensational around.

Bringing this full circle, the best opponents to these negative media are healthy networks. See above.

So I’ll carry on blogging and attempting to grow the good network by telling people how fantastic WordPress is. But I’m also taking control of my own mental sphere and stirring some books into it, sometimes deliberately choosing things outside my immediate interests. Some excellent books throughout history have never been mentioned or discussed in a single blog post yet. I’ll link to them and dig them where I can.

Vote for Twitter to be translated into Welsh

At the moment Twitter’s web interface is only available in four languages – English, Japanese, French and Spanish. Also on the way now are Italian and German.

So Twitter Inc have decided to increase support for the world’s languages, which is an excellent move. They’ll be asking users to collaborate on translating the interface, which again is good. The language community, made up of fluent users and some professional translators, knows best. Then everyone wins.

Twitter Inc haven’t said exactly how they’ll choose the next languages. But we can ask for Welsh. Here’s how.

  1. Go to http://twitter.com/translate
  2. Click the link “Sign up with your username and language”.
  3. Type your Twitter username.
  4. Select “Welsh” from the list.

It doesn’t matter if you’re a Welsh speaker or not. Welsh can belong to everyone!

I’m calling it a “vote”. You might as well use your vote for a language you’d like to see supported, even if you’re not a speaker.

Let’s not wait months and months for Welsh to get support – we can ask now. If they receive a high number of requests, it may spur them into offering Welsh.

Facebook made a similar move a while back. The whole thing was a game, with scores and a leaderboard for contributions. This resulted in a very rapid translation, completed in around three or four weeks as I recall. Twitter will be even quicker, I think we’ll do it in mere days.

In fact, Welsh was among the first languages to be supported by Facebook. This was mainly because there was a lot of demand expressed noisily, via a group.

“The squeaky hinge gets the grease.”

Google Translate is now instant. But still fun (and dangerous).

Google Translate has already accelerated my Welsh learning. It helps to decipher a daunting piece of text.

Now Google Translate is instant. They changed the interface slightly and it flashes up the equivalent translation as you type. Boy.

In other words you get the same flawed “translations”, now even faster!

Try it for Welsh to English.

Example phrases:
Dw i’n cyfieithu.
Defnyddia yn ofalus.
Gwlad beirdd a chantorion, enwogion o fri

I wish there were a proper online Welsh-to-English dictionary that did instant look-ups. It would take some of the friction out of reading difficult books. Just leave the laptop open, type a difficult word and get the meaning NOW.

Having to click is too slow a method because it breaks the flow of the book. Reaching for a dictionary is even worse. The look-up needs to be as near to the speed of thought as possible.
🙂

I say “proper dictionary” because Google still gets words wrong y’see. It’s based on statistical translation and uses the “most likely” translation based on a corpus of text equivalents in both languages. It also seems to have a limited vocabulary.

And a reminder…

Don’t use it for roadsigns! But you can use it to check the gist of a professional’s translation…

Sneak peek at data.gov.uk

Are you a coder in the UK? Do you fancy tinkering around with government data for the potential good of the public?

Here’s an early Christmas present.

Visit data.gov.uk and it will bounce you on to a Google Group. Request to join the group and introduce yourself. You might get access to the developer preview of the site – like I just did!

This might be old news to some as it seems to have gone live on 30th September according to the site blog. I guess I assumed it was for “special” people, so that’s a lesson learned. If you see an interesting door, knock it.

I’m still finding my way around. There are 113 datasets including census data, ASBOs, air quality, crime, fear of crime, work, health, motoring, (un)employment, police and it goes on.

I think at least some of this data has been already available in different places. But having it linked to from one place is a good idea.

Pick a subject and there are always people out there who are cleverer than you. That’s another lesson, or rather, reminder I get from the web. Whatever you think about the UK government, inviting people to build things with a resource like this is at least a way of acting on that lesson.

The open invite encompasses not only a central website, but actually having exportable data formats and clear conditions for data re-use. I am not a lawyer but the terms and conditions seem fairly clear. Generally I don’t like the phrase “Crown Copyright” if it’s something that was gathered using public money, but it’s apparently there to ensure attribution and accurate use of the data. I thought copyright couldn’t be applied to numerical data? Perhaps someone out there can explain.

The preview is for load testing purposes, according to the welcome email:

Since the appointment of Sir Tim Berners-Lee and Professor Nigel Shadbolt in June we have been working on how to pull together a single point of access for government data. We’ve talked to a range of people, and looked at what others have done, and over the summer we have built a first version, with a combination of open source and re-using of existing facilities such as CKAN.

We would now like your help over the next weeks and months to make it better – and more useful for you as developers. We would like feedback on how the site should work, what developer support facilities and tools there would be useful to you, and what further data should be freed up for re-use.

We’d also be interested in your ideas on how the data can be used – and if you can build some more great applications with what is available now this will help the drive to free up more over the months ahead.

At this preview stage, we are manually approving membership requests so that we check how the load on our server scales as we ramp up our service.

I’m currently looking at census data for Welsh language ability, collected during the 2001 census. It might lead to ideas for our Hacio’r Iaith event.

Hacking or faking a wiki history for good purposes

I want to utterly hack the wiki format because I don’t think it’s been fully explored.

I’d like wiki software into which I can manually insert fake edits. I’d like to write the history in arbitrary order and set the dates myself. (Usually the dates are automatic.) I love the history!

Why? The history is a really useful way of representing the progression of a document.

Here’s one application. Lots of documents change and it might be useful to show their development in this fashion. In the UK, bills are discussed in parliament, they are edited then they sometimes become acts which are the basis of law. Very few normal people actually follow the process. A wiki-style history might help their understanding.

There’s a similar process at the Welsh Assembly and we certainly need help understanding what happens there.

There are also famous documents like the USA constitution which might be fun or historically interesting to represent in a wiki fashion. Imagine being able to see prohibition as a literal 18th amendment to the wiki and it being repealed by the 21st amendment.

As well as the democracy stuff, there might be journalistic applications of something like this. Representing important documents in different time-based ways.

This idea strikes me as somewhat “obvious”. (It was inspired by a comment in a video interview with Matt Mullenweg about open source.) Has it been done before?

I might have a go, there are many open source wiki software systems. For instance MediaWiki or DokuWiki could be adapted to do this. There are also document comparison programs, maybe I just need to do it as a set of documents which can be compared.

Maybe this intersects with what Google Wave can do, I haven’t tried it yet.

I use Google Docs every day now and it’s obvious that that has borrowed heavily from wikis. I’d struggle to go back to emailing attachments back and forth.

Google have two products called wiki – SearchWiki and Sidewiki – and neither of them are really wikis! But Google Docs are proper wikis. If you haven’t tried Google Docs, try it.

I’m thinking of other documents that change over time, which could be wikified. Like chessboards and images of your dog’s face.

My own face is a wiki edited by time. My body is a wiki, edited by beer and curry.

The evolving blog: things that resemble blogging

This loosely follows on from the previous post about Twitter being a variant of blogging. Incidentally, normal service on this blog may be resumed at some point or possibly never. Anyway.

Sometimes I think almost EVERY form of publishing in social media can be considered a form of blogging. Is everything here blogging?

On Flickr, for example, you upload images which have dates and tags. YouTube and other video sharing sites let you upload video, again with dates and tags. There are subscription options in these too – you add people on Flickr and you subscribe to channels in YouTube. There are variants on other video sites. These “content services” also have feeds of course. They don’t look exactly like blogs but I’m saying the default view you get is incidental to this concept of them being about blogging. Of course, the default display of a blog is incidental. You could take feeds or content from any blog or set of blogs and display them in aggregate in a multitude of ways. The point is, all are about time-based publishing which is essentially all a blog is.

Facebook is like a huge group blog. The newest thing is at the top. Posting a status or whatever is obviously like doing a blog post, but almost everything else you do is subscription. Clicking Like for something is subscription. Writing a comment on a post is a form of subscription. Becoming a fan of a page is subscription. Responding to an event is subscription. And of course, adding a friend is a subscription. It can only be two-way, symmetrical. I tell people Facebook is weirder than blogging and Twitter because of the privacy stuff. There’s a grey area between private and public, but let’s forget about those aspects for now. Facebook is a huge group blog. The things that are slightly annoying on Facebook are the non-bloggy things, mainly the private inboxes. There’s your inbox for requests and your inbox for direct messages. Another thing, if you don’t respond to an event you are automatically subscribed to receive direct messages about that event. That’s annoying because automatic subscription to anything is not bloggy.

Stretching this even further – and this is highly provisional now – maybe a wiki page can be considered a form of blog. The time-based element is most apparent if you look at the history page. This page shows all the edits that have taken place. It looks like a blog, except that instead of different posts it’s the same post being refined over time by multiple authors. And of course there’s a feed of this history too.

Or, the other way around, maybe a blog can be considered a history for its AUTHOR. The author is a biological wiki changing over time! Changes are occurring in the author’s mind and each post is a snapshot in time. So each blog post is a wiki edit. Or at least an indication of one. (If you comment on my blog, I will read it and you will edit me slightly. And the potental future of the blog will change. Have fun.)

Starting an open content service like Twitter, YouTube or Facebook looks like so much fun. I would do it differently to those guys, natch. If I were starting such a service I would look at blogging in detail for which features I could borrow. This often happens subconciously as people have absorbed the customs and features of blogging. Maybe I could start by adapting an old UNIX command.

I’m abstracting features of software here. When I studied Computer Science, I went to a lecture about “computing in the real world” delivered by a software consultant. He said that he’d been asked to work with a prison for their database of inmates. Should they pay to develop an expensive new database system for the prison, from scratch? In a stroke of inspiration, he suggested they just adapt an existing hotel booking system. A prison is a hotel, except if you’re staying you can’t decide when you’re going to leave. On an abstract level, that’s the only functional difference. Inmates are guests.

That observation has always stuck with me and I’ve always tried to look at problems in a similar way.

Of course, not everything is blogging. Now go and eat your tea.

The evolving blog: Twitter as microblogging

Veteran blogger Meg Pickard wrote an insightful post last month about how the adoption of Twitter has mirrored that of blogging before it.

Twitter the company never describe their service as “microblogging”. That’s a smart move from the viewpoint of marketing the service to people who might have preconceived ideas about blogging. But mainly, it probably helps each user and the communities represented to be unconstrained and perhaps more creative in the way they actually use it as a medium.

Twitter feels like blogging at reduced friction. Each tweet (blog post) is tiny and you can type it quickly, on the go. They are also quicker to read than macroblog posts.

So Twitter could be fairly accurately described as microblogging. Some of the Twitter observations Pickard makes are accelerated in comparison to blogging.

People write more posts (tweets) than on a long-form macroblog – in my experience. The “half-life” of conversations is reduced. There’s probably a whole bunch of research someone could do on that if they wanted. (And I’m not talking about the paper where they dismissed 40% of Twitter as “babble”. I think that totally missed the point.)

So I wanted to expand on Pickard’s post and draw more connections between blogging and Twitter, between macroblogging and microblogging if you will. Some of this will apply to Identi.ca and other microblogging services. But I think Twitter’s larger user base makes it a bigger playground for this stuff.

The post
Let’s start with the obvious. A tweet is a blog post. Your tweets are organised by time, with newest at the top. Apart from that you can write anything you like. Same, same.

Following
Following is subscribing. Again, there’s less friction on Twitter because it happens in fewer clicks.

The client
Your Twitter client is your feed reader. The default web client is just a web-based feed reader. You get everyone you’re following aggregated together. But it can also be set to a single blog (a single person’s Twitter timeline).

URL and feeds
Your blog has a HTML version and it also has an RSS or Atom feed. Twitter feels like it has feeds but they’re invisible, they’re simulated by API calls. What I mean is, when you click Follow you’re not made aware of what happened in the background, it’s a black box. Whereas when reading blogs there is a URL to a feed which you subscribe to. (Although every Twitter account has a bona fide RSS feed as well.) Also, because Twitter and other services have emphasised real time there are efforts to make blog feeds real time. Twitter, in turn, is influencing technologies that were established before.)

Replies
Replies on Twitter are like blog pingbacks. They notify @someone that you made a response to their post. But unlike blogs, the “pingback” of a Twitter reply is not visible to onlookers reading the original tweet.

Tags and categories
The counterpart of blog post metadata – tags and categories – is the Twitter hashtag, which was deliberately introduced by a user and then popularised. The Hashtags website is what Technorati is for macroblogs (or rather used to be).

Retweet
Retweets, usually written as “RT @someone” or “via @someone”, are ostensibly about acknowledging a source. They’re a somewhat strange byproduct of Twitter’s lack of a quick way to link to, and read, another tweet. For programmers, it’s analogous to passing by value instead of passing by reference. They’re not native to Twitter at the time of writing.

Suggested user list
When someone joins Twitter now, the site suggests accounts for you to follow. This helps new users to get started and see how it’s being used. But it also offers a huge boost and arguably an unfair advantage to companies and individuals represented. It’s an editorial decision made by Twitter staff, one of the very few such decisions on a service which is mostly neutral – which to some “feels” wrong. There’s no equivalent on the blogosphere, which is sustained by a network and not hosted by a single provider. If Twitter the company want to be seen as fair, maybe they should behave like the blogosphere.

Blogrolls
In the early years of blogging, a blogger would have a “blogroll” which is a list of links to their favourite blogs. These seem to have faded in importance and usage as blogging has popularised. But during the growth of the new medium, they were useful for people navigating the blogosphere and finding other bloggers to subscribe to. Blog rolls were also about giving kudos and link juice. The earliest form of blogroll I have noticed on Twitter is the #followfriday tag, where people suggest accounts worth following.

Twitter list feature (new!)
The new Twitter list feature is a bit like a blogroll. It can be seen as a public endorsement of certain accounts and also a way of giving kudos. You can have up to 20 different lists, e.g. colleagues, bands, journalists, people in my hometown – which is similar to blogrolls that have categories. With Twitter, the emphasis seems to be on usefulness to the compiler of the lists, with the openness and kudos as byproducts. Like blogrolls, the lists help to grow the network by helping people navigate. Twitter lists can also be likened to OPML files, which are bundles of links to RSS feeds. In other words, an OPML file is a blogroll in a file.

Besides Twitter has always had lists. Each account has a grand list of all the people you’re following and it’s public. So the list of people you’re following is a blogroll. Albeit massive and context-blobby.

I think I’ve talked about Twitter as microblogging in enough detail now.