Sneak peek at data.gov.uk

Are you a coder in the UK? Do you fancy tinkering around with government data for the potential good of the public?

Here’s an early Christmas present.

Visit data.gov.uk and it will bounce you on to a Google Group. Request to join the group and introduce yourself. You might get access to the developer preview of the site – like I just did!

This might be old news to some as it seems to have gone live on 30th September according to the site blog. I guess I assumed it was for “special” people, so that’s a lesson learned. If you see an interesting door, knock it.

I’m still finding my way around. There are 113 datasets including census data, ASBOs, air quality, crime, fear of crime, work, health, motoring, (un)employment, police and it goes on.

I think at least some of this data has been already available in different places. But having it linked to from one place is a good idea.

Pick a subject and there are always people out there who are cleverer than you. That’s another lesson, or rather, reminder I get from the web. Whatever you think about the UK government, inviting people to build things with a resource like this is at least a way of acting on that lesson.

The open invite encompasses not only a central website, but actually having exportable data formats and clear conditions for data re-use. I am not a lawyer but the terms and conditions seem fairly clear. Generally I don’t like the phrase “Crown Copyright” if it’s something that was gathered using public money, but it’s apparently there to ensure attribution and accurate use of the data. I thought copyright couldn’t be applied to numerical data? Perhaps someone out there can explain.

The preview is for load testing purposes, according to the welcome email:

Since the appointment of Sir Tim Berners-Lee and Professor Nigel Shadbolt in June we have been working on how to pull together a single point of access for government data. We’ve talked to a range of people, and looked at what others have done, and over the summer we have built a first version, with a combination of open source and re-using of existing facilities such as CKAN.

We would now like your help over the next weeks and months to make it better – and more useful for you as developers. We would like feedback on how the site should work, what developer support facilities and tools there would be useful to you, and what further data should be freed up for re-use.

We’d also be interested in your ideas on how the data can be used – and if you can build some more great applications with what is available now this will help the drive to free up more over the months ahead.

At this preview stage, we are manually approving membership requests so that we check how the load on our server scales as we ramp up our service.

I’m currently looking at census data for Welsh language ability, collected during the 2001 census. It might lead to ideas for our Hacio’r Iaith event.

Do you care about Wales? Can you code? Fancy helping TheyWorkForYou then?

Below is some full background to this, but in summary TheyWorkForYou are looking for volunteer coders interested in working on Welsh Assembly data. If that’s you, please join the new discussion list and let’s figure out how to do it.

If you don’t know TheyWorkForYou then take some time to familiarise yourself. It’s a well established site taking parliamentary data and presenting it in a queryable form. It’s free, loaded with information and very useful indeed.

The whole thing is maintained by mySociety who are world class at this sort of thing.

Have a play and see what you can glean about your MP or issue of choice. The search function allows you to subscribe by email (or better still, RSS feed) to notify you immediately whenever something you care about is discussed.

This is all very well for the UK parliament but the Wales section of TheyWorkForYou is currently looking very bare, containing only the following text.

We need you!
It’d be fantastic if TheyWorkForYou also covered the Welsh Assembly, as we do with the Northern Ireland Assembly and the Scottish Parliament, but we don’t currently have the time or resources ourselves — in fact, both those assemblies were mainly done by volunteers.

If you’re interested in volunteering to help out, please get in touch!

So yes, Wales is the only constituent part of the UK which doesn’t have its parliamentary data available on TheyWorkForYou.

There is nothing preventing us, it’s purely because nobody’s stepped up and done it.

As a quick explanation of the work that needs to be done: Welsh Assembly proceedings and transcripts are already available on the web from the official site. But they’re effectively raw dumps – of speeches and other data. It’s almost impossible to get useful insights about members’ voting records. It provides no option to subscribe to notifications that a phrase was used. Apart from a very basic and clunky site search function, all the insights are locked in. You could do a human-powered research trawl through the records, but that starts to get a bit unwieldy for normal people. It feels like the preserve of experts and not really like proper democracy.

Most of TheyWorkForYou’s engine is already built. In the words of Matthew Somerville at mySociety, the work now is to “parse the official report of the Assembly into structured machine-readable data to feed into TheyWorkForYou, along with member information for the Assembly. This will need programming skills, I’m afraid.”

So if you know anything about data structures or programming, why not apply that knowledge for the good of everyone? Join the discussion list for now as we’ll be figuring out how to tackle it.

Any given Assembly Member who does his or her job properly would surely encourage the kind of scrutiny that TheyWorkForYou could bring. You might be wondering why nobody at the Welsh Assembly has added the necessary features to allow their data to be queried. Let’s give them the benefit of the doubt this time. Maybe they don’t always intend to obfuscate and hide this stuff. It’s just they’re not up to speed with any better ways of doing it. You might be able to help them! And the people of Wales!

I’m not naive enough to think that all problems can immediately be solved by opening up this information. Neither will it be enough to get every voter running to the polls once the information is available. All manner of things can go wrong in the democratic process. But if your thing is data, there is a clear problem there and maybe that’s the part of the scene you can help with.

In Wales we have a good selection of knowledgable, principled and often witty political bloggers. I’m not one. But I can help resource the conversation in the party political domain by opening up the possibility of insights from the data. It will be a step towards better accountability among our representatives. Let’s hope it does clear a pathway to some possible solutions.

If you’re not a coder, you could make a donation to mySociety or spread the word.