Data Everywhere

The Guardian has a great feature called their ‘Data Blog

Everyday we work with datasets from around the world. We have had to check this data and make sure it’s the best we can get, from the most credible sources. But then it lives for the moment of the paper’s publication and afterward disappears into a hard drive, rarely to emerge again before updating a year later.So, together with its companion site, the Data Store – a directory of all the stats we post – we are opening up that data for everyone. Whenever we come across something interesting or relevant or useful, we’ll post it up here and let you know what we’re planning to do with it.

So, for a number of their stories they’ll post the underlying data that goes with it.  They’ve got a storehouse of data to dig into and play around with.

One of their latest entries is about burglaries in the U.K. and dispelling the myth that hard economic times will automatically lead to higher crime rates.  In fact rates have gone down in many places but the why isn’t quite clear.  Some would argue this proves that our policy of mass incarceration works but I’m not so sure of that.  I’ve also seen theories that more unemployed people mean more people hanging out at home during peak burglary hours.

But that’s not really the point of this post.  They’ve put up the data from the Home Office for you to look at.

A few points I’d like to make:

  1. I think the fact that no one has stolen a ‘wheely bin /dustbin’ since at least 2003 is grounds for removing it from the list of items to track.  How’d it get on the list in the first place?  Was there a rash of wheely bin thieves?
  2. Comparable crime data in the U.S. is in something called the Uniform Crime Reports.  Every state is required to compare certain crime data and send it up to the federal authorities.  Every state decides how they’re going to release their data.  Pennsylvania does a decent (not perfect but pretty darn good) job with allowing the public to create their own queries and explore the data as they want.  New Jersey, on the other hand, is total shit, putting their data in .pdf reports and forces people who want to use the data to do a LOT of manual work.  As an aside, I’ve heard it theorized that one way to look at this is to assume that this isn’t a problem of technical issues or technophobia about using the internets on the part of some governments but rather, an intentional strategy to create obstacles to  transparency.  I have no evidence to back that up and think that may be giving too much credit to the people who make such decisions but I remain open to being convinced.

I don’t know if there’s a comparable site which provides data in a similar way in the U.S. ( is great but ‘only’ covers the executive branch of the federal government) but you can tool around the America section of the DataBlog.  It’s not as robust or (apparently) frequently updated as the UK part but it’s better than nothing.

Leave a Reply