The Copy Editor

I'm Jojo Pasion Malig. I'm the usual suspect behind the night desk of the Philippines' leading news website. I like making interactive data eye candy. Mild prescriptivist.
Recent Tweets @jojomalig
Who I Follow

lifeandcode:

I’m reading Becky Hogge’s Open Data Study, a review of open data and government transparency efforts.  

First, in many cases, the data does not exist in any electronic form, and in some cases record-keeping systems may not exist at all.  Nathaniel Heller:

My simplest example for this would be years ago, talking with the government in Senegal and trying to plan an intervention based on electronic property records and…the Senegalese government was at first very enthusiastic.  And then we started talking about the physical challenge of it and what we ended up discovering is that before we built an electronic property records system we actually had to build a property records system.  It wasn’t clear that data existed in paper form and that to build that sort of government data transparency system we needed, in many case we would have to do the basic data collection.  

From Ethan Zuckerman:  

I think it would be great to start mapping what datasets exist within governments, but I’m going to stand by my skepticism: I think a lot of the data that you want…it’s not clear that those records are getting digitized or digitized in any meaningful way.

In some cases the data does exist, but requires a lot of work to put into a form that you can distribute on the Web.  I really admire Mzalendo, a site where activists in Kenya try to get their hands on any government records they can find, cutting and pasting and even retyping.  Here’s Ory Okolloh:  

All the work we do is manual, so we have to literally cut and past information if we can find it.  It’s gotten a lot better from when we started.  Now things like the Hansard [transcripts of parliamentary debates] are on the website pretty much in soft copy and up to date.  So it’s improved but it’s still either in a PDF or Word document that we can’t crawl or extract information from.  

[I’m reading a lot of papers and reports on open data and civic data.  You can check out my reading list here. Civic Data/Open Data Reading List — LW]

The PDFs. OMFG. The PDFs.

  1. mercilaura reblogged this from lifeandcode
  2. lifeandcode posted this