Ralf Bendrath: "Machine-Readable Government" from 1987 to 2008

At a brainstorming session about future research issues at our section today, I mentioned the term "machine-readable government", which met a lot of interest. I did some quick research on where the term came from. Interesting outcomes:

German hackers in the 1980s

Surprise: The term seems to come already from 1987. First time I could find it was mentioned in the media was in 1988, in an article in the German weekly magazine Der Spiegel about the mailbox and hacker communities in Germany. The term "maschinenlesbare Regierung" was attributed to Chaos Computer Club co-founder Klaus Schleisiek, but it seems to have been a common concept for the first generation of German hackers, as the book about CCC founding father Wau Holland by Daniel Kulla tells us.

It is unclear to me if there was more detailed conceptual thinking about this, or if it was just an ironic catch-phrase.

More recently, the term was again used in the context of the German introduction of Freedom of Information Acts, see e.g. this 2003 CCC congress lecture by CCC co-founder Gerriet Hellwig.

Barack Obama / Lawrence Lessig in the U.S.

More recently, the term has been used for describing some ideas of the Barack Obama campaign in the United States. Obama has quite progressive plans for a more transparent government and the use of open standards for this, see his "technology and innovation" concept paper.

Obama does not say "machine-readable government", but the idea is roughly the same:

"Making government data available online in universally accessible formats to allow citizens to make use of that data to comment, derive value, and take action in their own communities. Greater access to environmental data, for example, will help citizens learn about pollution in their communities, provide information about local conditions back to government and empower people to protect themselves."

Larry Lessig's interpretation and endorsement of this does not use the term "machine-readable government" either, but was interpreted as such by a number of bloggers. Lessig says about Obama's ideas:

"the big part of this is a commitment to making data about the government (as well as government data) publicly available in standard machine readable formats. The promise isn't just the naive promise that government websites will work better and reveal more. It is the really powerful promise to feed the data necessary for the Sunlights and the Maplights of the world to make government work better. Atomize (or RSS-ify) government data (votes, contributions, Members of Congress's calendars) and you enable the rest of us to make clear the economy of influence that is Washington."

This interpretation of course is strongly related to Lessig's current interest and work on a more transparent and less corrupt government. He also announced a first practical project last year in the field of legal texts and decisions:

"Legal Commons (beta): Taking inspiration from the liberator and manumitter of government documents and legal cases, Carl Malamud, Creative Commons will enter into a joint venture with public.resource.org to collect and make available machine readable copies of government documents and law. Carl and I have committed to freeing all federal case law by the end of 2008. Importantly, this effort will not set up competing systems to the emerging ecology of great free law services (Cornell's LII, or Columbia's Altlaw.org). We instead will help gather and make available the resources those services use to provide their amazing service. So look for a tarball of all federal cases by the end of 2008, in parsable and usable plain text."

What's next?

Of course, freeing government information on public spending, on environmental or health data, or on government and parliament decision-making (voting records, contacts with lobbyists etc.) is great, and making this available in machine-readable standardized form is even better. But as we have learned from Creative Commons: "machine-readable" does not automatically translate into "human-readable" or "citizen-readable".

I see two upcoming challenges in this field:

1. Developing tools that make this information digestible by normal citizens. It should be fairly easy for plain environmental data like "compare air pollution over time in all states and tell me if there is a relation to power plants nearby". But social and relational data, such as data on the policial process, is much harder to digest in standardized forms. A contact with a lobbyist can mean a whole range of things, for example. It will be tough to come up with the semantics for this in the first place.

2. Even if this should be possible, the interpretation of such complex datasets is not really easy. This is a challenge for activists and political groups that will want to build tools around this data, and others who will do mash-ups from those. I certainly see the danger of mistaking correlation for causality here, as well as other reasons for blaming the wrong person or factor. In general, I am not sure if this in the long term will lead to better quality of political debates and decisions. You can also imagine a future where the political opponents only throw statistics at each other, and where the discourse over values and social visions gets even more marginalized.

That said, of course I totally agree that more transparency of government is better than less. And if machines can help us aggregate and digest the information, we should really give it a try.

PS: If anybody knows more comprehensive literature around these ideas, please let me know!

Update: The broader term for this (which is also much more common in the english-speaking world) is "open government". This also includes citizen wikis on government and parliament people and activities as well as similar approaches, where the data is not necessarily in standardized - i.e. machine-readable and digestable - formats.

Some sources on this:

Ethan Zuckerman: Towards the principles of open government data
O'Reilly Radar on the Open Government Summit
David Robinson, Harlan Yu, William Zeller and Edward W. Felten: Government Data and the Invisible Hand (pre-print)
Ed Mayo and Tom Steinberg: The Power of Information: An independent review
Ellen Miller: Case Study. Why Transparency is a Good Thing
Jerry Brito: Hack, Mash, & Peer: Crowdsourcing Government Transparency
The Sunlight Foundation has a long list of available government data from the U.S., including links to APIs and XML-formatted data.

Thanks to Markus Beckedahl for the helpful hints and links.

Ralf Bendrath

Tuesday, June 03, 2008

"Machine-Readable Government" from 1987 to 2008

0 Comments: