The State Decoded Turns Laws Inside Out

    by Waldo Jaquith
    May 16, 2012

    Semantically, legal codes are smooth, shapeless balls of text. They’re programmatically inaccessible, useless to software — and most people. There’s simply nothing on which to get a purchase. As qualitative data, they’re inaccessible to quantitative analysis.


    This is the problem that the State Decoded project seeks to solve.


    The State Decoded’s job is to turn legal codes inside out, bringing their substructures to the surface to make them understood more easily. By reducing laws to their smallest possible units, indexing them via every possible metric, and exposing all of those internal structures, it’s possible to give people and software alike something to get a hold on.

    Place names, people’s names, organization names, bill numbers, dates, glossary terms, cross references, and statistically unlikely phrases are all lurking just below the surface, waiting to be gathered and cataloged. Innumerable external sources of data can be used to infer more about laws, including citations in court opinions, citations in scholarly publications, citations in legislation, citations in blog entries, website traffic patterns, legislative tags, legislator voting histories, lobbying records, campaign finance data, and a great deal more.


    None of this has much to do with making state laws prettier. But it’s the part where the State Decoded starts to get fun.


    The project’s motto might be “state codes, for humans,” but it would be more honest to call it “state codes, for robots.” It’s the API that’s going to make this project valuable, because it’s the API where all of this fascinating data will be shared in its entirety (and also in bulk downloads because — let’s face it — sometimes APIs are more trouble than they’re worth).

    What will people do with this? I have no idea. That’s the beauty of it. There are people much smarter than I who will grasp the fascinating applications and analyses that can be created with these data. Perhaps they’ll find that legislators in different political parties tend to pass bills that affect distinctly different titles of the code. Or that the SMOG ranking of amendments to the code have gradually been increasing.

    Maybe that legislation amending a law tends to follow a spike in scholarly citations of that law. Who knows? By providing all of these data points in one place, it will be possible for people to crunch the numbers themselves and find out what secrets lie within them.

    The API for Virginia is in alpha testing now. If you’re interested in putting it to work, send an e-mail saying so to join the alpha test. This is where things get good.

    Image courtesy of Flickr user jimmywayne.

    Tagged: campaign finance codes law law parser legal legislation state decoded

    Comments are closed.

  • Who We Are

    MediaShift is the premier destination for insight and analysis at the intersection of media and technology. The MediaShift network includes MediaShift, EducationShift, MetricShift and Idea Lab, as well as workshops and weekend hackathons, email newsletters, a weekly podcast and a series of DigitalEd online trainings.

    About MediaShift »
    Contact us »
    Sponsor MediaShift »
    MediaShift Newsletters »

    Follow us on Social Media