I spent a rapid-fire 23 hours in St. Louis this weekend at the NICAR 12 conference. For those who don’t know, NICAR stands for “National Institute for Computer-Assisted Reporting,” and, as the slightly antiquated name might suggest, was founded long before the commercial Internet, back in 1989. Traditionally, the organization (which is run by IRE, Investigative Reporters and Editors), has been about helping reporters use computers to comb through data, but over the years, it has become the de facto organization and conference for news apps developers. And this year, it felt like the journo-coders in attendance took to it to another level.
There was an incredible amount of information thrown around at NICAR 12, and Chrys Wu from Hacks/Hackers did an incredible job capturing much of it. A few projects really stood out as exemplifying some of the best that the developer community within journalism can do.
The PANDA project officially launched into beta in St. Louis, and threw a “provisioning party” to help people get their data spelunking appliance up and running. The tool, which allows for collaborative searching and sharing of data, offers to unlock data across a newsroom, but has a ton of applicability among anyone who has a bunch of data that they want to be able to search across. Built by Christopher Groskopf, Brian Boyer (who donned a panda suit for the occasion), Joe Germuska, and Ryan Pitts, they’re looking for beta testers and collaborators, so check out the demo or grab PANDA on GitHub now.
For sheer blow-my-mind value, it didn’t get bigger than Overview, which makes the process of digging through giant piles of documents significantly easier. Creator Jonathan Stray showed Overview off throughout the conference and helped walk people through the install process to get them up and running. The project, which is super powerful, is still in early stages — Stray calls it a prototype — but he’s already used it to comb through 4,500 pages of reports filed by U.S. security contractors in Iraq. As it gets built out, it’s going to be an amazing tool for many. Stray even offers a great step-by-step for installing Overview on your machine.
Tabletop.js is one of those things that you can’t quite believe doesn’t already exist. It’s a simple tool that allows you to painlessly use a public Google Spreadsheet as the backend for web content. I spent the train ride home from St. Louis playing with it, and it does exactly what it promises on the box. It’s such a simple tool, but it has all kinds of powerful possibilities. It was built by Jonathan Soma at Balance Media with guidance by John Keefe of WNYC. Github’s got the goods.
The LA Times Datadesk team gave a presentation about why they turn many of their Django applications into flat HTML files before deployment. By not relying on the server to generate pages that may not need to be dynamically generated for every user, the Datadesk team is able to save a ton of headaches (not to mention money) serving up all sorts of web apps as straightforward HTML pages. Django-Bakery, their code for making this happen, is now up on GitHub.
Node Web Scraping
I missed this talk, but when I asked on Twitter for recommendations of great things from NICAR, Al Shaw’s talk on using Node.js for scraping web pages got the most recommendations — and for good reason. His straightforward presentation that steps through the process makes it look like a data scraper’s dream come true.
Campaign Finance API
Finally, it ended up shipping a couple days after NICAR wrapped up, but it’s worth pointing out the amazing work by both Derek Willis at the New York Times and the team at ProPublica in bringing the NYT Campaign Finance API up to near real-time speed. This kind of work is vital this election season, and it’s truly inspiring to see collaboration between two incredible news orgs. The full documentation of the API is on the NYT Developer Network.
It’s not even March yet, and the amount of awesome coming out of the journalism code community is already overwhelming. Let’s keep it going.
A version of this post first appeared here.