Barely two decades into the digital age, we take online media for granted. So much is so easy and convenient — at our fingertips — that we can forget technology can only do so much. Then we come up with a great idea that leaves us with the challenge of how to successfully push the limits.
This is what has confronted Gotham Gazette as we move into the final stages of creating our Councilpedia site. Councilpedia, a Knight News Challenge winner that I’ve blogged about here previously, will explore more fully the links between money and politics in New York City.
Councilpedia will enable visitors to the site to share what they know about politicians and their donors. It is to be powered by MediaWiki to let people flag something — noting, for example, that one contributor to a candidate owns land she hopes to get rezoned for a Walmart. Gotham Gazette staff will then confirm — or delete — the comment.
Filtering Data
The core of Councilpedia is information already on Gotham Gazette, information from City Council (on earmarks, for example) and, above all, the massive records from the city Campaign Finance Board on giving and spending. The sheer magnitude of all this data has posed an array of problems.
The city data, while thorough and accessible, is inscrutable to most New Yorkers — a list of largely meaningless names. To make it easier to search and understand, we set out to code the data (to indicate large donors, those from the city, unions, real estate industry etc.). With some candidates having thousands of contributors, this presented a massive task. Fortunately, we had some conscientious interns this summer who, between their other reporting responsibilities, dutifully researched and coded line after line of information under the supervision of our city government editor, Courtney Gross.
Readers will be able to examine this data in a number of ways. They can view by candidate. They can find out who else the contributor helped fund. They can look at intermediaries and determine whose money they bundled and then who it went to. And so on.
For the wiki, though, this mountain of information has been a bit much. When technical manager William JaVon Rice began uploading the data into spreadsheets he had created, the process took 36 hours and produced some 31,000 pages — a sure indication no one would ever attempt this in print. The system balked, overwriting pages, for example, which required Rice to check every candidate’s list of often hundreds of contributors to determine which ones had been overwritten. Then he had to undo the overwrite.
Pushing The Limits of MediaWiki
We’re still planning to have this ready to show you in the next several weeks. And we think you’ll be impressed. Not to boast, but the reporters, campaign finance aficionados and followers of city government who viewed our test felt that way.
But we do see a number of issues looming ahead. Councilpedia is intended as a living, breathing site, meaning data will continue to accumulate as officials collect more money, award more earmarks, pass more bills, and so on. The updating poses a challenge for a small non-profit like Gotham Gazette.
The magnitude of the new information — added to the volumes we already have — is likely to push the limits of MediaWiki even further.
With this in mind, we’re looking for ways to automate the process more. And we hope someone — any takers out there?— will make MediaWiki more robust or create or an alternative.
As always, we appreciate your ideas, so feel free to share them in the comments below. And stay tuned for Councilpedia.