Last Friday, we closed out our eighth iteration of PANDA Project development and published our second alpha. We’ve added a login/registration system, dataset search, complex query support and a variety of other improvements. You can try out the new release now by visiting our test site here.
The PANDA project aims to make basic data analysis quick and easy for news organizations, and make data sharing simple. We’ve incorporated much of the feedback we got in response to the first release, though some significant features, such as support for alternative file formats, have been intentionally put off while we focus on building core functionality. We will have these in place before we release our first beta at the National Institute for Computer-Assisted Reporting conference in February.
As always, you can report bugs on our Github Issue tracker or email your comments to me directly.
Building a complete API
PANDA is built on a robust API so that it can be extended without modification. This is a tried-and-true way to design web applications, but it’s also a permanent hedge against the project becoming bloated or obsolete. By making PANDA extensible, we encourage other developers to add features that they need, but which may not fit our vision of what belongs in the core offering — user-facing content, for example. Ideally, this will lead to a community of expert users who sustain the project after our grant is finished.
Over the next month, I’ll be adding the trickiest, but most exciting, part of the API: a mechanism for programatically adding data to the system. Once this is complete, developers will be able to write scripts which insert new data into their PANDA instance and have it immediately be made searchable for reporters. Here are some example use-cases:
- Create a script to periodically fetch the City of Chicago crimes database and use the API to push new events directly into PANDA.
- Create a script to parse the Federal Election Commission’s campaign contributions RSS feeds and automatically update a PANDA dataset with the latest.
- Integrate support for creating PANDA datasets directly from other applications, such as Google Refine (or even Excel!).
- Write a ScraperWiki scraper to extract data from a semi-structured list, like this one and then create a script to fetch data from ScraperWiki’s API and write it into PANDA.
This last use-case is particularly exciting. One feature we have on the roadmap is to investigate how we can integrate directly with ScraperWiki. This is speculative at the moment, but has the potential to make the API useful even to novice developers who might not be entirely comfortable writing shell scripts or cron jobs.
I’m really excited to be building out this feature. If you’ve got ideas for how you might use it or use-cases you want to make sure we support, let me know!
Image courtesy of Flickr user woychuk.