How Data Can Become an Evergreen Source for Newsrooms

October 7, 2011

Newsrooms don’t fear too much news. They fear not enough news. With news on demand 24/7, the stream of information that journalists work with is becoming the commodity upon which they rely — which is why “evergreen“ stories are becoming a staple for the modern newsroom. What they need now are evergreen news sources.

So how can data be an evergreen news source? Traditionally, data was hard to work with. It had to be collected, cleaned, organized, and once the effort was made to produce something consumable, it was left to stagnate and rot over time. With ScraperWiki, we’ve structured our site so that incoming data on the web renews your database and the infrastructure organizing your data flow does not rot.

For use in the newsroom, however, the output needs to be streamed. Here a couple of things you can do:

Data Stress to RSS

An RSS feed of case updates at the U.K. Information Tribunal as scraped by ScraperWiki.

Our Web API now has an option to make RSS feeds as a format. For example, a ScraperWiki user made a scraper that gets alcohol licensing applications for Islington in London. She wanted an RSS feed to keep track of new applications using Google Reader. Now all she needs to do is go to the Web API explorer, choose “rss2″ for the format, and enter a SQL statement into the query box. That way, she gets only what she wants into her reader without having to change the database.

The Early Data Bird Catches The Story

i-7da6c71d7d53747ac54b3668d6bb958b-Screen shot 2011-10-06 at 16.36.06.png One of our savvy users then used ifttt to turn an RSS feed into a Twitter feed. For food safety inspections in Walsall, follow @EatSafeWalsall. In fact, we have a couple of accounts tweeting out scraped data. For ministers’, permanent secretaries’ and special advisers’ meetings, gifts and hospitalities at No.10 Downing Street, follow @Scrape_No10. For Edinburgh planning applications, follow @PlanningAppMap. For complaints made against judges in the U.K., follow @OJCstatements.

Because you can get data in the way you want, you can push data out the way you want and also keep the integrity of the original database. The sources of data for these accounts are very different, and the output scripts need to reflect the timing of the data release. However, all this work means sentences can be formed and hashtags attached. So if they start trending, you’ve got a story lead.

A New Breed of Data Reporter

I’ve been experimenting with data output from ScraperWiki. In fact, I’ve been talking to it. In preparation for our U.S. tour, I’ve created a new member of the virtual newsroom. So here’s a little something I made earlier:

It’s not what you can do for your data, it’s what your data can do for you!

If you’d like to be a host or sponsor for a scraping event, email nicola[at]scraperwiki.com.

Comments are closed.

Who We Are

MediaShift is the premier destination for insight and analysis at the intersection of media and technology. The MediaShift network includes MediaShift, EducationShift, MetricShift and Idea Lab, as well as workshops and weekend hackathons, email newsletters, a weekly podcast and a series of DigitalEd online trainings.

About MediaShift »
Contact us »
Sponsor MediaShift »
MediaShift Newsletters »

Follow us on Social Media

@MediaShiftorg
@Mediatwit
@MediaShiftPod
Facebook.com/MediaShift