X
    Categories: BusinessCollaborationIdea LabTechnology

How Data.World Seeks to Transform Data Journalism

Photo in the public domain.

What the world needs now is arguments based on facts.

That’s the philosophy behind data.world, an Austin-based startup that bills itself as “the social network for data people.” It’s a platform that allows users to upload, share and analyze data sets in a collaborative environment.

Data set categories run the gamut – they include health, education, government and weather. And among the Featured Datasets on the website are information about podcasts published between 2006 and 2017, a ranking of ESPN’s 100 most famous athletes and credit card fraud data.

Data Journalism

Jon Loyens

While the site is used by government agencies, non-profits and NGOs, corporations and even individuals, there’s a special use case for journalists – data provides a way to elevate a conversation about a news event or issue to focus on the facts.

Ian Greenleigh, data.world’s Head of Marketing, pointed out that many Pulitzer winners from the last few years were data journalism projects.

Ian Greenleigh

“New organizations and journalists have always been in the trust-building business,” he said. “But lately trust in the media is hit a historic low among Americans. We think we can be a small part of that solution.”

In a survey conducted earlier this year – the results of which are accessible on the data.world platform – 78 percent of respondents said they would have more trust in online news if they could see the data behind a story.

Data.world’s goal is to help journalists find stories faster within data sets, add depth and relevance to stories so they aren’t just anecdotal and help organizations build trust with their readers.

The Platform

Jon Loyens, the Chief Product Officer and Co-Founder of data.world, described the site as a knowledge management platform wrapped around a data science community.

“We want to build the first real network of data and people working together to advance knowledge,” he said.

The platform is built to encourage collaboration among users and ensure that context around the data is provided. Users can download the data sets, annotate the data, track activity, comment on data sets and share it to Facebook and Twitter. Data can be shared via URL or be exported, which helps make newsroom collaboration easy, according to Loyens and Greenleigh.

The news feed shows what other users are working on and what new data sets were recently added to the site.

“You might be able to connect with people you may not have thought can be collaborators and contributors to the things you’re working on,” Loyens said.

For now, data.world is free to use. Eventually they will start charging a small fee per user for specific uses, such as private data sets.

They wouldn’t say how many users are registered, but Loyens said the number is in the tens of thousands.

Trust in the Data

Because anyone can upload a data set, in theory, dirty or manipulated data could be submitted to the site. But much of the data currently submitted is from trusted data providers – the Census Bureau among them. And the team behind data.world believes the democratization and transparency of the platform helps keep users accountable because data is vetted by the community. The open platform only serves to increase the quality and integrity of the data, they said.

Users are encouraged to post the source of data sets and, once the data is in the platform, other users can see how the data has changed.

“That’s why it’s important to build the community and the social network,” Loyens said. “It allows for the ability to dive deep and follow the linkages to check the quality of the data.”

He pointed out that while Wikipedia has content that might not be trustworthy, the site’s community typically will quickly correct errors.

FOIA Predictor

The team behind data.world has also developed a web application that predicts whether a user’s FOIA request would be successfully fulfilled. Users can paste the text of a FOIA request into the machine, select the government agency from which they would like to seek documents and the an algorithm will predict the success of a request by comparing it to previous requests from the website MuckRock. It evaluates factors such as word count, average sentence length and the success rate of the agency. See it here.

Bianca Fortis is the associate editor at MediaShift, a founding member of the Transborder Media storytelling collective and a social media consultant. Follow her on Twitter @biancafortis.

Bianca Fortis :Bianca Fortis is an independent journalist and social media consultant based in New York City. Her work has been published in newspapers throughout the country. She was a recipient of the 2011 Scripps Howard Foundation’s Semester in Washington Fellowship and won the 2013 I.F. Stone Award for Emerging Journalists through the Nation Institute. She is a founding member of the Transborder Media storytelling collective. Follow her on Twitter @biancafortis.

Comments are closed.