How Data.World Seeks to Transform Data Journalism

    by Bianca Fortis
    June 27, 2017
    Photo in the public domain.

    What the world needs now is arguments based on facts.

    That’s the philosophy behind data.world, an Austin-based startup that bills itself as “the social network for data people.” It’s a platform that allows users to upload, share and analyze data sets in a collaborative environment.

    Data set categories run the gamut – they include health, education, government and weather. And among the Featured Datasets on the website are information about podcasts published between 2006 and 2017, a ranking of ESPN’s 100 most famous athletes and credit card fraud data.


    Data Journalism

    Jon Loyens

    While the site is used by government agencies, non-profits and NGOs, corporations and even individuals, there’s a special use case for journalists – data provides a way to elevate a conversation about a news event or issue to focus on the facts.

    Ian Greenleigh, data.world’s Head of Marketing, pointed out that many Pulitzer winners from the last few years were data journalism projects.


    Ian Greenleigh

    “New organizations and journalists have always been in the trust-building business,” he said. “But lately trust in the media is hit a historic low among Americans. We think we can be a small part of that solution.”

    In a survey conducted earlier this year – the results of which are accessible on the data.world platform – 78 percent of respondents said they would have more trust in online news if they could see the data behind a story.

    Data.world’s goal is to help journalists find stories faster within data sets, add depth and relevance to stories so they aren’t just anecdotal and help organizations build trust with their readers.

    The Platform

    Jon Loyens, the Chief Product Officer and Co-Founder of data.world, described the site as a knowledge management platform wrapped around a data science community.

    “We want to build the first real network of data and people working together to advance knowledge,” he said.

    The platform is built to encourage collaboration among users and ensure that context around the data is provided. Users can download the data sets, annotate the data, track activity, comment on data sets and share it to Facebook and Twitter. Data can be shared via URL or be exported, which helps make newsroom collaboration easy, according to Loyens and Greenleigh.

    The news feed shows what other users are working on and what new data sets were recently added to the site.

    “You might be able to connect with people you may not have thought can be collaborators and contributors to the things you’re working on,” Loyens said.

    For now, data.world is free to use. Eventually they will start charging a small fee per user for specific uses, such as private data sets.

    They wouldn’t say how many users are registered, but Loyens said the number is in the tens of thousands.

    Trust in the Data

    Because anyone can upload a data set, in theory, dirty or manipulated data could be submitted to the site. But much of the data currently submitted is from trusted data providers – the Census Bureau among them. And the team behind data.world believes the democratization and transparency of the platform helps keep users accountable because data is vetted by the community. The open platform only serves to increase the quality and integrity of the data, they said.

    Users are encouraged to post the source of data sets and, once the data is in the platform, other users can see how the data has changed.

    “That’s why it’s important to build the community and the social network,” Loyens said. “It allows for the ability to dive deep and follow the linkages to check the quality of the data.”

    He pointed out that while Wikipedia has content that might not be trustworthy, the site’s community typically will quickly correct errors.

    FOIA Predictor

    The team behind data.world has also developed a web application that predicts whether a user’s FOIA request would be successfully fulfilled. Users can paste the text of a FOIA request into the machine, select the government agency from which they would like to seek documents and the an algorithm will predict the success of a request by comparing it to previous requests from the website MuckRock. It evaluates factors such as word count, average sentence length and the success rate of the agency. See it here.

    Bianca Fortis is the associate editor at MediaShift, a founding member of the Transborder Media storytelling collective and a social media consultant. Follow her on Twitter @biancafortis.

    Tagged: data data journalism data.world FOIA

    Comments are closed.

  • Who We Are

    MediaShift is the premier destination for insight and analysis at the intersection of media and technology. The MediaShift network includes MediaShift, EducationShift, MetricShift and Idea Lab, as well as workshops and weekend hackathons, email newsletters, a weekly podcast and a series of DigitalEd online trainings.

    About MediaShift »
    Contact us »
    Sponsor MediaShift »
    MediaShift Newsletters »

    Follow us on Social Media