Databases as Entry Points to Investigative Stories

    by Paul Grabowicz
    February 18, 2008

    If you want to know what the future of investigative reporting might look like online, check out what the Las Vegas Sun has done with its special section on Flight Delays.

    It’s an interactive map and database on plane delays at McCarran Airport. You can check a particular flight, look at patterns in delays to other airports and find out how long it takes to go through security checkpoints at different gates at different times of the day.

    And there’s a video of interviews with people at the airport, along with time-lapse videos showing planes arriving at the airport and the bustle in the baggage claim area.


    And oh by the way, the page also links to an in-depth story the newspaper did analyzing the problem with flight delays and what was causing it.

    Which is why I think this may show the future of investigative reporting – featuring a map or database that people can play with to get very personalized information: What delays am I facing at security checkpoints at my gate? What’s the likelihood my flight is going to be delayed? How big a problem is the airport I’m planning to fly to?

    And then using people’s engagement with the information to draw them to the stories that provide the background and context for understanding the data they’ve just explored.


    This idea is of particular interest to me, because it fits in with a project I’m currently doing on video games, in which we’re trying to use game play to help inform a community about its history and heritage.

    Interactive, customizable databases accomplish much the same thing, allowing people to “play” with the data. On the Las Vegas site I found myself clicking on the links on the map and sifting through the flights database just out of curiosity and for fun, even though I rarely fly to that airport.

    And as a former investigative reporter, I’m especially concerned with how digital technology can be used to do better investigative stories. Making use of the data that lies behind most investigative projects is one way to give people a personal stake in the information contained in those stories.

    Much of the attention of news organizations right now is on using the Internet for breaking news and 24/7 coverage at continuous news desks and doing quick podcasts and blog posts. But what journalists really have to offer are the skills, time and resources needed to do in-depth reporting.

    Taking long investigative projects written for newspapers or magazines or as TV/radio documentaries and then shoveling them online, perhaps dressed up with a little multimedia, is only jamming old media forms into a new media pipe. But understanding how to present data in an appealing way, and making that data accessible so people can mess around with it and create their own “stories,” is taking advantage of what digital has to offer.

    Several of the other Knight Challenge Grant winners are already doing great work in this area. Rich Gordon’s initiative at Northwestern to get computer programmers to come to journalism schools will help create the tools needed to build audiences for quality journalism. Rich also has written about the importance of databases in his article on Data as journalism, journalism as data .

    Adrian Holovaty has launched the EveryBlock site that aggregates local data from a variety of databases. Imagine the stories that could grow out of this data if the public and journalists worked together on analyzing it.

    Adrian has spoken before about the need to get reporters to focus on the raw data they gather for a story, and how that might be put online. I’d only go a step further and try to get reporters to think about how online databases might be the gateway into their stories, rather than the other way around.

    The folks at the Gotham Gazette have been exploring how news organizations can make use of online games to cover complex stories and how to present data in a visually more effective way.

    And Ingeborg Endter and Chris Csikszentmihályi at MIT’s Center for Future Civic Media suggested some good resources for thinking about how to better present data online:

    IBM’s Many Eyes project

    – Ben Fry’s book on Visualizing Data and some of his projects when he was at the MIT Media Lab.

    I’ve also started collecting links on the delicious social bookmarking site to some online databases and map mashups.

    If anyone has more ideas on this, or suggestions about other resources, I’d love to hear them.

    Tagged: databases investigative reporting journalism

    7 responses to “Databases as Entry Points to Investigative Stories”

    1. Jeremy Rue says:

      The Las Vegas Sun runs their own IT and they run their own servers. Most newsrooms outsource hosting and sometimes even their Web designers/programmers.

      I really do think there needs to be a paradigm shift on how newsrooms are structured. How can you do “database” stuff when the system admin won’t give access to the SQL databases on servers? (this was the situation I ran into at the Oakland Tribune).

      I think the newsroom has to be in control of the Web site just as it is in control of the print product. That’s the only way these types of projects will start to come to fruition.

    2. Dan Schultz says:



      Giving access to databases is like giving the keys to the nuclear reactor room;) I can’t blame your system administrator for not feeling comfortable giving open access to the guts of your digital operations. You don’t want just anyone snooping around pressing buttons, inserting content, creating tables, editing tables, deleting tables, etc.

      What might be nicer is interfaces (programs) that allow people to interact with data in a controlled environment — i.e. nobody can blow up the nuclear reactor; input is validated by the program and acts as a liasan between the database and the untrained human.

      Also, be careful about saying who you want to run what. Putting the newsroom in *control* of the web site assumes they know to use it (and that they know how NOT to use it). I’m not saying it should be detached to the point that it is outsourced and there isn’t direct and constant communication, but you have to remember that expertise in journalism does not imply expertise in the complexities of proper development process/database administration/digital interface design/etc.

      This probably deserves a post dedicated to it one of these days… I find myself saying that a lot lately!

    3. One of the disconnects in many newsrooms (such as the one at the Mercury News) is that the main newsroom and the online newsroom have been physically separate for so long. So Dan has a good point about control: I think most folks in the main newsroom have little clue about what makes online tick, whether it’s the main site, blogs, or multimedia.

      This problem is in large part due to a series of bad, bad management decisions. When I came to the Mercury News in 1999, I sat next to the producers who ran our site. Then Knight Ridder pulled all the online staffs out of all its newsrooms to create a new business unit. For the past seven years, this policy has been slowly reversed, and last summer the online folks returned to our building — though on the opposite side rather than being integrated into the newsroom.

      The result of all this is that most folks in the newsroom have little or no concept of how the online folks work, what they do, and how best to take advantage of those kinds of tools.

      To remedy this, we proposed putting the online team in the heart of the main newsroom, and hiring more developers and programmers to help us develop more data-driven products like the kind Paul discusses above. Who knows if either will ever happen. But I hope so, because making better use of our data is one of many great ways to help us re-connect with readers and produce great journalism.

    4. I would never agreed to join Gotham Gazette as a Technical Director if I didn’t sit between two reporters and participate in all of our editorial meetings.

      Dan, a sysadmin who can’t give the editorial staff a database without putting the organizations’ infrastructure at risk is a sysadmin who isn’t asking enough questions. I’d say he’s a sysadmin who has no idea what he’s doing, but most of us only figure out what we’re doing by asking questions.

      If news organizations treat IT staff like mechanics, they’ll get websites that run like cars. I’m going to take this analogy way, way too far, but if you take a car to a the shop they’ll fix what you asked them to fix (or not, which is a lot like web developers, too. We’re a mixed lot, we techies). They might tell you you should let them do more work if they’ve developed a taste for a particular kind of engine-souping-up, they might show you a neat thing you can use, they might sell you something to make you look stupid (a bumper bra, maybe, or a club) but they aren’t going to help you rethink your local transportation infrastructure. Some people want to fix cars, no doubt about it. Some people want to fix computers. But, if you want the kind of innovation that Paul found at the Vegas Sun, you have to have technical staff who are part of thinking about the news, and you have to have editorial staff who are willing to understand abstract concepts like databases.

      That said, I’m not convinced that this flight delay finder is quite the breakthrough in news reporting that I’m looking for. The TSA wait time data is a neat mashup, but I’m not sure it is breaking news.

      I’d like to see them figure how to illustrate the causes of delays — that is the part that is interesting enough to make me think about more than how grateful I am that I don’t expect to board another airplane until June. Are there a few airlines that slow averages down? Do some airports have more trouble with weather? Do delays bottle neck things more at one airport or another?

      I, too, keep a collection of innovative uses of maps and mapping as well as of design in advocacy, for what it is worth.

    5. Dan Schultz says:

      I don’t mean to belabor the point, but giving someone a database that they can use is very different from giving someone free reign over a database that they can use.

      When I see someone say “the sysadmin won’t give us access to our database” that means to me that he did give them a database (i.e. what you are talking about isn’t the same as what I’m talking about)

      It also means to me that the sysadmin won’t let people go in and do anything they want to the database (they don’t have direct “access”, which as you know means the ability to do everything they want to do).

      Sure, maybe this was a sysadmin who was giving absolutely nothing – but my point is that I don’t want anyone reading that comment to think that their system administrators are being unreasonable when they refuse to let just anyone do anything to their databases.

      I have a feeling we don’t disagree too much here, it’s just a matter of how we interpret the phrase “give access” — I assumed it meant “let us play with the guts” while you take it to mean “let us use the thing at all”

      Looking back I’m betting your interpretation is closer to correct, but I felt it was important enough to draw out the difference. Sysadmins can’t give everyone read/write access on demand! To do it securely would involve training or software development. If neither are available it is probably easiest (and wisest) for him to simply say no.

    6. Dan Schultz says:

      Correction when rereading — “It also means to me that the sysadmin won’t let people go in and do anything they want to the database (they don’t have direct “access”, which as you know means the ability to do everything they want to do).”

      should read — “It also means to me that the sysadmin won’t let people go in and do anything they want to the database (they don’t have direct “access”, which as I took it could have meant the ability to do everything they want to do).” ;););)

    7. Fair enough Dan.

      The particular breakdown at the Oakland Tribune is probably complicated in its own particular ways, but from a distance it reflects something I see all of the time: it is incredibly rare for programmers to be included in the editorial staff. Database driven interactive features can’t happen unless reporters and programmers can collaborate.

      That doesn’t mean that the sysadmin should just hand over their whole infrastructure, it means that organizations need to be committed to fostering collaboration between editorial content producers and programmers.

  • Who We Are

    MediaShift is the premier destination for insight and analysis at the intersection of media and technology. The MediaShift network includes MediaShift, EducationShift, MetricShift and Idea Lab, as well as workshops and weekend hackathons, email newsletters, a weekly podcast and a series of DigitalEd online trainings.

    About MediaShift »
    Contact us »
    Sponsor MediaShift »
    MediaShift Newsletters »

    Follow us on Social Media