The push for open data has prompted the release of countless datasets. But the raw data alone can’t adequately serve the public. After all, it’s hard to imagine the average person investing the time — or possessing the skills — to make meaningful sense of a massive spreadsheet.
That’s where the media comes in. Whenever possible, journalists should provide a way for every reader to explore the data and gain insights. Forget linking to massive spreadsheets, and keep in mind that even static graphs have their limitations. Instead, present your data in a rich, intuitive and interactive format. Allow readers to go beyond the questions you asked in your piece to ask and answer their own.
Why You Should Help Readers Explore The Data
Why should journalists go through the trouble of fostering data exploration? After all, it may seem counter to the journalist’s traditional role of explaining complex issues through a tightly-edited set of facts and perspectives.
The answer is simple, according to Alberto Cairo, a journalism professor at the University of Miami: “That’s what journalists do: make relevant information available to people who need it, or to communities who need it.”
Data, especially big data, can mean different things to different people. A graph that benefits the majority may not be significant to the minority, and vice versa. And not everyone has the tools or the skills to go looking for answers on their own.
Given the chance, the reader will find what Cairo calls “the me factor.” The reader learns how he or she fits into the bigger picture. And as a result, the story becomes more relatable. (Chad Skelton shared his insights on this idea in my previous piece.)
Readers find value in this form of self-discovery, according to Cairo, who explores this topic in his upcoming book, “The Truthful Art.” Consider the fact that the New York Times’ most popular piece in all of 2014 wasn’t a news story but rather an interactive quiz on regional U.S. dialects. The Times also recently updated its 2007 rent vs. own calculator due to its lasting popularity.
So how can you empower readers to dig into the data without any data skills? Here are a few tips to help you get started.
1. Visualize The Data
One way to make data accessible is to visualize it, says Ramon Martinez, health metrics adviser with the Pan American Health Organization. As part of his job, Martinez sees large datasets on health-related topics like the child obesity rate and each country’s life expectancy.
Thanks to his data skills, Martinez can parse the insights these datasets contain. But he knows that those same insights are inaccessible to most.
“If we put the dataset to the public, only a few people can benefit from that,” he said.
To help bridge the gap, Martinez visualizes the data.
To Martinez, truly open data means data anyone can understand the numbers and benefit from the insights contained within. Accessible insights can inform research, policies, and decisions, said Martinez, but the practice has yet to catch on.
“Some people are analyzing data, but they’re not putting the end results in a way that’s accessible to the public,” he said. “We have to provide to the end user, the reader, the ability to explore the data and find their own story. That is the whole thing.”
2. Let Readers Interact With The Data
One way to help readers find their own story in the data is to incorporate interactivity with data visualizations. Filters that narrow and broaden the scope of analysis enable readers to choose their own path.
Take, for instance, the Federal Emergency Management Agency’s disaster data visualization tool. Users can click on their home state or county and see a list of threats based on historical data. Users can also filter the data by hazard type, location, and year.
FEMA’s tool helps each user focus on the data that matters most. The user can ask personally relevant questions, spot trends and outliers, and plan for what may come.
“This gives you a way to take a lot of data at your home state to see how many times it’s been declared, and some people are surprised by how many times they have been declared or how infrequently they have been declared,” FEMA’s Craig Fugate wrote on the agency’s blog.
3. Provide Context For The Data
Like any other fact or figure, data in context is much more valuable than data alone. Whether in the form of additional reporting or additional data, added context helps readers understand what the numbers mean beyond the chart.
Adam McCann’s visualization of the future of the U.S. Supreme Court shows each justice’s tenure as well as the projected retirement date. But it doesn’t stop there. The reader can see past presidents’ nominations as well as their political affiliations. They can also see past justices’ tenure length and retirement age. These additional details put the current state of the court in historical context.
Context can also exist in the form of additional data. Shine Pulikathara’s visualization of 50 years of crime statistics makes it easy for the reader to see how the state compares to others in various categories of crime and whether the state’s trend follows the norm.
And this seemingly ordinary visualization shows something beyond the obvious. Instead of simply showing the unemployment rates we so often hear, author Joe Mako shows each state’s difference from the national average over time. In this case, the presentation provides context by adding larger meaning to each state’s unemployment rate.
4. Share The Data, Your Source And Your Methodology
Visualizing your data doesn’t mean you can forget about the raw data. In fact, there are three equally important things you must share, according to Cairo: your raw data, your source, and your methodology. (ProPublica, for its investigation of surgical complications, explained the methodology three forms: short post, FAQ and technical whitepaper.) The goal is to be transparent and allow others to build on your findings.
Here’s an example of this iterative approach at work. Martinez first published a dataset on child obesity, which he also visualized this way:
Then John Keltz took that data and used it to show how the U.S. compares to the rest of the world.
In another example, Russell Spangler visualized the Human Development Index, a measure of life expectancy, education and income indices, by country. The “more info” icon under the header leads to the data sources.
Using the same data, Kelly Martin made a very different visualization that offers a different perspective that makes country comparisons easier.
‘A Perfect Way To Be Useful To Our Readers’
Whenever possible, help readers understand the data, and empower them to conduct their own analysis. Provide the tools and the context that let them go beyond your questions to ask their own. Inform the public by helping decipher the fast-growing volume of data around us.
“We journalists have talked a lot about being useful to our readers,” Cairo said. “This is a perfect way to be useful to our readers.”
Martha Kang is the editorial manager of Tableau Software where she helps chronicle today’s big data revolution. A lifelong storyteller, she’s currently focused on telling data-driven stories that help us better understand our world, and ultimately, ourselves. Prior to joining Tableau, Martha worked as a journalist, first in TV news then in new media. She most recently served as the online managing editor of KPLU, an NPR affiliate in Seattle. There, she oversaw a number of projects, including the launch of Quirksee.org, a vertical site that featured two of her own award-winning stories, as well as a five-part, data-driven series, on Washington state’s idiosyncratic tax system.Martha has also worked at KOMO News, Northwest Cable News, and WLS-TV. In 2013, she was chosen as a Kiplinger fellow of public affairs journalism by Ohio State University.