It wasn’t until half-way through my journalism degree that I realized I wasn’t going to be a traditional reporter. I wasn’t even going to be a multimedia reporter. I was going to be a programmer/journalist. Putting a slash in your title makes you more important.
I haven’t been able to track down the first use of the phrase, but the earliest reference I could find using a Google News timeline search was in a 2006 interview with Adrian Holovaty, creator of Chicagocrime.org and EveryBlock. No surprise there. That interview was about a year before my revelation.
We’re all used to seeing journalists work with words and photos, and in the last few years even video, Flash and more. But how do you tell a story with code? When did simple reporters start becoming programmer/journalists? The history of reporters and computers has been a long and winding road.
The History of CAR
With a little extrapolation, programming in journalism can be traced back to the 1960s and ’70s. Most big newspapers had mainframe computers, and government data was being transferred from analog (paper) to electronic form. This was the beginning of what we now know under a slew of labels including Computer-Assisted Reporting (CAR), precision journalism or database journalism.
The earliest example of CAR is arguable, and seems to depend on how broad a view you take.
Scott Maier, an associate professor in the School of Journalism and Communication at the University of Oregon, wrote an article called “The Digital Watchdog’s First Byte: Journalism’s First Computer Analysis of Public Records,” published in American Journalism in 2000. He attributes the first computer analysis of public records to Miami Herald reporter Clarence Jones, who investigated corruption in the Dade County judicial system in 1968.
Maier also states that “the first use of computers for news analysis occurred November 4, 1952, when CBS television made use of the Remington Rand UNIVAC to predict the outcome of the presidential contest between Dwight Eisenhower and Adlai Stevenson. Defying pollster expectations, the computer accurately predicted a landslide for Eisenhower.”
Philip Meyer pioneered the use of computerized survey data by the print media in 1967. After riots in Detroit, the Detroit Free Press used a mainframe computer to show that people who had attended college were equally likely to have rioted as were high school dropouts.
Elliot Jaspin is another big name in the CAR world. He was working at the Providence Journal in the mid-‘80s when the first IBM PCs were delivered to the paper.
Jaspin wanted to find a way to use the computer as part of the reporting process. The perfect opportunity arose in the form of a federal corruption investigation into the city of Providence. Although the city’s paper records had been subpoenaed, Jaspin was able to get all the financial records in electronic form — on 9 track tapes.
When Jaspin began to talk about his methods at journalism conferences, he realized that this new form of reporting would not spread unless the cost of analysis was lowered. He got a fellowship at Columbia University and built Nine Track Express, a piece of software that downloaded data from 9 track tapes onto a PC and formatted it for database programs.
Rise of Programmer/Journalists
As the technology became cheaper and easier to use, more newsrooms began to accept it as just another part of the reporting process. With the advent of the Internet and the ability to put more information online for the audience, newspapers began to publish their databases online.
Mindy McAdams, who worked as a copy editor and content developer at the Washington Post between 1993 and 1995, described a database that the Post published online in the mid-‘90s before they even had a website. The software was designed to serve up pages of text with graphics embedded, with no capability for searching or embedding a database. To make it available, they had to import the database into Word and build macros to clean it up and import it into the system.
Around the same time, the Philadelphia Inquirer produced a massive story about police misclassifying crimes. They published the police database online, and people in the community let reporters know what was wrong — an early example of providing tools for the public to do its own data analysis (or “crowdcourcing”), moving away from the hidden, static analysis done by reporters.
In 2002, Adrian Holovaty was publishing database projects at the Atlanta Journal-Constitution. Although most of them have since been taken down, he figured out how to make these databases searchable in a number of ways, modeling them after sites like Amazon and the Internet Movie Database.
Holovaty explained to me how he saw the relationship between CAR and journalist/programmers:
My sense is that before the journalist/programmer idea became a thing, the same type of work was done by CAR specialists. CAR people historically have done great work/analysis that results in a newspaper article, which is just a static thing that doesn’t live or breathe beyond the initial publication date. So the ‘new’ thing is to get developers involved directly on the news staff.
I see a very clear progression from CAR to the programmer/journalist trend via the web. CAR is meant to be invisible. You analyze a database as part of the reporting process, but you don’t want to clog up a story with too many numbers. The ability to add details online has changed this process. Data has become a part of the story. And that’s the key connection between CAR and programming in journalism: data.
Matthew Waite, the developer behind PolitiFact, told me his evolution from computer-assisted reporter to programmer/journalist was “the natural evolution of someone who just keeps going with CAR skills.”
As scripting languages like Python, Ruby, PHP and Perl became more robust, “the best programmer/journalists can extract context and meaning from data online,” Waite said.
Waite believes he is still in line with the movement that Meyer started, as far as pushing technology and technical literacy in journalism.
It’s unclear where this new form of journalism is heading. Is this the future of journalism or, like multimedia reporting, merely a part of the equation? I’ll be writing once a month on this topic to try to answer some of these questions and address issues and innovations in the progression of programming in journalism.
I’d like to thank Scott Maier, Sarah Cohen, Philip Meyer, Mindy McAdams, Michael Skoler, Matthew Waite, James Hamilton, Everette Dennis, Elliot Jaspin, Derek Willis, Dan Pacheco, Aron Pilhofer and Adrian Holovaty for their assistance on this piece.
Megan Taylor is a web journalist whose work focuses on combining traditional and computer-assisted information-gathering with multimedia production to create news packages online. Megan tells stories in English, HTML/CSS/, ActionScript, PHP, photos, video and audio, and blogs at her personal site.
UNIVAC computer photo by Bernt Rostad via Flickr.
I agree with you that CAR journalists are also programmers from SQL on up the tech ladder of difficulty. And, I’m glad you have included my colleague, Matt Waite, in your piece as one of the most recent examples of how ingenuity and cutting-edge technical savvy can really bring out the stories in data. I just want to remind student and pro journalists alike, that knowing your way around a spreadsheet can still yield awesome beat-level stories on a daily or weekly basis for print, online and broadcast platforms. There’s enough out there for journalists of all levels of tech expertise to keep on being watchdogs of public records, whether they’re initially obtained in electronic form or not. Deb Wolfe, freelance technology training editor and IRE member based in St. Petersburg, Fla.