This post originally appeared on Medium.
News organizations are experimenting increasingly with robot journalism, using computer programs to transform data into news stories, or news stories into multimedia presentations.
Most uses of robot journalism have been for fairly formulaic situations — company earnings reports, stock market summaries, earthquake alerts and youth sports stories. But inevitably, news companies will be testing automatic news writing on more challenging subjects.
What are the ethics of robot journalism? When editors consider using automated news writing, what issues of accuracy, quality and transparency arise?
Here’s my proposal for a checklist of what editors should ask. More input is welcome. (Disclosure: My company, The Associated Press, transmits automatically written stories generated by Automated Insights, a company that provides robot journalism technology and that AP previously invested in.)
● How accurate is the underlying data? Does the data consist of publicly announced numbers from companies, a stock exchange or government? If so, it’s probably safe for automatic crunching – with regular checks to make sure the data is being properly transmitted. However, not all data comes from such authoritative sources. If scores are being sent in by dads from their kids’ soccer games, how will you assure the data is reliable?
● Do you have the rights to the data? Does the data provider have the legal right to send it to you? Do you have the further right to modify and publish it, and if so, on what platforms?
● Does the automation use the same phrasing for every story? Facts tell different stories. Your automation should be able to use different approaches, depending on what the data says.
● Will you disclose what you’re doing? At the least you should disclose to your readers that the story was automatically produced. Better: Provide a link that identifies the source of the data — the company that provides the automation and explains how the process works.
● Does the style of the automated reports match your style? Spellings, general writing style and capitalization should match the rest of your content. If they don’t, readers may be suspicious of copy that doesn’t feel like the rest of your journalism.
● Can you defend how the story was “written?” If people question the facts in a story or how they were organized, can you give an explanation (or get a quick answer from your data and automation providers)? “The computer did it” isn’t much of an explanation. If you try to encourage kid soccer players by having your software highlight goals and play down mistakes, are you prepared to disclose that?
● Who’s watching the machine? Problems with underlying data or automation software can create errors that quickly metastasize, potentially creating thousands of erroneous stories. Test the automated product thoroughly before anything is published. Even when publication begins, have a human editor check every story before it goes out. Once the product proves itself, stories can go out automatically with spot-checking by human editors.
● Are you considering automation that creates multimedia presentations? Some automated systems create video or photo displays to accompany text stories. If so, can you assure that the system is accessing only imagery that you have a legal right to use? How will you make sure it doesn’t grab imagery that’s satirical, hateful or not in line with your standards of taste?
● Are you using software that reduces long articles to bullet points? Test it extensively to make sure it’s truly finding the important points. And find out if the software you’re considering requires the original article to be in a certain format – say, the inverted pyramid. Text written in other ways may yield poor results. (At AP, we tried dropping the Book of Genesis into an automatic summary program we were testing; the bullet points created by the program left out the Garden of Eden.)
● Are you ready for the next frontier? Automated journalism is likely to push its way eventually even into political and analytical stories. As it does, it will become ever more controversial. A politician may demand to know why he doesn’t make the lead more often; it’s not hard to imagine political activists – or parties to a legal case – demanding the source code behind automated coverage. Try to consider all the what-ifs in advance.
The best protection as you move further into robot news writing is a constant focus on testing, and on making sure editors understand how the software really works. Plus a recognition that many things are still best done by humans.
Parts of this checklist are based on a presentation of mine at the 2014 annual conference of the Global Editors Network.
Thomas Kent is the leader of the ONA “Make your own ethics code” project. He is the standards editor of The Associated Press and teaches international reporting at Columbia University’s School of International and Public Affairs. He tweets at @tjrkent.