Algorithms are valuable in society: they allow us to access our money through ATMs, they control traffic lights, and they can even help determine how best to prioritize environmental remediation projects. But algorithms are increasingly being used to make important decisions – and left unchecked, can have unintended consequences, say two data science experts.
That’s because, although designed with the goal of objectivity in mind, human bias can still be injected into algorithms.
“As we automate everything, we need to make sure we build in the same kind of due process that we have for other types of decisions,” Julia Angwin, a senior reporter at ProPublica, said at a panel discussion hosted by New York University’s Arthur L. Carter Journalism Institute.
When Algorithms Fail
The best-known algorithm is likely that of Facebook – that mysterious code social media editors long to crack and which determines what we’ll see in our news feeds.
Not only does Facebook’s algorithm contribute to the filter bubble, it is programmed to incentivize users to stay on the site for as long as possible because its priority is to create profit, not educate or inform people, Cathy O’Neil, the author of Weapons of Math Destruction and the writer behind the blog MathBabe.org, said.
The panelists pointed out that Google has its problems too.
Just after the 2016 presidential election, one of the top Google results was a link to a fake news website that incorrectly said Donald Trump had won the popular vote.
O’Neil set out to further test how Google’s machine learning algorithm would process a conspiratorial question: after searching “Did the Holocaust really happen?” four of the top six search results were Holocaust denier websites, serving only to reinforce the conspiracy, rather than disprove it, and illustrating how machine learning can be vulnerable to manipulation.
“Google was built on the premise of truth,” O’Neil said. “Now that people on the internet love lying, Google is screwed.”
She said the use of algorithms will likely become far more prevalent in the future.
“If we have not vetted any of them to make sure that they are legal or fair or even meaningful, then where are we going to be?” O’Neil said.
Angwin was part of a team of ProPublica reporters who did a year-long investigation on software being used around the United States that predicts whether criminals who are arrested and convicted are likely to re-offend in the future – essentially the plot of Minority Report come to life. An algorithm gives individuals a risk-score, which is given to the judge to use as a factor when sentencing that individual.
The algorithm’s accuracy rate was 60 percent, the ProPublica team found. Furthermore, it was biased against black offenders. The 40 percent of the time the algorithm was incorrect, it was twice as more likely to call black offenders high-risk when they weren’t, and twice as likely to call white offenders low-risk when they weren’t. The algorithm doesn’t consider external factors such as the disparate rates of policing in minority communities, leading to the biased outcomes, Angwin said. The issue is how to optimize algorithms correctly.
“It’s real people’s lives every single day that are being affected by this,” Angwin said. “That’s where you realize – you go into jail, got picked up for something minor, and they give you a high score, your life is going to change.”
O’Neil cited an example of an algorithm designed for teacher assessments that used data that was not statistically robust. At times there is little scientific basis to these so-called scientific objects, she said.
“We don’t know whether the data itself has integrity,” O’Neil said. “If we can’t trust the data, we definitely can’t trust the risk scores that are derived from the data.”
And yet another problem is that the algorithms themselves are often kept secret, making it even more difficult for any independent auditing of them. Angwin pointed out that even if you can see the algorithms, they look like mathematical equations, so it’s still unclear if they’re designed fairly.
How To Cover Algorithms
O’Neil said that while it’s the obligation of data scientists to build ethical models, it’s the obligation of journalists to cover these issues and talk to the public.
Regardless of whether journalists know anything about data science or mathematics, the data that is input into an algorithm can be telling.
An example: the algorithms commonly used in risk-sentencing software include questions about whether the individual’s family members or friends or even neighbors had ever been to prison – raising questions about the constitutionality of the outputs.
“Anyone can look at inputs and say, ‘I don’t know, is that even something that would be admissible if it wasn’t part of a number?’” Angwin said.
Bianca Fortis is the associate editor at MediaShift, a founding member of the Transborder Media storytelling collective and a social media consultant. Follow her on Twitter @biancafortis.