AI

Resources – on automated systems and bias

Last updated: 13/12/2017

If you are a data scientist, a software developer, or in the social and human sciences with interest in digital humanities, then you’re no stranger to the ongoing discussions on how algorithms embed and perpetuate human biases. Ethical considerations and critical engagement are urgently needed.

I have keenly been following these discussions for a while and this post is an attempt to put together the articles, books, book reviews, videos, interviews, twitter threads and so on., that I’ve come across, in one place so they can be used as resources.

This list is by no means exhaustive and as we are becoming more and more aware of the catastrophic consequences of these technologies, more and more pieces/articles/journal papers are being written about it on a daily basis. I plan to update this site regularly. Also, if you think there are relevant material that I have not included, please leave them as a comment and I will add them.

Books

Weapons of math destruction: how big data increases inequality and threatens democracy by Cathy O’Neil. A great number of the article on the list below are written by O’Neil. She is also active on Twitter regularly posting links and interesting critical insights on everything to do with mathematical models and bias. Here is my own review of O’Neil’s book with plenty of relevant links itself and here for another excellent review of O’Neil’s book.

Algorithms of oppressionAlgorithms of oppression: How search engines reinforce – below is an excerpt from Nobel’s book:

Run a Google search for “black girls”—what will you find? “Big Booty” and other sexually explicit terms are likely to come up as top search terms. But, if you type in “white girls,” the results are radically different. The suggested porn sites and un-moderated discussions about “why black women are so sassy” or “why black women are so angry” presents a disturbing portrait of black womanhood in modern society.
In Algorithms of Oppression, Safiya Umoja Noble challenges the idea that search engines like Google offer an equal playing field for all forms of ideas, identities, and activities. Data discrimination is a real social problem; Noble argues that the combination of private interests in promoting certain sites, along with the monopoly status of a relatively small number of Internet search engines, leads to a biased set of search algorithms that privilege whiteness and discriminate against people of color, specifically women of color.

Screenshot 2017-09-15 at 9.09.59 PM - Edited

Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian and Tom Griffiths. This book is concerned with the workings of the human mind and how computer science can help human decision making.  Here is a post by Artem Kaznatcheev on Computational Kindness which might give you a glimpse of the some of the issues that book covers. Here is a long interview with Brian Christian and Tom Griffiths and a TED Talk with Tom Griffiths on The Computer Science of Human Decision Making.

The Black Box Society: The Secret Algorithms That Control Money and Information by Frank Pasquale. You can read the introduction and conclusion chapters of his book here And here is a good review of Pasquale’s book. You can follow his twitter stream here.

Technically Wrong: Sexist Apps, Biased Algorithms, and Other Threats of Toxic Tech by Sara Wachter-Boettcher

Technically wrongHere is a synopsis:  A revealing look at how tech industry bias and blind spots get baked into digital products—and harm us all.

Buying groceries, tracking our health, finding a date: whatever we want to do, odds are that we can now do it online. But few of us ask why all these digital products are designed the way they are. It’s time we change that. Many of the services we rely on are full of oversights, biases, and downright ethical nightmares: Chatbots that harass women. Signup forms that fail anyone who’s not straight. Social media sites that send peppy messages about dead relatives. Algorithms that put more black people behind bars.

Sara Wachter-Boettcher takes an unflinching look at the values, processes, and assumptions that lead to these and other problems. Technically Wrong demystifies the tech industry, leaving those of us on the other side of the screen better prepared to make informed choices about the services we use—and demand more from the companies behind them.

Paula Boddington, Oxford academic and author of Towards a Code of Ethics for Artificial Intelligence, recommends the five best books on Ethics for Artificial Intelligence. Here is the full interview with Nigel Warburton, published on December 1, 2017.

TED Talks, podcasts, and interviews 

The era of blind faith in big data must end TED Talk by Cathy O’Neil, April, 2017

Machine intelligence makes human morals more important November 11, 2017. In this TED Talk, Zeynep Tufekci emphasizes the importance of human values and ethics in the age of machine intelligence and algorithmic decision making.

We’re building an artificial intelligence-powered dystopia, one click at a time, another thought provoking TED Talk from techno-sociologist Zeynep Tufekci.

How I’m fighting bias in algorthims TED Talk – MIT Researcher Joy Buolamwini, November 2016

Data is the new gold, who are the new thieves? TED Talk – Tijmen Schep 2016

O’Neil’s interview with Politics Weekly podcast (starts 30mins in) July 5, 2017. O’Neil calls for public awareness on how algorithms are used, often without our knowledge, in job interviews, for example., and explains why we should question and interrogate these algorithms which are often presented to us as authoritative.

A short interview with Frank Pasquale on his book Black Box Society May 12, 2016. Pasquale emphasizes the opaqueness of algorithms and argues on why we should demand transparency.

A 2 minutes video, a prototype example, of algorithms being used in recruitment. A working example of the kind of dangerous AI used for recruiting that experts such as O’Neil constantly warn against. This post provides a critical analysis of why such endeavors are futile and dangerous. Here’s another related video on how facial recognition technology will go mainstream in 2018. In fact, such technology has gone mainstream in China. Here is a short video where a BBC reporter experimented with the world’s largest surveillance system.

Tom Chatfield on Critical Thinking October 2, 2017 In this philosophically themed podcast, Chatfield discusses issues such as “how new digital realities interact with old human biases” with Dave Edmonds.

When algorithms discriminate: Robotics, AI and ethics November 18, 2017. Stephen Roberts, professor of computer science at the University of Oxford, discusses the threats and promises of artificial intelligence and machine learning with Al Jazeera.

Here is a series of talks, from the ABC Boyer Lectures, hosted by Professor Genevieve Bell. The series is called Fast, Smart and Connected: What is it to be Human, and Australian, in a Digital World? The issues discussed include “How to build our digital future.”

Websites

Social Cooling is a term that refers to a gradual long term negative side effects of living in an digital society where our digital activities are tracked and recorded. Such awareness of potentially being scored by algorithms leads to a gradual behaviour change: self-censorship and self-surveillance. Here is a piece on what looks like social cooling in action. The website itself has plenty of resources that can aid critical thinking and touches up on big philosophical, economical and societal questions in relation to data and privacy.

bias-in-bias-out-sc593da2a154050-1280
www.socialcooling.com

For those interested in critical thinking, data and models Calling Bullshit offers various resources and tools for spotting and calling bullshit. This website, developed for a course entitled ‘Calling Bullshit’, is a great place to explore and learn about all things “data reasoning for the digital age”.

Another important website that is worth a mention here is Algorithmic Justice League where you can report algorithm bias, participate in testing software for inclusive training set, or where you can simply donate and contribute raising awareness about existing bias in coded systems. With a somewhat similar aim is the Data Harm Record website – a running record of harms that have been caused by uses of big data.

fast.ai a project that aims to increase diversity in the field of deep learning and make deep learning accessible and inclusive to all. Critical Algorithm Studies: a Reading List – a great website with links to plenty of material on critical literature on algorithms as social concerns. Here is the Social Media Collective Reading List where you’ll find further material on Digital Divide/ Digital Inclusion and Metaphors of Data.

The AI Now Institute at New York University is an interdisciplinary research center dedicated to understanding the social implications of artificial intelligence. Data & Society is a research institute focused on the social and cultural issues arising from data-centric technological developments.  FAT/ML is a website on Fairness, Accountability, and Transparency in Machine Learning with plenty of resources and events, run by a community of researchers.

ConceptNet Numberbatch 17.04: better, less-stereotyped word vectors This is not a website but a blogpost. I am putting it here with other websites as the author offers some solution to reducing biases when building algorithms for natural language understanding beyond simply stating that such algorithms are biased.

Auditing Algorithms – a useful website for those teaching/interested in accountability in automated systems. The site includes films festivals, videos, etc,.

Biology/genetics  – (Digital phrenology?) 

It is difficult to draw a line and put certain articles under the category of social, biological, political, or other as they seem to be somehow all interlinked. Nonetheless, I think the following articles can loosely be described as dealing with biological/genetics material. Towards the end of this post, I have also thematized some articles under the category of ‘political’.

In a recent preprint paper “Deep Neural Networks Can Detect Sexual Orientation From Faces” (here are the Gurdian and the Economist reportings) Yilun Wang and Michal Kosinski calmed that their deep neural network can be trained to discern individuals’ sexual orientations from their photographs. The paper has attracted and continues to attract a massive attentions and has generated numerous responses, outrages and discussion. Here is an in-depth analysis from Calling Bullshit and here for a detailed technical assessment and here for a comprehensive and eloquent response from Greggor Mattson. Here is another response and another one here from a data scientist’s perspective and another recent response from O’Neil here. If you only want to read just one response, I highly recommend reading Mattson’s. There have been been plenty of discussions and threads on Twitter – here and here are a couple of examples. It is worth noting that Kosinski, one of the authors of the above paper, is listed as one of the the advisers for a company called Faception, an Israeli security firm that promises clients to deploy “facial personality profiling” to catch pedophiles and terrorists among others.

When advanced technologies in genetics and face recognition are applied with the assumption that “technology is neutral”, the consequences are often catastrophic and dangerous. These two pieces, Sci-fi crime drama with a strong black lead and Traces of Crime: How New York’s DNA Techniques Became Tainted provide some in-depth analysis of such.

Physiognomy’s New Clothes this is a comprehensive and eloquent piece and well worth your time. Physiognomy, the practice of using people’s outer appearance to infer inner character is a practice that is now discredited and discarded as phrenology. However, this piece illustrates how such practice is alive and well in the era of big data and machine learning. Here is more on the Wu and Zhang paper that the Physignomy’s New Clothes authors cover in the above piece.

General articles on various automated systems and bias, discrimination, unfairness, ethical concerns, etc., listed in order of publication dates starting from the latest.

Artificial Intelligence Seeks An Ethical Conscience December 7, 2017

Australian media watchdog to investigate Google and Facebook December 5, 2017

Why Autocomplete Is Only Funny for Those Who Can Afford It by Safiya Umoja Noble: December 4, 2017

Predictive algorithm under wraps December 3, 2017

Artificial intelligence doesn’t have to be evil. We just have to teach it to be good November 30, 2017

Frank Pasquale testifies (video, written testimony) Before the United States House of Representatives Committee on  Energy and Commerce Subcommittee on Digital Commerce and Consumer Protection in relation to “Algorithms: How Companies’ Decisions About Data and Content Impact Consumers”. Here for more written testimony on Algorithmic Transparency from the Electronic Privacy Information Center – November 29, 2017

U.S. House Hearing on Algorithms & Big Data: 5 Takeaways for Schools November 29, 2017

Facebook to temporarily block advertisers from excluding audiences by race November 29, 2017

Why We Had to Buy Racist, Sexist, Xenophobic, Ableist, and Otherwise Awful Facebook Ads November 27,2017

Facebook hasn’t done enough to tell customers they were duped by Russian propaganda November 25, 2017

Facebook (still) letting housing advertisers exclude users by race November 21, 2017

Tim Berners-Lee on the future of the web: ‘The system is failing’ November 16, 2017

Ray Dalio has an unbelievable algorithm November 15, 2017

How One Woman’s Digital Life Was Weaponized Against Her November 14, 2017

Maybe Facebook Is Broken. How can you stop people from sharing biased and misleading stuff? November 7, 2017

Bringing A.R.T. to A.I. November 6, 2017

Computer says no: why making AIs fair, accountable and transparent is crucial November 5, 2017

Why we need a 21st-century Martin Luther to challenge the church of tech October 29, 2017

Facebook must face local data protection regulations, EU court opinion finds October 25, 2017

Key GDPR Guidance on Behavioral Advertising, Profiling and Automated Decision-Making October 24, 2017

It’s time for more transparency in A.I. October 24, 2017

Federal judge unseals New York crime lab’s software for analyzing DNA evidence October 20, 2017

AI Experts Want to End ‘Black Box’ Algorithms in Government October 18, 2017

Estonia Proposes Bill of Rights and Responsibilities for Robots October 17, 2017

Asking the Right Questions About AI October 12, 2017

Google’s AI chief says forget Elon Musk’s killer robots, and worry about bias in AI systems instead October 3, 2017

Researchers Are Upset That Twitter Is Dismissing Their Work On Election Interference October 3, 2017

Facebook’s Ad Scandal Isn’t a ‘Fail,’ It’s a Feature September 23, 2017

BBC News – Facebook can’t hide behind algorithms September 22, 2017

Data power could make 1984 ‘look like a Teddy bear’s picnic’ September 21, 2017

Machines Taught by Photos Learn a Sexist View of Women September 21, 2017

AI Research Is in Desperate Need of an Ethical Watchdog September 18, 2017

Getting serious about research ethics: AI and machine learning September 18, 2017

Machines are getting schooled on fairness September 16, 2017

Facebook and Google, show us your ad data Understanding how they influence us is crucial to the future of our democracy. September 13, 2017

Understanding Bias in Algorithmic Design Human judgement lies behind every data-driven decision. Left unexamined, value-laden software can have unintended discriminatory effects. September 6, 2017

Report: Britain’s Cops Have Big Data But Not Big Analysis September 6, 2017

Turns out algorithms are racist August 31, 2017

AI programmes are learning to exclude some african american voices August 16, 2017

FaceApp Is Very Excited About Its New Line of Ultra-Racist Filters August 8, 2017

Rise of the racist robots – how AI is learning all our worst impulses August 8, 2017

Artificial intelligence ethics the same as other new technology  July 29, 2017

Technology is biased too. How do we fix it? July 20, 2017

How can we stop algorithms telling lies? July 16, 2017

Lack of ethics education for computer programmers shocks expert July 2, 2017

Facebook’s secret censorship rules protect white men from hate speech but not black children June 28, 2017

We need to shine more light on algorithms so they can help reduce bias, not perpetuate it June 12, 2017

How to Call B.S. on Big Data: A Practical Guide June 3, 2017

Pitfalls of artificial intelligence decision-making highlighted in Idaho ACLU case  June 2, 2017

The bigot in the machine: Tackling big data’s inherent biases June 1, 2017

Secret algorithms threaten the rule of law June 1, 2017

Algorithms aren’t racist. Your skin is just too dark. May 29, 2017

‘A white mask worked better’: why algorithms are not colour blind May 28, 2017

On Facebook May 7, 2017

AI & Machine Learning Black Boxes: The Need for Transparency and Accountability: April 25, 2017

FaceApp sorry for ‘racist’ filter that lightens skin to make users ‘hot’ April 25, 2017

Robots are racist and sexist. Just like the people who created them April 20, 2017

How artificial intelligence learns to be racist April 17, 2017

Courts are using AI to sentence criminals. That must stop now. April 17, 2017

An AI stereotype catcher April 14, 2017

AI picks up racial and gender biases when learning from what humans write April 13, 2017

AI programs exhibit racial and gender biases, research reveals April 13, 2017

AI learns gender and racial biases from language April 13, 2017

Will the future be full of biased robots? March 31, 2017

Algorithms can be pretty crude toward women March 24, 2027

Algorithms learn from us, and we can be better teachers March 13, 2017

Data-driven crime prediction fails to erase human bias March 8, 2017

Big data, big problems – interview with Cathy O’Neil March 1, 2017

How to Keep Your AI From Turning Into a Racist Monster February 13, 2017

Code-Dependent: Pros and Cons of the Algorithm Age February 6, 2017

We put too much trust in algorithms and it’s hurting our most vulnerable December 29, 2016

Be Healthy or Else: How Corporations Became Obsessed with Fitness Tracking December 27, 2016

Discrimination by algorithm: scientists devise test to detect AI bias December 19, 2016

A simplified political history of Big Data December 16, 2016

Hiring Algorithms Are Not Neutral December 9, 2016

How Algorithms Can Bring Down Minorities’ Credit Scores December 2, 2016

Put Away Your Machine Learning Hammer, Criminality Is Not A Nail November 29, 2016

The Foundations of Algorithmic Bias November 7, 2016

Unregulated Use of Facial Recognition Software Could Curb 1st Amendment Rights October 30, 2016

Should we trust predictive policing software to cut crime? October 27, 2016

Google researchers aim to prevent AIs from discriminating October 7, 2016

How algorithms rule our working lives September 1, 2016

White House plan to use data to shrink prison populations could be a racist dumpster fire July 1, 2016

Is criminality predictable? Should it be? June 30, 2016

Artificial Intelligence’s White Guy Problem June 25, 2016

In Wisconsin, a Backlash Against Using Data to Foretell Defendants’ Futures June 22, 2016

Algorithmic risk-assessment: hiding racism behind “empirical” black boxes May 24, 2016

ProPublica

Image Courtesy of ProPublica

There’s software used across the country to predict future criminals. And it’s biased against blacks. May 23, 2016 The company that sells this program (Northpointe) has responded to the criticisms here. Northpointe asserts that a software program it sells that predicts the likelihood a person will commit future crimes is equally fair to black and white defendants. Following such response, Jeff Larson and Julia Angwin has written another response (Technical Response to Northpointe) re-examined the data. They argue that they have considered the company’s criticisms, and stand by their conclusions.

Python Meets Plato: Why Stanford Should Require Computer Science Students to Study Ethics May 16, 2016

The Real Bias Built In at Facebook May 19, 2016

Twitter taught Microsoft’s friendly AI chatbot to be a racist asshole in less than a day March 24, 2016

The Iron Cage in binary code: How Facebook shapes your life chances – Sociological Images: December 30, 2015

As World Crowds In, Cities Become Digital Laboratories December 11, 2015

Google Photos Tags Two African-Americans As Gorillas Through Facial Recognition Software July 1, 2015

How big data is unfair September 26, 2014

Facebook reveals news feed experiment to control emotions June 30, 2014

The Hidden Biases in Big Data by Kate Crawford April 1, 2013

Politics

Algorithmic processes and politics might seem far removed from each other. However, if anything, the recent political climate is indicative of how algorithms can be computational tools for political agendas. Here and here are exemplar twitter threads that highlight particular Twitter accounts used as tools for political agenda. The articles below are, in some way or another, related to algorithms in the political arena.

Inside the world of Brazil’s social media cyborgs December 13, 2017

How Rodrigo Duterte turned Facebook into a weapon, with a little help from Facebook December 7, 2017

More than a Million Pro-Repeal Net Neutrality Comments were Likely Faked November 23, 2017

Extreme Vetting by Algorithm November 20, 2017

How a half-educated tech elite delivered us into evil November 19, 2017

Do Facebook and Google have control of their algorithms anymore? A sobering assessment and a warning November 14, 2017

‘Way too little, way too late’: Facebook’s factcheckers say effort is failing November 13, 2017

How to Fool Americans on Twitter November 6, 2017

Russia funded Facebook and Twitter investments through Kushner associate November 5, 2017

Opinion | Silicon Valley Can’t Destroy Democracy Without Our Help November 2, 2017

When Data Science Destabilizes Democracy and Facilitates Genocide November 2, 2017

Facebook estimates 126 million people were served content from Russia-linked pages October 31, 2017

Russian content on Facebook, Google and Twitter reached far more users than companies first disclosed, congressional testimony says October 30, 2017

‘Downright Orwellian’: journalists decry Facebook experiment’s impact on democracy October 25, 2017

A Suspected Network of 13,000 Twitter Bots Pumped Out Pro-Brexit Messages In The Run-Up To The EU Vote October 20, 2017

How People Inside Facebook Are Reacting To The Company’s Election Crisis October 20, 2017

Facebook treats its ethical failures like software bugs, and that’s why they keep happening October 20, 2017

Tech Giants, Once Seen as Saviors, Are Now Viewed as Threats October 12, 2017

Russia Probe Now Investigating Cambridge Analytica, Trump’s ‘Psychographic’ Data Gurus October 10, 2017

Google uncovers Russian-bought ads on YouTube, Gmail and other platforms October 9, 2017

Facebook cut references to Russia from a report in April about election influence October 5, 2017

Russian Facebook ads: 70 million people may have seen them October 4, 2017

Google and Facebook Have Failed Us – The Atlantic October 2, 2017

Facebook and Google promote politicized fake news about Las Vegas shooter October 2, 2017

Social media companies must respond to the sinister reality behind fake news October 1, 2017

Zuckerberg’s Preposterous Defense of Facebook September 29, 2017

“Fake news” tweets targeted to swing states in election, researchers find September 28, 2017

As Google Fights Fake News, Voices on the Margins Raise Alarm September 26, 2017

Facebook blocked an ad for a march against white supremacy: September 25, 2017

Hillary Clinton says Kenya’s annulled election was a “project” of a controversial US data firm September 19, 2017

Facebook enabled advertisers to reach “Jew haters” September 14, 2017

Facebook and Google, show us your ad data Understanding how they influence us is crucial to the future of our democracy. September 13, 2017

RT, Sputnik and Russia’s New Theory of War September 13, 2017

American politics needs new rules for the Facebook era  September 12, 2017

Russia’s Facebook Fake News Could Have Reached 70 Million Americans September 8, 2017

Forum Q&A: Philip Howard on Computational Propaganda’s Challenge to Democracy July 25, 2017. “Computational propaganda, or the use of algorithms and automated social media accounts to influence politics and the flow of information, is an emerging challenge to democracy in the digital age. Using automated social media accounts called bots (or, when networked, botnets), a wide array of actors including authoritarian governments and terrorist organizations are able to manipulate public opinion by amplifying or repressing different forms of political content, disinformation, and hate speech.”

WhatsApp and Facebook are driving Kenya’s fake news cycle July 24, 2017

GOP Data Firm Accidentally Leaks Personal Details of Nearly 200 Million American Voters June 19, 2017

Voter profiling in the 2017 Kenyan election June 6, 2017

The great British Brexit robbery: how our democracy was hijacked May 7, 2017

Confronting a Nightmare for Democracy May 4, 2017

30 million Facebook users had their data harvested by Trump campaign affiliate March 30, 2017

Robert Mercer: the big data billionaire waging war on mainstream media Feburary 26, 2017

Revealed: how US billionaire helped to back Brexit Feburary 26, 2017

The Truth About The Trump Data Team That People Are Freaking Out About Feburary 16, 2017

The Data That Turned the World Upside Down Jan 28, 2017

Inside the Trump Bunker, With Days to GoWin or lose, the Republican candidate and his inner circle have built a direct marketing operation that could power a TV network—or finish off the GOP. October 27, 2016 

Facebook wants you to vote on Tuesday. Here’s how it messed with your feed in 2012. October 31, 2014

For a more scholarly read 

Barocas, S., & Selbst, A. D. (2016). Big data’s disparate impact.

Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems (pp. 4349-4357).

Caliskan-Islam, A., Bryson, J. J., & Narayanan, A. (2016). Semantics derived automatically from language corpora necessarily contain human biasesarXiv preprint arXiv:1608.07187.

Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments. arXiv preprint arXiv:1703.00056. (PDF)

Datta, A., Sen, S., & Zick, Y. (2016, May). Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In Security and Privacy (SP), 2016 IEEE Symposium on (pp. 598-617). IEEE. (PDF)

Datta, A., Tschantz, M. C., & Datta, A. (2015). Automated experiments on ad privacy settings. Proceedings on Privacy Enhancing Technologies2015(1), 92-112.

Friedman, B., & Nissenbaum, H. (1996). Bias in computer systems. ACM Transactions on Information Systems (TOIS), 14(3), 330-347.

Monahan, J., & Skeem, J. L. (2016). Risk assessment in criminal sentencing. Annual review of clinical psychology12, 489-513.

Munoz, C., Smith, M., & Patil, D. (2016). Big data: A report on algorithmic systems, opportunity, and civil rights. Executive Office of the President. The White House.

Yeung, K. (2017). Algorithmic Regulation: A Critical Interrogation.

Zafar, M. B., Valera, I., Gomez Rodriguez, M., & Gummadi, K. P. (2017, April). Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web (pp. 1171-1180). International World Wide Web Conferences Steering Committee.

 

 

Advertisements

The dark side of Big Data – how mathematical models increase inequality. My review of O’Neil’s book ‘WMD’

We live in the age of algorithms. Where the internet is, algorithms are. The Apps on our phones are results of algorithms. The GPS system can bring us from point A to point B thanks to algorithms. More and more decisions affecting our daily lives are handed over to automation. Whether we are applying for college, seeking jobs, or taking loans, mathematical models are increasingly involved with the decision makings. They pervade schools, the courts, the workplace, and even the voting process. We are continually ranked, categorised, and scored in hundreds of models, on the basis of our revealed preferences and patterns; as shoppers and couch potatoes, as patients and loan applicants, and very little of this do we see – even in applications that we happily sign up for.

More and more decisions are increasingly handled by algorithms, which in theory, should mean that human biases and prejudices should be eliminated. Algorithms are, after all, “neutral” and “objective”. They apply the same rules to everybody regardless of race, gender, ethnicity or ability. However, this couldn’t be far from the truth. In fact, mathematical models can be, and in some cases have been, tools that further inequality and unfairness. O’Neil calls these kinds of models Weapons of Math Destruction (WMD). These models are biased and unfair as they learn to encode poisonous prejudices, learning from past records just how to be unfair. These models punish the racial and ethnic minorities, low-wage workers, and women. It is as if these models were designed expressly to punish and to keep them down. As the world of data continues to expand, each of us producing ever-growing streams of updates about our lives, so do prejudice and unfairness.

Mathematical models have revolutionised the world and efficiency is their hallmark and sure, they aren’t just tools that create and distribute bias, unfairness and inequality. In fact, models, by their nature are neither good nor bad, neither fair nor unfair, neither moral nor immoral – they simply are tools. The sports domain is a good example where mathematical models are a force for good. For some of the world’s most competitive baseball teams today, competitive advantages and wins depend on mathematical models. Managers make decisions that sometimes involve moving players across the field based on analysis of historical data and current situation and calculate the positioning that is associated with the highest probability of success.

There are crucial differences, however, between models such as those used by baseball managers and WMDs.  While the former is transparent, and constantly updates its model with feedbacks, the latter by contrast are opaque and inscrutable black-boxes. Furthermore, while the baseball analytics engines manage individuals, each one potentially worth millions of dollars, companies hiring minimum wage workers, by contrast, are managing herds. Their objectives are optimising profits so they slash their expenses by replacing human resources professionals with machines that filter large populations into manageable groups. Unlike the baseball models, these companies have little reason – say plummeting productivity – to tweak their filtering model.  O’Neil’s primary focus in the book are those that are opaque and inscrutable, that are used within powerful institutions and industries which create and widen inequalities – WMDs – “The dark side of Big Data”! 

Weapons-of-math-destructionThe book contains crucial insights (or haunting warnings, depending on how you choose to approach it) to the catastrophic directions mathematical models used in the social sphere are heading. And it couldn’t come from a more credible and experienced source than a Harvard mathematician who then went to work as quant for D. E. Shaw, a leading hedge fund, and a data scientist, among other things.

One of the most persistent themes of O’Neil’s book is that the central objectives of a given model are crucial. In fact, objectives determine whether a model becomes a tool that helps the vulnerable or one that is used to punish them. WMDs objectives are often to optimise efficiency and profit, not justice. This, of course, is the nature of capitalism. And WMDs efficiency comes at the cost of fairness – they become biased, unfair, and dangerous. The destructive loop goes around and around and in the process, models become more and more unfair.

Legal traditions lean strongly towards fairness … WMDs, by contrast, tend to favour efficiency. By their very nature, they feed on data that can be measured and counted. But fairness is squishy and hard to quantify. It is a concept. And computers, for all their advances in language and logic, still struggle mightily with concepts. They “understand” beauty only as a word associated with the Grand Canyon, ocean sunsets, and grooming tips in Vogue magazine. They try in vain to measure “friendship” by counting likes and connections on Facebook. And the concept of fairness utterly escapes them. Programmers don’t know how to code for it, and few of their bosses ask them too. So fairness isn’t calculated into WMDs and the result is massive, industrial production of unfairness. If you think of a WMD as a factory, unfairness is the black stuff belching out of the smoke stacks. It’s an emission, a toxic one. [94-5]

The prison system is a startling example where WMDs are increasingly used to further reinforce structural inequalities and prejudicesIn the US, for example, those imprisoned are disproportionately poor and of colour. Being a black male in the US makes you nearly seven times more likely to be imprisoned than if you were a white male. Are such convictions fair? Many different lines of evidence suggest otherwise. Black people are arrested more often, judged guilty more often, treated more harshly by correctional officers, and serve longer sentences than white people who have committed the same crime. Black imprisonment rate for drug offences, for example, is 5.8 times higher than it is for whites, despite a roughly comparable prevalence of drug use.

Prison systems which are awash in data hardly carry out important research such as why non-white prisoners from poor neighbourhoods are more likely to commit crimes or what the alternative ways of looking at the same data are. Instead, they use data to justify the workings of the system and further punish those that are already at a disadvantage. Questioning the workings of the system or enquiries on how the prison system could be improved are almost never considered. If, for example, building trust were the objective, an arrest may well become the last resort, not the first. Trust, like fairness, O’Neil explains, is hard to quantify and presents a great challenge to modellers even when the intentions are there to consider such concept as part of the objective.

Sadly, it’s far simpler to keep counting arrests, to build models that assume we’re birds of a feather and treat us such… Innocent people surrounded by criminals get treated badly. And criminals surrounded by law-abiding public get a pass. And because of the strong correlation between poverty and reported crime, the poor continue to get caught up in these digital dragnets. The rest of us barely have to think about them. [104]

Insofar as these models rely on barely tested insights, they are in a sense not that different to phrenology – digital phrenologyThe practice of using outer appearance to infer inner character, which in the past justified slavery and genocide has been outlawed and is considered pseudoscience today. Scientific racism has entered a new era with the appearance of justified “objectivity” with machine-learned models embedding human biases. “Scientific” criminological approaches now claim to “produce evidence for the validity of automated face-induced inference on criminality. However, what these machine-learned “criminal judgements” pick up on, more than anything, is systematic unfairness.  

model that profiles us by our circumstances helps create the environment that justifies its assumptions. The stream of data we produce serve as insights into our lives and behaviours. Instead of testing whether these insights stand up to scientific scrutiny, the data we produce are used to justify the modellers’ assumptions and to reinforce pre-existing prejudice. And the feedback loop goes on.

When I consider the sloppy and self-serving ways that companies use data, I am often reminded of phrenology… Phrenology was a model that relied on pseudoscientific nonsense to make authoritative pronouncements, and for decades it went untested. Big Data can fall into the same trap. [121-2]

Hoffman in 1896 published a 330-page report where he used exhaustive statistics to support a claim as pseudoscientific and dangerous as phrenology. He made the case that the lives of black Americans were so precarious that the entire race was uninsurable. However, not only were Hoffman’s statistics erroneously flawed, like many of WMDs O’Neil discusses throughout the book, he also confused causation for correlation. The voluminous data he gathered served only to confirm his thesis: race is a powerful predictor of life expectancy. Furthermore, Hoffman failed to separate the “Black” population into different geographical, social or economic cohorts blindly assuming that the whole “Black” population is a homogeneous group. 

This cruel industry has now been outlawed. Nonetheless, the unfair and discriminatory practices remain and are still practised but in a far subtler form –  they are now coded into the latest generations of WMDs and obfuscated under complex mathematics. Like Hoffman, the creators of these new models confuse correlation with causation and they punish the struggling classes and racial and ethnic minorities. And they back up their analysis with realms of statistics, which give them the studied air of “objective science”. 

What is even more frightening is that as oceans of behavioural data continue to feed straight into artificial intelligence systems, this, to the most part will, unfortunately, remain a black box to the human eye. We will rarely learn about the classes that we have been categorised into or why we were put there and, unfortunately, these opaque models are as much a black-box to those who design them. In any case, many companies would go out of their way to hide the results of their models, and even their existence.

In the era of machine intelligence, most of the variables will remain a mystery... automatic programs will increasingly determine how we are treated by other machines, the ones that choose the ads we see, set prices for us, line us up for a dermatologist appointment, or map our routes. They will be highly efficient, seemingly arbitrary, and utterly unaccountable. No one will understand their logic or be able to explain it. If we don’t wrest back a measure of control, these future WMDs will feel mysterious and powerful. They’ll have their way with us, and we’ll barely know it is happening. [173]

In the current insurance system, (at least as far as the US is concerned) the auto insurers’ tracking systems which provide insurers with more information enabling them to create more powerful predictions, are opt-in. Only those willing to be tracked have to turn on their black boxes. Those that do turn them on get rewarded with discounts where the rest subsidise those discounts with higher rates. Insurers who squeeze out the most intelligence from this information, turning it into profits, will come out on top. This, unfortunately, undermines the whole idea of collectivisation of risk on which insurance systems are based. The more insurers benefit from such data, the more of it they demand, gradually making trackers the norm. Consumers who want to withhold all but the essential information from their insurers will pay a premium. Privacy, increasingly, will come at a premium cost. A recently approved US bill illustrates just that. This bill would expand the reach of “Wellness Programs” to include genetic screening of employees and their dependents and increase the financial penalties for those who choose not to participate.

Being poor in a world of WMDs is getting more and more dangerous and expensive. Even privacy is increasingly becoming a luxury that only the wealthy can afford. In a world which O’Neil calls a ‘data economy’, where artificial intelligence systems are hungry for our data, we are left with very few options but to produce and share as much data about our lives as possible. We are, in the process, implicitly or explicitly, coerced into self-monitoring and self-discipline as we continually attempt to conform ideal bodies and “normal” health statuses as dictated by organisations and institutions that handle and manage, say, our health insurances. Raley (2013) refers to this as dataveillance: a form of continuous surveillance through the use of (meta)data. Ever growing flow of data, including data pouring in from the Internet of Things – the Fitbits, Apple Watches, and other sensors that relay updates on how our bodies are functioning, continue to contribute towards this “dataveillance”.  

One might argue that helping people deal with their weight and health issues isn’t such a bad thing and that would be a reasonable argument. However, the key question here, as O’Neil points out, is whether this is an offer or a command. Using flawed statistics like the BMI, which O’Neil calls “mathematical snake oil”, corporates dictate what the ideal health and body looks like. They infringe on our freedom as they mould our health and body ideals. They punish those that they don’t like to look at and reward those that fit their ideals. Such exploitations are disguised as scientific and are legitimised through the use of seemingly scientific numerical scores such as the BMI. The BMI, a person’s weight (kg) over height (cm) squared, is only a crude numerical approximation for physical fitness. And since the “average” man underpins its statistical scores, it is more likely to conclude that women are “overweight” – after all, we are not “average” men. Even worse, black women, who often have higher BMIs, pay the heaviest penalties.  

The control of great amounts of data and the race to build powerful algorithms is a fight for political power. O’Neil’s breathtakingly critical look at corporations like Facebook, Apple, Google, and Amazon illustrates this. Although these powerful corporations are usually focused on making money, their profits are tightly linked to government policies which makes the issue essentially a political one.

These corporations have significant amounts of power and a great amount of information on humanity, and with that, the means to steer us in any way they choose. The activity of a single Facebook algorithm on Election Day could not only change the balance of Congress, but also potentially decide the presidency. When you scroll through your Facebook updates, what appears on your screen is anything but neutral – your newsfeed is censored. Facebook’s algorithms decided whether you see bombed Palestines or mourning Israelis, a policeman rescuing a baby or battling a protester. One might argue that television news has always done the same and this is nothing new. CNN, for example, chooses to cover a certain story from a certain perspective, in a certain way. However, the crucial difference is, with CNN, the editorial decision is clear on the record. People can debate whether that decision is the right one. Facebook on the other hand, O’Neil puts it, is more like the “Wizard of Oz” — we do not see the human beings involved. With its enormous power, Facebook can affect what we learn, how we feel, and whether we vote – and we are barely aware of any of it. What we know about Facebook, like other internet giants, comes mostly from the tiny proportion of their research that they choose to publish.

In a society where money buys influence, these WMD victims are nearly voiceless. Most are disenfranchised politically. The poor are hit the hardest and all too often blamed for their poverty, their bad schools, and the crime that afflicts their neighbourhoods. They, for the most part, lack economic power, access to lawyers, or well-funded political organisations to fight their battles. From bringing down minorities’ credit scores to sexism in the workplace, WMDs serve as tools. The result is widespread damage that all too often passes for inevitability.

Again, it is easy to point out that injustice, whether based on bias or greed, has been with us forever and WMDs are no worse than the human nastiness of the recent past. As with the above examples, the difference is transparency and accountability. Human decision making has one chief virtue. It can evolve. As we learn and adapt, we change. Automated systems, especially those O’Neil classifies as WMD, by contrast, stay stuck in the time until engineers dive in to change them.

If Big Data college application model had established itself in the early 1960s, we still wouldn’t have many women going to college, because it would have been trained largely on successful men. [204]

Rest assured, the book is not all doom and gloom or that all mathematical models are biased and unfair. In fact, O’Neil provides plenty of examples where models are used for good and models that have the potential to be great.

Whether a model becomes a tool to help the vulnerable or a weapon to inflict injustice, as O’Neil, time and time again emphasises, comes down to its central objectives. Mathematical models can sift through data to locate people who are likely to face challenges, whether from crime, poverty, or education. The kinds of objectives adopted dictate whether such intelligence is used to reject or punish those that are already vulnerable or to reach out to them with the resources they need. So long as the objectives remain on maximising profit, or excluding as many applicants as possible, or to locking up as many offenders as possible, these models serve as weapons that further inequalities and unfairness. Change that objective from leeching off people to reaching out to them, and a WMD is disarmed — and can even become a force of good. The process begins with the modellers themselves. Like doctors, data scientists should pledge a Hippocratic Oath, one that focuses on the possible misuse and misinterpretation of their models. Additionally, organisations such as the Algorithmic Justice League, which aim to increase awareness of algorithmic bias, provide space for individuals to report such biases. 

Opaqueness is a common feature of WMDs. People have been dismissed from work, sent to prison, or denied loans due to their algorithmic credit scores with no explanation as to how or why. The more we are aware of their opaqueness, the better chance we have in demanding transparency and accountability and this begins by making ourselves aware of the works of experts like O’Neil. This is not a book only those working in data science, machine learning or other related fields need to read, but one that everyone needs to read. If you are a modeller, this book should encourage you to zoom out, think whether there are individuals behind the figures that your algorithms manipulate, and think about the big questions such as the objectives behind your codes. Almost everyone, to a greater or lesser extent, is part of the growing world of ‘data economy’. The more awareness there is of the dark side of these machines, the better equipped we are to ask questions, to demand answers from those behind the machines that decide our fate.