Algorithms

Why model?

I came across this little paper on the Introduction to Dynamical Systems and Chaos online course from Santa Fe. It was provided as a supplementary reading in the ‘Modelling’ section. The paper lays out some of the most enduring misconceptions about building models.

“The modeling enterprise extends as far back as Archimedes; and so does its misunderstanding.” Epstein (2008)

So, why model? What are models? And who are modellers?

Prior to reading this paper, my short answers to these questions would have been in accordance with the widely held misconceptions that:

We model to explain and or predict. Models are formal representations (often mathematical) of phenomenon or processes. And a modeller is someone who builds these explicit formal mathematical models. However, Epstein explains:

“Anyone who ventures a projection, or imagines how a social dynamic—an epidemic, war, or migration—would unfold is running some model.”

I like the idea that we all run some implicit models all the time. In the social and political sphere, where it is extremely difficult to operationalize and specify variables, this perspective gives implicit modelling such as drawing dynamical analogies, its due importance.

The paper lays out 16 reasons other than prediction for building models. And the idea that prediction and explanation aren’t the only modelling goals was revelation to me given that I’ve had a love hate relationship with modelling in the past. I am attracted to models, specially those with dynamical systems inclination but the overall tendency towards prediction as a goal often frustrates me. Just to clarify, prediction is a fine goal but my objection arise when 1) we’re deluded into thinking that models give us the tools to predict specific individual behaviours and 2) we can model a phenomenon, especially human behaviour, without first understanding it.

ML

xkcd: Machine Learning

Let me elaborate further in the context of automated predictive system that are currently trending (at least, within my academic circle) and often preoccupy my thinking. Claims to predict “criminal” and “risky” behaviour are examples from last week’s headlines: UK police wants Artificial Intelligence (AI) to predict criminal behaviour before it happens and Predictim, a commercial data analytics firm, claims its AI can flag “risky” babysitters. Unfortunately, these are not the outrageous exceptions but the general direction where things in the digital surveillance sphere seem to be heading.

Behaviours such as “criminal” or “risky” are very complex adaptive behaviours which are a result of infinite ongoing factors, which we can never fully specify in the first place. This makes it impossible to predict criminal behaviour with certainty. Juarrero reminds us why it is impossible to predict human behaviour with precision:

“When we are dealing with complex adaptive systems, surprises are unavoidable. Because of their sensitivity to initial conditions – due, in turn, to their contextual and temporal embeddedness – complex adaptive systems are characterized by unusual twists and novel turns. Since we will never be able to specify any dynamical system’s initial conditions to the requisite (infinite) degree, a fortiori we will never be able to capture all the details and circumstances of anyone’s life and background. Given this limitation, we must always keep in mind that reconstructing specific instances of behavior will always be, at best, an interpretation and not a deduction – a much more fallible type of explanation than we had previously hoped was available. Interpretations of human action are always tentative. Absolute certainty about either what the agent just did, or what he or she will do – specifically – a year from now, is therefore impossible.” (Juarrero 1999, p. 225)

These claims to predict “criminal” or “risky” behaviour are more than a mere misunderstanding of human nature or simple illusions about what AI tools are capable of doing. As these tools are being implemented into the social world, they have grave consequences on people’s lives. When claiming to predict someone’s potential criminality, errors are inevitable. The stakes are high when we get things wrong. Unsurprisingly, it is often society’s most vulnerable, those who are disfranchised, that pay a high price. Indeed, such models are used to further punish and disfranchise those that fall prey to these models.

A slightly different but interrelated issue with modelling to predict is that the strive to predict and explain often ignores the value of describing and/or observing to gain deep understanding. Sure, describing to understand, and explaining and predicting aren’t mutually exclusive. However, in reality, we seem to have blindly adopted prediction and generalization as primary goals of science. Studying to describe and understand, as a result, are undervalued. What is the point of describing? you might ask. I think it is fundamental to understand any phenomena or process as deeply and comprehensibly as possible before we can attempt to explain or predict it, and description is key to gaining such understanding.

I’ll leave you with an insightful Geertz (1973) passage from The Interpretation of Cultures:

“… I have never been impressed with claims that structural linguistics, computer engineering or some other advanced form of thought is going to enable us to understand men without knowing them.”

How to prepare a talk on AI

How would you give a talk on Artificial Intelligence (AI) to 120 students between the age of 16-18, not all of whom are necessarily interested or have a background in science? How would you define AI? What would you include (and exclude)? What is the best way to structure it? Well, surely, there are many valid answers to these questions. It was the first time that myself and Elayne Ruane, a colleague who is also a PhD researcher, attempted to give an 80 minute talk to a big crowd of such students. We didn’t find much in terms of guidance or advice on how to interact with the students or how to frame the AI discourse in a suitable manner for students who are about to embark on their college journey. We wanted to convey the excitement, hope and potential the field holds while also portraying a realistic image of its current state. Hopefully sharing our general approach might be helpful to anybody who finds themselves in a similar situation.

Mind, what worked for us might not work in different contexts, mindsets, situations, or for a different topic. AI is one of the most over-hyped and misunderstood areas of research in the minds of the general public. Furthermore, AI has been somewhat associated with a certain stereotypical archetype in the media – a white male genius computer geek. How one introduces the field and the kind of work and influential figures one includes plays a subtle but important role towards challenging these misconceptions and stereotypes. Specifically, when addressing a crowd of young people in the midst of deciding what areas of study they will pursue at university, how you present the field of AI can send implicit signals about who is welcome. For us, this is everyone.

Initial Discussion

We began our talk with a brief discussion of what a computer science degree, as one of the routes to AI research, entails (within the context of our own department at University College Dublin) and the kinds of careers that it can lead to while raising the point that there isn’t one path to follow. We then briefly talked about exemplar AI projects that are taking place within our own School. We kept this part of the talk very interactive by frequently polling the group by way of raising their hands. This was important in keeping the students engaged.

What is AI?

We discussed the general definition of AI – the common view that artificial intelligence refers to a machine that simulates human intelligence. What it means to ‘simulate’ or ‘human intelligence’ are contested and of course far from settled. However, we felt it was important to keep it simple for the purpose of this talk. ‘Machines that simulate human intelligence and exhibit human behaviour’, often comes down to abilities such as learning, problem solving, reasoning, language processing and the like.

Unlike other disciplines such as physics or biology, Artificial Intelligence is not a clearly defined and well contained discipline but rather a very broad and cross disciplinary endeavour. It draws from mathematics, engineering, biology, neuroscience, linguistics, philosophy, and many more. Although the most direct route to studying AI is through computer science (certainly within the context of UCD), one can also get to AI through other routes. Besides, AI can be synthesized with any field of enquiry, including, neuroscience, music and art. Christie’s recent AI generated art is a good example.

AI is a wide umbrella term with sub-fields including robotics, natural language processing, computer vision, machine learning and deep learning, speech recognition, machine translation and more. We tried to use examples of these relevant to the students including Google Translate, Amazon’s Alexa, PS4 games, Minecraft, facial recognition tools, and robots. We showed them the famous video of Boston Dynamic’s robot, Spot, dancing to Uptown Funk which was a huge hit.

The History of AI

Ada_Lovelace_portraitAI is often thought of as a recent development, or worse, as futuristic, something that will happen in the far future. We tend to forget that dreams, aspirations and fascinations with AI go back in history back to antiquity. In this regard, Rene Descartes’s simulacrum and the Mechanical Turk are good examples. Descartes was fond of automata and had a walking and talking clockwork named after his daughter Francine. The machine apparently simulated his daughter, who died of scarlet fever at the age of 5. Similarly, the 18c Hungarian author and inventor Wolfgang von Kempelen created the Mechanical Turk, (a fake) chess-playing and speaking machine to impress the Empress Maria Theresa of Austria.

We can list an endless number of scholars who contributed to the development of AI as it is conceived today. The main towering figures we included were:

  • Al KhawarizmiThe ninth century Persian mathematician Muḥammad ibn Mūsā al-Khwārizmī who gave us one of the earliest mathematical algorithms. The word “algorithm” comes from mispronunciation of his name.
  • The English mathematician, Ada Lovelace who is often regarded as the first computer programmer.
  • Alan Turing who is regarded as the father of theoretical computer science and whom most students seemed to be already aware of.
  • And more recently, and perhaps scholars most influential in shaping the way we currently understand AI, are Marvin Minsky, John McCarthy, and Margaret Masterman.

Fun Game

We tried to make our talk as interactive as possible. We had questions and discussion points throughout. Towards the end, we had a game where students had to guess whether the AI being described on each slide was ‘sci-fi’ or ‘real’. Here are the main examples. Have a go yourself. 🙂

Sci-fi or real

  • Self-aware robots

Self aware robot

  • Face recognition AI which rates people’s ‘trustworthiness’

Trustworthiness AI

  • A bedside light that notifies you of your retweets

Light notifying RT

  • Automated confession systems

eConfession

Common Misconceptions

If there is anything that the AI narrative is not short of, it’s hype and misconception. Clarifications, in a subtle way, both help illustrate what the actual current state of the field is as well as highlighting the challenges that arise with it. As such, the final concluding remakes were highlighting the misconceptions surrounding AI and the ethical concerns that necessarily arise with any technological advancement. The major misconceptions we mentioned are:

  1. AI is a distant reality. The fact is far from it. AI is deeply embedded in the infrastructure of everyday life. It is invisible and ubiquitous.
  2. AI equals robots or ‘self-driving’ cars. As it would have been obvious by now, robotics is simply one dimension.
  3. AI is neutral and can’t be biased. This again is far from reality. As AI integrates deeper into the educational, medical, legal, and other social spheres, ethical questions inevitably arise. Questions of ethics, fairness, and responsibility are inherently questions of AI.

That concludes the content of the talk.

General advice:

  1. Keep it open and flexible. Create opportunities to hear from them. This allows you to get an idea of their awareness and knowledge (which can then help you calibrate on the fly in terms of technical detail) while also keeping them engaged.
  2. Pictures, more picture, and videos, are a great way to open up discussion. We showed a video of Google Assistant making a phone call which really captured their attention and got them talking. This also brought forth some ethical discussion.
  3. Prepare for plenty of questions around “Is AI going to take over?” and “How scared and worried should we be?”. It’s important to highlight how AI advancements can be misused but the trick is to highlight how much of what is reported on AI is overly blown hype which contributes to these unnecessary and unrealistic fears of AI when in fact much of the development in AI remains still premature. On the other hand, remember, we were talking to young science students about to embark to college. We still want to encourage them and want them to feel the dreams, excitements and hopes that have been the driving force of AI, at least in the 50s and 60s and the promising potentials that AI presents in medicine, robotics and more.

 

Further reading

 

 

The AI side of cognitive science is concerned with first world problems

I recently had the opportunity to attend a multidisciplinary conference where cognitive scientists, philosophers, psychologists, artificial intelligence (AI) researchers, neuroscientists and physicists came together to discuss the self. The conference was, generally speaking, well organized and most of the talks were interesting. The theme of the conference was on the openness of the self which means that contrary to the traditional essentialist view of self as fixed, fully autonomous and self-contained, the consensus, among the attendees, was that the self is not a static, discrete entity that exists independent of others but dynamic, changing, co-dependent, and intertwined with others. This intertwinement would furthermore extend to social and political forces that play crucial roles into constituting who we are. In this vein, any discussion of self and technology needs to acknowledge the entanglement of social and political factors and the necessity for diverse input and perspectives.

AI is a very broad field of enquiry which includes, to mention but a few, facial recognition technologies, search engines (such as Google), online assistants (such as Siri), and algorithms which are used in almost every sphere (medical, financial, judicial, and so on) of society. Unfortunately, the view of AI that seems to dominate public as well as academic discourses is a narrow and one-dimensional one where the concern revolves around the question of artificially intelligent “autonomous” entities. This view is unsurprisingly often promoted by a one-dimensional group of people; white, middle-class and male. Questions outside “the creation of artificial AI” rarely enter the equation. The social, political, and economical factors rarely feature in the cognitive science and interdisciplinary formulations of selfhood and technology — as if any technological development emerges in a social, political and economical vacuum. And the conference I attended was no different.

This was apparent during theme-based group discussions at this conference where one group discussed issues regarding self and technology. The discussion was led by researchers in embodied AI and robotics. The questions revolved around the possibility of creating an artificial self, robots, whether AI can be sentient and if so how might we know it. As usual, the preoccupation with abstract concerns and theoretical construction took centre stage, to  the detriment of the political and social issues. Attempts to direct some attention towards the social and political issues were dismissed as irrelevant.

It is easy to see the appeal of getting preoccupied in these abstract philosophical questions. After all, we immediately think of “I, Robot” type of robots when we think of AI and we think of “self-driving” cars when we think of ethical questions in AI.

game and gambling, gaming machines, chess playing Turk, design by Wolfgang von Kempelen (1734 - 1804), built by Christoph Mechel

A 1980s Turk reconstruction

The fascination and preoccupation for autonomous and discrete machines is not new to current pop-culture. The French philosopher René Descartes had a walking and talking clockwork named after his daughter Francine. The machine apparently simulated his daughter Francine, who died of scarlet fever at the age of five. The 18c Hungarian author and inventor Wolfgang von Kempelen created the Mechanical Turk, (a fake) chess-playing and speaking machine to impress the Empress Maria Theresa of Austria.

It is not surprising that our perception of AI is dominated by such issues given that our Sci-Fi pop culture plays an influential role towards our perception of AI. The same culture feeds on overhype and exaggeration of the state of AI. The researchers themselves are also often as responsible for miscommunication and misunderstanding about the state of the art of the filed. And the more hyped a piece of work is, the more attention it is given – look no further than the narrative surrounding Sophia – an excessively anthropomorphized and overhyped machine.

Having said that, the problem goes further than misleading coverage and overhype. The overhype, the narrow one-dimension view of AI as concerned with question of artificial self and “self-driving” cars, detracts from nuanced and most important and more pressing issues in AI that impact the very poor, disfranchised, socially, economically disadvantaged. For example, in the current data economy, insurance systems reward and offer discounts for those that are willing to be tracked and provide as much information about their activities and behaviours. Consumers who want to withhold all but the essential information from their insurers will pay a premium. Privacy, increasingly, will come at a premium cost only the privileged can afford.

An implicit assumption that AI is some sort of autonomous, discrete entity separate from humans, and not a disruptive force for society or the economy, underlies this narrow one-dimensional view of AI and the preoccupation with the creation of artificial self. Sure, if your idea of AI revolves around sentient robots, that might bear some truth. This implicit assumption seems, to me, a hangover from Cartesian dichotomous thinking that remains persistent even among scholars within the embodied and enactive tradition who think that their perspectives account for complex reality. This AI vs humans thinking is misleading and unhelpful, to say the least.

AI systems are ubiquitous and this fact is apparent if you abandon the narrow and one-dimensional view of AI. AI algorithms are inextricably intertwined with our social, legal, health and educational system and not some separate independent entities as we like to envision when we think of AI. The apps that power your smart phone, the automated systems, including those that contribute to the decision towards whether you get a loan or not, whether you are hired or not, or how much your car insurance premium will cost you all are AI. AI that have real impact, especially on society’s most vulnerable.

Yet, most people working on AI (both in academia and Silicon Valley) are unwilling to get their hands dirty with any aspect of the social, economic or political aspect and impact of AI. The field seems, to a great extent, to be constituted of those who are socially, economically and racially privileged where these issues bear no personal consequences. The AI side of cognitive science is no different with its concerns of first world problems.  Any discussion of a person or even society is devoid of gender, class, race, ability and so on. When scholars in these fields speak of “we”, they are barely inclusive of those that are outside the status quo which is mostly a white, male, Western, middle-class educated person. If your model of self is such, how could you and why would you be concerned about the class, economic, race and gender issues that emerge due to unethical application of AI, right? After all, you are unlikely to be affected.  Not only is the model of self unrepresentative of society, there barely is awareness of the issue as a problem in the first place. The problem is invisible due to privilege which renders diversity and inclusivity of perspectives as irrelevant.

This is not by any means a generalization of everyone within the AI scholarship. There are, of course, plenty of people who acknowledge the political and social forces as part of issues to be concerned about within the discussion of AI. Unsurprisingly, such important work in this regard is done by people of colour and women who unfortunately, remain a minority. And the field as a whole would do well to make sure that it is inclusive of such voices, and to value their input instead of dismissing them.

Situating China’s Social Credit System in history and context

If you have been following the developments in the digital humanities, it is very likely that you’ve come across the news that China is implementing a Social Credit System, officially known as Social Credit Score (SCS). Although the SCS is portrayed as a single integrated system that quantifies all behaviour into credit scores, it is in fact an ecology of fragmented initiatives with many different stakeholders. Broadly speaking, it consists of scoring systems developed by private sectors and by governmental bodies. As far as the governmental perspective is concerned, the SCS is an attempt to promote “trustworthiness” and transparency in the economy which is expected to combat perceived lack of trust in the marketplace, and more generally to harmonize social conduct.

Citizens “trustworthiness” is rated based an individual’s social behaviour such as their crime records, what they say on social media, what they buy, the scores of their friends, and so on. This has possible positive or negative implications on individual’s job, visa, loan applications. As a commitment towards radical transparency is a central driving force behind the SCS, information on subjects’ trustworthiness is made publicly available, and in some circumstances even being actively broadcast. Individual citizens and businesses alike are publicly ranked where the records are publicly open.

SCS civilized families

Roncheng’s “civilized families” are displayed on public noticeboards like these. (Simina Mistreanu)

The SCS is to become mandatory by 2020 and is currently being implemented in some form or another across parts of China. Areas that are socioeconomically deprived seem prior targets. Rongcheng in the eastern province of Shandong, where the SCS has been rolled out for some time now, is, according to government officials, one of the best examples of the system working as intended, according to government officials.

From a general systems science perspective, the SCS is a self-organizing system that operates through incentive and punishment mechanisms. People with low ratings will, for example, have slower internet speeds, restricted access to restaurants, and the right to travel invoked.

“Higher scores have already become a status symbol, with almost 100,000 people bragging about their scores on Weibo (the Chinese equivalent of Twitter) within months of launch. A citizen’s score can even affect their odds of getting a date, or a marriage partner, because the higher their Sesame rating, the more prominent their dating profile is on Baihe.” (Creemers, 2018)

The SCS has been described as an insidious digital panopticon and a dystopian nightmare where individuals’ every move are monitored and ranked through data generated from all sorts of activity and interactions, online or otherwise through digital technologies (facial recognition tools and biometric information). Many draw parallels between the SCS and the dystopian science fiction Black Mirror episode “Nosedive” where people rate each other based on their interactions.

Black Mirror rating

Many ethical and human rights issues as well as the complete eradication of the idea of privacy have been raised and the negative consequences of such a dystopian nightmare system is indisputable.

With the realization that ‘digital reputations’ could limit opportunities comes the tendency to self-censor and the tendency to be risk-averse. We are unlikely to hit “like” on a Facebook post that protests some government policy knowing that it could impact our ‘digital reputations’. Consequently, people gradually change their behaviour to align with what the system requires, to get better scores. In the process those behaviours and norms defined as “acceptable” by the government are reinforced.

Nonetheless, among the misconceptions surrounding the SCS, there seems to be some consensus that using individual’s digital traces to directly or indirectly influence individual’s behaviour is something that only happens in non-Western totalitarian states. In fact, credit scoring practices are not unfamiliar in Western societies. Facebook, for instance, seems it is developing its own system of rating users trustworthiness.

It is also worth mentioning Facebook’s emotion tracking patent (where the aim is to monitor individuals’ typing speed in order to predict emotions and adapt messages in response), which was granted in May 2017 and the currently filed Socioeconomic classifier (which might enable Facebook to rank its users according to different social classes), among its series of patents. These developments in combination with others, such as Facebook’s ability to flag individuals through its facial recognition technology without the consent of the user, in some sense constitute a surveillance society. Facebook’s ability to rank and categorize people into a variety of socioeconomic categories has possible impacts on individuals’ opportunities depending on their class, gender, race and sexual orientation. Whether its the type of job ads one is excluded from viewing (due to their gender, class or age) or the exclusion from certain housing ads, Facebook’s ranking and categorizing systems often impact the under-privileged and those who fail to conform to the status quo.

Health insurance

Marshall Allen, July 2018, ProPublica

Along social media platforms, health insurers, and schools, can also be mentioned as examples that share features of the SCS. Like the SCS, these Western industries and institutes, track and surveil people through digital technologies including face recognition tools and biometric information.

We are rated, ranked and categorized using data extracted from us. Similar to the SCS, such ranking and rating often has possible “real” life consequences whether in the form of how much we pay for our insurance, what ads are pushed on us, or how we behave in school yards. The difference between the Chinese SCS and Western tech industry is, while the former is clear and upfront about it, the latter is much more invisible. In fact, such tech giants go out of their way to hide what they are doing.

Rating systems, those by the SCS or deployed through Western tech industry, create unwanted incentives and increase pressure on individuals to conform to the status quo. This creates and contributes to a society that is risk averse.

“When doctors in New York were given scores this had unexpected results. Doctors that tried to help advanced cancer patients had a higher mortality rate, which translated into a lower score. Doctors that didn’t try to help were rewarded with high scores, even though their patients died prematurely.” Tijmen Schep

Situating the SCS in history and context

The history and context which are crucial to the development of the current SCS are often missing from how the SCS is framed, at least within in Western media .

“[social systems] must be viewed whole cloth as open dynamical systems embedded in a physical, historical, and social fabric” (Juarrero, 1999, p. 201)

As far as China’s political tradition goes, morality and authority are inextricably linked. Enforcing moral standards, monitoring and disciplining the conduct of local officials and individual citizens is seen as the role of the state. “Governing the country by virtue” equals to “governing the country by the law”. Unlike the Western legal system where rights, responsibilities and entitlement of private actors and public sectors are relatively easily categorized, such categories are much more blurred within the Chinese legal system. Individual citizens, government officials, communities and business are all expected to contribute to the whole social and economic harmony and development.

“Chinese political tradition has, for centuries, conceived of society as an organic whole, where harmony can be achieved if all its members conduct themselves as appropriate to their position in public and civil structures. … Critical in this process were ideas about systems theory, derived from natural science and applied in the social context. Influenced by Western scholarship on cybernetics and systems theory, scholars such as Qian Xuesen and Song Jian worked closely with government to develop a conceptual framework for the adoption of systems engineering techniques in governance. Particular regard was given to the role of information flows, not just towards and within government, but also as part of cybernetic feedback loops to create self-correcting responses in society.” (Creemers, 2018, p. 7)

Historically the Chinese government has experimented with some forms of social control and controlling social order through self-policing and social controlling mechanisms go all the way back to the Song Dynasty.

“An 11th-century emperor instituted a grid system where groups of five to 25 households kept tabs on each other and were empowered to arrest delinquents” Mistreanu, 2018. The current SCS then is an extension of such historical traditions. The difference now is the addition of digital technologies.

From the Chinese authorities perspective the SCS epitomizes a self-correcting feedback loop where “trustworthiness” and social morality are fostered through incentives and punishments.

This by no means is to argue that the SCS is any less of a digital panopticon. However, by highlighting history and context, often missing from the SCS narrative, we can paint a somewhat complex and nuanced image of the system (as opposed to the often alarming pieces which are stripped of context and history). Furthermore, while we are preoccupied by the stories of how China is becoming one giant surveillance prison, we miss the indirect and evasive practices that are happening within our own “civilized” Western system.

 

Bibliography

Creemers, R. (2018). China’s Social Credit System: An Evolving Practice of Control.
Juarrero, A. (1999). Dynamics in action: Intentional behavior as a complex system (p. 127143). Cambridge, MA: MIT press.

 

 

Resources – on automated systems and bias

Last updated: 07/05/2019

If you are a data scientist, a software developer, or in the social and human sciences with interest in digital humanities, then you’re no stranger to the ongoing discussions on how algorithms embed and perpetuate human biases. Ethical considerations and critical engagement are urgently needed.

I have keenly been following these discussions for a while and this post is an attempt to put together the articles, books, book reviews, videos, interviews, twitter threads and so on., that I’ve come across, in one place so they can be used as resources.

This list is by no means exhaustive and as we are becoming more and more aware of the catastrophic consequences of these technologies, more and more pieces/articles/journal papers are being written about it on a daily basis. I plan to update this site regularly. Also, if you think there are relevant material that I have not included, please leave them as a comment and I will add them.

Link for books here

Link for a more scholarly read here

TED Talks, podcasts, and interviews 

The era of blind faith in big data must end TED Talk by Cathy O’Neil, April, 2017

Machine intelligence makes human morals more important November 11, 2017. In this TED Talk, Zeynep Tufekci emphasizes the importance of human values and ethics in the age of machine intelligence and algorithmic decision making.

We’re building an artificial intelligence-powered dystopia, one click at a time, another thought provoking TED Talk from techno-sociologist Zeynep Tufekci.

How I’m fighting bias in algorthims TED Talk – MIT Researcher Joy Buolamwini, November 2016

AI, Ain’t I A Woman? Joy Buolamwini

Data is the new gold, who are the new thieves? TED Talk – Tijmen Schep 2016

O’Neil’s interview with Politics Weekly podcast (starts 30mins in) July 5, 2017. O’Neil calls for public awareness on how algorithms are used, often without our knowledge, in job interviews, for example., and explains why we should question and interrogate these algorithms which are often presented to us as authoritative.

A short interview with Frank Pasquale on his book Black Box Society May 12, 2016. Pasquale emphasizes the opaqueness of algorithms and argues on why we should demand transparency.

A 2 minutes video, a prototype example, of algorithms being used in recruitment. A working example of the kind of dangerous AI used for recruiting that experts such as O’Neil constantly warn against. This post provides a critical analysis of why such endeavors are futile and dangerous. Here’s another related video on how facial recognition technology will go mainstream in 2018. In fact, such technology has gone mainstream in China. Here is a short video where a BBC reporter experimented with the world’s largest surveillance system.

Tom Chatfield on Critical Thinking October 2, 2017 In this philosophically themed podcast, Chatfield discusses issues such as “how new digital realities interact with old human biases” with Dave Edmonds.

When algorithms discriminate: Robotics, AI and ethics November 18, 2017. Stephen Roberts, professor of computer science at the University of Oxford, discusses the threats and promises of artificial intelligence and machine learning with Al Jazeera.

Here is a series of talks, from the ABC Boyer Lectures, hosted by Professor Genevieve Bell. The series is called Fast, Smart and Connected: What is it to be Human, and Australian, in a Digital World? The issues discussed include “How to build our digital future.”

You and AI – Just An Engineer: The Politics of AI (July, 2018). Kate Crawford, Distinguished Research Professor at New York University, a Principal Researcher at Microsoft Research New York, and the co-founder and co-director the AI Now Institute, discusses the biases built into machine learning, and what that means for the social implications of AI.

How will AI change your life? AI Now Institute founders Kate Crawford and Meredith Whittaker explain. (8 April, 2019)

Facebook: Last Week Tonight with John Oliver (HBO) an extremely funny and super critical look at Facebook.

Humans are biased, and our machines are learning from us — ergo our artificial intelligence and computer programming algorithms are biased too. Joanna Bryson explains how human bias is learned by taking a closer look at how AI bias is learned.

Websites

Social Cooling is a term that refers to a gradual long term negative side effects of living in an digital society where our digital activities are tracked and recorded. Such awareness of potentially being scored by algorithms leads to a gradual behaviour change: self-censorship and self-surveillance. Here is a piece on what looks like social cooling in action. The website itself has plenty of resources that can aid critical thinking and touches up on big philosophical, economical and societal questions in relation to data and privacy.

bias-in-bias-out-sc593da2a154050-1280
www.socialcooling.com

For those interested in critical thinking, data and models Calling Bullshit offers various resources and tools for spotting and calling bullshit. This website, developed for a course entitled ‘Calling Bullshit’, is a great place to explore and learn about all things “data reasoning for the digital age”.

Another important website that is worth a mention here is Algorithmic Justice League where you can report algorithm bias, participate in testing software for inclusive training set, or where you can simply donate and contribute raising awareness about existing bias in coded systems. More on AI face misclassification and accountability by Joy Buolamwini here. With a somewhat similar aim is the Data Harm Record website – a running record of harms that have been caused by uses of big data.

fast.ai a project that aims to increase diversity in the field of deep learning and make deep learning accessible and inclusive to all. Critical Algorithm Studies: a Reading List – a great website with links to plenty of material on critical literature on algorithms as social concerns. Here is the Social Media Collective Reading List where you’ll find further material on Digital Divide/ Digital Inclusion and Metaphors of Data.

The AI Now Institute at New York University is an interdisciplinary research center dedicated to understanding the social implications of artificial intelligence. Data & Society is a research institute focused on the social and cultural issues arising from data-centric technological developments.  FAT/ML is a website on Fairness, Accountability, and Transparency in Machine Learning with plenty of resources and events, run by a community of researchers. Litigating Algorithms: Challenging Government Use of Algorithmic Decision Systems. An AI Now Institute Report.

ConceptNet Numberbatch 17.04: better, less-stereotyped word vectors This is not a website but a blogpost. I am putting it here with other websites as the author offers some solution to reducing biases when building algorithms for natural language understanding beyond simply stating that such algorithms are biased.

Auditing Algorithms – a useful website for those teaching/interested in accountability in automated systems. The site includes films festivals, videos, etc,.

The Ethics and Governance of Artificial Intelligence – a cross-disciplinary course that investigates the implications of emerging technologies, with an emphasis on the development and deployment of Artificial Intelligence. Here’s an Introduction to Data Ethics by Markkula Center for Applied Ethics. More here for the recent details.  

Google launches a new course to teach people about fairness in machine learning.

Biology/genetics  – (Digital phrenology?) 

It is difficult to draw a line and put certain articles under the category of “social”, “biological”, “political”, or other as the boundaries between these categories are blurred and most of the themes are somehow all interlinked. Nonetheless, I think the following articles can loosely be described as dealing with biological/genetics/personality material. Furthermore, towards the end of this post, I have also thematized some articles under the category of “political”.

In a recent preprint paper “Deep Neural Networks Can Detect Sexual Orientation From Faces” (here are the Gurdian and the Economist reportings) Yilun Wang and Michal Kosinski calmed that their deep neural network can be trained to discern individuals’ sexual orientations from their photographs. The paper has attracted and continues to attract a massive attentions and has generated numerous responses, outrages and discussion. Here is an in-depth analysis from Calling Bullshit and here for a detailed technical assessment and here for a comprehensive and eloquent response from Greggor Mattson. Here is another response and another one here from a data scientist’s perspective and another recent response from O’Neil here. If you only want to read just one response, I highly recommend reading Mattson’s. There have been been plenty of discussions and threads on Twitter – here and here are a couple of examples. It is worth noting that Kosinski, one of the authors of the above paper, is listed as one of the the advisers for a company called Faception, an Israeli security firm that promises clients to deploy “facial personality profiling” to catch pedophiles and terrorists among others.

Do algorithms reveal sexual orientation or just expose our stereotypes? by @blaiseaguera et al., is the latest (January 11, 2018) response to the above Wang and Kosinski “gaydar” paper. In this critical analysis, @blaiseaguera et al., argue that much of the ensuing scrutiny of Wang and Kosinski work has focused on ethics, implicitly assuming that the science is valid. However, on a closer inspection, et al., find that the science doesn’t stand up to scrutiny either.

When advanced technologies in genetics and face recognition are applied with the assumption that “technology is neutral”, the consequences are often catastrophic and dangerous. These two pieces, Sci-fi crime drama with a strong black lead and Traces of Crime: How New York’s DNA Techniques Became Tainted provide some in-depth analysis of such.

Physiognomy’s New Clothes this is a comprehensive and eloquent piece and well worth your time. Physiognomy, the practice of using people’s outer appearance to infer inner character is a practice that is now discredited and discarded as phrenology. However, this piece illustrates how such practice is alive and well in the era of big data and machine learning. Here is more on the Wu and Zhang paper that the Physignomy’s New Clothes authors cover in the above piece. Further examples of digital phrenology can be found here and here here.

General articles on various automated systems and bias, discrimination, unfairness, ethical concerns, etc., listed in order of publication dates starting from the latest.

Frank Pasquale testifies (video, written testimony) Before the United States House of Representatives Committee on  Energy and Commerce Subcommittee on Digital Commerce and Consumer Protection in relation to “Algorithms: How Companies’ Decisions About Data and Content Impact Consumers”. Here for more written testimony on Algorithmic Transparency from the Electronic Privacy Information Center – November 29, 2017.
ProPublica
Image Courtesy of ProPublica

There’s software used across the country to predict future criminals. And it’s biased against blacks. May 23, 2016 The company that sells this program (Northpointe) has responded to the criticisms here. Northpointe asserts that a software program it sells that predicts the likelihood a person will commit future crimes is equally fair to black and white defendants. Following such response, Jeff Larson and Julia Angwin has written another response (Technical Response to Northpointe) re-examined the data. They argue that they have considered the company’s criticisms, and stand by their conclusions.

 

Politics

Algorithmic processes and politics might seem far removed from each other. However, if anything, the recent political climate is indicative of how algorithms can be computational tools for political agendas. Here and here are exemplar twitter threads that highlight particular Twitter accounts used as tools for political agenda. The articles below are, in some way or another, related to algorithms in the political arena.

Forum Q&A: Philip Howard on Computational Propaganda’s Challenge to Democracy July 25, 2017. “Computational propaganda, or the use of algorithms and automated social media accounts to influence politics and the flow of information, is an emerging challenge to democracy in the digital age. Using automated social media accounts called bots (or, when networked, botnets), a wide array of actors including authoritarian governments and terrorist organizations are able to manipulate public opinion by amplifying or repressing different forms of political content, disinformation, and hate speech.”

 

The dark side of Big Data – how mathematical models increase inequality. My review of O’Neil’s book ‘WMD’

We live in the age of algorithms. Where the internet is, algorithms are. The Apps on our phones are results of algorithms. The GPS system can bring us from point A to point B thanks to algorithms. More and more decisions affecting our daily lives are handed over to automation. Whether we are applying for college, seeking jobs, or taking loans, mathematical models are increasingly involved with the decision makings. They pervade schools, the courts, the workplace, and even the voting process. We are continually ranked, categorized, and scored in hundreds of models, on the basis of our revealed preferences and patterns; as shoppers and couch potatoes, as patients and loan applicants, and very little of this do we see – even in applications that we happily sign up for.

More and more decisions being handed over to algorithms should in theory mean less human biases and prejudices. Algorithms are, after all, “neutral” and “objective”. They apply the same rules to everybody regardless of race, gender, ethnicity or ability. However, this couldn’t be far from the truth. In fact, mathematical models can be, and in some cases have been, tools that further inequality and unfairness. O’Neil calls these kinds of models Weapons of Math Destruction (WMD). WMDs are biased, unfair and ubiquitous. They encode poisonous prejudices from past records and work against society’s most vulnerable such as racial and ethnic minorities, low-wage workers, and women. It is as if these models were designed expressly to punish and to keep them down. As the world of data continues to expand, each of us producing ever-growing streams of updates about our lives, so do prejudice and unfairness.

Mathematical models have revolutionized the world and efficiency is their hallmark and sure, they aren’t just tools that create and distribute bias, unfairness and inequality. In fact, models, by their nature are neither good nor bad, neither fair nor unfair, neither moral nor immoral – they simply are tools. The sports domain is a good example where mathematical models are a force for good. For some of the world’s most competitive baseball teams today, competitive advantages and wins depend on mathematical models. Managers make decisions that sometimes involve moving players across the field based on analysis of historical data and current situation and calculate the positioning that is associated with the highest probability of success.

There are crucial differences, however, between models such as those used by baseball managers and WMDs.  While the former is transparent, and constantly updates its model with feedbacks, the latter by contrast are opaque and inscrutable black-boxes. Furthermore, while the baseball analytics engines manage individuals, each one potentially worth millions of dollars, companies hiring minimum wage workers, by contrast, are managing herds. Their objectives are optimizing profits so they slash their expenses by replacing human resources professionals with automated systems that filter large populations into manageable groups. Unlike the baseball models, these companies have little reason – say plummeting productivity – to tweak their filtering model.  O’Neil’s primary focus in the book are models that are opaque and inscrutable, those used within powerful institutions and industries, which create and perpetuate inequalities – WMDs – “The dark side of Big Data”! 

Weapons-of-math-destructionThe book contains crucial insights (or haunting warnings, depending on how you choose to approach it) to the catastrophic directions mathematical models used in the social sphere are heading. And it couldn’t come from a more credible and experienced expert than a Harvard mathematician who then went to work as quant for D. E. Shaw, a leading hedge fund, and an experienced data scientist, among other things.

One of the most persistent themes of O’Neil’s book is that the central objectives of a given model are crucial. In fact, objectives determine whether a model becomes a tool that helps the vulnerable or one that is used to punish them. WMDs objectives are often to optimize efficiency and profit, not justice. This, of course, is the nature of capitalism. And WMDs efficiency comes at the cost of fairness – they become biased, unfair, and dangerous. The destructive loop goes around and around and in the process, models become more and more unfair.

Legal traditions lean strongly towards fairness … WMDs, by contrast, tend to favour efficiency. By their very nature, they feed on data that can be measured and counted. But fairness is squishy and hard to quantify. It is a concept. And computers, for all their advances in language and logic, still struggle mightily with concepts. They “understand” beauty only as a word associated with the Grand Canyon, ocean sunsets, and grooming tips in Vogue magazine. They try in vain to measure “friendship” by counting likes and connections on Facebook. And the concept of fairness utterly escapes them. Programmers don’t know how to code for it, and few of their bosses ask them too. So fairness isn’t calculated into WMDs and the result is massive, industrial production of unfairness. If you think of a WMD as a factory, unfairness is the black stuff belching out of the smoke stacks. It’s an emission, a toxic one. [94-5]

The prison system is a startling example where WMDs are increasingly used to further reinforce structural inequalities and prejudicesIn the US, for example, those imprisoned are disproportionately poor and of colour. Being a black male in the US makes you nearly seven times more likely to be imprisoned than if you were a white male. Are such convictions fair? Many different lines of evidence suggest otherwise. Black people are arrested more often, judged guilty more often, treated more harshly by correctional officers, and serve longer sentences than white people who have committed the same crime. Black imprisonment rate for drug offenses, for example, is 5.8 times higher than it is for whites, despite a roughly comparable prevalence of drug use.

Prison systems which are awash in data hardly carry out important research such as why non-white prisoners from poor neighbourhoods are more likely to commit crimes or what the alternative ways of looking at the same data are. Instead, they use data to justify the workings of the system and further punish those that are already at a disadvantage. Questioning the workings of the system or enquiries on how the prison system could be improved are almost never considered. If, for example, building trust were the objective, an arrest may well become the last resort, not the first. Trust, like fairness, O’Neil explains, is hard to quantify and presents a great challenge to modellers even when the intentions are there to consider such concept as part of the objective.

Sadly, it’s far simpler to keep counting arrests, to build models that assume we’re birds of a feather and treat us such… Innocent people surrounded by criminals get treated badly. And criminals surrounded by law-abiding public get a pass. And because of the strong correlation between poverty and reported crime, the poor continue to get caught up in these digital dragnets. The rest of us barely have to think about them. [104]

Insofar as these models rely on barely tested insights, they are in a sense not that different to phrenology – digital phrenology. Phrenology, the practice of using outer appearance to infer inner character, which in the past justified slavery and genocide has been outlawed and is considered pseudoscience today. However, phrenology and scientific racism are entering a new era with the appearance of justified “objectivity” with machine-learned models. “Scientific” criminological approaches now claim to “produce evidence for the validity of automated face-induced inference on criminality. However, what these machine-learned “criminal judgements” pick up on, more than anything, is systematic unfairness and human bias embedded in historical data.  

model that profiles us by our circumstances helps create the environment that justifies its assumptions. The stream of data we produce serve as insights into our lives and behaviours. Instead of testing whether these insights stand up to scientific scrutiny, the data we produce are used to justify the modellers’ assumptions and to reinforce per-existing prejudice. And the feedback loop goes on.

When I consider the sloppy and self-serving ways that companies use data, I am often reminded of phrenology… Phrenology was a model that relied on pseudo-scientific nonsense to make authoritative pronouncements, and for decades it went untested. Big Data can fall into the same trap. [121-2]

Hoffman in 1896 published a 330-page report where he used exhaustive statistics to support a claim as pseudo-scientific and dangerous as phrenology. He made the case that the lives of black Americans were so precarious that the entire race was uninsurable. However, not only were Hoffman’s statistics erroneously flawed, like many of WMDs O’Neil discusses throughout the book, he also confused causation for correlation. The voluminous data he gathered served only to confirm his thesis: race is a powerful predictor of life expectancy. Furthermore, Hoffman failed to separate the “Black” population into different geographical, social or economic cohorts blindly assuming that the whole “Black” population is a homogeneous group. 

This cruel industry has now been outlawed. Nonetheless, the unfair and discriminatory practices remain and are still practiced but in a far subtler form –  they are now coded into the latest generations of WMDs and obfuscated under complex mathematics. Like Hoffman, the creators of these new models confuse correlation with causation and they punish the struggling classes and racial and ethnic minorities. And they back up their analysis with realms of statistics, which give them the studied air of “objective science”. 

What is even more frightening is that as oceans of behavioural data continue to feed straight into artificial intelligence systems, this, to the most part will, unfortunately, remain a black box to the human eye. We will rarely learn about the classes that we have been categorized into or why we were put there, and, unfortunately, these opaque models are as much a black-box to those who design them. In any case, many companies would go out of their way to hide the results of their models, and even their existence.

In the era of machine intelligence, most of the variables will remain a mystery... automatic programs will increasingly determine how we are treated by other machines, the ones that choose the ads we see, set prices for us, line us up for a dermatologist appointment, or map our routes. They will be highly efficient, seemingly arbitrary, and utterly unaccountable. No one will understand their logic or be able to explain it. If we don’t wrest back a measure of control, these future WMDs will feel mysterious and powerful. They’ll have their way with us, and we’ll barely know it is happening. [173]

In the current insurance system, (at least as far as the US is concerned) the auto insurers’ tracking systems which provide insurers with more information enabling them to create more powerful predictions, are opt-in. Only those willing to be tracked have to turn on their black boxes. Those that do turn them on get rewarded with discounts where the rest subsidize those discounts with higher rates. Insurers who squeeze out the most intelligence from this information, turning it into profits, will come out on top. This, unfortunately, undermines the whole idea of collectivization of risk on which insurance systems are based. The more insurers benefit from such data, the more of it they demand, gradually making trackers the norm. Consumers who want to withhold all but the essential information from their insurers will pay a premium. Privacy, increasingly, will come at a premium cost. A recently approved US bill illustrates just that. This bill would expand the reach of “Wellness Programs” to include genetic screening of employees and their dependents and increase the financial penalties for those who choose not to participate.

Being poor in a world of WMDs is getting more and more dangerous and expensive. Even privacy is increasingly becoming a luxury that only the wealthy can afford. In a world which O’Neil calls a ‘data economy’, where artificial intelligence systems are hungry for our data, we are left with very few options but to produce and share as much data about our lives as possible. In the process, we are (implicitly or explicitly) coerced into self-monitoring and self-discipline. We are pressured into conforming to ideal bodies and “normal” health statuses as dictated by organizations and institutions that handle and manage our social relations, such as, our health insurances. Raley (2013) refers to this as dataveillance: a form of continuous surveillance through the use of (meta)data. Ever growing flow of data, including data pouring in from the Internet of Things – the Fitbits, Apple Watches, and other sensors that relay updates on how our bodies are functioning, continue to contribute towards this “dataveillance”.  

One might argue that helping people deal with their weight and health issues isn’t such a bad thing and that would be a reasonable argument. However, the key question here, as O’Neil points out, is whether this is an offer or a command. Using flawed statistics like the BMI, which O’Neil calls “mathematical snake oil”, corporates dictate what the ideal health and body looks like. They infringe on our freedom as they mould our health and body ideals. They punish those that they don’t like to look at and reward those that fit their ideals. Such exploitation are disguised as scientific and are legitimized through the use of seemingly scientific numerical scores such as the BMI. The BMI, kg/m2 (a person’s weight (kg) over height (m) squared), is only a crude numerical approximation for physical fitness. And since the “average” man underpins its statistical scores, it is more likely to conclude that women are “overweight” – after all, we are not “average” men. Even worse, black women, who often have higher BMIs, pay the heaviest penalties.  

The control of great amounts of data and the race to build powerful algorithms is a fight for political power. O’Neil’s breathtakingly critical look at corporations like Facebook, Apple, Google, and Amazon illustrates this. Although these powerful corporations are usually focused on making money, their profits are tightly linked to government policies which makes the issue essentially a political one.

These corporations have significant amounts of power and a great amount of information on humanity, and with that, the means to steer us in any way they choose. The activity of a single Facebook algorithm on Election Day could not only change the balance of Congress, but also potentially decide the presidency. When you scroll through your Facebook updates, what appears on your screen is anything but neutral – your newsfeed is censored. Facebook’s algorithms decided whether you see bombed Palestines or mourning Israelis, a policeman rescuing a baby or battling a protester. One might argue that television news has always done the same and this is nothing new. CNN, for example, chooses to cover a certain story from a certain perspective, in a certain way. However, the crucial difference is, with CNN, the editorial decision is clear on the record. We can pinpoint to individual people as responsible and accountable for any given decision and the public can debate whether that decision is the right one. Facebook on the other hand, O’Neil puts it, is more like the “Wizard of Oz” — we do not see the human beings involved. With its enormous power, Facebook can affect what we learn, how we feel, and whether we vote – and we are barely aware of any of it. What we know about Facebook, like other internet giants, comes mostly from the tiny proportion of their research that they choose to publish.

In a society where money buys influence, these WMD victims are nearly voiceless. Most are disenfranchised politically. The poor are hit the hardest and all too often blamed for their poverty, their bad schools, and the crime that afflicts their neighbourhoods. They, for the most part, lack economic power, access to lawyers, or well-funded political organizations to fight their battles. From bringing down minorities’ credit scores to sexism in the workplace, WMDs serve as tools. The result is widespread damage that all too often passes for inevitability.

Again, it is easy to point out that injustice, whether based on bias or greed, has been with us forever and WMDs are no worse than the human nastiness of the recent past. As with the above examples, the difference is transparency and accountability. Human decision making has one chief virtue. It can evolve. As we learn and adapt, we change. Automated systems, especially those O’Neil classifies as WMD, by contrast, stay stuck in the time until engineers dive in to change them.

If Big Data college application model had established itself in the early 1960s, we still wouldn’t have many women going to college, because it would have been trained largely on successful men. [204]

Rest assured, the book is not all doom and gloom or that all mathematical models are biased and unfair. In fact, O’Neil provides plenty of examples where models are used for good and models that have the potential to be great.

Whether a model becomes a tool to help the vulnerable or a weapon to inflict injustice, as O’Neil, time and time again emphasizes, comes down to its central objectives. Mathematical models can sift through data to locate people who are likely to face challenges, whether from crime, poverty, or education. The kinds of objectives adopted dictate whether such intelligence is used to reject or punish those that are already vulnerable or to reach out to them with the resources they need. So long as the objectives remain on maximizing profit, or excluding as many applicants as possible, or to locking up as many offenders as possible, these models serve as weapons that further inequalities and unfairness. Change that objective from leeching off people to reaching out to them, and a WMD is disarmed — and can even become a force of good. The process begins with the modellers themselves. Like doctors, data scientists should pledge a Hippocratic Oath, one that focuses on the possible misuse and misinterpretation of their models. Additionally, initiatives such as the Algorithmic Justice League, which aim to increase awareness of algorithmic bias, provide space for individuals to report such biases. 

Opaqueness is a common feature of WMDs. People have been dismissed from work, sent to prison, or denied loans due to their algorithmic credit scores with no explanation as to how or why. The more we are aware of their opaqueness, the better chance we have in demanding transparency and accountability and this begins by making ourselves aware of the works of experts like O’Neil. This is not a book only for those working in data science, machine learning or other related fields, but one that everyone needs to read. If you are a modeller, this book should encourage you to zoom out, think whether there are individuals behind the data points that your algorithms manipulate, and think about the big questions such as the objectives behind your code. Almost everyone, to a greater or lesser extent, is part of the growing world of ‘data economy’. The more awareness there is of the dark side of these machines, the better equipped we are to ask questions, to demand answers from those behind the machines that decide our fate.