Abeba Birhane is currently a PhD candidate in cognitive science at University College Dublin in the School of Computer Science. She studies the dynamic and reciprocal relationships between emerging technologies, personhood and society. Specifically, she explores how ubiquitous technologies which are interwoven into our personal, social, political, and economical sphere are shaping what it means to be a person. In doing so, she leans on theoretical frameworks from traditions such as embodied cognitive science, dialogism, complexity science, critical data studies and philosophy of technology.
The second annual CyFyAfrica 2019, The Conference on Technology, Innovation, and Society  took place in Tangier, Morocco, 7 – 9 June 2019. It was a vibrant, diverse and dynamic gathering where various policy makers, UN delegates, ministers, governments, diplomats, media, tech corporations, and academics from over 65 nations, mostly African and Asian countries, attended. The conference unapologetically stated that its central aim is to bring forth the continent’s voices to the table in the global discourse. The president’s opening message emphasised that Africa’s youth need to be put at the front and centre stages of African concerns as the continent increasingly relies on technology for its social, educational, health, economical and financial issues. At the heart of the conference was the need to provide a platform to the voice of young people across the continent. And this was rightly so. It needs no argument that Africans across the continent need to play a central role in determining crucial technological questions and answers of not only for their continent but also far beyond.
In the race to make the continent teched-up, there are numerous cautionary tales that the continent needs to learn from. Otherwise we run the risk of repeating them and the cost of doing so is too high. To that effect, this piece outlines three major lessons that those involved in designing, implementing, importing, regulating, and communicating technology need to be aware of.
The continent stands to benefit from various technological and Artificial Intelligence (AI) developments. Ethiopian farmers, for example, can benefit from crowd sourced data to forecast and yield better crops. The use of data can help improve services within the health care and education sector. The country’s huge gender inequalities which plague every social, political, economical sphere can be brought to the fore through data. Data that exposes gender disparities in these key positions, for example, renders crucial the need to change societal structures in order to allow women to serve in key positions. Such data also brings general awareness of inequalities, which is central for progressive change.
Having said that, this is not what I want to discuss here. There already exist countless die-hard technology worshipers, some only too happy to blindly adopt anything “data-driven” and “AI” without a second thought of the possible unintended consequences, both within and outside the continent. Wherever the topic of technological innovation takes place, what we constantly find is advocates of technology and attempts to digitise every aspect of life, often at any cost.
In fact, if most of the views put forward by various ministers, tech developers, policy makers and academics at the CyFyAfrica 2019 conference are anything to go by, we have plenty of such tech evangelists – blindly accepting ethically suspect and dangerous practices and applications under the banner of “innovative”, “disruptive” and “game changing” with little, if any at all, criticism and scepticism. Therefore, given that we have enough tech-worshipers holding the technological future of the continent on their hands, it is important to point out the cautions that need to be taken and the lessons that need to be learned from other parts of the world, as the continent races forward in the technological race.
Just like Silicon Valley enterprise, the African equivalent of tech start-ups and “innovations” can be found at every possible sphere of life in any corner of the continent, from Addis Abeba to Nairobi to Abuja, to Cape Town. These innovations include in areas such as banking, finance, heath care, education, and even “AI for good” initiatives, both from companies and individuals within as well as outside the continent. Understandably, companies, individuals and initiatives want to solve society’s problems and data and AI seem to provide quick solutions. As a result the attempt to fix complex social problems with technology is ripe. And this is exactly where problems arise.
In the race of which start-up will build the next smart home system or state-of-the-art mobile banking application, we lose sight of the people behind each data point. The emphasis is on “data” as something that is up for grabs, something that uncontestedly belongs to tech-companies, governments, and the industry sector, completely erasing individual people behind each data point. This erasure of the person behind each data point makes it easy to “manipulate behaviour” or “nudge” users, often towards profitable outcomes for the companies. The rights of the individual, the long-term social impacts of AI systems and the unintended consequences on the most vulnerable are pushed aside, if they ever enter the discussion at all. Be it small start-ups or more established companies that design and implement AI tools, at the top of their agenda is the collection of more data and efficient AI systems and not the welfare of individual people or communities. Rather, whether explicitly laid out or not, the central point is to analyse, infer, and deduce “users” weakness and deficiencies and how that can be used to the benefit of commercial firms. Products, ads, and other commodities can then be pushed to individual “users” as if they exist as an object to be manipulated and nudged towards certain behaviours deemed “correct” or “good” by these companies and developers.
The result is AI systems that alter the social fabric, reinforce societal stereotypes and further disadvantage those already at the bottom of the social hierarchy while we allude to insisting these systems as politically neutral under the guise of “AI” and “data-driven”. UN delegates addressing the issue of online terrorism and counterterrorism measure and exclusively discussing Islamic terrorist groups, despite white supremacist terrorist groups carrying out more attacks than any other groups in recent years , illustrates an example where socially held stereotypes are reinforced and wielded in the AI tools that are being developed.
Although it is hardly ever made explicit, much of the ethical principles underlying AI rest firmly within utilitarian thinking. Even when knowledge of unfairness and discrimination of certain groups and individual as a result of algorithmic decision-making are brought to the fore, solutions that benefit the majority are sought. For instance, women have been systematically excluded from entering the tech industry , minorities forced into inhumane treatment , and systematic biases have been embedded in predictive policing systems , to mention but a few. However, although society’s most vulnerable are disproportionally impacted by the digitization of various services, proposed solutions to mitigate unfairness hardly consider such group as crucial piece of the solution.
Machine bias and unfairness is an issue that the rest of the tech world is grappling with. As technological solutions are increasingly devised and applied to social, economical and political issues, so are the problems that arise with the digitisation and automation of everyday life. The current attempts to develop “ethical AI” and “ethical guidelines” both within the Western tech industry and academic sphere illustrates awareness and attempt to mitigate these problems. The key global players in technology, Microsoft  and Google’s DeepMind  from the industry sector and Harvard and MIT , from the academic sphere are primary examples that illustrate the recognition of the possible catastrophic consequences of AI on society. As a result, ethics boards and curriculums on ethics and AI are being developed.
These approaches to develop, implement and teach responsible and ethical AI take multiple forms, perspectives, directions and emphasise various aspects. This multiplicity of views and perspectives is not a weakness but rather a desirable strength which is necessary for accommodating a healthy, context dependent remedy. Insisting on one single framework for various ethical, social and economical issues that arise in various contexts and cultures with the integration of AI, is not only unattainable but also advocating a one-size-fits-all style dictatorship and not a guideline.
Nonetheless, given the countless technology related disasters and cautionary tales that the global tech-community is waking up to, there are numerous crucial lessons that African developers, start-ups and policy makers can learn from. The African continent need not go through its own disastrous cautionary tales to discover the dark side of digitisation and technologization of every aspect of life.
AI is not magic and anything that makes it comes across as one needs to be disposed off
AI is a buzz word that gets thrown around so carelessly, it has increasingly become vacuous. What AI refers to is notoriously contested and the term is impossible to define conclusively – and it will remain that way due to the various ways various disciplines define and use it. Artificial intelligence can refer to anything from highly overhyped and deceitful robots , to Facebook’s machine learning algorithms that dictate what you see on your News Feed, to your “smart” fridge and everything in between. “Smart”, like AI has increasingly come to mean devices that are connected to other devices and servers with little to no attention being paid to how such hypoconnectivity at the same time creates surveillance systems that deprives individuals of their privacy.
Over-hyped and exaggerated representation of the current state of the field poses a major challenge. Both researchers within the field and the media contribute to this over-hype. The public is often made to believe that we have reached AGI (Artificial General Intelligence) or that we are at risk of killer robots  taking over the world, or that Facebook’s algorithms have created their own language forcing Facebook to shut down its project , when none of this is in fact correct. The robot known as Sophia is another example of AI over-hype and misrepresentation of AI, one that shows the disastrous consequences of the lack of critical appraisal. This robot which is best described as a machine with some face recognition capabilities and a rudimentary chatbot engine, is falsely described as semi-sentient by its maker. In a nation where women are treated as a second-class citizen, UAE granted this machine a citizenship, treating the female gendered machine better than its own female citizens. Similarly, neither the Ethiopian government nor the media attempted to pause and reflect on how the robot’s stay in Addis Ababa  should be covered. Instead the over-hype and deception were amplified as the robot was treated as some God-like entity.
Leading scholars of the field such as Mitchell  emphasise that, we are far from “superintelligence”. The current state of AI is marked by crucial limitations such as the lack of understanding of common-sense, which is a crucial element of human understanding. Similarly, Bigham  emphasises that in most of the discussion regarding “autonomous” systems (be it robots or speech recognition algorithms), a heavy load of the work is done by humans, often cheap labour – a fact that is put aside as it doesn’t bode well with the AI over-hype narrative.
Over-hype is not only a problem that portrays unrealistic image  of the filed, but also one that distracts attention away from the real danger of AI which is much more invisible, nuanced and gradual than “killer robots”. The simplification and extraction of human experience for capitalist  ends which is then presented as behaviour based “personalisation” is banal seeming practice on the surface but one that needs more attention and scrutiny. Similarly, algorithmic predictive models of behaviour that infer habits, behaviours and emotions need to be of concern as most of there inferences reflect strongly held biases and unfairness rather than getting at any in-depth causes or explanations.
The continent would do well to adopt a dose of critical appraisal when presenting, developing and reporting AI. This requires challenging the mindset that portrays AI with God-like power. And seeing AI as a tool that we create, control and are responsible for. Not as something that exists and develops independent of those that create it. And like any other tool, AI is one that embeds and reflects our inconsistencies, limitations, biases, political and emotional desires. Just like a mirror that reflects how society operates – unjust and prejudiced against some individuals and groups.
Technology is never either neutral or objective – it is like a mirror that reflects societal bias, unfairness and injustice
AI tools deployed in various spheres are often presented as objective and value free. In fact, some automated systems which are put forward in domains such as hiring  and policing  are put forward with the explicit claim that these tools eliminate human bias. Automated systems, after all, apply the same rules to everybody. Such claim is in fact one of the single most erroneous and harmful misconceptions as far as automated systems are concerned. As the Harvard mathematician, Cathy O’Neil  explains “algorithms are opinions embedded in code”. This widespread misconception further prevents individuals from asking questions and demanding explanations. How we see the world and how we chose to represent the world is reflected in the algorithmic models of the world that we build. The tools we build necessarily embed, reflect and perpetuate socially and culturally held stereotypes and unquestioned assumptions. Any classification, clustering or discrimination of human behaviours and characteristics that our AI system produces reflects socially and culturally held stereotypes, not an objective truth.
UN delegates working on online counterterrorism measures but explicitly focusing on Islamic groups despite over 60 percent  of mass shootings in 2019 the USA being carried out by white nationalist extremists, illustrate a worrying example that stereotypically held views drive what we perceive as a problem and furthermore the type of technology we develop.
A robust body of research as well as countless reports  of individual personal experience illustrates that various applications of algorithmic decision-makings result in biased and discriminatory outcomes. These discriminatory outcomes affect individuals and groups which are already on society’s margins, those that are viewed as deviants and outliers – people that refuse to conform to the status quo. Given that the most vulnerable are affected by technology the most, it is important that their voices are central in any design and implementation of any technology that is used on/around them. Their voice needs to be prioritised at every step of the way including in the designing, developing, implementing of any technology as well as in policy making.
As Africa grapples between catching up with the latest technological developments and protecting the consequential harm that technology causes, policy makers, governments and firms that develop and apply various tech to the social sphere need to think long and hard about what kind of society we want and what kind of society technology drives. Protecting and respecting the rights, freedoms and privacy of the very youth that the leaders want to put at the front and centre should be prioritised. This can only happen with guidelines and safeguards for individual rights and freedom in place.
Invasion of privacy and the erosion of humane treatment of the human
AI technologies are gradually being integrated to decision making processes in every sphere of life including insurance, banking, health and education services. Various start-ups are emerging from all corners of the continent at an exponential rate to develop the next “cutting edge” app, tool or system; to collect as much data as possible and then infer and deduce “users” various behaviours and habits. However, there seems to be little, if any at all, attention paid to the fact that digitisation and automatization of such spheres necessarily brings its own, often not immediately visible, problems. In the race to come up with the next new “nudge”  mechanism that could be used in insurance or banking, the competition for mining the most data seems the central agenda. These firms take it for granted that such “data”, which is out there for grabs, automatically belongs to them. The discourse around “data mining” and “data rich continent” shows the extent to which the individual behind each data point remains non-existent. This removing of the individual (individual with fears, emotions, dreams and hopes) behind each data is symptomatic of how little attention is given to privacy concerns. This discourse of “mining” people for data is reminiscent of the coloniser attitude that declares humans as raw material free for the taking.
Data is necessarily always about something and never about an abstract entity. The collection, analysis and manipulation of data, possibly entails monitoring, tracking and surveilling people. This necessarily impacts them directly or indirectly whether it is change of their insurance premiums or refusal of services.
AI technologies that are aiding decision making in the social sphere are developed and implemented by private sectors and various start-ups for the most part, whose primary aim is to maximise profit. Protecting individual privacy rights and cultivating a fair society is therefore least of their agenda especially if such practice gets in the way of “mining”, freely manipulating behaviour and pushing products into customers. This means that, as we hand over decision making regarding social issues to automated systems developed by profit driven corporates, not only are we allowing our social concerns to be dictated by corporate incentives (profit), but we are also handing over moral questions to the corporate world. “Digital nudges”, behaviour modifications developed to suit commercial interests, are a prime example. As “nudging” mechanisms become the norm for “correcting” individual’s behaviour, eating habits or exercising routines, those corporates, private sectors and engineers developing automated systems are bestowed with the power to decide what the “correct” behaviour, eating or exercising habit is. Questions such as who is deciding what the “correct” behaviour is and for what purpose are often completely ignored. In the process, individuals that do not fit our stereotypical image of what a “fit body”, a “well health” and a “good eating habit” is end up being punished, outcasted and pushed further to the margin.
The use of technology within the social sphere often, intentionally or accidentally, focuses on punitive practices, whether it is to predict who will commit the next crime or who would fail to pay their mortgage. Constructive and rehabilitation questions such as why people commit crimes in the first place or what can be done to rehabilitate and support those that have come out of prison are almost never asked. Technological developments built and applied with the aim of bringing security and order, necessarily bring cruel, discriminatory and inhumane practices to some. The cruel treatment of the Uighurs in China  and the unfair disadvantaging of the poor  are examples in this regard.
The question of technologization and digitalisation of the continent is also a question of what kind of society we want to live in. African youth solving their own problems means deciding what we want to amplify and show the rest of the world. It also means not importing the latest state-of-the-art machine learning systems or any other AI tools without questioning what the underlying purpose is, who benefits, and who might be disadvantaged by the application of such tools. Moreover, African youth playing in AI filed means creating programs and databases that serve various local communities and not blindly importing Western AI systems founded upon individualistic and capitalist drives. In a continent where much of the narrative is hindered by negative images such as migration, draught, and poverty; using AI to solve our problems ourselves means using AI in a way we want to understand who we are and how we want to be understood and perceived; a continent where community values triumph and nobody is left behind.
I gave a talk on the above title at NewCrafts Paris 2019 conference and was asked for bibliography underlying the content of my talk so here it is. I have included the abstract below to provide some context. I might also write a blog sometime in the future so watch this space. 🙂
Abstract: Imagine a world where we are able to predict people’s behaviour with precision. A world, for example, where we can tell whether someone is going to a commit crime before they do. A lot of our problems would just disappear. The quest for absolute certainty has been at the top of Western science’s agenda. In a similar fashion, current technological developments tend to strive for generalizability and predictability. We value certainty, stability and uniformity. Whereas most of reality, instead of being orderly and stable is seething with change, disorder and process. People, far from being predictable and predetermined, are complex, social and dynamical beings that inherently exist in a web of relations. This talk discusses how absolute certainty is not only an unattainable goal so far as understanding people and the social world is concerned but also a dangerous state to aspire to.
In Defence of Uncertainty – Bibliography
- Amazon scraps secret AI recruiting tool that showed bias against women
- Angwin, J.; Larson, J.; Mattu, S.; and Ajunwa, I., Friedler, S., Scheidegger, C. E., & Venkatasubramanian, S. (2016). Hiring by algorithm: predicting and preventing disparate impact. Available at SSRN.
- Bakhtin, M. M. (2010). The dialogic imagination: Four essays (Vol. 1). University of texas Press.
- Bakhtin, M. M. (2010). The Dialogic Imagination: Four Essays. University of Texas Press.
- Barad, K. (2007). Meeting the universe halfway: Quantum physics and the entanglement of matter and meaning. duke university Press.
- Baumer, E. P., & Silberman, M. (2011, May). When the implication is not to design (technology). In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 2271-2274). ACM.
- Birhane A (2017) Descartes was wrong: “a person is a person through other persons” | Aeon Ideas.
- Descartes, R. (2013). René Descartes: Meditations on first philosophy: With selections from the objections and replies. Cambridge University Press.
- Ferryman, K., & Pitcan, M. (2018). Fairness in Precision Medicine. Data & Society.
- Foucault, M. (2002). The order of things: An archaeology of the human sciences. Psychology Press.
- Gonen, H., & Goldberg, Y. (2019). Lipstick on a Pig: Debiasing Methods Cover up Systematic Gender Biases in Word Embeddings But do not Remove Them. arXiv preprint arXiv:1903.03862.
- Google ‘genuinely sorry’ after app labels dark-skinned people as ‘gorillas’ | CBC News
- Holquist, M. (2003). Dialogism: Bakhtin and his world. Routledge.
- Introna, L. D., & Nissenbaum, H. (2000). Shaping the Web: Why the politics of search engines matters. The information society, 16(3), 169-185.
- Juarrero, A. (1999). Dynamics in action: Intentional behavior as a complex system (p. 127143). Cambridge, MA: MIT press.
- Marková, I. (2016). The dialogical mind: Common sense and ethics. Cambridge University Press.
- Maturana, H. R., & Poerksen, B. (2004). From being to doing. The origins of the biology of cognition.
- Mbiti, J. S. (1990). African religions & philosophy. Heinemann.
- Morson, G. S., & Emerson, C. (Eds.). (1989). Rethinking Bakhtin: extensions and challenges. Northwestern University Press.
- New Study Uses Machine Learning to Predict Sexual Orientation
- O’Neil, C. (2016) Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy.
- Online Etymology Dictionary – Root word of science
- Prigogine, I., & Stengers, I. (1985). Order out of Chaos.
- Richardson, R., Schultz, J., & Crawford, K. (2019). Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice. New York University Law Review Online, Forthcoming.
- They’re watching, and they know a crime is about to take place before it happens
- Von Foerster, H. (2007). Understanding understanding: Essays on cybernetics and cognition. Springer Science & Business Media.
- Von Foerster, H., & Poerksen, B. (2002). Understanding systems: Conversations on epistemology and ethics (Vol. 17). Springer.
- Wang, Y., & Kosinski, M. (2018). Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. Journal of personality and social psychology, 114(2), 246.
- Wilson, B., Hoffman, J., & Morgenstern, J. (2019). Predictive Inequity in Object Detection. arXiv preprint arXiv:1902.11097.
There is no area untouched by data science and computer science. From medicine, to the criminal justice system, to banking & insurance, to social welfare, data driven solutions and automated systems are proposed and developed for various social problems. The fact that computer science is intersecting with various social, cultural and political spheres means leaving the realm of the “purely technical” and dealing with human culture, values, meaning, and questions of morality; questions that need more than technical “solutions”, if they can be solved at all. Critical engagement and ethics are, therefore, imperative to the growing field of computer science.
And the need for ethical and critical engagement is becoming more apparent as not a day goes by without a headline about some catastrophic consequences of careless practices, be it a discriminatory automated hiring system or implementation of a facial recognition system that undermines privacy. With the realization that the subjects of enquiry of data science delve deep into the social, cultural and political, has come the attempt to integrate (well, to at least include) ethics as part of computer science modules (e.g., at Harvard and MIT). The central players in the tech world (e.g., DeepMind) also seem to be moving in the direction of taking ethics seriously.
However, even when the need to integrate ethics and critical thinking into computer science is acknowledged, there are no established frameworks, standards or consensus on how it ought to be done, which is not surprising given that the practice is at its early stages. Often times, the idea of ethics is seen as something that can be added to existing systems (as opposed to a critical and analytical approach that goes beyond and questions current underlying assumptions), or as some sort of checklist (as opposed to an aspiration that students need to adopt as part of normal professional data science practice beyond satisfying module requirements) with a list of dos and don’ts that can be consulted… and ta da! You have ethics!
In this blog, we share our approach to integrating critical thinking and ethics to data science as part of the Data Science in Practice module in UCD, CS. The central aspiration of this class is to stimulate students to think critically, to question taken for granted assumptions and to open various questions for discussion. Central to this is the idea of viewing critical thinking and ethics as an important aspect of data scientific practice rather than a list of dos and don’ts that can be taught in class. To see irresponsible and unethical outcomes and practices as things that affect us as individual citizens and shape society for the worst.
The class does not teach some set ethical foundations that need to be followed or ethical and unethical ways of doing data science. Rather, we present various ethical issues as open questions for discussion and the class is given current tools and automated systems PredPol, for example, and are asked to point out possible issues. The class, therefore, is extremely interactive throughout.
The structure of the module (Data Science in Practice) is that students work in pairs on data science projects of their choice. Depending on the type of question the students choose to tackle, some projects require extensive critical and ethical reflection, while others less so. Nonetheless, all the projects are required to include an “Ethical Considerations” section in their final report. This section ideally reflects possible ethical issues that they came across working in their chosen project and the ways they mitigated such issues as well as issues that could be anticipated as emerging from the work that could be out of their control.
At the start of the module we have a general Critical Thinking and Data Ethics three-hour long lecture. The content is described below for those interested. Given that it is a data science module, the first half of the session thematically raises data related ethical questions and critical reflection while during the second half, the theme is ethics and AI, specifically, automated systems.
There are infinite various ways to approach this, a vast amount of material to include and many ways to design, frame and direct the conversation. Our specific way is simply one of them. It fits the module, the department, and the students’ backgrounds and aligns with the module aims and expected outcomes. These factors are likely to differ in other institutes and modules. If you find this helpful, that’s great. If not, we hope that this blogpost provided you with some food for thought.
The central content is thematized in two parts as follows. Read along if you are interested in the details. You can also email Abeba.email@example.com if you would like the slides.
Looking back: hidden history of data
Unquestioned assumptions and (mis)understandings of the nature of data: a critical look
- Data reflect objective truth:
- Data exist in a historical, societal, and cultural vacuum:
- The data scientist is often invisible in data science:
Correlation vs causation
Bias: automated systems
Data for Good
The first challenge is to establish why critical thinking and data ethics is important for data scientists. This is the second year that this module is running and one of the lessons learned from last year is that not everybody is on board from the get-go with the need for critical thinking and data ethics for data science. Therefore, although it might seem obvious, it is important to try to get everyone on board before jumping in. This is essential for a productive discussion. The students are likely to engage and have an interest if they first and foremost see why it is important. Examples of previous and current data science related disasters, (Cambridge Analytica, Facebook, for example), other major computer science departments doing it, and the fact that “Ethical Considerations” need to be included in the student’s final report serve to get them on board.
Looking back: hidden history of data
With the convincing out of the way, a brief revisit of the dark history of data and the sciences, provides a vivid and gruesome example of the use and abuse of the most vulnerable members of society in the name of data for medical advancements. Nazi-era medical experiments serve as primary examples. Between 1933 – 1945, German anatomists over 31 departments, had accepted bodies of thousands of people killed by the Hitler regime. These bodies were dissected in order to study anatomy. The (in)famous Hermann Stieve (1886 – 1952) got his “material” i.e. people the Nazis sentenced to death for minor crimes such as looting, for research from Plötzensee Prison. Stieve’s medical practices are among the ethically harrowing. However, he is also seen as “a great anatomist who revolutionized gynaecology through his clinical-anatomical research.”
The question remains: how should we view research that is scientifically valuable but morally disturbing? This question elicits a great deal of discussion in class.
Ethical red-flags and horrifying consequences are much more visible and relatively immediate with medical anatomy research. Whereas with data and data driven automated decision makings, the effects and consequences are much more nuanced and invisible.
At this point, another open question is paused to the class: What makes identifying and mitigating ethical red flags in data science much more difficult and nuanced than those in medical sciences?
Unquestioned assumptions and (mis)understandings of the nature of data: a critical look
Data reflect objective truth:
The default thinking within data and computer science tends to assume that data are automatically objective. This persistent misconception that data are an objective form of information that we simply find “out there”, observe and record, obscures the fact that data can be as subjective as the humans finding and recording it. Far from reflecting objective truth, data are political, messy, often incomplete, sometimes fake, and full of complex human meanings. Gitelman (2013)’s book “Raw Data is an Oxymoron” is an excellent resource in this regard. “Data is anything but “raw”… we shouldn’t think of data as a natural resource but as a cultural one that needs to be generated, protected, and interpreted.”
Crawford also concisely summarizes the problem with the view of data as reflection of objective truth:
“Data and data sets are not objective; they are creations of human design. We give numbers their voice, draw inferences from them, and define their meaning through our interpretations. Hidden biases in both the collection and analysis stages present considerable risks, and are as important to the big-data equation as the numbers themselves.”
At almost every step of the data science process, the data scientist makes assumptions and decisions based on these assumptions. Consequently, any results that emerge are fundamentally biased by these assumptions. These assumptions might be reasonable or they might not. This means that data scientists must be transparent about these assumptions. The problem is that oftentimes, data scientists are neither clear about their assumptions nor think about them at all.
Data exist in a historical, societal, and cultural vacuum:
Far from reflecting objective truth, data often reflect historical inequalities, norms and cultural practices. Our codes then pick these inequalities and norms, which are taken as the “ground truth” and amplify them. As a result, women getting paid less than men, for example, comes to be taken as the norm and gets amplified by algorithmic systems. Amazon’s recently discontinued hiring AI is a primary example. The training data for Amazon’s hiring algorithm is historical data – CV’s submitted to the company over the previous 10 years. In this case, previous success is taken as indication of future success. And in the process, CV’s that didn’t fit the criteria for past success (women) were eliminated from consideration. This type of predictive system works under the assumption that the future looks like the past. However, this is problematic as people and societies change over time. Algorithmic decision making like that of Amazon’s create algorithmic driven determinism where people are deprived from being the exception to the rule.
The data scientist is often invisible in data science:
Due to the “view from nowhere” approach that most of data science operates under, the data scientist is almost always absent from data science. With the spotlight on the data scientist, we can examine how there’s always people making important choices including:
Which data to include and by default which to exclude, how data are weighed, analysed, manipulated and reported, and making important decisions such as how good “good enough” is for an algorithm to perform. And people are biased. We form stereotypes and rely on them as shortcuts to our day-to-day lives. For example, a CV with a white-sounding name will receive a different (more positive) response than the same CV with a black-sounding name. Women’s chance of being hired in symphony orchestras increases between 30% and 55% in blind auditions. Goldin & Rouse (1997). We can only see things from our own point of view, which is entangled in our history, cultural and social norms. Putting the spotlight on the data scientist allows us to acknowledge personal motivations, beliefs, values, and biases that directly or indirectly shape our scientific practices.
Correlation vs causation
The statement that “correlations is not causation” might seem an obvious statement but one that most us need to be regularly reminded. In this regard, Bradford Hill’s criteria of causation is a helpful framework to look at. Hill’s 9 principles – minimal conditions needed to establish a causal relationship, were originally developed as a research tool in the medical sciences. However, they are equally applicable and relevant to data scientists. Hill’s 9 principles are; strength, consistency, specificity, temporality, dose effect, plausibility, coherence, experiment, and analogy. The more criteria that are met, the more likely the relationship is causal. xkcd.com provides witty and entertaining comics for each of Hill’s criteria for causation.
Ethical questions inevitably arise with all innovation. Unfortunately, they are often an afterthought and not anticipated and mitigated. As data scientists, the questions that we are trying to answer implicitly or explicitly intersect with the social, medical, political, psychological, and legal sphere. Ethical and responsible practices are not only personally expected but also legally required. To this end awareness and compliance of GDPR regulations is crucial when collecting, storing, and processing personal data. Students working with personal data are directed to the university’s GDPR Road Map.
Having said that, GDPR can only serve as a framework and is not the final answer that proves a clear black and white solution. We cannot comprehend what our data will reveal in conjunction with other data. Furthermore, privacy is not something we can always negotiate person by person but rather something that we need to look at as whole network. Look no further than the Strava debacle.
This is a murky and complex area and the idea is not to equip the students with the fine grained details of privacy or GDPR but rather to raise awareness.
Bias: automated systems
Can we solve problems stemming from human bias by turning decisions over to machines? In theory, more and more decisions increasingly handled by algorithms should mean that human biases and prejudices should be eliminated. Algorithms are, after all, “neutral” and “objective”. They apply the same rules to everybody regardless of race, gender, ethnicity or ability. The reality, however, is far from this. As O’Neil points out, automated systems only give us the illusion of neutrality. Case after case have demonstrated that automated systems, in fact, can become tools that perpetuate and amplify inequalities and unfairness. Examples include recidivism algorithms, hiring algorithms.
Decisions delivered from automated systems may not be as grave and immediate if these systems are recommending what books we might like to buy next based on our previous purchase. However, the stakes are much higher when automated systems are diagnosing illness or holding sway over a person’s job application or prison sentence.
O’Neil makes a powerful argument that the objectives of a mathematical model determine whether the model becomes a force for good or a tool that wields and perpetuates existing and historical bias. Automated systems (which are often developed by commercial firms for profit), often strive to optimize for efficiency and profit, which come at the cost of fairness. Take the (U.S) prison system for example. Questions such as how the prison system can be improved are almost never considered. Instead the goal seems to be to lock as many people away as possible. Consequently, algorithmic systems within the prison system strive to flag and lock people away that are deemed to likely reoffend.
The stream of data we produce serve as insights into our lives and behaviours. Instead of testing whether these insights and intuitions stand up to scientific scrutiny, the data we produce are used to justify the modellers’ intuitions and to reinforce pre-existing assumptions and prejudice. And the feedback loop goes on. Once again, associations are taken as evidence to justify pre-existing assumptions.
Algorithmic systems increasingly present new ways to sort, profile, exclude and discriminate within the complex social world. The opaque nature of these systems mean that we don’t know things have gone wrong until a big number of people, often society’s most vulnerable, are affected. We, therefore, need to anticipate all possible consequences of our work in advance before things go wrong. As we’ve seen in previous examples, algorithmic decision making is increasingly intersecting with the social sphere – blurring the boundaries between technology and society, public and private. As data scientists, working to solve society’s problems, understanding these complex and fuzzy boundaries and cultural, historical and social context of our data and algorithmic tools is crucial.
In domains such as medicine and psychology, where work has direct or indirect impact on the individual person or society, there often exist ethical frameworks in place. Ethics is an integral part of medical training, for example. Physicians are held to specific ethical standards through the practice of swearing the Hippocratic Oath and through various medical ethics boards.
At this stage another question is put forward to the class: Given that data scientists, like physicians, work to solve society’s problems, influencing it in the process, should data scientists then be held to the same standard as physicians?
Data for Good
Most of the content of this lecture contains either cautionary tales or warnings, which at times might dishearten students. This year we have added a section on “Data for good” towards the end. This helps conclude the course with a bit of a positive note by illustrating how data science is being used for social good.
The Greena Davis Institute in collaboration with Google.org is using data to identify gender bias within the film industry. They analysed the 100 highest grossing (US domestic) live-action films from 2014, 2015, and 2016. The findings show that men are seen and heard nearly twice as often as women. Such work is crucial for raising awareness of the blind spots in media and encourages storytellers to be inclusive.
“Data helps us understand what it is we need to encourage creators to do. Small things can have a huge impact.” Geena Davis, Academy Award-winning actor, founder and chair of the Geena Davis Institute on Gender in Media
Similarly, the Troll Patrol project by Amnesty International and Element AI, studied online abuse against women. They surveyed millions of tweets received by 778 journalists and politicians, UK and US throughout 2017. They commissioned an online polling of women in 8 countries about their experiences of abuse on social media. Over 6,500 volunteers from around the world took part, analysing 288,000 tweets to create a labelled dataset of abusive or problematic content. The findings show that 7.1% of tweets sent to the women in the study were “problematic” or “abusive”. This amounts to 1.1 million tweets mentioning 778 women across the year, or one every 30 seconds. Furthermore, women of colour, (black, Asian, Latinx and mixed-race women) were 34% more likely to be mentioned in abusive or problematic tweets than white women. Black women were disproportionately targeted, being 84% more likely than white women to be mentioned in abusive or problematic tweets.
This sums up how we currently organize the content and the kind of tone we are aiming for. The ambition is a fundamental rethinking of taken for granted assumptions and to think of ethical data science in a broader sense as work that potentially affects society rather than simply as “not using personal information for research”. Whether we succeed or not is a different matter that remains to be seen. Furthermore, this is a fast-moving field where new impacts of technology as well as new ways of thinking about ethics are continually changing. Taking this and the fact that we are continually incorporating student feedbacks and what has (not)worked into account, next year’s content could possibly look slightly different.
Multidisciplinary = the fruit bowl (single disciplines brought together) Interdisciplinary = a fruit salad (combine disciplines together for one output) Transdisciplinary = the smoothie (disciplines transformed-new). EU EnRRICH project
A young fruit enthusiast wanted to make a fruit salad. Seeing that so many different fruit suppliers bring all sorts of fruit to her fruit bar, and many customers in return buy individual fruits, she thought she’d make something that each fruit supplier doesn’t produce by combining their supplies – a fruit salad. Besides, there seems to be a great deal of excitement over this new mixing of various fruits and everybody seems to want and encourage it.
Having sampled many different fruits over the years, the fruit salad maker decided it is a good use of her time and expertise to get into the fruit salad making business. She decided on mango, kiwi and pineapple as her fruits of choice that would make her signature fruit salad. They blend very well, they are grown locally, and they complement one another. When mixed, they not only produce an excellent taste, but they are also very appetizing to look at. Most mango, kiwi and pineapple lovers should be able to appreciate and enjoy them, the fruit salad maker thought and she started the process of combining her fruits.
“Not so fast”, came along the fruit gatekeepers. “We need to first see that your tastes for fruits, ability to make fruit salad, and knowledge of each fruit is sufficient before we allow you to open this fruit bar”. Well, it’s legally required that a fruit bar is certified after all. And on the positive side, this certificate would signify a much-needed validation and boost from the fruit community.
Not being able to open her fruit bar without the recognition required and the seal of approval, the fruit salad maker embarked on the process of fulfilling the necessary requirements to pass the necessary tests. She compiled a convincing argument for the need for fruit salads, her knowledge of three fruits, and most importantly for her personal skills and passion for mixing fruits. She demonstrated how her fruits of choice go well together, why they should be made into fruit salad and how much her customers would benefit from such combination.
She then produced the first plate of fruit salad and put it in front of the fruit gatekeepers. “I love the idea of fruit salads. We are all stuck in our special fruit echo chambers. We should all try fruit salads and appreciate those that actually make colourful fruit salads”, said the mango gatekeeper. He then tasted a big mouthful of the fruit salad before him. “It needs more mango”, he said. “I also recommend you study the history of mango production and the fine-grained detail of the biochemistry of mango to make your fruit salad better. I am afraid I can’t let you past my gate until then”, he added.
The kiwi gatekeeper, who also confessed how much he loves fruit salads, followed and had a mouthful of the fruit salad in front of him. Like his previous colleague the mango gatekeeper, the kiwi gatekeeper seems to be solely concerned with the kiwi part of the fruit salad – not the whole combination. “Salt would really compliment the kiwis, add a pinch to bring out the flavour more. In order for me to recognize that you have used kiwi in your fruit salad, you need a lot more kiwi on your fruit salad,” he commented. “Plus, I don’t recognize the breed of kiwi that you’re using. I will give you a list of good kiwis you need to use. Until the kiwi is right, I am afraid it is my duty to not let you pass my gate. Better luck next time” he added.
Lastly, the pineapple gatekeeper scooped a spoonful of the fruit salad and tried it. “I also love the idea of fruit salads but I have to tell you that this is not how we slice pineapples over at the pineapple empire. We also marinate them in our special sauce. Your pineapples lack both. You really need to know your pineapple inside out if you are to call yourself a fruit salad maker at all. Plus, I see very little pineapple on this plate. So, get the special sauce from our empire and cut your pineapples our way. Only then can we give you our approval,” she exclaimed.
The fruit salad maker, unestablished and with much less power than the gatekeepers, felt disheartened. She tried to point out that each gatekeeper needs to look at the dish as a whole instead of focusing on each specific fruit. And, surely, the single fruit bars don’t go through as much scrutiny. Unfortunately, questioning the individual fruit experts didn’t do her any favours – they have been in their respective fruit business for much longer than she has and must surely know what they are doing. Who’s she to question their domain expertise?!
It felt as though, what they are demanding seemed too self-fulfilling and incommensurable at times. But then again, she suffered from too much self-doubt given that this is her first big attempt at making a fruit salad, to argue with their demands. Either way, if she is to get that business going, she needs each gatekeeper’s seal of approval. She went ahead and attempted to make the type of fruit salad that would satisfy each gatekeeper; with plenty of mango, huge helpings of ripe kiwi and custom sliced pineapples.
At the next round of testing, the fruit salad maker revised the plate in a manner that reflects the advice previously provided by the gatekeepers. Unfortunately, they unanimously agreed that the plate is overflooded with too much fruit, is unhealthy and is unattractive to look at. “All the excess fruit must be trimmed away,” they declared. “This is a health hazard and we cannot approve of such a dish. Think about how to make it neater, healthier and attractive and come back to us with your improved fruit salad. We will then discuss the matter and perhaps let you through our gate,” they said.
After many attempts to satisfy each of the gatekeepers version of a perfect fruit salad, the fruit salad maker is back to square one. She’s caught in a recursive loop. Each fruit connoisseur, expert on their own fruit, seems to underappreciate the taste and benefit of the fruit mix before them. Putting individual fruit experts together doesn’t necessarily make a fruit salad judge, after all.
Having gone through a number of time-consuming practices of making fruit salads and the bureaucratic paperwork associated with it, the fruit salad maker wonders if the fruit salad making business is worthwhile at all. Single fruit dealings, the dominant mode of doing business would have been simpler – not as rewarding for sure, but certainly simpler. But the thing is, once you develop the palate for the unique taste of fruit salads, nothing else will do.
- Ananny, M., & Crawford, K. (2016). Seeing without knowing: Limitations of the transparency ideal and its application to algorithmic accountability. new media & society, 1461444816676645.
- Ajunwa, I., Crawford, K., & Schultz, J. (2017). Limitless worker surveillance. Cal. L. Rev., 105, 735.
- Barabas, C., Dinakar, K., Virza, J. I., & Zittrain, J. (2017). Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment.arXiv preprint arXiv:1712.08238.
- Barocas, S., & Selbst, A. D. (2016). Big data’s disparate impact.
- Bartlett, A., Lewis, J., Reyes-Galindo, L., & Stephens, N. (2018). The locus of legitimate interpretation in Big Data sciences: Lessons for computational social science from-omic biology and high-energy physics. Big Data & Society, 5(1), 2053951718768831.
- Bolukbasi, T., Chang, K. W., Zou, J. Y., Saligrama, V., & Kalai, A. T. (2016). Man is to computer programmer as woman is to homemaker? Debiasing word embeddings. In Advances in Neural Information Processing Systems (pp. 4349-4357).
- Caliskan-Islam, A., Bryson, J. J., & Narayanan, A. (2016). Semantics derived automatically from language corpora necessarily contain human biases. arXiv preprint arXiv:1608.07187.
- Chouldechova, A. (2017). Fair prediction with disparate impact: A study of bias in recidivism prediction instruments.arXiv preprint arXiv:1703.00056. (PDF)
- Creemers, R., (2018) China’s Social Credit System: An Evolving Practice of Control
- Datta, A., Sen, S., & Zick, Y. (2016, May). Algorithmic transparency via quantitative input influence: Theory and experiments with learning systems. In Security and Privacy (SP), 2016 IEEE Symposium on (pp. 598-617). IEEE. (PDF)
- Datta, A., Tschantz, M. C., & Datta, A. (2015). Automated experiments on ad privacy settings.Proceedings on Privacy Enhancing Technologies, 2015(1), 92-112
- Edwards, L., Martin, L., & Henderson, T. (2018). Employee Surveillance: The Road to Surveillance is Paved with Good Intentions.
- Friedman, B., & Nissenbaum, H. (1996). Bias in computer systems. ACM Transactions on Information Systems (TOIS), 14(3), 330-347
- Jawaheri, H. A., Sabah, M. A., Boshmaf, Y., & Erbad, A. (2018). When A Small Leak Sinks A Great Ship: Deanonymizing Tor Hidden Service Users Through Bitcoin Transactions Analysis. arXiv preprint arXiv:1801.07501.
Keyes, O. (2018). The misgendering machines: Trans/HCI implications of automatic gender recognition. Proceedings of the ACM on Human-Computer Interaction, 2(CSCW), 88.
- Kleinberg, J., & Mullainathan, S. (2018). Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability. arXiv preprint arXiv:1809.04578.
- Kuehlkamp, A., Becker, B., & Bowyer, K. (2017, March). Gender-From-Iris or Gender-From-Mascara?. In Applications of Computer Vision (WACV), 2017 IEEE Winter Conference on (pp. 1151-1159). IEEE.
Lau, J., Zimmerman, B., & Schaub, F. (2018). Alexa, Are You Listening?: Privacy Perceptions, Concerns and Privacy-seeking Behaviors with Smart Speakers. Proceedings of the ACM on Human-Computer Interaction, 2(CSCW), 102.
- Lipton, Z. C., & Steinhardt, J. (2018). Troubling Trends in Machine Learning Scholarship. arXiv preprint arXiv:1807.03341.
- Miller, T. (2017). Explanation in artificial intelligence: insights from the social sciences. arXiv preprint arXiv:1706.07269.
- Monahan, J., & Skeem, J. L. (2016). Risk assessment in criminal sentencing.Annual review of clinical psychology, 12, 489-513.
- Munoz, C., Smith, M., & Patil, D. (2016). Big data: A report on algorithmic systems, opportunity, and civil rights.Executive Office of the President. The White House.
- Narayanan, A., Huey, J., & Felten, E. W. (2016). A precautionary approach to big data privacy. In Data protection on the move (pp. 357-385). Springer, Dordrecht.
Raji, I. D., & Buolamwini, J. (2019). Actionable auditing: Investigating the impact of publicly naming biased performance results of commercial ai products. In AAAI/ACM Conf. on AI Ethics and Society.
Richardson, R., Schultz, J., & Crawford, K. (2019). Dirty Data, Bad Predictions: How Civil Rights Violations Impact Police Data, Predictive Policing Systems, and Justice. New York University Law Review Online, Forthcoming.
Sadowski, J. (2019). When data is capital: Datafication, accumulation, and extraction. Big Data & Society, 6(1), 2053951718820549.
- Silver, E., & Miller, L. L. (2002). A cautionary note on the use of actuarial risk assessment tools for social control. Crime & Delinquency, 48(1), 138-161.
Stark, L. (2019). Facial recognition is the plutonium of AI. XRDS: Crossroads, The ACM Magazine for Students, 25(3), 50-55.
Stark, L., & Hoffmann, A. L. (2018). Data Is The New What?: Popular Metaphors & Professional Ethics in Emerging Data Cultures. Journal of Cultural Analytics.
Suresh, H., & Guttag, J. V. (2019). A Framework for Understanding Unintended Consequences of Machine Learning. arXiv preprint arXiv:1901.10002.
van Wynsberghe, A., & Robbins, S. (2018). Critiquing the reasons for making artificial moral agents. Science and engineering ethics, 1-17.
- Weinberg, J. (2018). “Know Everything that Can Be Known About Everybody”: The Birth of the Credit Report.
- Wilson, B., Hoffman, J., & Morgenstern, J. (2019). Predictive Inequity in Object Detection. arXiv preprint arXiv:1902.11097.
- Yeung, K. (2017). Algorithmic Regulation: A Critical Interrogation.
- Zafar, M. B., Valera, I., Gomez Rodriguez, M., & Gummadi, K. P. (2017, April). Fairness beyond disparate treatment & disparate impact: Learning classification without disparate mistreatment. In Proceedings of the 26th International Conference on World Wide Web (pp. 1171-1180). International World Wide Web Conferences Steering Committee.
- Zhang, B. H., Lemoine, B., & Mitchell, M. (2018). Mitigating Unwanted Biases with Adversarial Learning. arXiv preprint arXiv:1801.07593.
This list is not exhaustive by any means but work that is relevant to my work and a list I revise and revisit regularly
Link for the main resources page here
Weapons of math destruction: how big data increases inequality and threatens democracy by Cathy O’Neil. A great number of the article on the list below are written by O’Neil. She is also active on Twitter regularly posting links and interesting critical insights on everything to do with mathematical models and bias. Here is my own review of O’Neil’s book with plenty of relevant links itself and here for another excellent review of O’Neil’s book.
We Are Data: Algorithms and the Making of Our Digital Selves (2018) by John Cheney-Lippold.
Below is the first few paragraph from a review by Daniel Zwi, a lawyer with an interest in human rights and technology. Here is also a link to my twitter thread where you can read excerpts from the book that I tweeted as I read the book.
In 2013, a 41-year-old man named Mark Hemmings dialled 999 from his home in Stoke-on-Trent. He pleaded with the operator for an ambulance, telling them that ‘my stomach is in agony’, that ‘I’ve got lumps in my stomach’, that he was vomiting and sweating and felt light-headed. The operator asked a series of questions — ‘have you any diarrhoea or vomiting?’; ‘have you passed a bowel motion that looks black or tarry or red or maroon?’ — before informing him that he did not require an ambulance. Two days later Mr Hemmings was found unconscious on the floor of his flat. He died of gallstones shortly after reaching hospital.
This episode serves as the affective fulcrum of We Are Data: Algorithms and the Making of Our Digital Selves, John Cheney-Lippold’s inquiry into the manner in which algorithms interpret and influence our behaviour. It represents the moment at which the gravity of algorithmic regulation is brought home to the reader. And while it may seem odd to anchor a book about online power dynamics in a home telephone call (that most quaint of communication technologies), the exchange betokens the algorithmic relation par excellence. Mr Hemmings’s answers were used as data inputs, fed into a sausage machine of opaque logical steps (namely, the triaging rules that the operator was bound to apply), on the basis of which he was categorised as undeserving of immediate assistance.
The dispassionate, automated classification of individuals into categories is ubiquitous online. We either divulge our information voluntarily — when we fill out our age and gender on Facebook, for example — or it is hoovered up surreptitiously via cookies (small text files which sit on our computer and transmit information about our browsing activity to advertising networks). Our media preferences, purchases and interlocutors are noted down and used as inputs according to which we are ‘profiled’ — sorted into what Cheney-Lippold calls ‘measureable types’ such as ‘gay conservative’ or ‘white hippy’ — and served with targeted advertisements accordingly.
The challenges to humanity posed by the digital future, the first detailed examination of the unprecedented form of power called “surveillance capitalism,” and the quest by powerful corporations to predict and control our behavior. Shoshana Zuboff’s interdisciplinary breadth and depth enable her to come to grips with the social, political, business, and technological meaning of the changes taking place in our time. We are at a critical juncture in the confrontation between the vast power of giant high-tech companies and government, the hidden economic logic of surveillance capitalism, and the propaganda of machine supremacy that threaten to shape and control human life. Will the brazen new methods of social engineering and behavior modification threaten individual autonomy and democratic rights and introduce extreme new forms of social inequality? Or will the promise of the digital age be one of individual empowerment and democratization?
The Age of Surveillance Capitalism is neither a hand-wringing narrative of danger and decline nor a digital fairy tale. Rather, it offers a deeply reasoned and evocative examination of the contests over the next chapter of capitalism that will decide the meaning of information civilization in the twenty-first century. The stark issue at hand is whether we will be the masters of information and machines or its slaves.
Algorithms of oppression: How search engines reinforce – below is an excerpt from Nobel’s book: You can also find another review of Algorithms of Oppression here. Run a Google search for “black girls”—what will you find? “Big Booty” and other sexually explicit terms are likely to come up as top search terms. But, if you type in “white girls,” the results are radically different. The suggested porn sites and un-moderated discussions about “why black women are so sassy” or “why black women are so angry” presents a disturbing portrait of black womanhood in modern society. In Algorithms of Oppression, Safiya Umoja Noble challenges the idea that search engines like Google offer an equal playing field for all forms of ideas, identities, and activities. Data discrimination is a real social problem; Noble argues that the combination of private interests in promoting certain sites, along with the monopoly status of a relatively small number of Internet search engines, leads to a biased set of search algorithms that privilege whiteness and discriminate against people of color, specifically women of color.
Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian and Tom Griffiths. This book is concerned with the workings of the human mind and how computer science can help human decision making. Here is a post by Artem Kaznatcheev on Computational Kindness which might give you a glimpse of the some of the issues that book covers. Here is a long interview with Brian Christian and Tom Griffiths and a TED Talk with Tom Griffiths on The Computer Science of Human Decision Making.
The Black Box Society: The Secret Algorithms That Control Money and Information by Frank Pasquale. You can read the introduction and conclusion chapters of his book here. And here is a good review of Pasquale’s book. You can follow his twitter stream here.
Here is a synopsis: A revealing look at how tech industry bias and blind spots get baked into digital products—and harm us all.
Buying groceries, tracking our health, finding a date: whatever we want to do, odds are that we can now do it online. But few of us ask why all these digital products are designed the way they are. It’s time we change that. Many of the services we rely on are full of oversights, biases, and downright ethical nightmares: Chatbots that harass women. Signup forms that fail anyone who’s not straight. Social media sites that send peppy messages about dead relatives. Algorithms that put more black people behind bars.
Sara Wachter-Boettcher takes an unflinching look at the values, processes, and assumptions that lead to these and other problems. Technically Wrong demystifies the tech industry, leaving those of us on the other side of the screen better prepared to make informed choices about the services we use—and demand more from the companies behind them.
Paula Boddington, Oxford academic and author of Towards a Code of Ethics for Artificial Intelligence, recommends the five best books on Ethics for Artificial Intelligence. Here is the full interview with Nigel Warburton, published on December 1, 2017.
“Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor” by Virginia Eubanks is being published and will be released on January 23, 2018. Here is an excerpt from Danah Boyd’s blog:
“Automating Inequality: How High-Tech Tools Profile, Police, and Punish the Poor” is a deeply researched accounting of how algorithmic tools are integrated into services for welfare, homelessness, and child protection. Eubanks goes deep with the people and families who are targets of these systems, telling their stories and experiences in rich detail. Further, drawing on interviews with social services clients and service providers alongside the information provided by technology vendors and government officials, Eubanks offers a clear portrait of just how algorithmic systems actually play out on the ground, despite all of the hope that goes into their implementation. Additionally, Berkman Klein discusses “Algorithms and their unintended consequences for the poor” with Eubanks here.
The Big Data Agenda: Data Ethics and Critical Data Studies by Annika Richterich PDF available through the link here.
“This book highlights that the capacity for gathering, analysing, and utilising vast amounts of digital (user) data raises significant ethical issues. Annika Richterich provides a systematic contemporary overview of the field of critical data studies that reflects on practices of digital data collection and analysis. The book assesses in detail one big data research area: biomedical studies, focused on epidemiological surveillance. Specific case studies explore how big data have been used in academic work.
The Big Data Agenda concludes that the use of big data in research urgently needs to be considered from the vantage point of ethics and social justice. Drawing upon discourse ethics and critical data studies, Richterich argues that entanglements between big data research and technology/ internet corporations have emerged. In consequence, more opportunities for discussing and negotiating emerging research practices and their implications for societal values are needed.”
Re-Engineering Humanity by professor Evan Selinger and Brett Frischmann
Every day, new warnings emerge about artificial intelligence rebelling against us. All the while, a more immediate dilemma flies under the radar. Have forces been unleashed that are thrusting humanity down an ill-advised path, one that’s increasingly making us behave like simple machines? In this wide-reaching, interdisciplinary book, Brett Frischmann and Evan Selinger examine what’s happening to our lives as society embraces big data, predictive analytics, and smart environments.
A guide to understanding the inner workings and outer limits of technology and why we should never assume that computers always get it right.
“In Artificial Unintelligence, Meredith Broussard argues that our collective enthusiasm for applying computer technology to every aspect of life has resulted in a tremendous amount of poorly designed systems. We are so eager to do everything digitally—hiring, driving, paying bills, even choosing romantic partners—that we have stopped demanding that our technology actually work. Broussard, a software developer and journalist, reminds us that there are fundamental limits to what we can (and should) do with technology. With this book, she offers a guide to understanding the inner workings and outer limits of technology—and issues a warning that we should never assume that computers always get things right.”
Back to the main resources page
I had the privilege of participating in a science communication conference last week (12, December 2018). Some of the speakers beautifully and convincingly articulated the argument for the importance of academics communicating their work with non-academics as well as other academics from different disciplines and how to do it. Alan Alda’s talk, in particular was deep,insightful and thought-provoking.
Alda’s “Communication is not something you add to science; it is the essence of science” captures his key message that communication is an essential part of doing science and not something separate and extra. There is very little dispute regarding the importance of sharing one’s work with the general public as well as scientists, and with academics outside one’s field. However, there is very little guidance as to how one ought to go about it. Alda’s talk during the SCI:COM conference in Dublin provided some of the most insightful advice by far that I have come across.
Alda suggests, talk TO and not AT people. This seemingly obvious but powerful statement is a way of shifting the mindset from “giving a talk” or “delivering a lecture” which treats knowledge as something that can be simply dispersed to communication as two-way shared activity.
Science commination is a reciprocal process that involves both the speaker and the audience. It is vital that the communicator pays attention to the person that they are communicating with. “It is up to you,the communicator, to ensure that the person is following and to bring them onboard.” And this requires understanding your audience. As Alda puts it: “the speaker needs to listen harder than the listener”.
Communication, Alda argues, is not about me figuring out the best message and spraying it at you, it is building a reciprocal dynamic relationship that changes both the speaker and the audience. Effective communication is understanding your audience and knowing how to connect with them. In order to do so, we don’t start with crafting the best message; we start with awareness of the audience.
Good science communication, Alda emphasises, requires reputation, which is intrinsically connected to trust. Speaking from a position of authority is different from speaking as an equal fellow human being. Your audience is more likely to trust you when you speak as a fellow human and this requires humility, which brings me to central point of my blog.
I wholeheartedly agree with Alda’s approach to communication and also think that humility is a virtue that needs to be highly valued.However, whether humility is viewed as a virtue is dependent on societal stereotypes, hence my conflict with it. Humility doesn’t yield trust and reputation for everyone and I speak from a perspective of a black woman in academia.
In academia, we often have an ideal representation or an image of what an ‘intellectual’ looks like. This is typically a white, middle-class, cis, male. Society’s stereotypes make this group of people automatically perceived as authoritative. Academia’s structure means that people who fit the stereotypically ‘intellectual’ are seen as as unquestionable experts. And for the privileged who fit society’s ‘intellectual’, where coming across as authoritative is the default, humility and speaking to their audience as a fellow human, gains them trust. On the other hand, academics that don’t fit society’s stereotypical ‘intellectual’ often have to work hard to simply prove that they are as capable of their white male counterparts. In an academic environment where looks, gender and race are part of ‘fitting in’ and getting acknowledgements as an intellectual, humility, which is an admirable character for the white male, can be a way of proving that you are not capable, for a black woman. When the default assumption is often you might lack the capacities due to your race or gender, humility might seem like conforming people’s assumptions. Humility, downplaying one’s skills and achievements, for the black woman who already struggles to establish herself as an intellectual, can be a self-imposed punishment which underestimates her intellectual capacity. Humility, then seems, a luxury that the privileged can afford.
Having said that, I must emphasize that the problem is not humility itself but societal stereotypes and rigid academic structures. I still think humility is a character we need to treasure, both in academia and outside. I just hope that we gradually challenge these stereotypes of what an expert intellectual looks like, which will then afford minority’s the luxury for humility and not punish them for it.
I came across this little paper on the Introduction to Dynamical Systems and Chaos online course from Santa Fe. It was provided as a supplementary reading in the ‘Modelling’ section. The paper lays out some of the most enduring misconceptions about building models.
“The modeling enterprise extends as far back as Archimedes; and so does its misunderstanding.” Epstein (2008)
So, why model? What are models? And who are modellers?
Prior to reading this paper, my short answers to these questions would have been in accordance with the widely held misconceptions that:
We model to explain and or predict. Models are formal representations (often mathematical) of phenomenon or processes. And a modeller is someone who builds these explicit formal mathematical models. However, Epstein explains:
“Anyone who ventures a projection, or imagines how a social dynamic—an epidemic, war, or migration—would unfold is running some model.”
I like the idea that we all run some implicit models all the time. In the social and political sphere, where it is extremely difficult to operationalize and specify variables, this perspective gives implicit modelling such as drawing dynamical analogies, its due importance.
The paper lays out 16 reasons other than prediction for building models. And the idea that prediction and explanation aren’t the only modelling goals was revelation to me given that I’ve had a love hate relationship with modelling in the past. I am attracted to models, specially those with dynamical systems inclination but the overall tendency towards prediction as a goal often frustrates me. Just to clarify, prediction is a fine goal but my objection arise when 1) we’re deluded into thinking that models give us the tools to predict specific individual behaviours and 2) we can model a phenomenon, especially human behaviour, without first understanding it.
Let me elaborate further in the context of automated predictive system that are currently trending (at least, within my academic circle) and often preoccupy my thinking. Claims to predict “criminal” and “risky” behaviour are examples from last week’s headlines: UK police wants Artificial Intelligence (AI) to predict criminal behaviour before it happens and Predictim, a commercial data analytics firm, claims its AI can flag “risky” babysitters. Unfortunately, these are not the outrageous exceptions but the general direction where things in the digital surveillance sphere seem to be heading.
Behaviours such as “criminal” or “risky” are very complex adaptive behaviours which are a result of infinite ongoing factors, which we can never fully specify in the first place. This makes it impossible to predict criminal behaviour with certainty. Juarrero reminds us why it is impossible to predict human behaviour with precision:
“When we are dealing with complex adaptive systems, surprises are unavoidable. Because of their sensitivity to initial conditions – due, in turn, to their contextual and temporal embeddedness – complex adaptive systems are characterized by unusual twists and novel turns. Since we will never be able to specify any dynamical system’s initial conditions to the requisite (infinite) degree, a fortiori we will never be able to capture all the details and circumstances of anyone’s life and background. Given this limitation, we must always keep in mind that reconstructing specific instances of behavior will always be, at best, an interpretation and not a deduction – a much more fallible type of explanation than we had previously hoped was available. Interpretations of human action are always tentative. Absolute certainty about either what the agent just did, or what he or she will do – specifically – a year from now, is therefore impossible.” (Juarrero 1999, p. 225)
These claims to predict “criminal” or “risky” behaviour are more than a mere misunderstanding of human nature or simple illusions about what AI tools are capable of doing. As these tools are being implemented into the social world, they have grave consequences on people’s lives. When claiming to predict someone’s potential criminality, errors are inevitable. The stakes are high when we get things wrong. Unsurprisingly, it is often society’s most vulnerable, those who are disfranchised, that pay a high price. Indeed, such models are used to further punish and disfranchise those that fall prey to these models.
A slightly different but interrelated issue with modelling to predict is that the strive to predict and explain often ignores the value of describing and/or observing to gain deep understanding. Sure, describing to understand, and explaining and predicting aren’t mutually exclusive. However, in reality, we seem to have blindly adopted prediction and generalization as primary goals of science. Studying to describe and understand, as a result, are undervalued. What is the point of describing? you might ask. I think it is fundamental to understand any phenomena or process as deeply and comprehensibly as possible before we can attempt to explain or predict it, and description is key to gaining such understanding.
I’ll leave you with an insightful Geertz (1973) passage from The Interpretation of Cultures:
“… I have never been impressed with claims that structural linguistics, computer engineering or some other advanced form of thought is going to enable us to understand men without knowing them.”
How would you give a talk on Artificial Intelligence (AI) to 120 students between the age of 16-18, not all of whom are necessarily interested or have a background in science? How would you define AI? What would you include (and exclude)? What is the best way to structure it? Well, surely, there are many valid answers to these questions. It was the first time that myself and Elayne Ruane, a colleague who is also a PhD researcher, attempted to give an 80 minute talk to a big crowd of such students. We didn’t find much in terms of guidance or advice on how to interact with the students or how to frame the AI discourse in a suitable manner for students who are about to embark on their college journey. We wanted to convey the excitement, hope and potential the field holds while also portraying a realistic image of its current state. Hopefully sharing our general approach might be helpful to anybody who finds themselves in a similar situation.
Mind, what worked for us might not work in different contexts, mindsets, situations, or for a different topic. AI is one of the most over-hyped and misunderstood areas of research in the minds of the general public. Furthermore, AI has been somewhat associated with a certain stereotypical archetype in the media – a white male genius computer geek. How one introduces the field and the kind of work and influential figures one includes plays a subtle but important role towards challenging these misconceptions and stereotypes. Specifically, when addressing a crowd of young people in the midst of deciding what areas of study they will pursue at university, how you present the field of AI can send implicit signals about who is welcome. For us, this is everyone.
We began our talk with a brief discussion of what a computer science degree, as one of the routes to AI research, entails (within the context of our own department at University College Dublin) and the kinds of careers that it can lead to while raising the point that there isn’t one path to follow. We then briefly talked about exemplar AI projects that are taking place within our own School. We kept this part of the talk very interactive by frequently polling the group by way of raising their hands. This was important in keeping the students engaged.
What is AI?
We discussed the general definition of AI – the common view that artificial intelligence refers to a machine that simulates human intelligence. What it means to ‘simulate’ or ‘human intelligence’ are contested and of course far from settled. However, we felt it was important to keep it simple for the purpose of this talk. ‘Machines that simulate human intelligence and exhibit human behaviour’, often comes down to abilities such as learning, problem solving, reasoning, language processing and the like.
Unlike other disciplines such as physics or biology, Artificial Intelligence is not a clearly defined and well contained discipline but rather a very broad and cross disciplinary endeavour. It draws from mathematics, engineering, biology, neuroscience, linguistics, philosophy, and many more. Although the most direct route to studying AI is through computer science (certainly within the context of UCD), one can also get to AI through other routes. Besides, AI can be synthesized with any field of enquiry, including, neuroscience, music and art. Christie’s recent AI generated art is a good example.
AI is a wide umbrella term with sub-fields including robotics, natural language processing, computer vision, machine learning and deep learning, speech recognition, machine translation and more. We tried to use examples of these relevant to the students including Google Translate, Amazon’s Alexa, PS4 games, Minecraft, facial recognition tools, and robots. We showed them the famous video of Boston Dynamic’s robot, Spot, dancing to Uptown Funk which was a huge hit.
The History of AI
AI is often thought of as a recent development, or worse, as futuristic, something that will happen in the far future. We tend to forget that dreams, aspirations and fascinations with AI go back in history back to antiquity. In this regard, Rene Descartes’s simulacrum and the Mechanical Turk are good examples. Descartes was fond of automata and had a walking and talking clockwork named after his daughter Francine. The machine apparently simulated his daughter, who died of scarlet fever at the age of 5. Similarly, the 18c Hungarian author and inventor Wolfgang von Kempelen created the Mechanical Turk, (a fake) chess-playing and speaking machine to impress the Empress Maria Theresa of Austria.
We can list an endless number of scholars who contributed to the development of AI as it is conceived today. The main towering figures we included were:
- The ninth century Persian mathematician Muḥammad ibn Mūsā al-Khwārizmī who gave us one of the earliest mathematical algorithms. The word “algorithm” comes from mispronunciation of his name.
- The English mathematician, Ada Lovelace who is often regarded as the first computer programmer.
- Alan Turing who is regarded as the father of theoretical computer science and whom most students seemed to be already aware of.
- And more recently, and perhaps scholars most influential in shaping the way we currently understand AI, are Marvin Minsky, John McCarthy, and Margaret Masterman.
We tried to make our talk as interactive as possible. We had questions and discussion points throughout. Towards the end, we had a game where students had to guess whether the AI being described on each slide was ‘sci-fi’ or ‘real’. Here are the main examples. Have a go yourself. 🙂
Sci-fi or real
- Self-aware robots
- Face recognition AI which rates people’s ‘trustworthiness’
- A bedside light that notifies you of your retweets
- Automated confession systems
If there is anything that the AI narrative is not short of, it’s hype and misconception. Clarifications, in a subtle way, both help illustrate what the actual current state of the field is as well as highlighting the challenges that arise with it. As such, the final concluding remakes were highlighting the misconceptions surrounding AI and the ethical concerns that necessarily arise with any technological advancement. The major misconceptions we mentioned are:
- AI is a distant reality. The fact is far from it. AI is deeply embedded in the infrastructure of everyday life. It is invisible and ubiquitous.
- AI equals robots or ‘self-driving’ cars. As it would have been obvious by now, robotics is simply one dimension.
- AI is neutral and can’t be biased. This again is far from reality. As AI integrates deeper into the educational, medical, legal, and other social spheres, ethical questions inevitably arise. Questions of ethics, fairness, and responsibility are inherently questions of AI.
That concludes the content of the talk.
- Keep it open and flexible. Create opportunities to hear from them. This allows you to get an idea of their awareness and knowledge (which can then help you calibrate on the fly in terms of technical detail) while also keeping them engaged.
- Pictures, more picture, and videos, are a great way to open up discussion. We showed a video of Google Assistant making a phone call which really captured their attention and got them talking. This also brought forth some ethical discussion.
- Prepare for plenty of questions around “Is AI going to take over?” and “How scared and worried should we be?”. It’s important to highlight how AI advancements can be misused but the trick is to highlight how much of what is reported on AI is overly blown hype which contributes to these unnecessary and unrealistic fears of AI when in fact much of the development in AI remains still premature. On the other hand, remember, we were talking to young science students about to embark to college. We still want to encourage them and want them to feel the dreams, excitements and hopes that have been the driving force of AI, at least in the 50s and 60s and the promising potentials that AI presents in medicine, robotics and more.
- (2016) Discussing the limits of artificial intelligence
- David Chapman (2018) How should we evaluate progress in AI?
- François Chollet (2017) The impossibility of intelligence explosion
- Gary Marcus and Ernest Davis (2018) A.I. Is Harder Than You Think
- Melanie Mitchell (2018) Artificial Intelligence Hits the Barrier of Meaning
- Oscar Schwartz (2018) ‘The discourse is unhinged’: how the media gets AI alarmingly wrong