Hello!
Welcome to Hold the Code, Edition #16.
In this week's newsletter, we cover a recent Twitter #scandal, the limits of AI in assisting with COVID efforts, IBM's CodeNet project, and an essay about explainability and the AI black-box problem.
As Northwestern wraps up its spring quarter, we'd like to thank you for being part of the Hold The Code community for the past 16 weeks.
Without further ado...
#Biased
Twitter’s algorithm for automatically cropping images has been found to have been favoring white people over black people and women over men. This announcement comes after reports from users that the algorithm seemed to favor showing some faces over others in image previews.
How it started
Last September, a university employee realized that when he posted photos of himself and a colleague on Twitter, the automatic imaging cropping always displayed the white man. Other users also confirmed this pattern, testing out images of Barack Obama and Mitch McConnell as well as stock images of people of different races, and had similar results.
Twitter responded by explaining they had tested for these issues in the past but also admitted there was still work that needed to be done. This algorithm was trained on eye-tracking data, which may be one of the contributing and complicated causes of this disparity.
Testing Twitter
In testing, the algorithm was found to have...
An 8% difference in favoring women over men.
A 4% difference in favoring white people over black people.
A 7% difference in favoring white women over black women.
A 2% difference in favoring white men over black men.
Continuing the thread
Twitter has committed to fixing this issue and has already released a new update for mobile devices.
Rumman Chawdhury, the director of software engineering at Twitter wrote: "One of our conclusions is that not everything on Twitter is a good candidate for an algorithm, and in this case, how to crop an image is a decision best made by people.”
Research Blues
AI has been used in many stages of the fight against Covid-19, especially when it comes to vaccine distribution. But recent Covid-19 studies have indicated that AI hasn’t been successful in predicting the disease itself.
Looking at research papers from January 2020 to October 2020, 300 studies of Covid-19 that involve AI haven’t been necessarily reproducible. In fact, they lack a lot of detail and reliability. For example, a study that used chest X-rays to observe Covid-19 in adults but not children to train AI resulted in the AI more likely predicting whether the X-ray came from an adult or a child than if the patient had Covid-19.
Bias also became a problem in the studies. AI studies that had poor rigor but high performance were more likely to be published than those that failed to outperform industry standards but had comprehensive models.
People want AI to be the next solution in healthcare, but they need to do it thoroughly and thoughtfully. AI can be the future, but rushing to get there isn’t the solution.
CodeNet CodesNot – At Least Without Humans
IBM's efforts to automate programming using AI highlight the irreplaceable nature of human programmers – at least for the time being.
IBM's CodeNet
CodeNet is an ambitious dataset curated by IBM researchers to train machine learning models automating various programming tasks – such as translating between programming languages, or making code-writing recommendations for programmers (think autocomplete but for coding).
The grab for data
Machine learning is teaching software to solve problems through past examples. Therefore, a good machine learning model relies on a quality dataset containing a large number of well-defined and annotated examples. IBM's CodeNet contains 14 million code samples with 500 million lines of code written in 55 different programming languages, including samples accounting for a variety of sizes, errors, execution times, etc. – factors that help the computer learn what makes good code.
We still very much need human programmers
Curating datasets for artificial intelligence takes Herculean, and very human, efforts.
The code examples making up the CodeNet dataset come from human programmers.
Tremendous software and data science efforts by IBM engineers are needed to process the examples into the quality dataset needed to train the computer.
We need human engineers to evaluate and ensure the quality of the resulting machine learning algorithm.
Artificial intelligence is not ready to replace programmers (at least for the time being). But it might change the kind of tasks that require the efforts and ingenuity of human programmers.
Without human programmers, CodeNet codes not.
Read the full article here.
Weekly Feature: "Explainability Won't Save AI"
A recent essay by Jessica Newman in Brookings, “Tech Stream” argues that addressing the “black-box problem” of AI through explainability alone will not be enough to build trust in AI models. According to Newman, without clear articulation from different communities, AI is still more likely to serve the interests of the powerful, regardless of how well we understand the models.
The black-box problem
In computing, a ‘black box’ is a device, system, or program that allows you to see the input and output, but gives no view of the processes and workings between. The AI black box, then, refers to the fact that with most AI-based tools, we don’t know how they do what they do.
we know the question or data the AI tool starts with (the input). We also know the answer it produces (the output).
But thanks to the AI black box problem, we have no idea how the tool turned the input into the output.
Which is fine, until it produces an unexpected, incorrect, or problematic answer.
Why does the black-box problem exist?
Artificial neural networks consist of hidden layers of nodes. These nodes each process the given input and pass their output to the next layer of nodes. Deep learning is a huge artificial neural network, with many of these hidden layers, and it ‘learns’ on its own by recognizing patterns.
And this can get infinitely complicated. We can’t see what the nodes have ‘learned’. We don’t see the output between layers, only the conclusion.
So, we can’t know how the nodes are analyzing the data...i.e: we’re facing the AI black box.
What is explainability (XAI)?
Explainable AI (XAI) is broadly defined as “machine learning techniques that make it possible for human users to understand, appropriately trust, and effectively manage AI.” Around the world, explainability has been referenced as a guiding principle for AI development, including in Europe’s General Data Protection Regulation.
Explainable AI has also been a major research focus of the Defence Advanced Research Projects Agency (DARPA) since 2016.
Newman writes, “Explainability is seen as a central pillar of trustworthy AI because, in an ideal world, it provides understanding about how a model behaves and where its use is appropriate. The prevalence of bias and vulnerabilities in AI models means that trust is unwarranted without sufficient understanding of how a system works.”
So why isn’t XAI enough?
Newman argues that knowledge about how AI systems work will only benefit trust-building efforts if the knowledge is fairly distributed. Currently, she argues that an asymmetry of knowledge exists: regardless of much explainability research companies conduct, users and other external stakeholders are typically afforded little if any insight into the behind-the-scenes workings of the AI systems that impact their lives and opportunities.
Newman ultimately calls upon users of AI to invest in explainability with diversity and inclusion and mind and to maintain clear objectives. In short, she thinks it’s not enough to just know about AI; we must also know what to do with that knowledge.
Read her full essay here.
Written by: Lorinna Chen, Molly Probble, and Lex Verb