AI for Everyone

AI for Everyone

By Celina Lee

Did you ever wonder why pictures of some people consistently come out clearer and more true-to-life than others? It's about skin tone, as you might have guessed. It’s also a bit about lighting, as I used to think. But this story is actually a story about training data.

The first colour film cameras were engineered in the United States and calibrated to fair-skinned women using reference cards that were popularly known as “Shirley” cards, named after the original female model who worked for Kodak.

In artificial intelligence (AI), statistical models are trained on and fitted to data to create an approximation of the reality represented by the data. In the case of Kodak, the training data was a particular type of person. The technology and chemicals used to produce the photos were adjusted to ensure that “Shirley” always looked true-to-life. 

Similarly, in AI, if the training data is skewed then the resulting models will also be skewed, or biased. As AI begins to enter into virtually every aspect of our lives, from medicine to dating to law enforcement, biased AI is a problem that people cannot afford to ignore.

Bias in AI

This Kodak example illustrates the problem with insufficient representation in the training data. AI models will not perform well for those that are not represented well in the original training data. The AI simply won’t know what to do. For example, popular sentiment analysis tools have been largely trained on American English. These AI would miss the slang and local-language words found in any African country, even if the text were mostly in English.

But there is an even more problematic skewness that often exists in training data- the one born out of our own prejudices. AI simply learns to mimic even the subtlest bias. There are already examples of AI pushing ads for more prestigious jobs to men or classifying women of color as less attractive than their white counterparts.

When it comes to the analysis of the training data, even as AI becomes more of a “black box” with neural nets that can take in massive amounts of data and variables, there is still human judgement in building the algorithms. A human decides what attributes, or variables, to feed the “machine”. Is it right to give gender to a model for identifying engineering talent? Or socio-economic status to a model for predicting criminal risk?

It is also a human that ultimately defines the cut-offs and thresholds that convert data outputs into decisions and actions, which can also be driven by bias. This includes defining the tolerance level for a false positive vs. a false negative. For example, is it ok for an AI to falsely classifying a person as high-risk and sentence them too harshly in the name of ensuring that no one else mistakenly gets let off?

What can be done?

Ironically, if we can get it right, AI should be able to help us combat our own human biases. But on the other hand, getting AI wrong could be catastrophic. The stakes are understandably high. 

Academia and the industry are actively investing in research on bias in AI - how to detect bias, how to eliminate bias, and estimating the impact of AI bias on society. Microsoft and Facebook, among others, have dedicated research teams to study the issue. Some solutions are as simple as excluding sensitive attributes. Many solutions involve using AI to find bias in AI.

But the best solution is probably the most obvious one. The only way to combat bias in AI in the long-run is to ensure that AI solutions are developed on a diverse and representative pool of data by an equally diverse and representative pool of data scientists.

Enter Zindi

At Ixio we are getting ready to launch Zindi, a new data science competition platform for Africa. Our vision is to mobilize a community of data scientists to solve Africa’s most pressing challenges through machine learning and AI. We work with organizations with data and business problems and invite data scientists in Africa and beyond who have the tools and the curiosity to tackle the issues that affect the continent. The issues are diverse and range from from public health and urban planning to customer service and marketing.

Zindi will provide the platform for the African market to contribute powerfully to the global vision for inclusive AI that serves everyone. Zindi launches on 9 September 2018. 

If you’d like us to keep you posted about Zindi, please sign-up for updates on the website.

Share This Artcle :

About Author

Celina is CEO of the Zindi data science competition platform. For the last 9 years, Celina has acted at the forefront of advanced uses of data for financial inclusion globally. Celina is from San Francisco, California, but has lived and worked around the world, including Latin America and Asia, and now Cape Town in South Africa.