IBM Open Sources Comprehensive Python Toolkit for Detecting & Fighting Bias (30 Metrics, 9 Algorithms)

Pranav Dar 19 Sep, 2018

2 min read

Overview

IBM has open sourced a toolkit to deal with bias in datasets and algorithms
The toolkit contains 30 fairness metrics and 9 state-of-the-art algorithms
The Python code and detailed documentations, along with detailed installation instructions, are available on GitHub

Introduction

Bias is a serious issue in machine learning models. Quite often we tend to skim through the data in our eagerness to build the model, and then scratch our heads when the model doesn’t translate well in real-world situations. It’s a pervasive issue, and one that experts have been trying to mitigate for years.

Image result for bias

Source: YouTube

With the seriousness of this challenge in mind, IBM has released a toolkit that contains a set of “fairness metrics” for datasets and models, explanations for these metrics, and algorithms that can deal with any bias that is unearthed. And the best part? It’s open source (and in Python)! Check out the below links to get started by yourself:

The toolkit, officially labelled the ‘AI Fairness 360 Open Source Toolkit’, contains over 30 fairness metrics and 9 algorithms that aim to deal with bias. These algorithms are state-of-the-art, and are mentioned below:

Optimized Preprocessing
Disparate Impact Remover
Equalized Odds Postprocessing
Reweighing
Reject Option Classification
Prejudice Remover Regularizer
Calibrated Equalized Odds Postprocessing
Learning Fair Representations
Adversarial Debiasing

The above mentioned official site has multiple tutorials in different industry functions to give you a taste of how to use the toolkit. These include credit scoring, medical expenditure, and gender bias in facial recognition. What are you waiting for? Get started already!

Our take on this

We need to remember that data isn’t just numbers on a spreadsheet, but is linked to human beings. Bias is an omnipotent issue. I cannot stress enough on how important dealing with it is, especially when we’re running algorithms that will directly impact lives.

Can you imagine running a credit risk model, or a loan default model, and turning away folks who most desperately need the money? They were perfectly eligible for it, but due to some bias in the data, and subsequently the model, we failed to consider that aspect. Unacceptable, right? Let’s keep that in mind next time we work on a project and try to use this toolkit, if other known methods are not working.

AI bias IBM IBM AI machine learning

Pranav Dar 19 Sep, 2018

Senior Editor at Analytics Vidhya. Data visualization practitioner who loves reading and delving deeper into the data science and machine learning arts. Always looking for new ways to improve processes using ML and AI.

AVbytes