What are Explainability AI Techniques? Why do We Need it?

Sonia Singla 16 Aug, 2023 • 11 min read


When we talk about AI quality, what exactly does it mean? The quality of AI serves as a fundamental cornerstone for organizational values. Its significance cannot be overstated, often being the linchpin determining success or failure. Notably, a survey reveals that AI applications have the potential to generate substantial annual values, ranging from $3.5 to $5.8 trillion across 19 industries. The accessibility of data preparation, model training, and deployment solutions has led to a remarkable surge in AI and machine learning (ML) integration in business settings. Amidst this integration, a pivotal question surfaces: How can AI truly contribute enduring value to an organization? Enter Model Explainability AI. Amidst the pursuit of AI quality, Model Explainability emerges as a keystone element, offering the means to comprehend the intricate workings of AI systems.

Model Explainability in AI

Learning Objectives

  1. We will learn Lime, Drift, Fairness, and Model Explainability
  2. We will also learn how the single features affect the model by tools TruEra and WhatIfTools

This article was published as a part of the Data Science Blogathon.

What is Model Explainability AI?

Do we trust our model without knowing whether its decision is correct? How it made the decision based on what features?

If the model fails to observe the decision correctly, it will enable the failure in business as it will get affected based on the decision outcome. Apart from Model Explainability, fairness and Stability are the other backbones of AI quality. If we have to define the term Model Explainability, it is just how we view the model and its functions that help to give us the result or predict the outcome.

To answer a few questions, how do you explain your model? How do you conclude your model is best? Let us drive into details. Let us first talk about LIME. LIME is one of the models used in any machine learning model and enables us to understand the model.

Top 15 Explainability AI Techniques

Here is the list of top 15 model explainability AI techniques:

  1. Feature Importance: Identifying the most influential features in a model’s predictions.
  2. LIME (Local Interpretable Model-agnostic Explanations): Creating simple interpretable models to explain complex ones.
  3. SHAP (SHapley Additive exPlanations): Providing local and global feature importance scores for a prediction.
  4. Partial Dependence Plots: Visualizing the relationship between a feature and the model’s predictions while keeping other features constant.
  5. Permutation Feature Importance: Assessing the importance of features by shuffling their values and measuring the impact on the model’s performance.
  6. Integrated Gradients: Calculating the contribution of each feature towards a prediction by integrating the gradient along the path from a baseline input.
  7. Attention Mechanisms: Highlighting the parts of an input that are most relevant for a model’s decision.
  8. Counterfactual Explanations: Providing alternative input scenarios that would have led to a different model prediction.
  9. Rule-based Explanations: Expressing a model’s decision using human-readable rules.
  10. Anchors: Determining the smallest condition that captures a model’s prediction for a specific instance.
  11. Decision Trees: Creating interpretable decision paths that explain model predictions.
  12. Gradient-based Sensitivity Analysis: Analyzing how changes in input features impact model predictions through gradient calculations.
  13. Global Surrogate Models: Training interpretable models that approximate the complex behavior of the original model.
  14. Local Surrogate Models: Creating simpler models that mimic the behavior of the primary model for individual predictions.
  15. Feature Interaction Analysis: Exploring how interactions between features influence model outcomes.


Lime techniques make some changes in the input data sample. For these, it is necessary to input some changes in the input data sample and then check how it affects the level of prediction.

Mostly, the term used is a perturbation. In machine learning, adding random variations to the input data to test a model’s robustness and stability is termed perturbation. The purpose of perturbation is to identify areas of the input space where the model is likely to make incorrect predictions and how sensitive the model is to minute changes in the input.

LIME techniques modify or alter the single feature values and similarly observe the prediction, like We humans will want to know which features were unique to make the prediction.

Adding the noise to the input data and observing the model’s predictions change as the noise increases. We get to know if the model is overfitting to the training data and if it is making predictions based on features that aren’t relevant or misleading.

LIME Example

We uploaded a wine dataset in csv file to understand the working of LIME. First, install lime by using a pip.

pip install lime
import numpy as np
import pandas as pd
data = pd.read_csv('wine.csv', delimiter=';')

As we can see, the data has columns with no missing values, and quality is the target variable. We will move further to split the data into train and test.

from sklearn.model_selection import train_test_split
train = data.drop('quality', axis=1)
test = data['quality']
train, test, label_train, label_test = train_test_split(
    X, y, test_size=0.2, random_state=42

Now we will use the pycaret.

from pycaret.regression import *
s = setup(data, target = 'quality')
best = compare_models()
Regression in AI

Now we will use ExtraTreesRegressor as a model. We will take the model score.

from sklearn.ensemble import ExtraTreesRegressor

model = ExtraTreesRegressor(bootstrap=False, ccp_alpha=0.0, criterion='squared_error',
                    max_depth=None, max_features=1.0, max_leaf_nodes=None,
                    max_samples=None, min_impurity_decrease=0.0,
                    min_samples_leaf=1, min_samples_split=2,
                    min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1,
                    oob_score=False, random_state=6666, verbose=0,

model.fit(train, label_train)
score = model_random.score(test, label_test)

Our model score shows 58% accuracy. Now we will use the lime that requires training data that should be in a numpy array. The second is a mode, which is classification in our case. The third class names that target variables, and the fourth are training data columns.

import lime
from lime import lime_tabular

explain_lime = lime_tabular.LimeTabularExplainer(
    class_names=['bad', 'good','neutral'],
explain = explain_lime.explain_instance(
    data_row= test.iloc[1], 


We will choose a random number to view the LIME results.

explain = explain_lime.explain_instance(


You can see the probability of good, bad, and neutral wine quality.

Good and not-good features let us know which feature or column values need to be less and what needs to be more for wine quality.


What is fairness, and why should we be concerned about it? We know Machine learning has become part of our lives. Automated self-driving cars will become usual on the road after a gap of 5-10 years. Big companies like Amazon utilize it to sort and display the items according to your choices, followed by Netizens for pages, Linkedin for Job Candidates, and the US court uses the COMPAS algorithm for predictions of criminals.

Even if AI and Machine learning applications have grown, sometimes the data provided is unfair and one-sided. The COMPAS algorithm used by US court data shows a false positive rate more for crimes from Black than white people. There were other cases where Females outnumbered for jobs more than men. To make sure no one gets unfair treatment, AI must be unbiased.

Now the question arises why Bias happened in Machine learning? What is the reason? The answer is mainly due to data provided by human means of human error.

Fairness Examples

For example, sometimes, the training data is older. Suppose the job applicant gets chosen based on the manager’s decision of previous selection, and profiles of other candidates are more likely to get unnoticed or rejected even if their skills are high. Likewise, a limited amount of features can cause a problem, as the training data is less for that group. Unimportant or sensitive features can also be the root cause for biased.

So how will you define fairness in AI quality? It is the method that understands the unfairness in the model.

Fairness in AI

Fairness recognizes the biases in your data. It makes sure that your model makes accurate predictions for all groups. In Machine Learning, Fairness in the development phase can get utilized.

First, we will target the imbalanced dataset. What do we understand by the imbalance dataset? In machine learning, if the two groups have variations in the percentage, it will cause an imbalance. For example, in the email classification, the spam holders hold less percentage compared to the required emails that are imbalanced.

There are many methods to get balanced data, like undersampling, oversampling, and adding artificial data. Undersampling means taking out randomly the sample values from the higher class, making it balance the lower class. Oversampling means adding duplicate data to the lower class.

Adding artificial data, like in SMOTE, stands for Synthetic Over sampling minority techniques similar to oversampling by bootstrapping or K-nearest neighbors algorithm.

Fairness Examples with Code

We will use TruEra, and WhatifTools to have an understanding of our data in a better way.

import pandas 
import numpy
data =pd.read_csv('wine.csv', delimiter=';')
import witwidget

from witwidget.notebook.visualization import WitWidget, WitConfigBuilder

config_builder = (WitConfigBuilder(data.values.tolist(), data.columns.values.tolist()))

WitWidget(config_builder, height=800)
AI Fairness

We have uploaded our data and can see a threshold of 0.5. A score of more than 0.5 means positivity which means high quality in the case of our data wine.


Changing Color

We will look at the visualization now as green color is more compared to other colors means data values of six in quality are more compared to other wine quality data.

AI Fairness

Let’s change the x-axis to the alcohol feature in our data and the y-axis to the quality.

data.sort_values(by=['quality'], ascending=True, inplace=True)
data = pd.get_dummies(data) 
y = data['quality']
X = data.drop(columns=['quality']
train, test, label_train, label_test = train_test_split(
    X, y, test_size=0.2, random_state=42
from sklearn.metrics import mean_squared_error
print(f"RMSE = {mean_squared_error(test, model_extra.predict(label_test), squared=False)}
tru.get_explainer(base_data_split="test").plot_isp(feature='free sulfur dioxide')

We just took the feature of free sulfur dioxide, and we can see the lower value of free sulfur dioxide influences the quality of the wine.


When the data changes over time, it is called a drift. The changes in the data can affect the model, as we know that the data relies on the key of the value of the past to predict the future.

For example, if we think of the concept of COVID-19 pre-pandemic and post-pandemic, we can see the sudden shift, which leads us to ask a few questions. One reason for the sudden shift? Second, whether my model is working well or needs some changes.

In another example, sudden changes in weather conditions and in-house sale prices can shift the data, making the model fails in predicting the future.

Example of Drift

Similarly, if we see two images on our left and right side, whether it is a cat or not cat and the bike model in the previous older days can still be used now.

real-world example in AI

To answer this question, first, we should know how our model trained on the datasets. For example, if our model is trained on cat images and not on the big cat, it will just answer cat or no cat. Similarly, it will depend on the training data for bikes, whether it is trained, on features of having two wheels, etc. The real-world example can make the model fails in how it predicts.

real-world example in AI

Now, why does such a drift occurs? It can be due to a lack of data collection for some period, or sometimes a lack of data as in case of failure of credit card acceptance as compared to credit card acceptance chance or the data is quite old and not updated.

X is the input variable, f is the mapping function or the model, and then y as output is the prediction value.


In a usual routine, the relation between input and output variables is static. But due to changes in the data over time, the function or model will fall if not updated according to the new change.

Some of the terms used in the drift as a covariate and data shift refer to the change of data over time and the relationship shared between input and output.

How Can We Lessen the Effect of Drift?

Now the question arises, how can we lessen the effect of drift? Well, the answer to this question is as simple. When we can understand the cause from its root, we can lower its acts on the model.

Sometimes the reason of the cause of the drift is the unbalanced feature that is not steady, and we can replace it with mean or median, or we can also remove it without replacing it.

One can identify the new features that are making the drift to occurs. Sometimes the changes that lead to the cause of a drift-like situation have no importance to add, and thus no changes are required.


To explain the output model, SHAP (Shapely Additive Explanations) uses Shapely Values in practice.

It shows the measures of every feature, how they perform on the model, and whether it is positive or negative.

from sklearn.ensemble import ExtraTreesRegressor

model = ExtraTreesRegressor(bootstrap=False, ccp_alpha=0.0, criterion='squared_error',
                    max_depth=None, max_features=1.0, max_leaf_nodes=None,
                    max_samples=None, min_impurity_decrease=0.0,
                    min_samples_leaf=1, min_samples_split=2,
                    min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=-1,
                    oob_score=False, random_state=6666, verbose=0,
model.fit(train, label_train)
!pip install shap
import shap

explain = shap.TreeExplainer(model)
shap_values = explain.shap_values(X)
shap.force_plot(explain.expected_value, shap_values[0, :], X.iloc[0, :])
Shap in AI

Now lets us interpret the visualization chart. The red marks show the features with higher prediction contribution and the blue with the lower value. Let’s see our wine data. We can see here that residual sugar, free sulfur oxide, and citric acid make the higher prediction, whereas alcohol and volatile acidity contribute to lower or worst quality.

To understand the effect of a single feature, we will plot another chart.

shap.dependence_plot('alcohol', shap_values, X)
Summary of Plots

It shows changes in the wine quality as the change occurs in the value of alcohol.

We will plot a summary of all the features.

shap.summary_plot(shap_values, X)
Summary of Plots

Here we can see that the low value of free sulfur dioxide and the high value of alcohol makes good wine quality.


The quality of AI Techniques plays a vital role in making any business progress. We here will point out how the drift and individual features affect the model.

The key takeaways of the article are as follows:

  1. We explained the importance of various AI Techniques in Machine learning, Model Explainability, fairness, and stability.
  2. We also explained the importance of AI Techniques such as Drift and Shapley’s values in shaping the model.
  3. We tried to explain Why the Drift occurs by taking an example of the wine dataset.
  4. We used TruEra, WhatifTools, and Shapley Values to explain our Wine data by showing which features make it’s quality higher and lower.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion. 

Frequently Asked Questions

Q1. What is an example of explainability AI?

A. An instance of explainability in AI provides insights into how a self-driving car’s decision-making process led to a particular action, enhancing trust and accountability.

Q2. What is explainability machine learning methods?

A. Explainability in machine learning involves techniques like LIME (Local Interpretable Model-agnostic Explanations) and SHAP (Shapley Additive exPlanations) that reveal model predictions’ rationale.

Q3. What is the explainability approach?

A. The explainability approach focuses on making complex AI models more transparent and interpretable to users, enabling them to understand why certain decisions are made.

Q4. What are the different types of explainability?

A. There are several types of explainability methods, including post-hoc explainability (explaining after model training), intrinsic explainability (interpretable models), and local vs. global explainability (insights into individual predictions vs. model behavior overall).

Sonia Singla 16 Aug 2023

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Related Courses