Priyanka Dalmia — Published On November 29, 2021 and Last Modified On September 18th, 2022
Beginner Datasets Deep Learning Guide Machine Learning Python

This article was published as a part of the Data Science Blogathon.

In this article, we will learn about model explainability and the different ways to interpret a machine learning model.

What is Model Explainability?

Model explainability refers to the concept of being able to understand the machine learning model. For example – If a healthcare model is predicting whether a patient is suffering from a particular disease or not. The medical practitioners need to know what parameters the model is taking into account or if the model contains any bias. So, it is necessary that once the model is deployed in the real world. Then, the model developers can explain the model. 

Why is Model Explainability required?

  1. Being able to interpret a model increases trust in a machine learning model. This becomes all the more important in scenarios involving life-and-death situations like healthcare, law, credit lending, etc. For example – If a model is predicting cancer, the healthcare providers should be aware of the available variables.

  2. Once we understand a model, we can detect if there is any bias present in the model. For example – If a healthcare model has been trained on the American population, it might not be suitable for Asian people.

  3. Model Explainability becomes important while debugging a model during the development phase.

  4. Model Explainability is critical for getting models to vet by regulatory authorities like Food and Drug Administration (FDA), National Regulatory Authority, etc. It also helps to determine if the models are suitable to be deployed in real life.

How to develop Model Understanding?

Here we have two options at our disposal:

Option 1: Build models that are inherently interpretable – Glass Box Models.

For example – In a linear regression model of the form y = b0 + b1*x, we know that when x increases by 1% then y will increase by b1% keeping other factors constant.

Option 2: Post-hoc explanation of pre-built models – Black Box Models

For example – In a deep learning model, the model developers are not aware of how the input variables have combined to produce a particular output.

Glass Box Models
Black Box Models 
Simple Complex
Interpretable Not easily Interpretable
Low accuracy High accuracy
Examples – Linear Models, Decision Tree Examples – Random Forest, Deep Learning

Ways to interpret a Model

There are two ways to interpret the model – Global vs Local interpretation.
Global Interpretation
Local interpretation
This helps in understanding how a model makes decisions for the overall structure This helps in understanding how the model makes decisions for a single instance
 Using global interpretation we can explain the complete behavior of the model  Using local interpretation we can explain the individual predictions
Global interpretation help in understanding the suitability of the model for deployment Local interpretation helps in understanding the behavior of the model in the local neighborhood
Example – Predicting the risk of disease in patients Example – Understanding why a specific person has a high risk of a disease

Local Interpretation

We will discuss the following methods of local interpretation:

  • LIME (Local Interpretable Model-agnostic Explanations)
  • SHAP (SHapley Additive exPlanations)


LIME (Local Interpretable Model-Agnostic Explanations)

LIME provides a local interpretation by modifying feature values of a single data sample and observing its impact on the output. It builds a surrogate model from the input (sample generation) and model predictions. An interpretable model can be used as a surrogate model. Because LIME is a model agnostic technique, therefore it can be used on any model.

Steps involved in LIME:

  1. It creates a permutation (fake) of the given data.

  2. It calculates the distance between permutations and the original observations. Also, we can specify the distance measured.

  3. Then, it makes predictions on the new data using some black-box models.

  4. It picks “m” features that describe the complex model. It is an outcome from the permuted data in the best possible way through the maximum likelihood approach. Here, we can decide the number of features i.e. the value of “m” we want to use.

  5. It picks the “m” features and fits a simple model to the permuted data with the similarity score as weights.

  6. The weights from the simple model are used to provide explanations for the complex model’s local behavior.

SHAP (SHapley Additive exPlanations)

SHAP shows the impact of each feature by interpreting the impact of a certain value compared to a baseline value. The baseline used for prediction is the average of all the predictions. SHAP values allow us to determine any prediction as a sum of the effects of each feature value.

The only disadvantage with SHAP is that the computing time is high. The Shapley values can be combined together and used to perform global interpretations also.

Global Interpretation

We will discuss the following methods of global interpretation:

  • PDP (Partial Dependency Plot)
  • ICE(Individual Conditional Expectation)


PDP (Partial Dependency Plot)

PDP explains the global behavior of a model by showing the relationship of the marginal effect of each of the predictors on the response variable.

It shows a relationship between the target variable and a feature variable. Such a relationship could be complex, monotonic, or even a simple linear one. The plot assumes that the feature of interest (whose partial dependence is being computed) is not highly correlated with the other features. If the features of the model are correlated, then PDP does not provide the correct interpretation. We cannot plot PDP for all complex classifiers like Neural Networks.

ICE (Individual Conditional Expectation)

ICE is an extension of PDP(global method) but they are more intuitive to understand as compared to PDP. Using ICE, we can explain heterogeneous relationships. While PDP supports two feature explanations using ICE we can explain only one feature at a time.

Thus, it provides a plot of the average predicted outcomes. These outcomes are for different values of a feature while keeping the values of other feature values are constant.

Hands-on learning model explainability methods

We will explore the different model interpretation methods using the famous “Pima Indians Diabetes Database”  to predict whether a patient has diabetes or not.

Dataset can be downloaded here.

Model Explainability

For full code visit Github


Machine learning models are often seen as black-box models. However, in this article, we have seen how we can explain such models and why it is important to do so. Further, we have discussed ways to interpret and explain a model. Explainable AI (XAI) is emerging and we would possibly be able to automate the interpretation of ML models in the near future.

The media shown in this article is not owned by Analytics Vidhya and are used at the Author’s discretion

About the Author

Priyanka Dalmia

Our Top Authors

Download Analytics Vidhya App for the Latest blog/Article

Leave a Reply Your email address will not be published. Required fields are marked *