Explainable AI: Demystifying the Black Box Models

Last Updated : 25 Oct, 2023

10 min read

Introduction

In today’s data-driven world, machine learning is playing an increasingly prominent role in various industries. Explainable AI aims to make machine learning models more transparent to clients, patients, or loan applicants, helping build trust and social acceptance of these systems. Now, different models require different explanation methods, depending on the audience. This article discusses the importance and applications of Explainable AI (XAI) as well as the various approaches to demystify black box models.

Explainable AI: Demystifying the Black Box Models | DataHour by Dr. Farha Anjum Khan

Learning Objectives:

Recognize the importance and applications of Explainable AI (XAI) in enhancing transparency and trust in machine learning.
Differentiate between global and local approaches to XAI and understand their use cases.
Explore key XAI techniques such as LIME, SHapley Additive Explanations, and counterfactual explanations for model transparency and interoperability.

What is Explainable AI?
Types of Explainability
Importance of Explainable AI
Challenges and Considerations in Explainability
Model Development in Explainable AI
Different Approaches to Explainability
Advanced Approaches to Explainable AI
About the Author: Dr. Farha Anjum Khan

What is Explainable AI?

The use of machine learning in decision-making is now an integral part of every industry, company, and organization. These decisions impact a wide range of stakeholders, including business owners, managers, end-users, domain experts, regulators, and data scientists. It is therefore vital that we understand how these models make decisions.

Regulators have become increasingly interested in machine learning, particularly in highly regulated sectors where decisions carry significant consequences. Transparency is crucial in fields like finance, healthcare, crime, and insurance, where it’s important to know how and why a particular decision was made.

Explainable AI aims to make machine learning models more transparent to all of these stakeholders. It also assists data scientists in understanding their work better. However, XAI is not always necessary, and we must use it wisely. In some scenarios, overly complex explanations may lead to distrust or confusion.

Types of Explainability

Explainable AI comes in various flavors, each catering to different use cases and stakeholders. We need different methods to explain AI models to business owners, managers, users, domain experts, regulators, and data scientists. Depending on the stakeholders’ needs, we might require different explanation methods. The rise of regulators in the AI field, proposing requirements and regulations, has made explainability even more crucial in strictly regulated industries like finance, healthcare, and criminal justice. Machine learning models in these areas must be able to provide transparency to clients, patients, or loan applicants.

Explainable AI encompasses various methods and techniques, categorized into several types:

Ad-hoc/Intrinsic Methods: These methods restrict the complexity of the model before training. They involve selecting specific algorithms and limiting the number of features.
Post-hoc Methods: These methods are applied after a model is trained. They are more versatile in explaining black-box models.
Model-Specific vs. Model-Agnostic: Some methods are specific to certain types of black-box models, while others can be applied universally.
Local vs. Global Methods: Local methods provide explanations for individual instances or data points, while global methods offer insights into overall model behavior across the dataset.

Importance of Explainable AI

Explainable AI is crucial for building trust and acceptance of machine learning models. It helps end-users understand why a particular decision was made. It also fosters transparency, detects biases, improves robustness and reliability, aids data scientists in understanding their models, and helps in gaining trust in machine learning systems.

However, explainability is not always necessary and can sometimes lead to distrust, especially when complex and opaque processes are uncovered. It’s essential to choose the right level of explainability for the context.

Challenges and Considerations in Explainability

Explainable AI comes with its set of challenges. One of the major challenges is the need to balance between accuracy and transparency. Sometimes, achieving high accuracy with complete transparency is not feasible. Additionally, selecting the right explanation method for a specific model can be challenging. It’s crucial to avoid revealing sensitive information or creating explanations that can be exploited.

Model Development in Explainable AI

The quest for explainability often involves striking a balance between model accuracy and transparency. While models like decision trees are intrinsically explainable, they might sacrifice accuracy. On the other hand, highly accurate models, like deep neural networks, often lack transparency. Explainable AI aims to offer reasonably high accuracy while providing interpretability. It fills the gap between accuracy and transparency in a machine learning model.

Different Approaches to Explainability

Explainable AI can be approached in various forms. The approach taken influences the type of inferences gained and how the insights are explained. Therefore, the approach depends on the scale of the project and the use case. We will now look into some of these approaches such as global, local, counterfactual explanations, visualization, etc.

Global Approaches

Explainable AI can be approached from a global perspective. Global explanations focus on providing insights into the overall behavior of a model for an entire dataset. The important aspect here is to understand the general behavior of the model across different input instances. Decision trees and linear models often fall under this category, offering a high-level explanation of how features influence predictions.

Local Approaches

In contrast to global approaches, local explainability focuses on individual instances or data points. Local explanations aim to provide insight into why a specific prediction was made for a particular input. They offer a more detailed, case-specific view of model decisions. Techniques like LIME (Local Interpretable Model-agnostic Explanations) fall under this category, enabling us to understand why a model made a specific prediction for a particular input.

Counterfactual Explanations

Now, let’s explore the world of counterfactual explanations. Counterfactual explanations are all about understanding how changes and adjustments in input features can lead to different model outputs. It’s like asking, “What if I had different values for these features?”

Consider a scenario where someone’s loan application is rejected, and they want to know how to change their circumstances to get approval. Counterfactual explanations provide actionable insights. They show how altering specific features can lead to a desired outcome. This approach is incredibly valuable in decision-making processes and can help individuals understand how to achieve their goals.

Counterfactual explanations use case and scenario | XAI

However, the challenge with counterfactual explanations is that there can be multiple valid counterfactuals for a single instance. Selecting the most appropriate one can be tricky. Overall, counterfactual explanations provide a unique perspective on interpretability, helping us understand how changes in input features can impact the model’s predictions.

Visualization Approaches

Visualization is an effective way to provide explanations in Explainable AI. Two common visualization methods are Partial Dependence Plots (PDPs) and Individual Conditional Expectation (ICE) Plots.

Partial Dependence Plots

PDPs provide a visual representation of how one or two features affect the model’s predictions. They show the average marginal effect of a feature on the target variable. PDPs are easy to comprehend, making them ideal for explaining machine learning models to various stakeholders.

Individual Conditional Expectation Plots

Individual Conditional Expectation (ICE) plots offer a more detailed view of how feature variations affect individual instances. They uncover heterogeneous effects that are often missed by PDPs. ICE plots provide a distribution of model predictions for each instance, offering a nuanced understanding of a model’s behavior.

Global Surrogate Models

Global surrogate models provide an alternate approach to understanding complex black-box models. These models are built on top of the predictions made by black-box models. They offer a more interpretable and understandable representation of the model’s behavior. Decision trees are commonly used as surrogate models to explain the predictions of black-box models.

Decision tree surrogate after random forest

Feature Importance Analysis

Feature importance analysis is a critical part of Explainable AI, providing insights into which features have the most significant impact on model predictions. Permutation-based feature importance analysis is a technique to assess the importance of each feature. It measures the increase in a model’s prediction error when a specific feature is randomly shuffled. This allows us to understand the role of each feature in the model’s decision-making process.

Contrastive Explanation and Pertinent Negatives

Finally, we have a contrastive explanation, which offers a unique approach to model interpretability. It focuses on explaining decisions by highlighting what features are missing. Pertinent Negatives, a subset of contrastive explanations, help us understand the importance of absent features in classification.

Think of it this way: if someone has the symptoms of cough, cold, and fever but no sputum and chills, it could be pneumonia or the flu. But when sputum and chills are absent, it’s almost certainly the flu. Pertinent Negatives help us make more accurate diagnoses by considering the importance of missing features.

These explanations are particularly valuable when you want to understand the impact of the absence of certain features on model predictions. They provide a different perspective on interpretability and can help uncover the hidden factors that influence decisions.

Advanced Approaches to Explainable AI

By now, you’ve probably gained a good understanding of the various ways to approach explainable AI. Now, apart from those mentioned above, there are a few more advanced approaches, specific to certain tasks. Let’s have a look at some of them.

LIME: Demystifying Local Explanations

Local Interpretable Model-Agnostic Explanations (LIME) is a local approach that provides model-agnostic explanations for individual predictions. What makes it stand out is its ability to work with a wide range of data types, including tabular, text, and images, and it can handle both classification and regression tasks.

Local Interpretable Model-Agnostic Explanations | XAI

So, how does LIME work? Imagine you have a complex, “black box” model, and you want to understand why it made a specific prediction for a particular instance. LIME takes the instance you’re interested in, perturbs the data around it by generating new data points, and observes the predictions made by your model for these new samples. This process helps LIME approximate how the model behaves locally around the instance of interest.

Next, LIME creates an interpretable model using the new data. But here’s the twist: the interpretable model gives more weight to the data points that are closer to the instance you’re explaining. This means that the data points generated nearby are considered more heavily when building the interpretable model.

The result is a local explanation for the prediction of your “black box” model. This explanation is not only human-friendly but also model-agnostic, meaning it can be applied to any machine learning model. However, LIME does have its challenges. The definition of the neighborhood around an instance is not always clear, and the manual selection of features can be tricky. The sampling process can also be improved to make LIME even more effective.

SHapley Additive Explanations: The Power of Collaboration

Now, let’s dive into SHapley Additive Explanations. This approach is rooted in cooperative game theory. It is all about calculating the individual contribution or utility of features in a team, just like how game theory calculates the importance of each player in a cooperative game.

Consider a team of four people who won a prize in a chess game. They want to fairly distribute the prize among them based on their contributions. To calculate this, they explore various combinations of the players and estimate the utility for each player in every combination. This helps determine how much each player contributed to the overall success.

SHapley Additive Explanations apply the same principle to machine learning models. Instead of players, we have features, and instead of the prize, we have predictions. This approach calculates the marginal value of each feature on a specific instance and compares it to the average prediction across the dataset.

The beauty of SHapley values is that they provide an instance-based approach and work for both classification and regression problems. Whether you’re dealing with tabular, text, or image data, SHapley Additive Explanations can help you understand the importance of different features. However, one drawback is that it can be computationally intensive due to the vast number of combinations it explores. To mitigate this, you can use sampling on a subset of the data to speed up the process.

Layer-wise Relevance Propagation: Peeling Back the Layers

Layer-wise Relevance Propagation, or LRP, is a technique primarily used for image data. It helps us understand which pixels in an image have the most influence on a model’s classification decision. LRP operates by computing the relevance of each layer and neuron in a neural network. This process reveals the most influential pixels and helps us visualize what the model “sees” in an image.

Layer-wise Relevance Propagation | Explainable AI

Let’s consider a neural network that classifies images. LRP starts from the model’s output and works its way backward. It estimates the relevance of each neuron and layer in the network, checking the contribution of each neuron to the final classification. The result is a heatmap that highlights the pixels that had the most influence on the model’s decision.

One challenge with LRP-generated heatmaps is that they can be noisy and similar, making it difficult to pinpoint the exact influential pixels. However, LRP is a valuable tool for understanding how neural networks make image classification decisions.

Conclusion

Explainable AI (XAI) is a vital component in the realm of artificial intelligence and machine learning. It clarifies the intricate inner workings of AI models, ensuring transparency and trust. We explored various facets of XAI, from its importance to specific techniques like LIME, SHapley Additive Explanations, counterfactual explanations, LRP, and contrastive explanations. These methods empower users to understand, question, and fine-tune machine learning models for different contexts.

Key Takeaways:

Explainable AI offers diverse methods, such as global and local approaches, counterfactual explanations, and feature importance analysis, catering to various stakeholders and industries.
XAI enhances trust in machine learning systems through transparency, bias detection, and reliability improvement. Striking the right transparency-complexity balance is essential.
Techniques like LIME, SHapley Additive Explanations, LRP, and contrastive explanations provide insights into model behavior and foster interpretability, aiding AI system comprehension.

Frequently Asked Questions

Q1. Why is Explainable AI important?

Ans. Explainable AI fosters trust, detects biases, and enhances transparency in machine learning systems, promoting their acceptance and reliability.

Q2. What are global and local approaches in XAI?

Ans. Global approaches provide insights into an entire dataset, while local approaches focus on individual instances, offering case-specific explanations.

Q3. How do techniques like LIME and SHapley Additive Explanations work?

Ans. LIME perturbs data around an instance, creates an interpretable model, and provides local, model-agnostic explanations. SHapley Additive Explanations calculate feature importance based on cooperative game theory, aiding interpretability for both classification and regression problems.

About the Author: Dr. Farha Anjum Khan

Dr. Farha Anjum Khan is a seasoned data scientist and Lead Technical Architect at Continental. Dr. Khan has an extensive six-year career in data science. Her journey began with academic pursuits in experimental nuclear and particle physics, where she gained invaluable experience at prestigious institutions in Germany. Her passion for data, algorithms, and artificial intelligence led her to bridge the gap between technology and responsible AI, making her an expert in XAI.

DataHour Page: https://community.analyticsvidhya.com/c/datahour/explainable-ai-demystifying-the-black-box-models

LinkedIn: https://www.linkedin.com/in/dr-farha-anjum-khan-617b0813/

Artificial Intelligence Classification Data Visualization Machine Learning Regression

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

Explainable AI: Demystifying the Black Box Models

Introduction

Table of Contents

What is Explainable AI?

Types of Explainability

Importance of Explainable AI

Challenges and Considerations in Explainability

Model Development in Explainable AI

Different Approaches to Explainability

Global Approaches

Local Approaches

Counterfactual Explanations

Visualization Approaches

Partial Dependence Plots

Individual Conditional Expectation Plots

Global Surrogate Models

Feature Importance Analysis

Contrastive Explanation and Pertinent Negatives

Advanced Approaches to Explainable AI

LIME: Demystifying Local Explanations

SHapley Additive Explanations: The Power of Collaboration

Layer-wise Relevance Propagation: Peeling Back the Layers

Conclusion

Frequently Asked Questions

About the Author: Dr. Farha Anjum Khan

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Congratulations, You Did It!

Analytics Vidhya (4)

brahmaid

csrftoken