XAI: Accuracy vs Interpretability for Credit-Related Models

guest_blog 05 Dec, 2022

7 min read

Introduction

The global financial crisis of 2007 has had a long-lasting effect on the economies of many countries. In the epic financial and economic collapse, many lost their jobs, savings, and much more. When too much risk is restricted to very few players, it is considered as a notable failure of the risk management framework. Since the global financial crisis, there has been a rapid increase in the popularity of terms like “bankruptcy,” “default,” and “financial distress.” The process of forecasting distress/default/bankruptcy is now considered an important tool for making financial decisions and the main driver of credit risk. The Great Depression of 1932, the Suez Crisis of 1956, the International Debt Crisis of 1982, and the latest 2007 recession have pushed the world to try and figure out the prediction of the downfall as early as possible.

AI Accuracy

Source: Image by pikisuperstar on Freepik

Predicting financial distress and bankruptcy is one application that has received much attention in recent years, thanks to the excitement surrounding Financial Technology (FinTech) and the continuing rapid progress in AI. Financial institutions can use credit scores to help them decide whether or not to grant a borrower credit, increasing the likelihood that risky borrowers will have their requests denied. In addition to difficulties caused by data that is both noisy and highly unbalanced, there are now new regulatory reforms that must be met, like General Data Protection Regulation (GDPR). The regulators expect that the model interpretability is there to ensure that algorithmic findings are comprehensible and consistent. As the COVID-19 pandemic hurt the financial markets, a better understanding of lending is required to avoid a situation like 2007.

Balance: Interpretability and Accuracy

The credit scoring models or bankruptcy prediction models, which predict whether an individual will return the lender’s money or if a company will file for bankruptcy, must address high accuracy and interpretability. Two friends of mine from the same University in the US, with approximately the same salary, the same age, and many more equivalent parameters, applied for a home loan. One got his loan approved, while the other one faced a rejection. Don’t you think the rejected one deserves an answer? What if a corporation has applied for a loan for expansion, but the bank decided not to offer credit?

Interpretation

With the growth in several relevant independent variables (sentiment score from financial statements and Ratios), an obvious interpretation difficulty has emerged. In the past, a minimal number of independent variables and a basic model were sufficient for the model to be easily interpreted. As a result, research aiming to pick the most significant variables and simulate insolvency based on the chosen features and a simple statistical model gained much popularity. Using machine learning algorithms is another method for dealing with features that are not always limited in number.

Source: Image by Guo et al (2019) – Research Gate

Accuracy

The conventional feature selection-based technique and the machine learning-based approach have advantages and disadvantages. Feature-selection-based approaches are readily interpretable since they use a small number of variables that have been selected as being significant to the prediction of an event, bankruptcy/default, in the context of this article. Feature selection-based approaches often use a basic prediction model, such as a multivariate function. However, the accuracy is much lower compared to machine-learning-based models. In contrast, even though machine-learning-based approaches achieve greater accuracy, such models are too complex to be comprehended easily.

Explainable AI for Humans – XAI

Explainable AI

Source: Image by Caltech

Explainable artificial intelligence (XAI) attempts to simplify black-box models and make them more interpretable. It lets humans understand and trust machine learning algorithm output. It describes the AI-powered decision-making model’s accuracy, transparency, and results. The much-needed explainability and transparency help organizationsapproach AI differently. Machine learning success has sparked a flood of AI applications. However, the machine’s inability to explain its actions to humans limits these systems’ efficacy. As AI advances, humans struggle to understand and retrace the algorithm’s results.

XAI

Source: Image by freepik on Freepik

When presented with an ML-based model, XAI techniques may be used to audit the model and put it through its paces to see whether it can consistently provide accurate results across different use cases related to credit scoring and distress prediction. These techniques, for instance, assess the model’s prediction rules for consistency with previous information about the business problem, which may help reveal challenges that may hamper the model’s accuracy when applied to out-of-sample data. Problems in the dataset that are used to train the model, such as an incomplete or biased representation of the population of interest or training conditions that lead to the model learning faulty forecasting rules, may be uncovered by XAI techniques.

Post-Covid, XAI has taken massive strides to address credit-related business problems:

Zest’s artificial intelligence-based technology allows lenders to tweak models for fairness by lowering the impact of discriminatory credit data without compromising performance.
FICO has publicly stated its intention to supplement human domain understanding with machine intelligence to enable the quick development of highly predictive and explainable credit risk scoring models.
Orcolus frees lenders from biased datasets and helps them to analyze financial data more efficiently. Ocrolus’ software analyses bank statements, pay stubs, tax documents, mortgage forms, invoices, and other documents to determine loan eligibility for a mortgage, business, consumer, credit scoring, and KYC.
Underwrite.ai used non-linear algorithmic modeling to estimate lending risk in areas with minimal or no credit bureau use.
Temenos launched transparent XAI models delivered as SaaS to help banks and credit unions speed up digital onboarding and loan processing to tackle the Post Covid challenges about the lending industry.

Models: Modern vs. Conventional

The modern techniques of applying machine learning models to credit scoring for default prediction in the financial sector have undoubtedly resulted in greater predictive performance than classic models such as logistic regressions. New digital technologies have boosted the adoption of machine learning models, enabling financial institutions to acquire and use bigger datasets comprising many features and observations for model training. Machine learning approaches, as opposed to conventional models such as the logit model, may experimentally discover non-linearities in the connections between the outcome variable and its predictors and interaction effects among the latter. As a result, when the training dataset is big enough, ML models are thought to beat classical statistical models used in credit assessment regularly. As a result, higher accuracy in default forecasting may benefit lending institutions through fewer credit losses and regulatory capital-related savings.

Role of Regulators

Financial regulators have recently voiced strong support for having financial institutions use both traditional and machine learning models simultaneously, with humans assigned to handle any significant discrepancies. Most financial regulators do not limit where or how lending institutions can use black boxes. Still, Germany’s Federal Financial Supervisory Authority has recommended that institutions consider the advantages of using a more complex model and record why they opted for interpretable alternatives. In the words of many policymakers, there is no universal method for assessing costs and benefits. They suggest that banks consider the model’s justification, deployment context, and goal before implementing it. Patented work has been carried out by Equifax, which uses neural networks to analyze consumer credit risk and offers reason codes that help businesses meet regulatory obligations.

XAI

Source: Image by freepik on Freepik

Actionable Insights For The Customers

The XAI techniques for credit-related models easily accommodate binary consumer decisions, including “lend” and “don’t lend.” The explanations provided by the XAI techniques may not consider crucial facets of lending, such as the interest rate, the repayment schedule, credit limit changes, or the customer’s loan preferences. However, financial institutions using XAI would need to inform customers why their loan applications were denied and how to improve their credentials. Imagine the customer is suggested to get an increase in his salary or educational qualifications or is asked to pay his credit card bills on time for a few months to get the loan approval through. This is nothing but actionable information a customer can make use of and reapply for the loan.

Success is Not Linear, or Is It?

Numerous financial institutions are now investigating inherently interpretable algorithms, such as linear regression, logistic regression, explainable gradient boosting, and neural networks. Consistently, there is a desire to investigate new techniques for developing models that are transparent by design and require no post-hoc explanation. Explaining pre-built models in a post hoc fashion is the other part of the story. A deep neural network with, say, 200 layers or a black box model passes inputs to the explainer algorithm, which further breaks the complex model into smaller, simpler pieces that are easier to comprehend. These simpler outputs would also consist of a list of features and parameters that are significant to the business problem. In both the above scenarios, a trade-off between high accuracy and interpretability is the need of the hour.

XAI

Source: Image by Ajitesh Kumar on Vitalflux

SHAP (SHapley Additive exPlanation) and LIME (Local Interpretable Model-agnostic Explanations) are widely used explainability approaches. SHAP uses Shapley values to score model feature influence. Shapley values consider all feasible predictions using all inputs. SHAP’s thorough approach ensures consistency and local correctness. On the other hand, LIME constructs sparse linear models around each prediction to describe how the black box model operates locally.

More importantly, what is more relevant, an accurate model or a model that can be easily understood by businesses as well as customers? If the data is linearly separable, we can create an inherently interpretable and accurate model. But if the data is complicated and the decision boundaries aren’t straight, we might have to look at the complicated model to ensure it’s right and then think about post-hoc explanations.

Conclusion

In an ideal world, explainability helps people understand how a model works and when to use it. Lending institutions, regulatory bodies, and government must collaborate to develop AI guidelines protecting customers’ interests. If different groups don’t make the goals of explainability clear, AI will serve the interests of a few, as during the 2007 Global Financial Crisis. A well-designed XAI solution considers stakeholders’ needs and customizes the explanation. AI and Machine Learning algorithms have already revolutionized the operations of lending institutions. Some of the key takeaways about XAI are as follows:

Model accuracy/performance versus interpretability – XAI aims to make AI models more intuitive and comprehensible without sacrificing accuracy/performance.
XAI implementation and scaling – Inexplicability has long prevented lending institutions from fully utilizing AI. However, XAI has helped institutions not only with smooth onboarding as well as personalization.
Responsible AITransparency – How the credit model reached an outcome and using what parameters.
Informativeness (giving humans new information to help them make decisions), and Uncertainty estimation (figuring out how uncertain a result is) (quantifying how reliable a prediction is).