What is the first measure coming into your mind when you must test your classification model? Do you always check accuracy? Is accuracy enough to test the model?

In this read, you will get to know that accuracy is not the best measure to test your classification model, how can you draw more insight from your model, and can get a sense of how you can improve it. And, I will tell you one great trick to remember the mighty confusion Matrix, so that next time you wonâ€™t find it confusing. So, without further ado, letâ€™s start!

We will be covering items here:

- Confusion Matrix: a simple definition
- TP and TPR
- TN and TNR
- FP and FPR
- FN and FNR
- Accuracy
- Error Rate
- General Understanding

- Why is the confusion matrix better than accuracy?
- Precision and Recall
- F1 Score
- Trick to remember confusion matrix

As the name suggests, it is a matrix. A matrix of predicted and actual target values. This matrix compares the predicted target values with the actual target values. And it cannot process probability scores.

Letâ€™s understand it with the help of a small dataset:

Data points | Actual Class Labels | Predicted class labels |

x1 | y1 | Y1 |

x2 | y2 | Y2 |

x3 | y3 | Y3 |

. | . | . |

. | . | . |

. | . | . |

xn | yn | Yn |

Here, x1,x2,x3â€¦â€¦xn is the independent data points, y1,y2,y3â€¦..yn are the actual target values or actual class labels, and Y1, Y2, Y3â€¦..Yn has predicted class labels. Because the confusion matrix cannot process probability scores, all these class labels are binary.

Here every class label is either 0 or 1 (0 represents negative and 1 represents positive labels). So, the confusion matrix for a binary classification will be:

N = total negative

P = total positive

Here we can see how a confusion matrix looks like for a binary classification model. Now letâ€™s understand TN, TP, FN, FP further.

**True Positive â€“ **The predicted value as well as the actual value both are positive, i.e., the model correctly predicts the class label to be positive.

**True Positive Rate â€“ **The ratio of true positive and totally positive, i.e.,

TPR = TP / P

TPR = TP / (FN+TP)

**True Negative â€“** The predicted value as well as the actual value both are negative, i.e., the model correctly predicts the class label to be negative.

**True Negative Rate â€“** The ratio of true negative and total negative, i.e.,

TNR = TN / N

TNR = TN / (TN+FP)

**False Positive â€“ **The predicted value is positive, but the actual value was negative, i.e., the model falsely predicted these negative class labels to be positive.

**False Positive Rate â€“ **The ratio of false-positive and total negative, i.e.,

FPR = FP / N

FPR = FP / (TN+FP)

NOTE: False positive (FP) is also called **â€˜type-1 errorâ€™**.

**False Negative â€“ **The predicted value is negative, but the actual value is positive, i.e., the model falsely predicted the positive class labels to be negative.

**False Negative Rate â€“ **The ratio of false-negative and totally positive, i.e.,

FNR = FN / P

FNR = FN / (FN+TP)

NOTE: False negative (FN) is also called **â€˜type-2 errorâ€™**.

**Accuracy â€“ **The ratio of correctly predicted class labels to all class labels. It tells us how much our model is correct.

Accuracy = (TN+TP) / (N+P)

**Error rate â€“ **The ratio of incorrectly predicted class labels to all class labels. It tells us how much our model has errored.

Error rate = (FN + FP) / (N+P)

If the model is sensible and not dumb then,

i.e., the elements in the principal diagonal in the matrix will be high and the rest of the off-diagonal elements will be small if the model is good or sensible.

So, by looking at the rates we have discussed above, we can conclude whether the model is dumb or sensible regardless it is balanced or imbalanced.

Letâ€™s take a very simple example and understand it more deeply.

Suppose we have an imbalanced dataset that is 95% negative and 5% positive. So in this dataset, for any given â€˜xâ€™ we will have a negative â€˜Yâ€™ 95% of the time.

Letâ€™s say after fitting the model our different rates are like this:

TP = 70, TN = 880, FP = 30, FN = 20

So, our accuracy will be:

Accuracy = (880+70) / (70+880+30+20)

Accuracy = 950 / 1000 = 0.95

i.e., 95%

We will think that the model is 95% accurate, i.e., it will predict accurately 95% of the time. But we can see here that it is a dumb model and 95% of the time it will say that the predicted value is negative.

We can overcome this problem by using **precision** and **recall** which are derived from the confusion matrix.

Now, I believe, you get my point, why accuracy is not the best matrix to measure our model performance.

**Precision: **Of all the points that the model predicted to be positive, how much percentage of them are truly positive?

Precision = TP / (FP+TP)

**Recall: **Of all the actually positive points how much percentage of them are predicted positive points?

Recall = TP / (FN+TP)

In most cases, we want both our precision and recall being high, but it is not possible. When our precision will be high our recall will be low and vice versa. So to balance these we have another matric called F1 Score.

**F1 Score: **It is the harmonic mean of precision and recall values. It is maximum when precision is equal to recall.

Here is our confusion matrix:

2 steps to remember:

- What is the predicted label? (the 2nd part)
- Are we correct? (the first part)

Let me elaborate. If we need to find (a) in the above matrix, we will ask ourselves what if the predicted label and write it in the 2nd place:

Now, we will ask the second question, i.e., are we correct? If yes, then we will write true (T) and if no, we will write false (F) and write it in the first place:

Similarly for (b), what is the predicted value? â€“ N. Are we correct? â€“ F. So in place of (b) we can write â€“ FN:

Similarly, we can fill (c) and (d) also:

TPR = number of true positives / total number of positives

So, the number of true positive points is â€“ TP and the total number of positive points is â€“ the sum of the column in which TP is present which is â€“ P.

i.e.,

TPR = TP / P

TPR = TP / (FN+TP)

Similarly, we can see that,

TNR = TN / N

TNR = TN / (TN+FP)

Using the same trick, we can write FPR and FNR formulae.

So now, I believe you can understand the confusion matrix and different formulae related to it. How and why is it used and how it is better than accuracy? See, the confusion matrix is not very confusing anymore!

We will soon come with another article with some other tricks and in-depth intuition.

Stay Well!

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Become a full stack data scientist
##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

##

Understanding Cost Function
Understanding Gradient Descent
Math Behind Gradient Descent
Assumptions of Linear Regression
Implement Linear Regression from Scratch
Train Linear Regression in Python
Implementing Linear Regression in R
Diagnosing Residual Plots in Linear Regression Models
Generalized Linear Models
Introduction to Logistic Regression
Odds Ratio
Implementing Logistic Regression from Scratch
Introduction to Scikit-learn in Python
Train Logistic Regression in python
Multiclass using Logistic Regression
How to use Multinomial and Ordinal Logistic Regression in R ?
Challenges with Linear Regression
Introduction to Regularisation
Implementing Regularisation
Ridge Regression
Lasso Regression

Introduction to Stacking
Implementing Stacking
Variants of Stacking
Implementing Variants of Stacking
Introduction to Blending
Bootstrap Sampling
Introduction to Random Sampling
Hyper-parameters of Random Forest
Implementing Random Forest
Out-of-Bag (OOB) Score in the Random Forest
IPL Team Win Prediction Project Using Machine Learning
Introduction to Boosting
Gradient Boosting Algorithm
Math behind GBM
Implementing GBM in python
Regularized Greedy Forests
Extreme Gradient Boosting
Implementing XGBM in python
Tuning Hyperparameters of XGBoost in Python
Implement XGBM in R/H2O
Adaptive Boosting
Implementing Adaptive Boosing
LightGBM
Implementing LightGBM in Python
Catboost
Implementing Catboost in Python

Introduction to Clustering
Applications of Clustering
Evaluation Metrics for Clustering
Understanding K-Means
Implementation of K-Means in Python
Implementation of K-Means in R
Choosing Right Value for K
Profiling Market Segments using K-Means Clustering
Hierarchical Clustering
Implementation of Hierarchial Clustering
DBSCAN
Defining Similarity between clusters
Build Better and Accurate Clusters with Gaussian Mixture Models

Introduction to Machine Learning Interpretability
Framework and Interpretable Models
model Agnostic Methods for Interpretability
Implementing Interpretable Model
Understanding SHAP
Out-of-Core ML
Introduction to Interpretable Machine Learning Models
Model Agnostic Methods for Interpretability
Game Theory & Shapley Values

Deploying Machine Learning Model using Streamlit
Deploying ML Models in Docker
Deploy Using Streamlit
Deploy on Heroku
Deploy Using Netlify
Introduction to Amazon Sagemaker
Setting up Amazon SageMaker
Using SageMaker Endpoint to Generate Inference
Deploy on Microsoft Azure Cloud
Introduction to Flask for Model
Deploying ML model using Flask