The Bayes Theorem and Football

Joyan Last Updated : 28 Jul, 2022

6 min read

This article was published as a part of the Data Science Blogathon.

Introduction

Suppose you are at a company for a job interview, and your friend has gone in for the interview before you. You estimate your chances of selection to be low, but when you hear from your friend that the interview was pretty easy and he/she feels that they have a good chance at being selected, wouldn’t you re-estimate your chances as well? Well, that’s the essence of Bayes’ theorem.

The Bayes’ theorem is perhaps the most important in the context of probabilities. In simple terms, Bayes’ theorem helps us update our prior beliefs based on some new evidence. This is probably the gist of the Bayes theorem. I’m sure there are quite a few question marks on your mind but don’t worry, I have got you covered :).

Before getting into Bayes’ theorem, we’ll have to brush up on the concept of conditional probability. The traditional probability can be explained as the ratio of the number of events favorable to the total number of events. Utilizing this, one can derive the conditional probability formula, which is event A’s probability given that event B has occurred. Therefore, the above formula will be modified to,

Illustration using Marbles

Bag 1– 2 white, 3 black marbles

Bag 2– 4 white, 2 black marbles

Now, if I asked you to calculate the probability of getting a white marble from bag 1, you would easily be able to calculate that right? (2/5 = 40%) On the flip side, if I said that white marble is drawn, determine the probability that it came from bag 2. How would you calculate that?

Enter ‘Reverend Thomas Bayes’. He found that if you can estimate the values of P(A), P(B), and the conditional probability P(B|A), you can compute the conditional probability P(A|B). That’s why this theorem is also called the ‘probability of causes’.

Event A — Getting a white marble

Event E1 — Choosing bag 1

Event E2 — Choosing bag 2

P (choosing bag 1) = P(E1) = P(choosing bag 2) = P(E2) = 1/2

P(white marble, given that it’s from bag 1) = P(A|E1)= 2/5

P(white marble given that it’s from bag 2) = P(A|E2) = 4/6

To find, Probability of drawing a marble from bag 2, given that it’s white marble. With the help of Bayes’ theorem, we can find the P(E2|A).

After plugging in the values in the above formula, you should get your answer.

Illustration using Football

To understand it better, let’s take an out-of-the-textbook example.

Mohamed Salah is a world-class footballer who plays as a Liverpool FC forward. He is predominantly a left-footed player. Hence he takes a higher number of shots from his left foot.

Let us find out the chances that he took the shot from his right foot, given that the shot was a goal.

Note: xG (expected goal) is a measure of shot quality; what are the chances of scoring a goal from that position based on records.

Let the events be as follows,

A — scoring a goal

E1 — the shot was taken from the right foot

E2 — the shot was taken from the left foot

As shown below, Mo Salah had a total of 139 shots in the 2021–22 season. Therefore,

P(shot was taken from right) = P(E1) = 15/139 = 0.1079

P(shot was taken from left) = P(E2) = 117/139 = 0.8417

Now for the conditional probabilities, which will be equal to the xG/shot for that particular foot.

P(scoring a goal given shot was taken from the right foot) = P(A|E1) = 0.27

P(scoring a goal given shot was taken from the left foot) = P(A|E2) = 0.16

Now all that remains is to put the values in the formula mentioned in the first example, and boom, you have your answer.

So the chances that Mo Salah took the shot from his right foot, given that the shot was a goal, is 0.1779 ~ 18%. Therefore, there’s an 18% chance that the shot was taken from the right, given that it was a goal. Similarly, can you determine the probability that Mo Salah took the shot from his left foot, given that the outcome was a goal? Let me know the answer in the comments.

Bayes theorem in ML

After understanding Bayes’ theorem, let’s quickly try to understand its use in machine learning. The naive Bayes theorem is one of the fastest machine learning algorithms, and the best part is that it’s straightforward to interpret.

The model is used for classification, be it binary or multi-class classification. The end goal is to get the maximum posterior probability of a class, and that will be assigned as the predicted class.

As illustrated in the graphic above, one can calculate the posterior probability of each class given the features and select the maximum posterior probability, also called Maximum A Posteriori (MAP). The marginal probability is constant for each class and is only used for normalizing the result.

The Naive Bayes classifier is fast and can provide real-time results. It is widely used in text classification for sentiment analysis. There are different types of Naive Bayes models, depending on the data type.

Gaussian

normal distribution
continuous values

Multinomial:

multinomial distribution
multinomial categorical variables

Bernoulli:

binomial distribution
binary categorical variables

The Naive Bayes model assumes that the features are independent, meaning the data has no multicollinearity, which is not the case in real-world problems. Despite this, the model has worked well in various classification problems such as spam filtering.

Win Predictor Using Naive Bayes Model

Let’s implement the Naive Bayes model in python to better understand the model. We start by importing the data and the required libraries. The data is scraped from fotmob.com; you can check out the tutorial here.

The dataset did not have the results column. Hence I added that.

#Home — 1
#Away — 2
#Draw — 0

results = []
result = 0
for i in range(len(df)):
  if df.home_team_score[i] > df.away_team_score[i]:
     result = 1
     results.append(result)
  elif df.home_team_score[i] < df.away_team_score[i]:
     result = 2
     results.append(result)
  else:
     result = 0
     results.append(result)

df['result'] = results

Next, I replaced the team names with numbers.

df.replace({‘ATK Mohun Bagan FC’:1, ‘Bengaluru FC’:2, 
            ‘SC East Bengal’:3,‘Mumbai City FC’:4, ‘Hyderabad FC’:5, 
            ‘Odisha FC’:6,‘Northeast United FC’:7, ‘FC Goa’:8,

            ‘Jamshedpur’:9,'Chennaiyin FC’:10,‘Kerala Blasters FC’:11},inplace=True)

Training the model and predicting the results.

X = df.drop(['result'],axis=1)
y = df['result']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

gnb = GaussianNB()
y_pred = gnb.fit(X_train, y_train).predict(X_test)

cf = confusion_matrix(y_test,y_pred)
sns.heatmap(cf, annot=True)

The model had a decent output considering there were just 110 data points in the data set. The accuracy can be increased through feature engineering and training it more data points.

Conclusion

The Bayes theorem is one of the simplest but beneficial probability theorems. It lays the foundation for various machine learning models, such as the Naive Bayes model. Although keep in mind the Naive Bayes model assumes no correlation between the features.

The science of probability is extremely captivating and especially for data enthusiasts. Predicting outcomes is the day-to-day job of a data scientist; therefore, understanding the machine learning models is crucial to optimize results.

To summarise the article:

Bayes theorem is a scientific approach to updating prior beliefs given new evidence. That is, if the probability of events A, B, and (B given A) is available, then you can compute P(A given B).
It is used in several sectors like finance, sports, and health to estimate the odds.
It lays the foundation for machine learning models such as the Naive Bayes theorem
The Naive Bayes model is mostly employed in spam filtering, sentiment analysis, and recommendation systems.

I hope this article helped you understand the theorem and its usage in machine learning. Let me know in the comments if you have any doubts regarding the theorem.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Joyan

Algorithm Beginner Probability Statistics

Free Courses

4.7

Generative AI - A Way of Life

Explore Generative AI for beginners: create text and images, use top AI tools, learn practical skills, and ethics.

4.5

Getting Started with Large Language Models

Master Large Language Models (LLMs) with this course, offering clear guidance in NLP and model training made simple.

4.6

Building LLM Applications using Prompt Engineering

This free course guides you on building LLM apps, mastering prompt engineering, and developing chatbots with enterprise data.

4.8

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Explore practical solutions, advanced retrieval strategies, and agentic RAG systems to improve context, relevance, and accuracy in AI-driven applications.

4.7

Microsoft Excel: Formulas & Functions

Master MS Excel for data analysis with key formulas, functions, and LookUp tools in this comprehensive course.

MUID

Used by Microsoft Clarity, to store and track visits across websites.

Expiry: 1 Year

Type: HTTP

_clck

Used by Microsoft Clarity, Persists the Clarity User ID and preferences, unique to that site, on the browser. This ensures that behavior in subsequent visits to the same site will be attributed to the same user ID.

Expiry: 1 Year

Type: HTTP

_clsk

Used by Microsoft Clarity, Connects multiple page views by a user into a single Clarity session recording.

Expiry: 1 Day

Type: HTTP

SRM_I

Collects user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Years

Type: HTTP

SM

Use to measure the use of the website for internal analytics

Expiry: 1 Years

Type: HTTP

CLID

The cookie is set by embedded Microsoft Clarity scripts. The purpose of this cookie is for heatmap and session recording.

Expiry: 1 Year

Type: HTTP

SRM_B

Collected user data is specifically adapted to the user or device. The user can also be followed outside of the loaded website, creating a picture of the visitor's behavior.

Expiry: 2 Months

Type: HTTP

_gid

This cookie is installed by Google Analytics. The cookie is used to store information of how visitors use a website and helps in creating an analytics report of how the website is doing. The data collected includes the number of visitors, the source where they have come from, and the pages visited in an anonymous form.

Expiry: 399 Days

Type: HTTP

_ga_#

Used by Google Analytics, to store and count pageviews.

Expiry: 399 Days

Type: HTTP

_gat_#

Used by Google Analytics to collect data on the number of times a user has visited the website as well as dates for the first and most recent visit.

Expiry: 1 Day

Type: HTTP

collect

Used to send data to Google Analytics about the visitor's device and behavior. Tracks the visitor across devices and marketing channels.

Expiry: Session

Type: PIXEL

AEC

cookies ensure that requests within a browsing session are made by the user, and not by other sites.

Expiry: 6 Months

Type: HTTP

G_ENABLED_IDPS

use the cookie when customers want to make a referral from their gmail contacts; it helps auth the gmail account.

Expiry: 2 Years

Type: HTTP

test_cookie

This cookie is set by DoubleClick (which is owned by Google) to determine if the website visitor's browser supports cookies.

Expiry: 1 Year

Type: HTTP

_we_us

this is used to send push notification using webengage.

Expiry: 1 Year

Type: HTTP

WebKlipperAuth

used by webenage to track auth of webenagage.

Expiry: Session

Type: HTTP

ln_or

Linkedin sets this cookie to registers statistical data on users' behavior on the website for internal analytics.

Expiry: 1 Day

Type: HTTP

JSESSIONID

Use to maintain an anonymous user session by the server.

Expiry: 1 Year

Type: HTTP

li_rm

Used as part of the LinkedIn Remember Me feature and is set when a user clicks Remember Me on the device to make it easier for him or her to sign in to that device.

Expiry: 1 Year

Type: HTTP

AnalyticsSyncHistory

Used to store information about the time a sync with the lms_analytics cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

lms_analytics

Used to store information about the time a sync with the AnalyticsSyncHistory cookie took place for users in the Designated Countries.

Expiry: 6 Months

Type: HTTP

liap

Cookie used for Sign-in with Linkedin and/or to allow for the Linkedin follow feature.

Expiry: 6 Months

Type: HTTP

visit

allow for the Linkedin follow feature.

Expiry: 1 Year

Type: HTTP

li_at

often used to identify you, including your name, interests, and previous activity.

Expiry: 2 Months

Type: HTTP

s_plt

Tracks the time that the previous page took to load

Expiry: Session

Type: HTTP

lang

Used to remember a user's language setting to ensure LinkedIn.com displays in the language selected by the user in their settings

Expiry: Session

Type: HTTP

s_tp

Tracks percent of page viewed

Expiry: Session

Type: HTTP

AMCV_14215E3D5995C57C0A495C55%40AdobeOrg

Indicates the start of a session for Adobe Experience Cloud

Expiry: Session

Type: HTTP

s_pltp

Provides page name value (URL) for use by Adobe Analytics

Expiry: Session

Type: HTTP

s_tslv

Used to retain and fetch time since last visit in Adobe Analytics

Expiry: 6 Months

Type: HTTP

li_theme

Remembers a user's display preference/theme setting

Expiry: 6 Months

Type: HTTP

li_theme_set

Remembers which users have updated their display / theme preferences

Expiry: 6 Months

Type: HTTP

Reading list

Basics of Machine Learning

Machine Learning Lifecycle

Importance of Stats and EDA

Understanding Data

Probability

Exploring Continuous Variable

Exploring Categorical Variables

Missing Values and Outliers

Central Limit theorem

Bivariate Analysis Introduction

Continuous - Continuous Variables

Continuous Categorical

Categorical Categorical

Multivariate Analysis

Different tasks in Machine Learning

Build Your First Predictive Model

Evaluation Metrics

Preprocessing Data

Linear Models

KNN

Selecting the Right Model

Feature Selection Techniques

Decision Tree

Feature Engineering

Naive Bayes

Multiclass and Multilabel

Basics of Ensemble Techniques

Advance Ensemble Techniques

Hyperparameter Tuning

Support Vector Machine

Advance Dimensionality Reduction

Unsupervised Machine Learning Methods

Recommendation Engines

Improving ML models

Working with Large Datasets

Interpretability of Machine Learning Models

Interpretability of Machine Learning Models

Automated Machine Learning

Model Deployment

Deploying ML Models

Embedded Devices

The Bayes Theorem and Football

Introduction

Illustration using Marbles

Illustration using Football

Bayes theorem in ML

Win Predictor Using Naive Bayes Model

Conclusion

Free Courses

Generative AI - A Way of Life

Getting Started with Large Language Models

Building LLM Applications using Prompt Engineering

Improving Real World RAG Systems: Key Challenges & Practical Solutions

Microsoft Excel: Formulas & Functions

Recommended Articles

Responses From Readers

Write for us

Congratulations, You Did It!

Analytics Vidhya (4)

brahmaid

csrftoken

Identityid

sessionid

Google (1)

g_state

Microsoft (7)

MUID

_clck

_clsk

SRM_I

SM

CLID

SRM_B

Google (7)

_gid

_ga_#

_gat_#

collect

AEC