Have you ever read about Bayes’ theorem and wondered why its proof is so mathematically dense? It’s indeed confusing. Imagine a picture where a canvas of shapes and colours is showing Bayesian reasoning with no equations involved. Now, you will be able to demystify Bayes’ Theorem with intuitive shapes and areas. This supports the fact that conditional probability makes geometric sense. Bayes’ theorem is a fundamental concept in probability, and it’s unexplained to most people mathematically. In this article, we will dive into the world of probability, and that too visually. After reading this article, you will be able to understand Bayes’ Theorem and its proof visually. Now, let’s get started.
Before jumping into Bayes’ Theorem, let’s first understand what Conditional Probability is.
Conditional Probability is how likely an event is to happen given that another event has already happened. In simple terms, it is the probability of one event occurring under the condition of another event already occurring. You have information about one event, so it impacts the probability of another event.
The following image denotes the mathematical formula for Conditional probability.
Where,
P(A∣B) is the conditional probability of event A occurring given that event B has already occurred.
P(A and B) is the joint probability of both event A and event B occurring.
P(B) is the marginal probability of event B occurring.
Bayes’ Theorem, also known as Bayes’ Rule or Bayes’ Law used to determine the conditional probability of event A when event B has already occurred. In simple terms, it is a way to update your understanding of some event based on new information. It helps you to calculate the probability of a cause (event A) given that you have already observed an effect (event B).
Let’s take a simple example,
Bayes’ Theorem helps you update your belief; a long line makes it more probable the restaurant is good, revising your initial “average” belief.
The image shows Bayes’ Theorem:
We finally explored all the prerequisites for understanding Bayes’ Theorem.
Let’s dive into the Bayes’ Theorem Visualization:
Let’s break the provided visualization into some parts to understand it easily.
According to the formula of Bayesian probability:
Here, P(A|B) is the overlap area divided by the circle. So we have to prove,
The following equation, according to Bayes’ Theorem, is also equal to overlap divided by circle, i.e, Left Hand Side (LHS) = Right Hand Side (RHS).
Let’s substitute the given shapes into the LHS. After substituting the values with their corresponding shapes defined earlier. We can notice that several similar shapes can be cut out using the fraction rule.
After cutting down the similar images. We are left with an overlap shape divided by the circle shape. This resulting fraction is equal to the P(A|B) that is the required RHS.
Hence, LHS = RHS, and Bayes’ Theorem is proved using shapes and Venn diagrams. It denotes the Visual Proof of Bayes’ Theorem.
Bayes’ Theorem is a fundamental concept while studying probability. Although it is an easy concept, its applications show its versatility across various domains.
Bayes’ theorem proof is just about comparing parts of a whole. When you look at the overlapping shapes, you see how proportions tell the whole story. You can draw your colorful circles and diamonds (or whatever shapes you like) to get random scenarios and see Bayes working in real time, not just in math. Once you play with these visuals, you build intuition easily, and then you’re ready to go deeper into Bayesian inference, like using priors, likelihoods, updating beliefs, and it all starts from simple overlapping areas. Visualizing an equation makes it easier to understand and implement.
Read more: Bayes’ Theorem for Data Science
A. The joint event A and B (P(A ∧ B)) – the foundation of Bayes’ formula
A. It’s the overlap area divided by the total circle (B) area
A. Intersection is commutative – order doesn’t matter.
A. It gets complex with 3+ events, but mosaic plots or tree diagrams work well
A. Visuals build stronger intuition and help avoid misinterpreting conditional probabilities.