Exploring The Different Types Of Probability Distribution Function!

Harika Bonthu 19 Jan, 2024 • 9 min read
Probability distribution function

Introduction:

In this article, we’ll look into Probability Distribution Functions (PDFs), Distribution Functions and their different types, and how they’re used. We’ll break down the formulas, understand their characteristics, and make it easy to grasp how they help with data. This guide is here to simplify probability and statistics for better decision-making.

What is Probability Distribution Function?

A Probability Distribution Function (PDF) is a mathematical way of showing how likely different outcomes are in a random event. It gives probabilities to each possible result, and when you add up all the probabilities, the total is always 1. The PDF helps us understand the chances of different outcomes in a random experiment.

Probability distribution function formula?

The Probability Distribution Function (PDF) is often denoted as f(x), and its formula varies based on the specific probability distribution. Here are a few examples for commonly used distributions:

Continuous Uniform Distribution:

f(x)=b−a1​for a≤x≤b

Normal Distribution (Gaussian Distribution):

f(x)=σ2π​1​⋅e−2σ2(x−μ)2​

Exponential Distribution:

f(x)=λe−λxfor x≥0

Binomial Distribution:

f(x)=(xn​)â‹…pxâ‹…(1−p)n−xfor x=0,1,2,…,n

Here,f(x) represents the probability density function,x is the variable, and the other symbols have specific meanings depending on the distribution. Note that these are just a few examples, and different probability distributions have different formulas.

What is a distribution function?

In statistical terms, a distribution function is a mathematical expression that describes the probability of different possible outcomes for an experiment.

It is denoted as Variable ~ Type (Characteristics)

Let us say we are running an experiment of tossing a fair coin. The possible events are Heads, Tails. And for instance, if we use X to denote the events, the probability distribution of X would take the value 0.5 for X=heads, and 0.5 for X=tails.

Data Types

At a higher level, we have Qualitative and Quantitative data. And in Quantitative data, we have Continuous and Discrete data types.

Continuous data is measured and can take any number of values in a given finite or infinite range. It can be represented in decimal format. And the random variable that holds continuous values is called the Continuous random variable.

Examples: A person’s height, Time, distance, etc.

Discrete data is counted and can take only a limited number of values. It makes no sense when written in decimal format. And the random variable that holds discrete data is called the Discrete random variable.

Example: The number of students in a class, number of workers in a company, etc.

Types of distribution functions:

Based on the types of data we deal with, we have two types of distribution functions.

For discrete data, we have discrete distributions; and for continuous data, we have continuous distributions.

Discrete distributions 

Continuous distributions

Uniform distributionNormal distribution
Binomial distributionStandard Normal distribution
Bernoulli distributionStudent’s T distribution
Poisson distributionChi-squared  distribution

Before deep-diving into the types of distributions, it is important to revise the fundamental concepts like Probability Density Function (PDF), Probability Mass Function (PMF), and Cumulative Density Function (CDF).

Probability Density Function (PDF): 

It is a statistical term that describes the probability distribution of a continuous random variable. The probability associate with a single value is always Zero. Below is the formula for PDF.

probability distribution function

The formula for PDF. Source

Probability Mass Function (PMF)

It is a statistical term that describes the probability distribution of a discrete random variable.

probability mass function
The formula for PMF. Source 

Cumulative Distribution Function (CDF)

It is another method to describe the distribution of a random variable (either continuous or discrete).

cumulative distribution function
The formula for CDF. Source

Discrete distributions

Let us start with the easiest one – Uniform distribution.

Discrete Uniform distribution (U)

It is denoted as X ~ U (a, b). And is read as X is a discrete random variable that follows uniform distribution ranging from a to b.

Uniform distribution is when all the possible events are equally likely. For example, consider an experiment of rolling a dice. We have six possible events X = {1, 2, 3, 4, 5, 6} each having a probability of P(X) = 1/6.

The PMF graph of the above experiment is:

Probability Mass Function,

The formula for PMF, CDF of Uniform distribution function are:

fromula of pmf, cmf

The Mean and Variance of Uniform distribution are:

Mean = (a+b)/2

Variance = (n2-1)/12

Binomial distribution (B):

It is denoted as X ~ B(n, p). And is read as X is a discrete random variable that follows Binomial distribution with parameters n, p.
Where n is the no. of trials, and p is the success probability for each trial.

Binomial distribution is a discrete probability distribution of the number of successes in ‘n’ independent experiments sequence. The two outcomes of a Binomial trial could be Success/Failure, Pass/Fail/, Win/Lose, etc.

Generally, the outcome success is denoted as 1, and the probability associated with it is p.

And Failure is denoted as 0, and the probability associate with it is q = 1-p.

The formula for PMF, CDF of Binomial distribution are:

binomial distribution pMF and CMF

Source

K in the above formula is the number of successes.

The mean and variance of a binomial distribution are given as:

Mean = np

Variance = npq

Now consider, we ran a Binomial experiment 10 times, and the probability of success = 0.25. Below are how PMF, CDF looks.

Binomial distribution | Probability Distribution Function

Image by Author

PMF, CDF for a Binomial experiment with Probability of success = Probability of failure look like below.

Probability Distribution Function |Binomial

Bernoulli distribution (Bern): 

It is denoted as X ~ Bern(p). And is read as X is a discrete random variable that follows Bernoulli distribution with parameter p.
Where p is the probability of the success.

Bernoulli can be represented as a Binomial experiment with a single trial.

X ~ Bern(p) —-> X ~ B(1, p)

The formula for PMF, CDF of Bernoulli distribution is:

The Mean and Variance of Bernoulli distribution are given as:

Mean = p

Variance = p(1-p) = pq

Example: Consider an example of tossing a fair. The two possible outcomes are Heads, Tails. The probability (p) associated with each of them is 1/2.

If we take an unfair coin, the probability associated with each of them need not be 1/2. Heads can have a probability of p = 0.8, then the probability of tail q = 1-p = 1-0.8 = 0.2

example | Probability Distribution Function

Bernoulli’s event suggests which outcome can be expected for a single trial. Whereas, a Binomial event suggests the no. of times a specific outcome can be expected. 

Poisson Distribution (Po):

It is denoted as X ~ Po(λ). And is read as X is a discrete random variable that follows Poisson Distribution with parameter Î».
Where Î» is the expected rate of occurrences.

Poisson Distribution is a discrete probability distribution function that expresses the probability of a given number of events occurring in a fixed time interval.

Examples: 

  • The number of diners at a restaurant on a given day.
  • Calls per hour at a call centre.

The formula for PMF, CDF of poison distribution are:

The Mean and Variance of Poisson distribution are given as:

Mean = Variance = Î»

A Poisson distribution with Î» = 5 look like below

example poison distribution

Continuous Distributions

Normal or Gaussian Distribution (N)

It is denoted as X ~ N (μ, Ïƒ2). And is read as X is a continuous random variable that follows a Normal distribution with parameters Î¼, Ïƒ2.
Where Î¼ is the mean, and Ïƒis the variance. Mean, Variance together talks about shape statistics.

A normal distribution is a continuous distribution that describes the probability of a continuous random variable that takes real values.

Examples: Heights of people, exam scores of students, IQ Scores, etc follows Normal distribution.

Properties of Normal distribution:

  • The random variable takes values from -∞ to +∞
  • The probability associate with any single value is Zero.
  • looks like a bell curve and is symmetric about x=μ. 50% of data lies on the left-hand side and 50% of the data lies on the right-hand side.
  • The area under the curve (AUC) = 1
  • All the measures of central tendency coincide i.e., mean = median = mode

A normal distribution with different means, standard deviations look like below:

normal distribution | Probability Distribution Function

Source

Normal distribution follows the 68-95-99.7 rule. This rule is also known as the empirical rule. According to it, 68% of data lies in the first standard deviation range, 95% of data lies in the second standard deviation range, and 99.7% of data lies in the third standard deviation range.

Probability Density. Probability distribution function

The formula for PDF, CDF of the normal distribution are:

The Mean and Variance of a Normal distribution are given as:

Mean = Î¼

Variance = Ïƒ2

Let’s assume we have a height distribution with mean = 25, standard deviation = 2.48. Below is how the graph looks like.

example normal disribution

Standard Normal Distribution or SND:

It is denoted as Z ~ N(0, 1). And is read as X is a continuous random variable that follows  Normal distribution with mean 0 and variance 1.

It is a transformation of Normal distribution in such a way that Mean = 0, and standard deviation 1.

Transformation is a way in which we alter every element of distribution to get a new distribution with similar characteristics.

All the properties of a Normal distribution will be satisfied by a Standard Normal distribution.

SND | Probability Distribution Function

Source

And in addition, there exists a table that summarizes the most commonly used values of a CDF of s Standard Normal Distribution. This table is known as a Z-score table.

The formula for standardisation is Z = (X-μ)/σ

The formula for PDF, CDF of Standard Normal distribution are given as:

formula PDF,CDF distribution function

The Mean and Variance of Standard Normal distribution are:

Mean = 0

Variance = 1

Normal distribution with mean 0 and variance 1 (SND) looks like below:

normal distribution with mean 0 and sd 1

Student’s T distribution or t-distribution (t)

It is denoted as X ~ t(k). And is read as X is a continuous random variable that follows Student’s T distribution with parameter k.
where k is the degrees of freedom. If the sample size is n, then k = n-1.

Student’s T distribution is a small sample size approximation of a normal distribution. As the degrees of freedom increase, t distribution tends to become Standard Normal distribution.

T distribution, probability distribution function

Source

The formula for PDF, CDF of t-distribution are:

(v in above formulae is degrees of freedom)

t-distribution can be used in Hypothesis testing (to test if there is any significant difference between two sample means), calculating confidence intervals with population standard deviation is unknown.

Like standard normal distribution, t-distribution also has a table of its own. This table is known as the t-table.

The mean and variance of Student’s T distribution are:

Mean = 0

variance = k/(k-2)

A student’s t-distribution with degrees of freedom = 25 looks like below:

example tdistribution | Probability Distribution Function

Chi-Square distribution

It is denoted as X~χ2(k). And is read as X is a continuous random variable that follows Chi-Square distribution with k degrees of freedom.

It is used in Hypothesis testing, computing confidence intervals, and for the goodness of fit.

It is a transformation of t-distribution. Finding the t-distribution to the power of 2 gives Chi-Square distribution and finding the square root of Chi-Square of distribution gives us t-distribution.

Chi-Square distribution has a chi-square table.

The formula for PDF, CDF of Chi-square distribution are:

chi square distribution

Source

The mean and variance of Chi-square distribution are:

Mean = k

Variance = 2k

A Chi-square distribution with degrees of freedom = 5 looks like below.

example chi sqaured , probabiltiy distribution function

Conclusion

In summary, Probability Distribution Functions (PDFs) and their formulas and Distribution Functions are key for understanding different data types. Explore Discrete Uniform, Binomial, Bernoulli, Poisson, and Continuous distributions to gain insights for effective data analysis and decision-making in various scenarios.

I hope this article is informative. Feel free to share it with your study buddies.

Frequently Asked Questions

Q1.What is normal CDF and PDF?

Normal CDF and PDF:
CDF is like climbing stairs, showing the chance of being below a point on a bell curve.
PDF is the likelihood at a specific point, showing how probable a value is close to that point.

Q2. What is the difference between probability distribution and function?

Probability Distribution vs. Function:
Distribution is a list of probabilities for outcomes.
Function is like a rule, connecting each outcome to its probability.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.

Harika Bonthu 19 Jan 2024

Frequently Asked Questions

Lorem ipsum dolor sit amet, consectetur adipiscing elit,

Responses From Readers

Clear

Machine Learning
Become a full stack data scientist

  • [tta_listen_btn class="listen"]