Complete Guide to Chebyshev’s Inequality and WLLN in Statistics for Data Science

Aashi Goyal 08 Jun, 2021

5 min read

This article was published as a part of the Data Science Blogathon

Introduction

Chebyshev’s inequality and Weak law of large numbers are very important concepts in Probability and Statistics which are heavily used by Statisticians, Machine Learning Engineers, and Data Scientists when they are doing the predictive analysis.

So, In this article, we will be discussing these concepts with their applications in a detailed manner.

Chebyshev’s Inequality

In probability theory, Chebyshev’s inequality, also known as “Bienayme-Chebyshev” inequality guarantees that, for a wide class of probability distributions, NO MORE than a certain fraction of values can be more than a certain distance from the mean.

Specifically, no more than 1/k² of the distribution’s values can be more than k standard deviations away from the mean( or equivalently, at least 1-1/k² of the distribution’s values are within k standard deviations of the mean).

Now, let’s formally define Chebyshev’s inequality:

Let X be a random variable with mean μ with a finite variance σ², then for any real number k>0,

P(| X-μ | < kσ) ≥ 1-1/k²

OR

P(| X-μ | ≥ kσ) ≤ 1/k²

The rule is often known as Chebyshev’s theorem, tells about the range of standard deviations around the mean, in statistics.

This inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined.

For Example, it can be used to prove the weak law of large numbers, which we will be discussed later in this article.

Applications of Chebyshev’s Inequality

Numerical Example-1:

Suppose that it is known that the number of products formed in a factory during a week is a random variable with a mean of 50. If the variance of a week production is equal to 25, then what can be said about the productivity that it will be between 40 and 60?

Solution:

Step-1: Mean(μ) = 50, Variance(σ²) = 25 ⇒ σ= 5

Step-2: Required probability: P(40 < X < 60)

= P(40 < X < 60) = P(-10 < X-50 < 10) = P(| X-50 | < 10)

Step-3: Now, by using the Chebyshev’s theorem, we have P(| X-μ | < kσ) ≥ 1-1/k²

Find k by compare with general equation, therefore kσ = 10 ⇒ k(5) =10 ⇒ k=2

Step-4: Apply the Chebyshev’s Theorem to find the required probability:

≥ 1-1/k² ≥ 1-(1/4) ≥ 3/4 ≥ 0.75

Step-5: Present the results

Therefore, the lower bound of the probability that the productivity lies between 40 and 60 is equal to 0.75.

Numerical Example-2:

A symmetric die is thrown 600 times. Find the lower bound for the probability of getting 80 to 120 sixes.

Solution:

Step-1: A symmetric die is thrown 600 times, so it follows Binomial Distribution and p=1/6.

Step-2: Now, by using the binomial distribution, we have to calculate the mean and variance of the random variables using the given below formula:

Mr. Mays Has Flipped Chebyshev's Inequality

Mean = np = 600*(1/6) = 100

Variance = npq = 600*(1/6)*(5/6) = 500/6

Step-3: Required probability: P(80 < X < 120)

P(80 < X < 120) = P(-20 < X-100 < 20) = P(| X-100 | < 20)

Step-4: Now, by using the Chebyshev’s theorem, we have P(| X-μ | < kσ) ≥1-1/k²

Find k by compare with general equation, therefore kσ = 20 ⇒ k(√(500/6)) = 10 ⇒ k = 20√(6/500)

Step-5: Apply Chebyshev’s Theorem to find the required probability:

≥ 1-1/k² ≥ 1-500/2400 ≥ 19/24 ≥ 0.79

Step-6: Present the results

Therefore, the lower bound of the probability of getting sixes between 80 and 120 is equal to 0.79.

Convergence in Probability

A sequence of random variables X₁, X₂, ——, X_n is said to convergence in probability to α if for any ε>0, the

lim _n→∞ P(| X_n– α | < ε) = 1

OR

lim _n→∞ P(| X_n– α | ≥ ε) = 0

We can write X_n-> α as n→∞ in probability.

Chebyshev’s theorem used in WLLN

Statement:

If X₁, X₂, —–, X_n is a sequence of random variables and if mean μ_n and standard deviation σ_n of X_n exists for all n and if σ_n->0 as n→∞, then X_n-μ_n -> 0 as n→∞ in probability.

Proof:

By Chebyshev’s inequality for ε >0,

P(| X_n-u_n | ≥ ε) ≤ σ_n²/ε² -> 0 as n-> ∞.

Hence, X_n-u_n->0 as n-> ∞ in probability, provided σ_n->0 as n→∞

Weak law of large numbers (WLLN)

Let X₁, X₂,———, X_n is a sequence of random variables and μ₁, μ₂, ———-, μ_n be their respective means and let B_n= Var(X₁+X₂+———-+X_n)<∞.Then,

P(| {(X₁+X₂+———-+X_n)/n} – {(μ₁+ μ₂ ———-+μ_n)/n}| < ε ) ≥ 1- j

For all n > n₀, where ε, j are arbitrary small positive numbers, provided

lim n→∞ (B_n/n²) -> 0

Remarks:

For the existence of WLLN, the following conditions:

E(X_i) exists for all i
B_n= Var(X₁+X₂+———-+X_n) exists, and
lim _n→∞ (B_n/n²) -> 0

Condition-1 is necessary(i.e, without it, the law itself cannot be stated).

Condition-2 and 3 are not necessary.

Condition-3 is a sufficient condition.

WLLN for IID Random Variables

If the variables, X₁, X₂,———, X_n are independent and identically distributed (IID), i.e, E(X_i)=μ and Var(X_i) = σ² for all i, then

B_n = Var(X₁+X₂+———-+X_n) = Var(X₁)+Var(X₂)+————+Var(X_n) = nσ²

Hence, lim_n→∞ (B_n/n²) = nσ²/n²= σ²/n -> 0

Thus, WLLN holds for the sequence of IID and we get

x̄_n -> μ in probability i.e, x̄_n converges in probability to μ.

Conditions for WLLN holds

Case-1: When the sequence of random variables { X_n} is independent, then

E(X_i) exists for all i
B_n= Var(X₁+X₂+———-+X_n) exists, and
lim _n→∞ (B_n/n²) -> 0

If at least one of the conditions is not met, then we apply the further test named Markov’s Theorem.

Case-2: When the sequence of random variables { X_n} is IID, then

E(X_i) exists for all i is enough for the existence of WLLN.

This result is known as Khinchin’s theorem.

Now, let’s see what the Markov Theorem states:

Markov’s theorem: The WLLN holds if for some δ >0,

E(|X_i|^1+δ) exists and bounded

NOTE:

Markov’s theorem provides only a necessary condition for the WLLN to hold good. This means that if for some δ>0, E(|X_i|^1+δ) unbounded then WLLN cannot hold for the sequence of random variables, { X_n}

Some Important Results

Result-1: If the variables are uniformly bounded then the condition,

lim _n→∞ (B_n/n²) -> 0

is necessary as well as sufficient for WLLN to hold.

Result-2: The necessary and sufficient condition for sequence { X_n} to satisfy the WLLN is :

lim _n→∞ E( Y_n²/ 1+Y_n²) -> 0

Where Y_n = S_n– E(S_n)/n and S_n= X₁+X₂+——+X_n

Applications of WLLN

Numerical Example:

Let X_i assume that values i and -i with equal probabilities. Show that the law of large numbers cannot be applied to the independent variables X₁, X₂, ———-.

Solution:

Step-1: Calculate the mean and variance of the random variables

E(X_i) = Σ X_iP(X_i) = i/2 – i/2 = 0

Var(X_i) = E(X_i²) – (E(X_i))² = [i²/2+(-i)²/2] – (0)² = i²

Step-2: Calculate the value of the limit of B_n/n²as n tends to infinity

Since X₁, X₂, ———- are independent variables

B_n = Var(X₁+X₂+———-+X_n) = Var(X₁) + Var(X₂)+—————-+ Var(X_n) = 1²+2²+3²+——-+n²

= n(n+1)(2n+1)/6

Therefore, B_n/n²= (n+1)(2n+1)/6 tends to ∞ as n-> ∞

Step-3: Interpret the results using the results of WLLN

Hence, we cannot draw any conclusion on whether WLLN holds or not.

Step-4: Here we apply the further tests, such as Markov’s Test.

E(|X_i|^1+δ) = i^1+δ/2 + |-i|^1+δ/2 = i^1+δ

which is unbounded for i^1+δ>0

Step-5: Present the final results

Hence, by Markov’s theorem, the WLLN cannot be applied to the sequence { X_i} of independent random variables.

This completes today’s discussion!

Endnotes

Thanks for reading!

I hope you enjoyed the article and increased your knowledge about Chebyshev’s Inequality and Weak Law of Large Numbers in Probability and Statistics.

Please feel free to contact me on Email

Something not mentioned or want to share your thoughts? Feel free to comment below And I’ll get back to you.

For the remaining articles, refer to the link.

About the Author

Aashi Goyal

Currently, I am pursuing my Bachelor of Technology (B.Tech) in Electronics and Communication Engineering from Guru Jambheshwar University(GJU), Hisar. I am very enthusiastic about Statistics, Machine Learning and Deep Learning.

The media shown in this article are not owned by Analytics Vidhya and are used at the Author’s discretion.