# CDF vs PDF: What’s the Difference?

Nitika Sharma 31 Oct, 2023 • 6 min read

## Introduction

The Cumulative Distribution Function and the Probability Density Function are 2 essential ideas in probability theory that frequently confound students. Understanding random variables’ behavior, features, and distributions depends critically on these operations. Knowing differences between PDF vs CDF is crucial to analyze and interpret the probabilities linked to continuous and discrete random variables.

This article will discuss the definitions of PDF vs CDF and their unique roles and interactions. To clarify their application and highlight their significance in diverse statistical applications, we will also offer a solved example to show their use.

## What is Probability Density Function (PDF)?

The PDF is a crucial tool for understanding the probabilities associated with continuous random variables. It provides a smooth curve representing the probability distribution over possible values. The PDF function does not give the probabilities of specific individual values. Still, it describes the likelihood of the random variable taking on values within a small interval around a particular point.

To understand the concept of PDF, imagine a continuous probability distribution, such as the height of adult males. The probability for various height ranges will be displayed in the PDF. It might suggest, for instance, that people with heights between 5’9″ and 5’10” are more numerous than those with heights outside of that range.

The area under the PDF curve spanning a range represents the probability that the random variable will fall inside that range. Only by computing the integral of the PDF at that point can the probability of a single value, which is the probability that the random variable will be infinitesimally close to that value, be calculated.

## What is Cumulative Distribution Function (CDF)?

The CDF is a complementary concept to the PDF and provides a cumulative perspective of the probabilities associated with a random variable. Unlike the smooth curve of the PDF, the CDF is a step function that jumps at specific values. It displays the likelihood that a particular number will be less than or equal to the random variable.

The CDF begins at 0 for negative values, moving steadily towards 1 as the random variable’s value rises. For discrete random variables, the CDF increases in steps corresponding to the probabilities of each possible outcome. For continuous random variables, it increases smoothly, reflecting the accumulated probabilities across different intervals.

The CDF would demonstrate the likelihood of discovering a male with a height less than or equal to a certain value, such as 5 ‘9″, using the male heights example from before. The CDF allows us to respond to questions like “What percentage of adult males is shorter than 5 ‘9” by presenting cumulative probability.

## PDF vs CDF Understanding with Example

Understanding how the Probability Density Function vs Cumulative Distribution Function interact is essential for comprehending how random variables behave and how their distributions work. Both functions provide complementary insights into the probabilities of the random variable’s values.

We previously showed how to compute the PDF vs CDF using the fair six-sided die example. Let’s now explore their connection and deeper aspects of their relationship.

### Calculating the CDF from the PDF

We need to integrate the PDF over a given range to find the CDF from the PDF. The CDF at a certain point x (F(x)) for a continuous random variable equals the region of the PDF curve up to that point. It can be modelled mathematically as follows:

`F(x)=[a, x]f(t)dt`

Here, x is the point on the distribution curve for which we wish to get the cumulative probability, and an is the lower limit of the range.

For our example of rolling the fair die, we can use the PDF values we previously calculated to find the CDF:

```Let's calculate the CDF at x = 3:

F(3) = ∫[1, 3] f(t) dt

F(3) = ∫[1, 3] 16 dt

F(3) = [t6] |[1, 3]

F(3) = (36) - (16)

F(3) = 26```

Similarly, we can calculate the CDF for other values of x using the same approach.

### Relating PDF and CDF for Discrete Random Variables

The relationship between the PMF (Probability Mass Function) and the CDF is more apparent for discrete random variables. The PMF provides the probabilities for each specific value of the discrete random variable while the CDF accumulates these probabilities.

The CDF at a particular value, x, is the sum of all the probabilities of the random variable being less than or equal to x. Mathematically, for discrete random variables:

`F(x) = P(X ≤ x) = Σ[all values ≤ x] P(X = value)`

By adding up the probabilities of all values up to x, we obtain the cumulative probability up to that point, and this process aligns with the CDF concept.

## Understanding Difference Between CDF vs PDF

The CDF provides the probability that a random variable is less than or equal to a specific value ‘x,’ while the PDF represents the probability that the random variable takes on a precise value ‘x.’

Let’s understand the unique properties and applications in PDF and CDP:

## Conclusion

To sum up, understanding the distinction between CDF and PDF is essential in probability and statistics. Both play vital roles in analyzing random variables and their distributions. If you want to delve deeper into data science and enhance your statistical skills, consider enrolling in the Analytics Vidhya Blackbelt Program. This comprehensive program will equip you with the knowledge and expertise to excel in the world of data science. Don’t miss this opportunity to unlock your full potential and propel your career to new heights with Analytics Vidhya’s Blackbelt Program. Start your data science journey today!

Q1. What is the relationship between PDF and CDF?

A. The PDF and CDF are interrelated concepts in probability theory. The PDF gives the probability of a continuous random variable taking on a specific value. At the same time, the CDF provides the cumulative probability of the random variable being less than or equal to a given value.

Q2. What are CDF and PDF functions?

A. The CDF and PDF are important in probability and statistics for describing random variable behavior. The CDF shows the cumulative probability up to a specific value “x” (denoted as “F(x)”), while the PDF displays the probability distribution of a continuous random variable (represented as “f(x)”).

Q3. What is the difference between PDF and PMF?

A. PMF is for discrete random variables, giving probabilities for specific values. On the other hand, the PDF is for continuous random variables, showing the probability density over a range of values.

Q4. What is the difference between the probability distribution function and the probability density function?

A. Both terms represent a mathematical function describing the probability distribution of a continuous random variable. Though “probability density function” and “probability distribution function” are interchangeable, they mean the same thing.

Nitika Sharma 31 Oct 2023