Skip to content
Related Articles

Related Articles

Probability Distribution
  • Last Updated : 11 Feb, 2021

Probability Distribution for a Random Variable shows how Probabilities are distributed over for different values of the Random Variable. When all values of Random Variable are aligned on a graph, the values of its probabilities generate a shape. The Probability distribution has several properties (example: Expected value and Variance) that can be measured.

In Probability Distribution, A Random Variable’s outcome is uncertain. Here, the outcome’s observation is known as Realization. It is a Function that maps Sample Space into a Real number space, known as State Space. They can be Discrete or Continuous. 

Random Variables

Random Variable is an important concept in probability and statistics. We need to understand it intuitively and mathematically to gain a deeper understanding of probability distributions that surround us in everyday life.  

It’s a function which associates a real number with an event. 

Sometimes we are interested not only in the probabilities of the events in the experiments but also in some numbers associated with the experiment. Here’s when we feel the need for random variables. 

Let’s take an example of the coin flips. We’ll start with flipping a coin and find out. We’ll use H to mean ‘heads’ and T for ‘tails’. 

So now we flip our coin 5 times, and we want to answer some questions. 

1. What is the probability of getting exactly 3 heads?

2. What is the probability of getting less than 4 heads?

3. What is the probability of getting more than 1 head?

Then our general way of writing would be:

· P(Probability of getting exactly 3 heads when we flip a coin 5 times)

· P(Probability of getting less than 4 heads when we flip a coin 5 times)

· P(Probability of getting more than 1 head when we flip a coin 5 times) 

In a different scenario, suppose we are tossing two dice, and we are interested in knowing the probability of getting two numbers such that their sum is 6. 

So, in both of these cases, random variables come to our rescue. First, let’s define what is random variable mathematically. 


It is an assignment of a value (number) to every possible outcome. In more mathematical terms, it is a function from the sample space Ω to the real numbers. We can choose our random variable according to our needs. 

A random variable is a real valued function whose domain is the sample space of a random experiment

To make this more intuitive, let us consider the experiment of tossing a coin two times in succession.

The sample space of the experiment is S = {HH, HT, TH, TT}. Let’s define a random variable to count events of head or tails according to our need, let X denotes the number of heads obtained. For each outcome, its values are as given below:

X(HH) = 2, X (HT) = 1, X (TH) = 1, X (TT) = 0.

More than one random variable can be defined in the same sample space. For example, let Y denote the number of heads minus the number of tails for each outcome of the above sample space S.

 Y(HH) = 2, Y (HT) = 0, Y (TH) = 0, Y (TT) = – 2

Thus, X and Y are two different random variables defined on the same sample.

Note: More than one event can map to same value of random variable. 

Types of Random Variables in Probability distribution

  • Discrete Random Variables
  • Continuous Random Variables

Discrete Random Variables in Probability distribution

A discrete random variable can only take a finite number of values. To further understand this, let’s see some examples of discrete random variables:

  1. X = {sum of the outcomes when two dice are rolled}. Here, X can only take values like {2, 3, 4, 5, 6….10, 11, 12}.
  2. X = {Number of Heads in 100 coin tosses}. Here, X can take only integer values from [0,100].

Continuous Random Variable in Probability distribution

A continuous random variable can take infinite values in a continuous domain. Let’s see an example of a dart game. 

Suppose, we have a dart game in which we throw a dart where the dart can fall anywhere between [-1,1] on the x-axis. So if we define our random variable as the x-coordinate of the position of the dart, X can take any value from [-1,1]. There are infinitely many possible values that X can take. (X = {0.1, 0.001, 0.01, 1,2, 2.112121 …. and so on}.   

Probability distribution of a random variable

Now the question comes, how to describe the behavior of a random variable

Suppose that our random variable only takes finite values, like x1, x2, x3 …. and xn. That is, the range of X is the set of n values is {x1, x2, x3 …. and xn}.

Thus, the behavior of X is completely described by giving probabilities for all the values of the random variable X

x1Pr(X = x1)
x2Pr(X = x2)
x3Pr(X = x3)

The Probability Function of a discrete random variable X is the function p(x) satisfying

p(x) = Pr(X = x)

Let’s look at an example: 

Question: We draw two cards successively with replacement from a well-shuffled deck of 52 cards. Find the probability distribution of finding aces. 


Let’s define a random variable “X”, which means number of aces. So since we are only drawing two cards form the deck, X can only take three values: 0, 1 and 2.  We also know that, we are drawing cards with replacement which means that the two draws can be considered an independent experiments. 

P(X = 0) = P(both cards are non-aces) 

               = P(non-ace) x P(non-ace) 

               = \frac{48}{52} \times \frac{48}{52} = \frac {144}{169}

P(X = 1) = P(one of the cards in ace) 

               = P(non-ace and then ace) + P(ace and then non-ace)

               = P(non-ace) x P(ace) + P(ace) x P(non-ace)

               = \frac{48}{52} \times \frac{4}{52}  + \frac{4}{52} \times \frac{48}{52} = \frac{24}{169}

P(X = 2) = P(Both the cards are aces) 

               = P(ace) x P(ace)

               = \frac{4}{52} \times \frac{4}{52} = \frac{1}{169}

Now we have probabilities for each value of random variable. Since it is discrete, we can make a table to represent this distribution. The table is given below. 


Expectation (Mean) and Variance of a Random Variable

Suppose we have a probability experiment we are performing, and we have defined some random variable(R.V.) according to our needs( like we did in some previous examples). Now, each time experiment is performed, our R.V. takes on a different value. But we want to know that if we keep on doing the experiment a thousand times or an infinite number of times, what will be the average value of the random variable?


The mean, expected value, or expectation of a random variable X is written as E(X) or \mu_{\textbf{X}} . If we observe N random values of X, then the mean of the N values will be approximately equal to E(X) for large N. 

For a random variable X which takes on values x1, x2, x3 … xn with probabilities  p1, p2, p3 … pn. Expectation of X is defined as, 

\sum_{i=1}^{N} x_{i}p_{i}

i.e it is weighted average of all values which X can take, weighted by the probability of each value. 

To see it more intuitively, let’s take a look at this graph below, 

Now in the above figure, we can see both the random variables have the almost same ‘mean’, but does that mean that they are equal? No. To fully describe the properties/behavior of a random variable, we need something more, right? 

We need to look at the dispersion of the probability distribution, one of them is concentrated, but the other is very spread out near a single value. So we need a metric to measure the dispersion in the graph. 


In Statistics, we have studied that the variance is a measure of the spread or scatter in data. Likewise, the variability or spread in the values of a random variable may be measured by variance.

For a random variable X which takes on values x1, x2, x3 … xn with probabilities  p1, p2, p3 … pn and the expectation is  E[X] 

The variance of X or Var(X) is denoted by, E[X - u]^{2} = \sum (x_{i}-\mu)^{2}p_{x_{i}} = E[X^{2}] - (E[X])^{2}

Let’s calculate the mean and variance of a random variable probability distribution through an example:

Question: Find the variance and mean of the number obtained on a throw of an unbiased die.


We know that the sample space of this experiment is {1,2,3,4,5,6} 

Let’s define our random variable X, which represents the number obtained on a throw. 

So, the probabilities of the values which our random variable can take are, 

P(1) = P(2) = P(3) = P(4) = P(5) = P(6) = \frac{1}{6}

Therefore, the probability distribution of the random variable is, 

Probabilities\frac{1}{6}\frac{1}{6 }\frac{1}{6}\frac{1}{6}\frac{1}{6}\frac{1}{6}

E[X] = \sum p_{x_{i}}x_{i} \\ \hspace{0.9cm} = 1 \times \frac{1}{6} + 2 \times \frac{1}{6} + 3 \times \frac{1}{6} + 4 \times \frac{1}{6} + 5 \times \frac{1}{6} + 6 \times \frac{1}{6} \\ \hspace{0.9cm} = \frac{21}{6}

Also, E[X2] = 1^{2} \times \frac{1}{6} + 2^{2}\times\frac{1}{6} + 3^{2}\times\frac{1}{6} + 4^{2}\times\frac{1}{6} + 5^{2}\times\frac{1}{6} + 6^{2}\times\frac{1}{6} \\ \hspace{0.9cm} = \frac{91}{6} \\

Thus, Var(X) = E[X2] – (E[X])2

                      = (\frac{91}{6}) - (\frac{21}{6})^{2} = \frac{91}{6} - \frac{441}{36} = \frac{35}{12}

So, therefore mean is \frac{21}{6}    and variance is \frac{35}{12}

Different Types of Probability Distributions

We have seen what probability distributions are, now we will see different types of probability distributions. The probability distribution type is determined by the type of random variable. There are two types of probability distributions: 

  • Discrete probability distributions for discrete variables
  • Probability density functions for continuous variables

We will study in detail two types of discrete probability distributions, others are out of scope at class 12. 

Discrete Probability Distributions 

Discrete probability functions assume a discrete number of values. For example, coin tosses and counts of events are discrete functions. These are discrete distributions because there are no in-between values. We can either have heads or tails in a coin toss.

For discrete probability distribution functions, each possible value has a non-zero probability. Moreover, probabilities of all the values of the random variables must sum to one. For example, the probability of rolling a specific number on a die is 1/6. The total probability for all six values equals one. When we roll a die, we only get either one of these values. 

Bernoulli trials and Binomial distributions

Many experiments only have either one of two outcomes. For example, a tossed coin shows a ‘head’ or ‘tail’, a manufactured item can be ‘defective’ or ‘non-defective’. In these cases, we can call one of these outcomes “success” and another “failure”. Let’s say in the coin-tossing experiment if the occurrence of the head is considered a success, then the occurrence of the tail is a failure.

Each time we toss a coin or roll a die or perform any other experiment, we call it a trial. Now we know that in our experiments coin-tossing trial, the outcome of any trial is independent of the outcome of any other trial. In each of such trials, the probability of success or failure remains constant. Such independent trials that have only two outcomes usually referred to as ‘success’ or ‘failure’ are called Bernoulli trials.


Trials of the random experiment are known as Bernoulli trials, if they are satisfying below given conditions :

  • Finite number of trials are required.
  • All trials must be independent.
  • Every trial has two outcomes : success or failure.
  • Probability of success remains  same in every trial.

Let’s take the example of experiment in which we throw a die, throwing a die 50 times can be considered as a case of 50 Bernoulli trials, where result of each trial is either success(let’s assume that getting even number is success) or failure( similarly, getting odd number is failure) and the probability of success (p) is the same for all 50 throws. Obviously, the successive throws of the die are independent trials. If the die is fair and has six numbers 1 to 6 written on six faces, then p = 1/2 and q = 1 – p =1/2 = probability of failure.

Question: An urn contains 8 red balls and 10 black balls. We draw six balls from urn successively. You have to tell whether or not the trials of drawing balls are Bernoulli trials when after each draw, the ball drawn is:

  1. replaced
  2. not replaced in urn.


  • We know that the number of trials are finite. When drawing is done with replacement, probability of success (say, red ball) is p =8/18 which will be same for all of the six trials. So, drawing of balls with replacements are Bernoulli trials.
  • If  drawing is done without replacement,  probability of success (i.e., red ball) in the first trial is 8/18 , in 2nd trial is 7/17 if  first ball drawn is red or, 10/18 if  first ball drawn is black, and so on. Clearly, probabilities of success are not same for all the trials, Therefore, the trials are not Bernoulli trials.

Binomial Distribution

It is a random variable that represents the number of successes in “N” successive independent trials of Bernoulli’s experiment. It is used in a plethora of instances like including the number of heads in “N” coin flips, and so on. 

Let P and Q denote the success and failure in the Bernoulli trial. Let’s Suppose we are interested in finding different ways in which we have 1 success in all six trials. 

Clearly, six cases are available as listed below:


Likewise, 2 success and 4 failures will show \frac{6!}{4! 2!}  combinations. So many combinations are very difficult to list. Henceforth, calculating probabilities of 0, 1, 2,…, n number of successes can be long and time-consuming. To avoid such lengthy calculation along with a listing of all possible cases, for probabilities of the number of successes in n-Bernoulli’s trials, a formula is made:

If Y is a Binomial random variable, we denote this Y∼ Bin(n, p), where p is the probability of success in a given trial, q is the probability of failure, Let ‘n’ be the total number of trails and ‘x’ be the number of successes. A binomial random variable has the following properties: 

P(Y) = nCx qn–xpx

Now the probability function P(Y) is known as the probability function of the binomial distribution. 

Question: When a fair coin is tossed 10 times, Probability of:

  1. exactly six heads
  2. at least six heads


Every coin tossed can be considered as the Bernoulli trial . Suppose X be the number of heads in this experiment: 

We already know, n = 10

                            p = 1/2

So, P(X = x) = nCx pn-x (1-p)x , x= 0,1,2,3,….n

P(X = x) = 10Cxp10-x(1-p)x  

When x = 6, 

(i) P(x = 6) = 10C6 p4 (1-p)6

                   \frac{10!}{6!4!}(\frac{1}{2})^{6}(\frac{1}{2})^{4}\\ \hspace{0.4cm} = \frac{7\times8\times9\times10}{2\times3\times4}\times\frac{1}{64}\times\frac{1}{16} \\ \hspace{0.4cm} = \frac{105}{512}

(ii) P(at least 6 heads) = P(X >= 6) = P(X = 6) + P(X=7) + P(X=8)+ P(X=9) + P(X=10) 

= 10C6 p4 (1-p)6 + 10C7 p3 (1-p)7 + 10C8 p2 (1-p)8 + 10C9 p1(1-p)9 + 10C10 (1-p)10 =  

\frac{10!}{6!4!}(\frac{1}{2})^{10} + \frac{10!}{7!3!}(\frac{1}{2})^{10} + \frac{10!}{8!2!}(\frac{1}{2})^{10} + \frac{10!}{9!1!}(\frac{1}{2})^{10} + \frac{10!}{10!}(\frac{1}{2})^{10}\\ \hspace{0.5cm} = (\frac{10!}{6!4!} + \frac{10!}{7!3!}+ \frac{10!}{8!2!} + \frac{10!}{9!1!}+ \frac{10!}{10!})(\frac{1}{2})^{10} \\ \hspace{0.5cm} = \frac{193}{512}

Attention reader! Don’t stop learning now. Get hold of all the important DSA concepts with the DSA Self Paced Course at a student-friendly price and become industry ready.

My Personal Notes arrow_drop_up
Recommended Articles
Page :