Open In App

How to Calculate Conditional Probability in R?

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will discuss how to calculate conditional probability in R programming language.

The probability of occurrence of one event conditioned over the occurrence of another event( i.e., an event occurs depending on the condition of another event) is termed as conditional probability. In simple terms, it means if A and B are two events, then the probability of occurrence of Event B conditioned over the occurrence of Event A is given by P(B|A). In another way, it is also the conditional probability of Event B given that event A has already occurred.

Similarly, the probability of occurrence of Event A conditioned over the occurrence of Event B is given by P(A|B), which also represents the conditional probability of Event A given that Event B has already occurred.

The formula for conditional probability can be represented as

P(A|B) = P(A ∩ B) / P(A)          

This is valid only when P(A)≠ 0 i.e. when event A is not an impossible event.

Similarly,

P(B|A) = P(A ∩ B) / P(B)       

This is valid only when P(B)≠ 0 i.e. when the event B is not an impossible event.

The below figure depicts the Venn diagram representation

Example 1: Computation of Conditional Probability

From a pack of 50 Pokémon cards, a card is drawn at random. These 50 cards have 5 equal sets of red, blue, green, yellow, and black cards respectively and each set has 2 water-type Pokémon with one water type being of high strength and the other one being of medium strength.

Considering A to be the event of drawing a high strength water-type Pokémon card and B to be the event of drawing a red card, what is the probability of drawing a high-strength, water-type Pokémon card with the red card already been drawn?

Solution Steps

Step 1. Probability of drawing a red card (Event B).

P(B) = 10/50  (since there are 10 red cards within a pack of 50 Pokémon cards.)

Step 2: Probability of drawing a high strength water-type Pokémon card (Event A)

P(A) = 5/50  (as there are 5 high-strength water-type Pokémon cards within a pack of 50 cards.)

Step 3 :  P( A Ո B) = 1/50 ( as there is one red high strength water-type Pokémon card within a pack of 50 cards)

Step 4: Since event B has already occurred hence there are 10 exhaustive cases and not 50 as earlier. Amongst these 10 red Pokémon cards, there is 1 high-strength, water-type Pokémon card.

Hence, P(A|B) = P( A Ո B) / P(B)  = (1/50) / (10/50) = 1/10.

This is the conditional probability of A given that B has already occurred.

Similarly,

P(B|A) = P( A Ո B) / P(A)  = (1/50) / (5/50)  = 1 / 5

As there can be only 1 red high strength water-type Pokémon card within the high strength water-type Pokémon card already drawn from pack of 50 cards.

Example 2: Computation of Conditional Probability

A store owner has a list of 15 customers. He observes certain patterns in their purchases which are depicted in the table below.

Customers

Money spent Frequency

1

High

Less

2

Low

More

3

High

More

4

High

Less

5

Low

Less

6

Low

More

7

High

More

8

Low

Less

9

Low

Less

10

High

More

11

Low

More

12

Low

Less

13

High

Less

14

High

More

15

High

Less

Based on the above table, he is interested in finding out

  • What is the probability of the customer spending high given that they are purchasing less often?
  • What is the probability of the customer spending less given that they are purchasing more often?
  • What is the probability of the customer spending less given that they are purchasing less often?
  • What is the probability of the customer spending high given that they are purchasing more often?

Solution Steps

1. P(High Spend | Less Frequency)

P(Less Frequency) = 8/15( as from the table,8 times out of 15, frequency is less)

P(High Spend Ո Less Frequency) = 4/15 (as from the table, there are 4 combinations out of 15 with high spend and less frequency)

P(High Spend | Less Frequency) = P(High Spend Ո Less Frequency)/ P(Less Frequency) = (4/15)/( 8/15) = 0.5

2. P(Low Spend | More Frequency)

P(More Frequency) = 7/15( as from the table,7 times out of 15, frequency is less)

P(Low Spend Ո More Frequency) = 3/15 (as from the table, there are 3 combinations out of 15 with low spend and more frequency)

P(Low Spend | More Frequency) = P(Low Spend Ո More Frequency)/ P(More Frequency) = (3/15)/( 7/15) = 0.4285714

Similarly,

3. P(Low Spend | Less Frequency) = 0.5

4. P(High Spend | More Frequency) = 0.5714286

To get the job done first install packages “prob” and “tidyverse” and create a Data frame. Represent the  Data frame in table form to represent each combination. Now, count the frequency of the unique combinations from the data frame, depicted as “n” in output. Compute the individual probabilities of each row depicted as “probs” in output. Compute final conditional probability as per the problem at hand.

Below is the R Code used for computation

R




# Library for calculation of conditional probability
library(prob)
library(tidyverse)
 
Money_Spent < - c("High", "Low", "High", "High",
                  "Low", "Low", "High", "Low",
                  "Low", "High", "Low", "Low",
                  "High", "High", "High")
Frequency < - c("Less", "More", "More", "Less",
                "Less", "More", "More", "Less",
                "Less", "More", "More", "Less",
                "Less", "More", "Less")
Customer < - c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10,
               11, 12, 13, 14, 15)
 
# Customer Data Frame
Customer_Data < - as.data.frame(cbind(Customer, Money_Spent, Frequency))
Customer_Data % >%
count(Money_Spent, Frequency, sort=T)
 
# Creating two-way table from data frame
Customer_Data_Table < - addmargins(table("Money_Spent"=Customer_Data$Money_Spent,
                                         "Frequency"=Customer_Data$Frequency))
# view table
Customer_Data_Table
 
Customer_Data < - probspace(Customer_Data)
Customer_Data
 
# Probability of the customer spending high
# given that they are purchasing less often
Prob(Customer_Data, event=Money_Spent == "High", given=Frequency == "Less")
 
# Probability of the customer spending less
# given that they are purchasing more often
Prob(Customer_Data, event=Money_Spent == "Low", given=Frequency == "More")
 
# Probability of the customer spending less
# given that they are purchasing less often
Prob(Customer_Data, event=Money_Spent == "Low", given=Frequency == "Less")
 
# Probability of the customer spending high
# given that they are purchasing more often
Prob(Customer_Data, event=Money_Spent == "High", given=Frequency == "More")


Output:



Last Updated : 28 Jan, 2024
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads