Open In App

Cumulative Frequency Graph in R

Last Updated : 19 Dec, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to plot a cumulative frequency graph using the R programming language.

What is Cumulative Frequency?

When the frequency of the first-class interval is added to the frequency of the second class, this total is added to the third class and so on is known as the cumulative frequency.

What is a Cumulative Frequency Graph?

A graph that can show the cumulative frequency distribution of grouped data is called a cumulative frequency graph or an ogive. This is the most effective technique to comprehend cumulative frequency data and arrive at conclusions is to plot the data. Graphs in particular are crucial in the realm of statistics because they enable us to better comprehend the data and depict it. 

Functions Used for R Cumulative Frequency Graph

Here are the some of the functions used for R Cumulative Frequency Graph.

seq() Method

The seq() method creates a list of values beginning from the lower limit to the higher and segregates them with the difference specified in the “by” parameter. 

Syntax: seq( start , end, by )

Parameters : 

start – start of the sequence 

end – end of the sequence

by – increment value of the sequence

cut() Method

The cut() method in R divides the range of the specified vector of data points into intervals and codes the values in the vector as per which interval in which they belong.

Syntax: cut(x, breaks) 

Parameters : 

x – The vector of data points.

breaks – The vector of break points.

table(x) Method

The transformed vector is then converted into a table of values, in order to construct a frequency table. The values are mapped according to the interval in which they lie. It is used to create a categorical representation of data with the specified variable name and its corresponding frequency.

Syntax: table(x)

Parameter : 

x – The vector of values to be converted.

cumsum(x) Method

The cumulative frequencies can be generated using the cumsum() method for the specified vector. Cumulative frequency for a data point at nth interval is the summation of frequencies till the (n-1)th interval.

Syntax: cumsum(x)

Parameters : 

x – A vector of data points.

plot() Method

The plot of cumulative frequencies can then be created using the plot() method in R. The method takes as arguments the breakpoints as the coordinates on the x-axis and their respective cumulative frequencies as the coordinates on the y axis respectively. 

Syntax: plot(x-coordinates, y-coordinates, xlab, ylab)

Parameters : 

x-coordinates – The vector of x coordinates.

y-coordinates – The vector of y coordinates.

xlab – The labelling of x axis.

ylab – The labelling of y axis.

Creating a frequency table

The frequency table is used to depict the frequency of something or in a particular interval of time or data. Here we are storing data points in a variable “data_points” and then make six breakpoints using the seq() method. Transform it to a table using cut() and table() methods.

R




# declaring data points
data_points <- c(1, 2, 3, 5, 1, 1,
                  2, 4, 5, 1, 2, 3, 3)
# declaring the break points
break_points <-seq(0, 6, by=1)
# transforming the data
data_transform = cut(data_points, break_points,
                     right=FALSE)
# creating the frequency table
freq_table = table(data_transform)
# printing the frequency table
print("Frequency Table")
print(freq_table)


Output:

[1] "Frequency Table" 
data_transform
[0,1) [1,2) [2,3) [3,4) [4,5) [5,6)
0 4 3 3 1 2

The number of data points in the interval [1,2) inclusive of 1 and non-inclusive of 2 is 4. Similarly, there are 3 three’s in the vector of data points so the value corresponding to [3,4) = 3

Plotting the cumulative frequency graph

In the continuation with the above code, we are going to make a frequency table first using cumsum() method, and then using that table we are going to plot the cumulative frequency graph by labeling the x-axis as data points and the y-axis as cumulative frequency. The points can then be connected using the lines() method.

R




# calculating cumulative frequency
cumulative_freq = c(0, cumsum(freq_table))
print("Cumulative Frequency")
print(cumulative_freq)
# plotting the data
plot(break_points, cumulative_freq,
     xlab="Data Points",
     ylab="Cumulative Frequency")
# creating line graph
lines(break_points, cumulative_freq)


Output:

gh

Cumulative Frequency Graph in R

The value corresponding to the cumulative frequency of [5,6) is the summation of all the previous frequencies. 

Create multiple Cumulative Frequency Graph in R

Here we are try to plot Cumulative Frequency Graph in R Programming Language.

R




# Install and load the necessary library
install.packages("ggplot2")
library(ggplot2)
 
# Create a sample data frame
data <- data.frame(
  values = c(5, 8, 12, 15, 20, 22, 25, 28, 32, 35),
  frequency = c(2, 3, 4, 5, 6, 7, 8, 9, 10, 12)
)
 
# Calculate cumulative frequency
data$cumulative_frequency <- cumsum(data$frequency)
 
# Create additional scenarios for cumulative frequency
data$cumulative_frequency_scenario2 <- cumsum(data$frequency) + 10
data$cumulative_frequency_scenario3 <- cumsum(data$frequency) - 5
 
# Combine all scenarios into one data frame
all_data <- rbind(
  transform(data, scenario = "Scenario 1"),
  transform(data, cumulative_frequency = data$cumulative_frequency_scenario2,
            scenario = "Scenario 2"),
  transform(data, cumulative_frequency = data$cumulative_frequency_scenario3,
            scenario = "Scenario 3")
)
 
# Create a cumulative frequency plot with step format and multiple scenarios
ggplot(all_data, aes(x = values, y = cumulative_frequency, group = scenario,
                     color = scenario)) +
  geom_step(size = 1.5, direction = "hv") +
  geom_point(size = 3) +
  labs(title = "Cumulative Frequency Graph",
       x = "Values",
       y = "Cumulative Frequency",
       color = "Scenario") +
  theme_minimal() +
  scale_color_manual(values = c("Scenario 1" = "steelblue", "Scenario 2" = "red",
                                "Scenario 3" = "green"))


Output:

gh

Cumulative Frequency Graph in R

In this example, geom_step() with direction = "hv" is used to create a step plot with both horizontal and vertical segments.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads