R – Pareto Chart

Last Updated : 08 Jun, 2023

A Pareto chart is a combination of a bar chart and a line chart used for visualization. In R Pareto charts, the right vertical axis is used for cumulative frequency while the left vertical axis represents frequency. They basically use the Pareto principle which says that 80% of effects are produced from 20% of causes of systems. Here, we have a bar chart that indicates the frequency of occurrence of the event in different categories in decreasing order (from left to right), and an overlaid line chart indicates the cumulative percentage of occurrences.

Syntax: pareto.chart(x, ylab = “Frequency”, ylab2 = “Cumulative Percentage”, xlab, cumperc = seq(0, 100, by = 25), ylim, main, col = heat.colors(length(x))) Parameters: x: a vector of values. names(x) are used for labelling the bars. ylab: a string specifying the label for the y-axis. ylab2: a string specifying the label for the second y-axis on the right side. xlab: a string specifying the label for the x-axis. cumperc: a vector of percentage values to be used as tickmarks for the second y-axis on the right side. ylim: a numeric vector specifying the limits for the y-axis. main: a string specifying the main title to appear on the plot. col: a value for the color, a vector of colors, or a palette for the bars. See the help for colors and palette.

Plotting Pareto Chart

Following are the steps that are required for plotting R Pareto Chart:

A vector (defect <- c(Values…)) is taken which holds the values of counts of different categories.
A vector (names(defect) <- c(Values…)) is taken which holds the string values specifying names of different categories.
This vector “defect” is plotted using pareto.chart().

Example 1:

R

# install and Load qcc package
install.packages('qcc')
library(qcc)
# x axis numbers
defect <- c(27, 789, 9, 65, 12, 109, 30, 15, 45, 621)
 
# x axis titles
names(defect) <- c("Too noisy", "Overpriced", "Food not fresh",
                "Food is tasteless", "Unfriendly staff",
                "Wait time", "Not clean", "Food is too salty",
                "No atmosphere", "Small portions")
 
pareto.chart(defect, xlab = "Categories", # x-axis label
                    ylab="Frequency", # label y left
 
# colors of the chart            
col=heat.colors(length(defect)),
 
# ranges of the percentages at the right
cumperc = seq(0, 100, by = 20),
 
# label y right
ylab2 = "Cumulative Percentage",
 
# title of the chart
main = "Complaints of different customers"
)

Output :

R – Pareto Chart

The “qcc” package is loaded into the R environment for use in the code in the second line.
The frequency of each complaint is used to build a vector “defect” in the third line.
The names of the many complaints are added to the vector’s “defect” in the fourth line.
The Pareto chart is then generated using the pareto.chart function.
The function receives the “defect” vector as the data to the chart.
The label for the x-axis is specified by the “xlab” option.
The label for the left-hand y-axis is specified by the “ylab” option.
The chart’s colors are specified by the “col” option.
The “campers” parameter defines the percentage ranges to the right of the

Example 2:

R

# x axis numbers
defect <- c(7000, 4000, 5200, 3000, 800)
 
# x axis titles
names(defect) <- c("Class A", "Class B", "Class C",
                "Class D", "Class E")
 
pareto.chart(defect, xlab = "Categories",
                    ylab="Frequency",
            col=heat.colors(length(defect)),
            cumperc = seq(0, 100, by = 10),
            ylab2 = "Cumulative Percentage",
            main = "Defects"
)