Open In App

How to Calculate Percentiles in R?

In this article, we will discuss how to calculate percentiles in the R programming language. 

Percentiles are measures of central tendency, which depict that out of the total data about certain percent data lies below it. In R, we can use quantile() function to get the job done.



Syntax: quantile( data, probs)

Parameter: 



  • data: data whose percentiles are to be calculated
  • probs: percentile value

Example 1: Calculate percentile

To calculate the percentile we simply pass the data and the value of the required percentile.




x<-c(2,13,5,36,12,50)
  
res<-quantile(x,probs=0.5)
  
res

Output:

 50%  
12.5

Example 2: Calculate percentiles of vector

We can calculate multiple percentiles at once. For that, we have to pass the vector of percentiles instead of a single value to probs parameter.




x<-c(2,13,5,36,12,50)
  
res<-quantile(x,probs=c(0.5,0.75))
  
res

Output:

 50%   75%  
12.50 30.25 

Example 4: Calculate percentile in dataframe

Sometimes requirement asks for calculating percentiles for a dataframe column in that case the entire process remains same only you have to pass the column name in place of data along with the percentile value to be calculated.




df<-data.frame(x=c(2,13,5,36,12,50),
y=c('a','b','c','c','c','b'))
  
res<-quantile(df$x,probs=c(0.35,0.7))
  
res

Output:

 35%   70%  
10.25 24.50 

Example 5: Quantiles of several and all columns

We can also find percentiles of several dataframe columns at once. This can also be applied to find the percentiles of all numeric columns of dataframe. For this we use apply() function, within this we will pass the dataframe with just numeric columns and the quantile function that has to be applied on all columns.

Syntax: apply( dataframe, function)




df<-data.frame(x=c(2,13,5,36,12,50),
y=c('a','b','c','c','c','b'),
z=c(2.1,6,3.8,4.8,2.2,1.1))
  
sub_df<-df[,c('x','z')]
  
res<-apply(sub_df, 2, function(x) quantile(x,probs=0.5))
  
res

Output:

   x    z  
12.5  3.0 

Example 6: Calculate Quantiles by group

We can also group values together and find the percentile with respect to each group. For this, we use groupby() function, and then within summarize() we will apply the quantile function.




library(dplyr)
  
df<-data.frame(x=c(2,13,5,36,12,50),
  
              y=c('a','b','c','c','c','b'))
  
df %>% group_by(y) %>%
  
 summarize(res=quantile(x,probs=0.5))

Output:

A tibble: 3 x 2
y       res
<chr> <dbl>
a       2  
b      31.5
c      12 

Example 7: Visualizing percentiles

Visualizing percentiles can make it better to understand. 




df<-data.frame(x=c(2,13,5,36,12,50),
y=c('a','b','c','c','c','b'),
z=c(2.1,6,3.8,4.8,2.2,1.1))
  
n<-length(df$x)
  
plot((1:n-1)/(n-1), sort(df$x.Length), type='h',
     xlab = "Percentile",
  ylab = "Value")

Output:


Article Tags :