Open In App

Skewness and Kurtosis in R Programming

Improve
Improve
Improve
Like Article
Like
Save Article
Save
Share
Report issue
Report

In statistics, skewness and kurtosis are the measures that tell about the shape of the data distribution, or simply, both are numerical methods to analyze the shape of data set unlike, plotting graphs and histograms which are graphical methods. These are normality tests to check the irregularity and asymmetry of the distribution.

To calculate skewness and kurtosis in R language, a moments package is required. 

Skewness

Skewness is a statistical numerical method to measure the asymmetry of the distribution or data set. It tells about the position of the majority of data values in the distribution around the mean value. A fundamental statistical notion called skewness quantifies the asymmetries in data distributions. It is essential to a number of disciplines, including data analysis, social sciences, economics, and finance.

Formula:
{\displaystyle \gamma_{1}=\frac{\frac{1}{n} \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{3}}{\left(\frac{1}{n} \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}\right)^{3 / 2}}}
where, 

represents coefficient of skewness 

*** QuickLaTeX cannot compile formula:
 

*** Error message:
Cannot connect to QuickLaTeX server: cURL error 28: Connection timed out after 10001 milliseconds
Please make sure your server/PHP settings allow HTTP requests to external resources ("allow_url_fopen", etc.)
These links might help in finding solution:
http://wordpress.org/extend/plugins/core-control/
http://wordpress.org/support/topic/an-unexpected-http-error-occurred-during-the-api-request-on-wordpress-3?replies=37
represents i^\text{th}  value in data vector 
*** QuickLaTeX cannot compile formula:
 

*** Error message:
Cannot connect to QuickLaTeX server: cURL error 28: Connection timed out after 10001 milliseconds
Please make sure your server/PHP settings allow HTTP requests to external resources ("allow_url_fopen", etc.)
These links might help in finding solution:
http://wordpress.org/extend/plugins/core-control/
http://wordpress.org/support/topic/an-unexpected-http-error-occurred-during-the-api-request-on-wordpress-3?replies=37
represents mean of data vector 
n represents total number of observations

There exist 3 types of skewness values on the basis of which the asymmetry of the graph is decided. These are as follows:

Positive Skew

The asymmetry of data distributions where the tail extends towards higher values is known statistically as positive skewness. It is a crucial metric in a number of disciplines, including data analysis, social sciences, finance, and economics. The definition, computation techniques, interpretation, and applications of positive skewness theory are covered in detail in this article. 

If the coefficient of skewness is greater than 0 i.e. \gamma_{1}>0  , then the graph is said to be positively skewed with the majority of data values less than the mean. Most of the values are concentrated on the left side of the graph.
Example:  

R

# Required for skewness() function
library(moments)
 
# Defining data vector
x <- c(40, 41, 42, 43, 50)
 
# output to be present as PNG file
png(file = "positiveskew.png")
 
# Print skewness of distribution
print(skewness(x))
 
# Histogram of distribution
hist(x)
 
# Saving the file
dev.off()

                    

Output:

[1] 1.2099

Graphical Representation: 

Simple Histogram

Zero Skewness or Symmetric

A statistical notion called zero skewness, commonly referred to as symmetry, defines data distributions that are balanced and have equal probability on both sides of the mean. It is a fundamental metric used in many disciplines, such as data analysis, economics, social sciences, and finance.

If the coefficient of skewness is equal to 0 or approximately close to 0 i.e. \gamma_{1}=0  , then the graph is said to be symmetric and data is normally distributed.
Example:  

R

# Required for skewness() function
library(moments)
 
# Defining normally distributed data vector
x <- rnorm(50, 10, 10)
 
# output to be present as PNG file
png(file = "zeroskewness.png")
 
# Print skewness of distribution
print(skewness(x))
 
# Histogram of distribution
hist(x)
 
# Saving the file
dev.off()

                    

Output: 

[1] -0.02991511

Graphical Representation: 

Simple Histogram

Negatively skewed

Left-skewed distributions, commonly referred to as negatively skewed distributions, are statistical notions that describe asymmetrical data distributions with a tail that slopes downward. In a number of disciplines, including finance, economics, social sciences, and data analysis, it is crucial to comprehend negatively skewed data. 

If the coefficient of skewness is less than 0 i.e. \gamma_{1}<0  , then the graph is said to be negatively skewed with the majority of data values greater than the mean. Most of the values are concentrated on the right side of the graph.
Example: 

R

# Required for skewness() function
library(moments)
 
# Defining data vector
x <- c(10, 11, 21, 22, 23, 25)
 
# output to be present as PNG file
png(file = "negativeskew.png")
 
# Print skewness of distribution
print(skewness(x))
 
# Histogram of distribution
hist(x)
 
# Saving the file
dev.off()

                    

Output: 

[1] -0.5794294

Graphical Representation: 

Simple Histogram

Kurtosis

A statistical measure known as kurtosis measures the peakedness, flatness, and weight of the tails of data distributions. In a number of disciplines, including finance, economics, social sciences, and data analysis, an understanding of kurtosis is crucial. Kurtosis theory is thoroughly explained in this article, which also covers its definition, computation processes, interpretation, and applications.

Kurtosis is a numerical method in statistics that measures the sharpness of the peak in the data distribution.
Formula: 
{\displaystyle \gamma_{2}=\frac{\frac{1}{n} \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{4}}{\left(\frac{1}{n} \sum_{i=1}^{n}\left(x_{i}-\bar{x}\right)^{2}\right)^{2}} }
where, 

*** QuickLaTeX cannot compile formula:
 

*** Error message:
Cannot connect to QuickLaTeX server: cURL error 28: Connection timed out after 10001 milliseconds
Please make sure your server/PHP settings allow HTTP requests to external resources ("allow_url_fopen", etc.)
These links might help in finding solution:
http://wordpress.org/extend/plugins/core-control/
http://wordpress.org/support/topic/an-unexpected-http-error-occurred-during-the-api-request-on-wordpress-3?replies=37
represents coefficient of kurtosis 
*** QuickLaTeX cannot compile formula:
 

*** Error message:
Cannot connect to QuickLaTeX server: cURL error 28: Connection timed out after 10001 milliseconds
Please make sure your server/PHP settings allow HTTP requests to external resources ("allow_url_fopen", etc.)
These links might help in finding solution:
http://wordpress.org/extend/plugins/core-control/
http://wordpress.org/support/topic/an-unexpected-http-error-occurred-during-the-api-request-on-wordpress-3?replies=37
represents i^\text{th}  value in data vector 
*** QuickLaTeX cannot compile formula:
 

*** Error message:
Cannot connect to QuickLaTeX server: cURL error 28: Connection timed out after 10001 milliseconds
Please make sure your server/PHP settings allow HTTP requests to external resources ("allow_url_fopen", etc.)
These links might help in finding solution:
http://wordpress.org/extend/plugins/core-control/
http://wordpress.org/support/topic/an-unexpected-http-error-occurred-during-the-api-request-on-wordpress-3?replies=37
represents mean of data vector 
n represents total number of observations

There exist 3 types of Kurtosis values on the basis of which the sharpness of the peak is measured. These are as follows:

Platykurtic

Data distributions having flattened tails compared to the normal distribution are referred to statistically as platykurtic distributions. In several disciplines, including finance, economics, social sciences, and data analysis, it is essential to comprehend platykurtic data. The definition, calculation procedures, interpretation, and applications of Platykurtic.

If the coefficient of kurtosis is less than 3 i.e. \gamma_{2}<3  , then the data distribution is platykurtic. Being platykurtic doesn’t mean that the graph is flat-topped.
Example: 

R

# Required for kurtosis() function
library(moments)
 
# Defining data vector
x <- c(rep(61, each = 10), rep(64, each = 18),
rep(65, each = 23), rep(67, each = 32), rep(70, each = 27),
rep(73, each = 17))
 
# output to be present as PNG file
png(file = "platykurtic.png")
 
# Print skewness of distribution
print(kurtosis(x))
 
# Histogram of distribution
hist(x)
 
# Saving the file
dev.off()

                    

Output: 

[1] 2.258318

Graphical Representation: 

Histogram

Mesokurtic

Data distributions with tails that are similar in thickness to the normal distribution are known statistically as mesokurtic distributions. Numerous professions, including finance, economics, social sciences, and data analysis, depend on an understanding of mesokurtic data. 

If the coefficient of kurtosis is equal to 3 or approximately close to 3 i.e. \gamma_{2}=3  , then the data distribution is mesokurtic. For the normal distribution, the kurtosis value is approximately equal to 3.
Example: 

R

# Required for kurtosis() function
library(moments)
 
# Defining data vector
x <- rnorm(100)
 
# output to be present as PNG file
png(file = "mesokurtic.png")
 
# Print skewness of distribution
print(kurtosis(x))
 
# Histogram of distribution
hist(x)
 
# Saving the file
dev.off()

                    

Output: 

[1] 2.963836

Graphical Representation: 

Histogram

Leptokurtic

Data distributions having hefty tails compared to the normal distribution are referred to statistically as leptokurtic distributions. In several disciplines, including finance, economics, social sciences, and data analysis, it is essential to comprehend leptokurtic data. The leptokurtic theory is thoroughly discussed in this article, including its concept, calculation techniques, interpretation, and applications. 

If the coefficient of kurtosis is greater than 3 i.e. \gamma_{1}>3  , then the data distribution is leptokurtic and shows a sharp peak on the graph.
Example: 

R

# Required for kurtosis() function
library(moments)
 
# Defining data vector
x <- c(rep(61, each = 2), rep(64, each = 5),
rep(65, each = 42), rep(67, each = 12), rep(70, each = 10))
 
# output to be present as PNG file
png(file = "leptokurtic.png")
 
# Print skewness of distribution
print(kurtosis(x))
 
# Histogram of distribution
hist(x)
 
# Saving the file
dev.off()

                    

Output:  

[1] 3.696788

Graphical Representation: 

Histogram



Last Updated : 05 Jul, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads