# Variability in R Programming

**Variability** (also known as **Statistical Dispersion**) is another feature of descriptive statistics. Measures of central tendency and variability together comprise of descriptive statistics. Variability shows the spread of a data set around a point.

**Example:** Suppose, there exist 2 data sets with the same mean value:

A = 4, 4, 5, 6, 6

Mean(A) = 5B = 1, 1, 5, 9, 9

Mean(B) = 5

So, to differentiate among the two data sets, R offers various measures of variability.

#### Measures of Variablity

Following are some of the measures of variablity that R offers to differentiate between data sets:

- Variance
- Standard Deviation
- Range
- Mean Deviation
- Interquartile Range

#### Variance

Variance is a measure that shows how far is each value from a particular point, preferably mean value. Mathematically, it is defined as the average of squared differences from the mean value.

**Formula:**

**where,**

specifies variance of the data set

specifies value in data set

specifies the mean of data setnspecifies total number of observations

In the R language, there is a standard built-in function to calculate the variance of a data set.

Syntax:var(x)

Parameter:x:It is data vector

**Example:**

`# Defining vector` `x <` `-` `c(` `5` `, ` `5` `, ` `8` `, ` `12` `, ` `15` `, ` `16` `)` ` ` `# Print variance of x` `print` `(var(x))` |

**Output:**

[1] 23.76667

#### Standard Deviation

Standard deviation in statistics measures the spreaness of data values with respect to mean and mathematically, is calculated as square root of variance.

**Formula:**

**where,**

specifies standard deviation of the data set

specifies value in data set

specifies the mean of data setnspecifies total number of observations

In R language, there is no standard built-in function to calculate the standard deviation of a data set. So, modifying the code to find the standard deviation of data set.

**Example:**

`# Defining vector` `x <` `-` `c(` `5` `, ` `5` `, ` `8` `, ` `12` `, ` `15` `, ` `16` `)` ` ` `# Standard deviation` `d <` `-` `sqrt(var(x))` ` ` `# Print standard deviation of x` `print` `(d)` |

**Output:**

[1] 4.875107

#### Range

Range is the difference between maximum and minimum value of a data set. In R language,

and **max()**

is used to find the same, unlike **min()**

function that returns the minimum and maximum value of data set.**range()**

**Example:**

`# Defining vector` `x <` `-` `c(` `5` `, ` `5` `, ` `8` `, ` `12` `, ` `15` `, ` `16` `)` ` ` `# range() function output` `print` `(` `range` `(x))` ` ` `# Using max() and min() function` `# to calculate the range of data set` `print` `(` `max` `(x)` `-` `min` `(x))` |

**Output:**

[1] 5 16 [1] 11

#### Mean Deviation

Mean deviation is a measure calculated by taking an average of the arithmetic mean of the absolute difference of each value from the central value. Central value can be mean, median, or mode.

**Formula:**

**where,**

specifies value in data set

specifies the mean of data setnspecifies total number of observations

In R language, there is no standard built-in function to calculate mean deviation. So, modifying the code to find mean deviation of the data set.

**Example:**

`# Defining vector` `x <` `-` `c(` `5` `, ` `5` `, ` `8` `, ` `12` `, ` `15` `, ` `16` `)` ` ` `# Mean deviation` `md <` `-` `sum` `(` `abs` `(x` `-` `mean(x)))` `/` `length(x)` ` ` `# Print mean deviation` `print` `(md)` |

**Output:**

[1] 4.166667

#### Interquartile Range

Interquartile Range is based on splitting a data set into parts called as quartiles. There are 3 quartile values (Q1, Q2, Q3) that divide the whole data set into 4 equal parts. Q2 specifies the median of the whole data set.

Mathematically, the interquartile range is depicted as:

IQR = Q3 – Q1

**where,**

Q3specifies the median of n largest valuesQ1specifies the median of n smallest values

In R language, there is built-in function to calculate the interquartile range of data set.

Syntax:IQR(x)

Parameter:x:It specifies the data set

**Example:**

`# Defining vector` `x <` `-` `c(` `5` `, ` `5` `, ` `8` `, ` `12` `, ` `15` `, ` `16` `)` ` ` `# Print Interquartile range` `print` `(IQR(x))` |

**Output:**

[1] 8.5