Open In App

Covariance Matrix

Improve
Improve
Like Article
Like
Save
Share
Report

Covariance Matrix is a type of matrix used to describe the covariance values between two items in a random vector. It is also known as the variance-covariance matrix because the variance of each element is represented along the matrix’s major diagonal and the covariance is represented among the non-diagonal elements. A covariance matrix is usually a square matrix. It is also positive semi-definite and symmetric. This matrix comes in handy when it comes to stochastic modeling and Principal component analysis.

What is Covariance Matrix?

The variance-covariance matrix is a square matrix with diagonal elements that represent the variance and the non-diagonal components that express covariance. The covariance of a variable can take any real value- positive, negative, or zero. A positive covariance suggests that the two variables have a positive relationship, whereas a negative covariance indicates that they do not. If two elements do not vary together, they have a zero covariance.

Learn More, Diagonal Matrix

Covariance Matrix Example 

Let’s say there are 2 data sets X = [10, 5] and Y = [3, 9]. The variance of Set X = 12.5 and the variance of set Y = 18. The covariance between both variables is -15. The covariance matrix is as follows:

\begin{bmatrix} Variance~of~Set~X & Coorelation~of~Both~Sets\\ Coorelation~of~Both~Sets& Variance~of~Set~Y \end{bmatrix}=\begin{bmatrix} 12.5 & -15\\ -15& 18 \end{bmatrix}

Covariance Matrix Formula

The general form of a covariance matrix is given as follows:

Covariance Matrix

where,

  • Sample Variance: var(x1) = \frac{\sum_{1}^{n}\left ( x_{i} -\overline{x}\right )^{2} }{n-1}
  • Sample Covarinace: cov(x1, y1) = \frac{\sum_{1}^{n}\left (x_{i} -\overline{x}\right )\left(y_{i}-\overline{y}\right)}{n-1}
  • Population Variance: var(xn) = \frac{\sum_{1}^{n}\left ( x_{i} -\mu\right )^{2} }{n}
  • Population Covariance: cov(xn, yn) = \frac{\sum_{1}^{n}\left ( x_{i} -\mu_{x}\right )\left ( y_{i}-\mu_{y} \right ) }{n}

Here, μ is Mean of Population

\overline x     is Mean of Sample

n is Number of Observation

xi is the Observation in Dataset x

Let’s see the format of Covariance Matrix of 2 ⨯ 2 and 3 ⨯ 3

2 ⨯ 2 Covariance Matrix

We know that in a 2 ⨯ 2 matrix there are two rows and two columns. Hence, the 2 ⨯ 2 Covariance Matrix can be expressed as \begin{bmatrix}\mathrm{var(x)}& \mathrm{cov(x,y)} \\\mathrm{cov(x,y)} &\mathrm{var(y)}\end{bmatrix}

3 ⨯ 3 Covariance Matrix

In a 3⨯3 Matrix there are 3 rows and 3 columns. We know that in a Covariance Matrix the diagonal elements are variance and non-diagonal elements are covariance. Hence, a 3⨯3 Covariance Matrix can be given as \begin{bmatrix}\mathrm{var(x)}&\mathrm{cov(x,y)} &\mathrm{cov(x,z)} \\\mathrm{cov(x,y)} &\mathrm{var(y)} &\mathrm{cov(y,z)} \\\mathrm{cov(x,z)} &\mathrm{cov(y,z)} &\mathrm{var(z)} \\\end{bmatrix}

How to Find Covariance Matrix?

The dimensions of a covariance matrix are determined by the number of variables in a given data set. If there are only two variables in a set, then the covariance matrix would have two rows and two columns. Similarly, if a data set has three variables, then its covariance matrix would have three rows and three columns. 

The data pertains to marks scored by Anna, Caroline, and Laura in Psychology and History. Make a covariance matrix.

StudentPsychology(X)History(Y)
Anna8070
Caroline6320
Laura10050

The following steps have to be followed:

Step 1: Find the mean of variable X. Sum up all the observations in variable X and divide the sum obtained with the number of terms. Thus, (80 + 63 + 100)/3 = 81.

Step 2: Subtract the mean from all observations. (80 – 81), (63 – 81), (100 – 81).

Step 3: Take the squares of the differences obtained above and then add them up. Thus, (80 – 81)2 + (63 – 81)2 + (100 – 81)2.

Step 4: Find the variance of X by dividing the value obtained in Step 3 by 1 less than the total number of observations. var(X) = [(80 – 81)2 + (63 – 81)2 + (100 – 81)2] / (3 – 1) = 343.

Step 5: Similarly, repeat steps 1 to 4 to calculate the variance of Y. Var(Y) = 633.

Step 6: Choose a pair of variables.

Step 7: Subtract the mean of the first variable (X) from all observations; (80 – 81), (63 – 81), (100 – 81).

Step 8: Repeat the same for variable Y; (70 – 47), (20 – 47), (50 – 47).

Step 9: Multiply the corresponding terms: (80 – 81)(70 – 47), (63 – 81)(20 – 47), (100 – 81)(50 – 47).

Step 10: Find the covariance by adding these values and dividing them by (n – 1). Cov(X, Y) = (80 – 81)(70 – 47) + (63 – 81)(20 – 47) + (100 – 81)(50 – 47)/3-1 = 481.

Step 11: Use the general formula for the covariance matrix to arrange the terms. The matrix becomes: \begin{bmatrix} 343 & 481\\ 481& 633 \end{bmatrix}

Properties of Covariance Matrix

The Properties of Covariance Matrix are mentioned below:

  • A covariance matrix is always square, implying that the number of rows in a covariance matrix is always equal to the number of columns in it.
  • A covariance matrix is always symmetric, implying that the transpose of a covariance matrix is always equal to the original matrix.
  • A covariance matrix is always positive and semi-definite.
  • The eigenvalues of a covariance matrix are always real and non-negative.

Read More,

Solved Examples on Covariance Matrix

Example 1: The marks scored by 3 students in Physics and Biology are given below:

StudentPhysics(X)Biology(Y)
A9280
B6030
C10070

Calculate Covariance Matrix from the above data.

Solution:

Sample covariance matrix is given by \frac{\sum_{1}^{n}\left ( x_{i} -\overline{x}\right )^{2} }{n-1}              .

Here, μx = 84, n = 3

var(x) = [(92 – 84)2 + (60 – 84)2 + (100 – 84)2] / (3 – 1) = 448

Also, μy = 60, n = 3

var(y) = [(80 – 60)2 + (30 – 60)2 + (70 – 60)2] / (3 – 1) = 700

Now, cov(x, y) = cov(y, x) = [(92 – 84)(80 – 60) + (60 – 84)(30 – 60) + (100 – 84)(70 – 60)] / (3 – 1) = 520.

The population covariance matrix is given as: \begin{bmatrix} 448 & 520\\ 520& 700 \end{bmatrix}

Example 2. Prepare the population covariance matrix from the following table:

AgeNumber of People
2968
2660
3058
3540

Solution:

Population variance is given by \frac{\sum_{1}^{n}\left ( x_{i} -\mu\right )^{2} }{n}              .

Here, μx = 56.5, n = 4

var(x) = [(68 – 56.5)2 + (60 – 56.5)2 + (58 – 56.5)2 + (40 – 56.5)2 ] / 4 = 104.75

Also, μy = 30, n = 4

var(y) = [(29 – 30)2 + (26 – 30)2 + (30 – 30)2 + (35 – 30)2] / 4 = 10. 5

Now, cov(x, y) = \frac{\sum_{1}^{4}\left ( x_{i} -\mu_{x}\right )\left ( y_{i}-\mu_{y} \right ) }{4}

cov(x, y) = -27

The population covariance matrix is given as: \begin{bmatrix} 104.7 &-27 \\ -27& 10.5 \end{bmatrix}

Example 3. Interpret the following covariance matrix:

\begin{bmatrix} & X & Y & Z\\ X & 60 & 32 & -4\\ Y & 32 & 30 & 0\\ Z & -4 & 0 & 80 \end{bmatrix}

Solution:

  1. The diagonal elements 60, 30, and 80 indicate the variance in data sets X, Y, and Z respectively. Y shows the lowest variance whereas Z displays the highest variance.
  2. The covariance for X and Y is 32. As this is a positive number it means that when X increases (or decreases) Y also increases (or decreases)
  3. The covariance for X and Z is -4. As it is a negative number it implies that when X increases Z decreases and vice-versa.
  4. The covariance for Y and Z is 0. This means that there is no predictable relationship between the two data sets.

Example 4. Find the sample covariance matrix for the following data:

XYZ
7510.545
6512.865
227.374
152.176
189.256

Solution:

Sample covariance matrix is given by \frac{\sum_{1}^{n}\left ( x_{i} -\overline{x}\right )^{2} }{n-1}              .

n = 5, μx = 22.4, var(X) = 321.2 / (5 – 1) = 80.3

μy = 12.58, var(Y) = 132.148 / 4 = 33.037

μz = 64, var(Z) = 570 / 4 = 142.5

cov(X, Y) = \frac{\sum_{1}^{5}\left ( x_{i} -22.4\right )\left ( y_{i}-12.58\right ) }{5-1} = -11.76

cov(X, Z) = \frac{\sum_{1}^{5}\left ( x_{i} -22.4\right )\left ( z_{i}-64 \right ) }{5-1} = 34.97

cov(Y, Z) = \frac{\sum_{1}^{5}\left ( y_{i} -12.58\right )\left ( z_{i}-64 \right ) }{5-1} = -40.87

The covariance matrix is  given as:

\begin{bmatrix} 80.3 & -13.865 &14.25 \\ -13.865 & 33.037 & -39.5250\\ 14.25 & -39.5250 & 142.5 \end{bmatrix}

FAQs on Covariance Matrix

1. Define Covariance Matrix

A covariance matrix is a type of matrix used to describe the covariance values between two items in a random vector.

2. What is the Formula for Covariance Matrix?

The Formula for Covariance Matrix is given as

\left[\begin{array}{ccc} \operatorname{Var}\left(x_1\right) & \ldots \ldots & \operatorname{Cov}\left(x_n, x_1\right) \\ \vdots & \ldots  & \vdots \\ \vdots & \ldots & \vdots \\ \operatorname{Cov}\left(x_n, x_1\right) & \ldots \ldots & \operatorname{Var}\left(x_n\right) \end{array}\right]

Where, Sample Variance: var(x1) = \frac{\sum_{1}^{n}\left ( x_{i} -\overline{x}\right )^{2} }{n-1}

  • Sample Covarinace: cov(x1, y1) = \frac{\sum_{1}^{n}\left (x_{i} -\overline{x}\right )\left(y_{i}-\overline{y}\right)}{n-1}
  • Population Variance: var(xn) = \frac{\sum_{1}^{n}\left ( x_{i} -\mu\right )^{2} }{n}
  • Population Covariance: cov(xn, yn) = \frac{\sum_{1}^{n}\left ( x_{i} -\mu_{x}\right )\left ( y_{i}-\mu_{y} \right ) }{n}

3. What is the General Form of a 3 ⨯ 3 Covariance Matrix?

The general form of a 3 ⨯ 3 covariance matrix is given as follows:

\begin{bmatrix}\mathrm{var(x)}&\mathrm{cov(x,y)} &\mathrm{cov(x,z)} \\\mathrm{cov(x,y)} &\mathrm{var(y)} &\mathrm{cov(y,z)} \\\mathrm{cov(x,z)} &\mathrm{cov(y,z)} &\mathrm{var(z)} \\\end{bmatrix}

4. What are the Properties of Covariance Matrix?

Covariance Matrix is a square matrix and is also symmetric in nature i.e. the transpose of the original matrix gives the original matrix itself

5. What are the sectors where Covariance Matrix can be used?

Covariance Matrix is used in the field of Mathematics, Machine Learning, Finance and Economics. Covariance Matrix is used in Cholskey Decomposition to perfom Monte Carlo Simulation which is used to create Mathematical Models.



Last Updated : 27 Nov, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads