Open In App

How to Calculate Correlation in Excel

Improve
Improve
Like Article
Like
Save
Share
Report

Correlation is a concept that hails from the statistics background. In statistical terms, correlation can be defined as the linear association between two entities. Simply, it can be understood as the change in one entity leads to how much proportion changes in another entity. Many times’ correlation is often confused with another popular term in statistics, Causation. To differentiate and clarify, one must understand, correlation does not cause a change in the values of the second entity when the values of the first entity change and vice-versa.

Let’s understand this difference with the help of an example. It has been often observed that during the summer season crime rates usually increase in a city and also during the summer season there is an increase in the sale of ice cream. We can easily understand that due to the increase in temperature, people tend to prefer cooler food items for relaxation from heat, thus it causes an increase in ice cream sales. Thus, this is a common cause of Causation, whereas when we compare the increase in the sale of ice cream to the increase in crime rate during summer, both are correlated, but one is not the cause of another.

Now, there can be either a positive correlation or a negative correlation between two entities. The degree of correlation is often given using a correlation coefficient named as Pearson Correlation coefficient, which is named after Karl Pearson, who gave the concept of Correlation. The statistical formula for Pearson’s coefficient is given as:

Correlation(x,y) = \frac{Cov(x,y)}{\sigma _{x}.\sigma_{y}}

Where x and y are two separate entities, Cov(x,y) is the covariance between two entities x and y, σx and σy is the standard deviation of x and y respectively. To know more about the mathematical equation and how it is used, you can refer to https://www.geeksforgeeks.org

What is Correlation in Excel?

Correlation is the measurement of the strength and direction of the relationship between two variables. The correlation coefficient ranges from -1 to 1. A positive correlation indicates the variable that moves in the same direction, while a negative correlation indicates that it moves in opposite directions. A correlation coefficient of 0 indicates no linear relationship between the variables.

What is Correlation Data Analysis in Excel?

It is essential to make sure that your data is well organized in a spreadsheet before using correlation. Each variable should have its own column and each row should represent an observation or data point. You can refer to the below points to prepare your data:

  • Open Excel: After launching Microsoft you can create a new spreadsheet or open an existing sheet that contains the data you want to analyze.
  • Organize your data: Enter the data in appropriate cells, ensuring that each variable has its column, and each row represents an observation.
  • Data Format: Your data should be in numerical format for accurate correlation analysis. If your data is in the non-numeric format then convert it in numeric format.

Excel Correlation Formula

You can also enter the correlation formula yourself, Below is the correlation formula:

equation

where X and Y are measurements, ∑ is the sum, and the X and Y with bars over them indicate the mean value of the measurements.

How to Calculate Correlation in Excel

The value of the correlation coefficient ranges from -1 to +1. The closer the value is to -1 or +1, the strongly both entities are related to one another. If the correlation coefficient comes out to be 0, we say that there is no linear relationship between both entities. Let’s understand this with the help of an example, in which we will calculate the Pearson correlation coefficient using Excel. Suppose, we have records of the height and weight of 10 students of a class which is given as:

Height (in cm)Weight (in Kg)

155

66

178

82

148

62

162

70

165

71

172

74

158

64

152

65

176

80

185

93

We can calculate correlation in Excel using two methods:

Method 1: Using CORREL() Function

Excel has a built-in CORREL() function that can be used for calculating the Pearson correlation coefficient. The basic syntax for CORREL() is given as:

=CORREL(array1, array2) 

Where array1 and array2 are the arrays of records of the first entity and second entity, respectively.

Step 1: We can calculate the Correlation coefficient between both attributes using the formula applied in the A13 cell, i.e.,

=CORREL(A2:A11, B2:B11) 

We pass the first array, Height (in cm) from A2:A11 as the first parameter, and the second array, Weight (in kg) from B2:B11 as the second parameter inside the CORREL() formula.

Calculating-pearson's-correlation

Using the CORREL() function to calculate Pearson’s correlation coefficient

The value obtained after calculating the correlation coefficient comes out to be 0.959232649 which is very close to +1, hence we can derive a conclusion that the height and weight of the student are highly positively correlated to each other. We can likely say if a student is taller then there is a higher chance that the student will be having higher weight as well.

A video is also given below demonstrating all the usage of the CORREL() function to calculate the correlation value.

Method 2: Using the Data Analysis Tool

Step 1: In the menu bar, select the Data tab.

Selecting-data-tab

Step 2: From the data tab, select the Data Analysis option.

Selecting-data-analysis

Step 3: A data analysis tools dialogue box will appear, in the dialogue box select the Correlation option.

Selecting-correlation-option

Data Analysis dialog box

Step 4: An additional dialogue box for correlation will appear, in the dialogue box first we have to give the input range, so select the entire table. Since our data is grouped by Columns, we will select the Columns option. Also, our data have labels in the first row, therefore we will click the checkbox saying Labels in the first row. We can get output as per our requirement in the current sheet or a new worksheet or a new workbook. We can select the new worksheet option and click the OK button.

Filling-values-inside-correlation-box

Filling all the values inside the correlation dialog box

Step 5: The output will get automatically generated in the new worksheet.

Output-generated

The correlation table generated using the Data Analysis tool

A video is also given below demonstrating all the above steps given above to calculate the correlation value.

From the new worksheet, we can notice a correlation table will get generated in which we can see our correlation value between height and weight comes out to be 0.959232649, which we also got in using the first method.

Excel correlations are a good place to start when creating a marketing, sales, and spending plan, but they don’t provide the full picture. In order to rapidly assess the correlation between two variables and use this information as a starting point for more in-depth analysis, it is worthwhile to use Excel’s built-in data analysis options.

What is Correlation Matrix Excel

In Excel, a correlation matrix is a powerful tool used to analyze the relationships between multiple variables in a dataset. It provides a comprehensive view of the correlation coefficients between all possible pairs of variables, allowing us to understand how variables are interrelated. If you want to create a correlation matrix, make sure your data is organized with each variable in a separate column and each observation in a row.

Next, select the entire range of data, including the column labels. Then Go to the “Formula tab” and click on “More Functions” >> Go to the “Statistical”>> then select “CORREL”.

Select the data range again in the function wizard and click “OK”. Now Excel will calculate the correlation coefficients and display them in a matrix format.

Tips for Correlation Analysis

  1. Data Cleaning: Make sure that your data is accurate and error-free before performing the correlation analysis. Incorrect or missing data can affect the output.
  2. Sample Size: Correlation analysis is more reliable with larger sample sizes. Smaller sample sizes may lead to less accurate results.
  3. Causation vs. Correlation: Correlation does not imply causation. Even with a strong correlation, it is essential to explore other factors and conduct further research before establishing causation.

FAQs on How to Use Correlation in Excel

Q1: What is Correlation in Excel?

Answer:

Correlation is the measurement of the strength and direction of the relationship between two variables. The correlation coefficient ranges from -1 to 1. A positive correlation indicates the variable that moves in the same direction, while a negative correlation indicates that they move in opposite directions. A correlation coefficient of 0 indicates no linear relationship between the variables.

Q2: How to calculate correlation coefficients in Excel?

Answer:

To calculate the correlation coefficients in Excel, Follow the below steps:

Step 1: Select an empty cell.

Step 2: Enter the formula ‘=CORREL(array1, array2)’, and replace “array1” and “array2” with the ranges of data you want to analyze.

Q3: What are the possible values of the correlation coefficient in Excel?

Answer:

In Excel, the correlation coefficient can range from -1 to 1. A correlation coefficient of -1 means there is a perfect negative correlation between the variables, where one increases while the other decreases. A correlation coefficient of 1 indicates a perfect positive correlation, where both variables increase together.

Q4: How to visualize correlation in Excel using a scatter plot?

Answer:

To visualize the correlation using a scatter plot in Excel, Follow the below steps:

Step 1: Select the data you want to plot.

Step 2: Go to the “Insert tab”, and click on the Scatter in the “charts” group.

Step 3: Choose the scatter plot type as per your requirement.

Step 4: Positive correlations will show data points sloping upward, while negative correlations will show data points sloping downward



Last Updated : 06 Dec, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads