Open In App

How to Calculate Correlation Between Two Columns in Pandas?

In this article, we will discuss how to calculate the correlation between two columns in pandas

Correlation is used to summarize the strength and direction of the linear association between two quantitative variables. It is denoted by r and values between -1 and +1. A positive value for r indicates a positive association, and a negative value for r indicates a negative association.



By using corr() function we can get the correlation between two columns in the dataframe.

Syntax:



dataframe[‘first_column’].corr(dataframe[‘second_column’])

where,

Example 1: Python program to get the correlation among two columns




# import pandas module
import pandas as pd
 
# create dataframe with 3 columns
data = pd.DataFrame({
    "column1": [12, 23, 45, 67],
    "column2": [67, 54, 32, 1],
    "column3": [34, 23, 56, 23]
}
)
# display dataframe
print(data)
 
# correlation between column 1 and column2
print(data['column1'].corr(data['column2']))
 
# correlation between column 2 and column3
print(data['column2'].corr(data['column3']))
 
# correlation between column 1 and column3
print(data['column1'].corr(data['column3']))

Output:

 column1  column2  column3
0       12       67       34
1       23       54       23
2       45       32       56
3       67        1       23
-0.9970476685163736
0.07346999975265099
0.0

It is also possible to get element-wise correlation for numeric valued columns using just corr() function.

Syntax:

dataset.corr()

Example 2: Get the element-wise correlation




# import pandas module
import pandas as pd
 
# create dataframe with 3 columns
data = pd.DataFrame({
    "column1": [12, 23, 45, 67],
    "column2": [67, 54, 32, 1],
    "column3": [34, 23, 56, 23]
}
)
# get correlation between element wise
print(data.corr())

Output:

          column1   column2  column3
column1  1.000000 -0.997048  0.00000
column2 -0.997048  1.000000  0.07347
column3  0.000000  0.073470  1.00000

Article Tags :