Open In App

How to Get the Descriptive Statistics for Pandas DataFrame?

Improve
Improve
Like Article
Like
Save
Share
Report

describe() method in Python Pandas is used to compute descriptive statistical data like count, unique values, mean, standard deviation, minimum and maximum value, and many more. In this article, let’s learn to get the descriptive statistics for Pandas DataFrame.

Python DataFrame.describe() Syntax

Syntax: df[‘cname’].describe(percentiles = None, include = None, exclude = None)
df.describe(percentiles = None, include = None, exclude = None)

Parameters:

  • percentiles: represents percentile value that has to be returned by the function. Default values are 0.25, 0.5 and 0.75
  • include: represents list of data types that has to be included
  • exclude: represents list of data types that has to be excluded

Creating a Sample DataFrame

Here, we are making a sample Pandas DataFrame that we will use in the whole article to show descriptive statistics in Pandas and it’s calculation.

Python3




# Import package
from pandas import DataFrame
 
# Create DataFrame
cart = {'Product': ['Mobile', 'AC', 'Mobile', 'Sofa', 'Laptop'],
        'Price': [20000, 28000, 22000, 19000, 45000],
        'Year': [2014, 2015, 2016, 2017, 2018]
        }
df = DataFrame(cart, columns = ['Product', 'Price', 'Year'])
 
# Original DataFrame
print("Original DataFrame:\n", df)


Output:

Original DataFrame:
Product Price Year
0 Mobile 20000 2014
1 AC 28000 2015
2 Mobile 22000 2016
3 Sofa 19000 2017
4 Laptop 45000 2018

Get the Descriptive Statistics for Pandas DataFrame

Below are the examples from which we can understand about descriptive statistics in Pandas in Python:

  • Descriptive Statistics in Pandas of Price Column
  • Descriptive Statistics in Pandas of Year Column
  • Descriptive Statistics of Whole DataFrame
  • Descriptive Statistics in Pandas of Data Individually

Descriptive Statistics in Pandas of Price Column

In this example, a DataFrame is created with product details, prices, and years. Descriptive statistics, including count, mean, and standard deviation of the ‘Price’ column, are then computed and displayed using describe() method.

Python3




# Describing descriptive statistics of Price
print("\nDescriptive statistics of Price:\n")
stats = df['Price'].describe()
print(stats)


Output:

Descriptive statistics of Price:
count 5.000000
mean 26800.000000
std 9986.532963
min 19000.000000
25% 20000.000000
50% 22000.000000
75% 28000.000000
max 45000.000000
Name: Price, dtype: float64

Descriptive Statistics in Pandas of Year Column

In this example, a DataFrame is created to represent products with their prices and respective years. The descriptive statistics, such as count, mean, and standard deviation of the ‘Year’ column, are computed and printed.

Python3




# Describing descriptive statistics of Year
print("\nDescriptive statistics of year:\n")
stats = df['Year'].describe()
print(stats)


Output:

Descriptive statistics of year:
count 5.000000
mean 2016.000000
std 1.581139
min 2014.000000
25% 2015.000000
50% 2016.000000
75% 2017.000000
max 2018.000000
Name: Year, dtype: float64

Descriptive Statistics of Whole DataFrame

In this example, a DataFrame is constructed with product details, prices, and years. The entire DataFrame’s descriptive statistics, encompassing all columns, are computed and displayed, including count, unique values, top value, and frequency for categorical columns, and mean, standard deviation, and quartile information for numerical columns.

Python3




# Describing descriptive statistics of whole dataframe
print("\nDescriptive statistics of whole dataframe:\n")
stats = df.describe(include='all')
print(stats)


Output:

Descriptive statistics of whole dataframe:
Product Price Year
count 5 5.000000 5.000000
unique 4 NaN NaN
top Mobile NaN NaN
freq 2 NaN NaN
mean NaN 26800.000000 2016.000000
std NaN 9986.532963 1.581139
min NaN 19000.000000 2014.000000
25% NaN 20000.000000 2015.000000
50% NaN 22000.000000 2016.000000
75% NaN 28000.000000 2017.000000
max NaN 45000.000000 2018.000000

Descriptive Statistics in Pandas of Data Individually

Let’s print all the descriptive statistical data individually. In this example, a DataFrame named df is created containing product names, their respective prices, and purchase years. Various statistics related to the ‘Price’ column, such as count, mean, maximum value, and standard deviation, are calculated and printed.

Python3




# Count of Price
print("\nCount of Price:")
counts = df['Price'].count()
print(counts)
 
# Mean of Price
print("\nMean of Price:")
m = df['Price'].mean()
print(m)
 
# Maximum value of Price
print("\nMaximum value of Price:")
mx = df['Price'].max()
print(mx)
 
# Standard deviation of Price
print("\nStandard deviation of Price:")
sd = df['Price'].std()
print(sd)


Output:

Count of Price:
5
Mean of Price:
26800.0
Maximum value of Price:
45000
Standard deviation of Price:
9986.53296259569


Last Updated : 28 Dec, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads