Open In App

How to Calculate Rolling Median in Pandas?

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will see how to calculate the rolling median in pandas.

A rolling metric is usually calculated in time series data. It represents how the values are changing by aggregating the values over the last ‘n’ occurrences. The ‘n’ is known as the window size. The aggregation is usually the mean or simple average. However, we can also use median aggregation to perform certain kinds of analyses. 

Before we move, let us install the panda’s library using pip:

pip install pandas

pandas.core.window.rolling.Rolling.median() function calculates the rolling median. The object pandas.core.window.rolling.Rolling is obtained by applying rolling() method to the dataframe or series.

Example 1:

Under this example, we will be using the pandas.core.window.rolling.Rolling.median() function to calculate the rolling median of the given data frame. we have calculated the rolling median for window sizes 1, 2, 3, and 4. We have merged all these different window outputs in the original dataframe so that we can compare them. As we can observe in the output, for a window size of ‘n’, we have the first n-1 columns as NaN value. For record 5, the median values of record 2 – 5 will be considered. Similarly, for the 10th record, the median value of records between 7 and 10 is considered. This window size can be defined in the rolling() method in the window parameter.

Python




# Import the `pandas` library
import pandas as pd
  
# Create the pandas dataframe
df = pd.DataFrame({
    "value": [101, 94, 112, 100, 134, 124
              119, 127, 143, 128, 141]
})
  
# Calculate the rolling median for window = 1
w1_roll_median = df.rolling(window=1).median()
  
# Calculate the rolling median for window = 2
w2_roll_median = df.rolling(window=2).median()
  
# Calculate the rolling median for window = 3
w3_roll_median = df.rolling(window=3).median()
  
# Calculate the rolling median for window = 4
w4_roll_median = df.rolling(window=4).median()
  
# Add the rolling median series to the original 
# dataframe for comparison
df['w1_roll_median'] = w1_roll_median
df['w2_roll_median'] = w2_roll_median
df['w3_roll_median'] = w3_roll_median
df['w4_roll_median'] = w4_roll_median
  
# Print the dataframe
print(df)


Output:

    value  w1_roll_median  w2_roll_median  w3_roll_median  w4_roll_median
0     101           101.0             NaN             NaN             NaN
1      94            94.0            97.5             NaN             NaN
2     112           112.0           103.0           101.0             NaN
3     100           100.0           106.0           100.0           100.5
4     134           134.0           117.0           112.0           106.0
5     124           124.0           129.0           124.0           118.0
6     119           119.0           121.5           124.0           121.5
7     127           127.0           123.0           124.0           125.5
8     143           143.0           135.0           127.0           125.5
9     128           128.0           135.5           128.0           127.5
10    141           141.0           134.5           141.0           134.5

Example 2:

In this example, we have taken the stock price of Tata Motors for the last 3 weeks. The rolling median is calculated for a window size of 7 which means a week’s time frame. Therefore, each value in the w7_roll_median column represents the median value of the stock price for a week. Since the window size is 7, the initial 6 records are NaN as discussed earlier.

Python




# Import the `pandas` library
import pandas as pd
  
# Create the pandas dataframe
df = pd.DataFrame({
    "value": [
        506.40, 487.85, 484.90, 489.70, 501.40, 509.65, 510.75,
        503.45, 507.05, 505.45, 519.05, 530.15, 509.70, 486.10,
        495.50, 488.65, 492.75, 460.20, 461.45, 458.60, 475.25,
    ]
})
  
# Calculate the rolling median for window = 7
w7_roll_median = df.rolling(window=7).median()
  
# Add the rolling median series to the original
# dataframe for comparison
df['w7_roll_median'] = w7_roll_median
  
# Print the dataframe
print(df)


Output:

     value  w7_roll_median
0   506.40             NaN
1   487.85             NaN
2   484.90             NaN
3   489.70             NaN
4   501.40             NaN
5   509.65             NaN
6   510.75          501.40
7   503.45          501.40
8   507.05          503.45
9   505.45          505.45
10  519.05          507.05
11  530.15          509.65
12  509.70          509.70
13  486.10          507.05
14  495.50          507.05
15  488.65          505.45
16  492.75          495.50
17  460.20          492.75
18  461.45          488.65
19  458.60          486.10
20  475.25          475.25


Last Updated : 19 Dec, 2021
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads