Python | Pandas dataframe.clip()
Last Updated :
16 Nov, 2018
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
Pandas dataframe.clip()
is used to trim values at specified input threshold. We can use this function to put a lower limit and upper limit on the values that any cell can have in the dataframe.
Syntax: DataFrame.clip(lower=None, upper=None, axis=None, inplace=False, *args, **kwargs)
Parameters:
lower : Minimum threshold value. All values below this threshold will be set to it.
upper : Maximum threshold value. All values above this threshold will be set to it.
axis : Align object with lower and upper along the given axis.
inplace : Whether to perform the operation in place on the data.
*args, **kwargs : Additional keywords have no effect but might be accepted for compatibility with numpy.
Example #1: Use clip()
function to trim values of a data frame below and above a given threshold value.
import pandas as pd
df = pd.DataFrame({ "A" :[ - 5 , 8 , 12 , - 9 , 5 , 3 ],
"B" :[ - 1 , - 4 , 6 , 4 , 11 , 3 ],
"C" :[ 11 , 4 , - 8 , 7 , 3 , - 2 ]})
df
|
Now trim all the values below -4 to -4 and all the values above 9 to 9. Values in-between -4 and 9 remaining the same.
Output :
Notice, there is not any value in the data frame greater than 9 and smaller than -4
Example #2: Use clip()
function to clips using specific lower and upper thresholds per column element in the dataframe.
import pandas as pd
df = pd.DataFrame({ "A" :[ - 5 , 8 , 12 , - 9 , 5 , 3 ],
"B" :[ - 1 , - 4 , 6 , 4 , 11 , 3 ],
"C" :[ 11 , 4 , - 8 , 7 , 3 , - 2 ]})
df
|
when axis=0
, then the value will be clipped across the rows. We are going to provide upper and lower threshold for all the column element (i.e. equivalent to the no. of rows)
Creating a Series to store the lower and upper threshold value for each column element.
lower_limit = pd.Series([ 1 , - 3 , 2 , 3 , - 2 , - 1 ])
upper_limit = lower_limit + 5
lower_limit
upper_limit
|
Output :
Now we want to apply these limits on the dataframe.
df.clip(lower_limit, upper_limit, axis = 0 )
|
Output :
Share your thoughts in the comments
Please Login to comment...