Python – Removing Constant Features From the Dataset

Those features which contain constant values (i.e. only one value for all the outputs or target values) in the dataset are known as Constant Features. These features don’t provide any information to the target feature. These are redundant data available in the dataset. Presence of this feature has no effect on the target, so it is good to remove these features from the dataset. This process of removing redundant features and keeping only the necessary features in the dataset comes under the filter method of Feature Selection Methods.

Now Let’s see how we can remove constant features in Python.

Consider the self created dataset for the article:

Portal Article’s_category Views
GeeksforGeeks Python 545
GeeksforGeeks Data Science 1505
GeeksforGeeks Data Science 1157
GeeksforGeeks Data Science 2541
GeeksforGeeks Mathematics 5726
GeeksforGeeks Python 3125
GeeksforGeeks Data Science 3131
GeeksforGeeks Mathematics 6525
GeeksforGeeks Mathematics 15000

Code: Create DataFrame of the above data

filter_none

edit
close

play_arrow

link
brightness_4
code

# Import pandas to create DataFrame
import pandas as pd
  
# Make DataFrame of the given data
data = pd.DataFrame({"Portal":['GeeksforGeeks', 'GeeksforGeeks', 'GeeksforGeeks', 'GeeksforGeeks', 'GeeksforGeeks'
                               'GeeksforGeeks', 'GeeksforGeeks', 'GeeksforGeeks', 'GeeksforGeeks'],
                    "Article's_category":['Python', 'Data Science', 'Data Science', 'Data Science', 'Mathematics', 
                                          'Python', 'Data Science', 'Mathematics', 'Mathematics'],
                    "Views":[545, 1505, 1157, 2541, 5726, 3125, 3131, 6525, 15000]})

chevron_right


Code: Convert the categorical data to numerical data



filter_none

edit
close

play_arrow

link
brightness_4
code

# import ordinal encoder from sklearn
from sklearn.preprocessing import OrdinalEncoder
ord_enc = OrdinalEncoder()
  
# Transform the data
data[["Portal","Article's_category"]] = ord_enc.fit_transform(data[["Portal","Article's_category"]])

chevron_right


Code: Fit the data to VarianceThreshold.

filter_none

edit
close

play_arrow

link
brightness_4
code

# import VarianceThreshold
from sklearn.feature_selection import VarianceThreshold
var_threshold = VarianceThreshold(threshold=0)   # threshold = 0 for constant
  
# fit the data
var_threshold.fit(data)
  
# We can check the variance of different features as
print(var_threshold.variances_)

chevron_right


Output: Variance of different features:

[0.00000000e+00 6.17283951e-01 1.76746269e+07]

Code: Transform the data

filter_none

edit
close

play_arrow

link
brightness_4
code

print(var_threshold.transform(data))
print('*' * 10,"Separator",'*' * 10)
  
# shapes of data before transformed and after transformed
print("Earlier shape of data: ", data.shape)
print("Shape after transformation: ", var_threshold.transform(data).shape)

chevron_right


Output:

[[2.000e+00 5.450e+02]
 [0.000e+00 1.505e+03]
 [0.000e+00 1.157e+03]
 [0.000e+00 2.541e+03]
 [1.000e+00 5.726e+03]
 [2.000e+00 3.125e+03]
 [0.000e+00 3.131e+03]
 [1.000e+00 6.525e+03]
 [1.000e+00 1.500e+04]]
********** Separator **********
Earlier shape of data:  (9, 3)
Shape after transformation:  (9, 2)

As you can observe earlier we had 9 observations with 3 features.
After transformation we have 9 observations with 2 features. We can clearly observe that the removed feature is ‘Portal’.

Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.




My Personal Notes arrow_drop_up

Possess good Mathematical and Statistical Foundation Data Science Enthusiast Addicted to Python

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.