Those features which contain constant values (i.e. only one value for all the outputs or target values) in the dataset are known as Constant Features. These features don’t provide any information to the target feature. These are redundant data available in the dataset. Presence of this feature has no effect on the target, so it is good to remove these features from the dataset. This process of removing redundant features and keeping only the necessary features in the dataset comes under the filter method of Feature Selection Methods.
Now Let’s see how we can remove constant features in Python.
Consider the self created dataset for the article:
Code: Create DataFrame of the above data
Code: Convert the categorical data to numerical data
Code: Fit the data to VarianceThreshold.
Output: Variance of different features:
[0.00000000e+00 6.17283951e-01 1.76746269e+07]
Code: Transform the data
[[2.000e+00 5.450e+02] [0.000e+00 1.505e+03] [0.000e+00 1.157e+03] [0.000e+00 2.541e+03] [1.000e+00 5.726e+03] [2.000e+00 3.125e+03] [0.000e+00 3.131e+03] [1.000e+00 6.525e+03] [1.000e+00 1.500e+04]] ********** Separator ********** Earlier shape of data: (9, 3) Shape after transformation: (9, 2)
As you can observe earlier we had 9 observations with 3 features.
After transformation we have 9 observations with 2 features. We can clearly observe that the removed feature is ‘Portal’.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.
- Python - Basics of Pandas using Iris Dataset
- Python Bokeh – Visualizing the Iris Dataset
- Applying Convolutional Neural Network on mnist dataset
- ML | Using SVM to perform classification on a non-linear dataset
- Pandas | Parsing JSON Dataset
- Pyspark | Linear regression with Advanced Feature Dataset using Apache MLlib
- Tensorflow | tf.data.Dataset.reduce()
- Tensorflow | tf.data.Dataset.from_tensor_slices()
- Plotting graph For IRIS Dataset Using Seaborn And Matplotlib
- Top 8 Free Dataset Sources to Use for Data Science Projects
- Importing Kaggle dataset into google colaboratory
- Different dataset forms in Social Networks
- Image Caption Generator using Deep Learning on Flickr8K dataset
- Visualising ML DataSet Through Seaborn Plots and Matplotlib
- Python Features
- Awesome New Features in Python 3.8
- Advance Features of Python
- Train a Support Vector Machine to recognize facial features in C++
- Features of Selenium WebDriver
- Selenium Basics - Components, Features, Uses and Limitations
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.