SimpleImputer is a scikit-learn class which is helpful in handling the missing data in the predictive model dataset. It replaces the NaN values with a specified placeholder.
It is implemented by the use of the SimpleImputer() method which takes the following arguments :
missing_data : The missing_data placeholder which has to be imputed. By default is NaN
stategy : The data which will replace the NaN values from the dataset. The strategy argument can take the values – ‘mean'(default), ‘median’, ‘most_frequent’ and ‘constant’.
fill_value : The constant value to be given to the NaN data using the constant strategy.
Code: Python code illustrating the use of SimpleImputer class.
Original Data :
[[12, nan, 34] [10, 32, nan] [nan, 11, 20]]
Imputed Data :
[[12, 21.5, 34] [10, 32, 27] [11, 11, 20]]
Remember : The mean or median is taken along column of the matrix
- Working with Missing Data in Pandas
- Simple Linear-Regression using R
- Simple Keyboard Racing with Python
- Flask - (Creating first simple application)
- Simple Chat Room using Python
- Python | Simple GUI calculator using Tkinter
- Python | Simple calculator using Tkinter
- Python | Simple FLAMES game using Tkinter
- Simple Multithreaded Download Manager in Python
- Python | Simple registration form using Tkinter
- Creating a simple machine learning model
- Python | Creating a Simple Drawing App in kivy
- A Practical approach to Simple Linear Regression using R
- Python | Make a simple window using kivy
- ML | Handling Missing Values
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.