During Feature Engineering the task of converting categorical features into numerical is called Encoding.
There are various ways to handle categorical features like OneHotEncoding and LabelEncoding, FrequencyEncoding or replacing by categorical features by their count. In similar way we can uses MeanEncoding.
Created a DataFrame having two features named subjects and Target and we can see that here one of the features (SubjectName) is Categorical, so we have converted it into the numerical feature by applying Mean Encoding.
SubjectName Target 0 s1 1 1 s2 0 2 s3 1 3 s1 1 4 s4 1 5 s3 0 6 s2 0 7 s1 1 8 s2 1 9 s4 1 10 s1 0
Code : Counting every datapoints in SubjectName
subjectName s1 4 s2 3 s3 2 s4 2 Name: Target, dtype: int64
Code: groupby data with SubjectName with their mean according to their positive target value
subjectName s1 0.750000 s2 0.333333 s3 0.500000 s4 1.000000 Name: Target, dtype: float64
The output shows the mean mapped with data point in SubjectName with their positive target value (1-positive and 0-Negative).
Code : Finally assigning the mean value and map with df[‘SubjectName’]
Output : Mean Encoded Data
SubjectName Target 0 0.750000 1 1 0.333333 0 2 0.500000 1 3 0.750000 1 4 1.000000 1 5 0.500000 0 6 0.333333 0 7 0.750000 1 8 0.333333 1 9 1.000000 1 10 0.750000 0
Pros of MeanEncoding:
- Capture information within the label, therefore rendering more predictive features
- Creates a monotonic relationship between the variable and the target
Cons of MeanEncodig:
- It may cause over-fitting in the model.
- Feature Encoding Techniques - Machine Learning
- Learning Model Building in Scikit-learn : A Python Machine Learning Library
- Difference Between Artificial Intelligence vs Machine Learning vs Deep Learning
- Artificial intelligence vs Machine Learning vs Deep Learning
- Azure Virtual Machine for Machine Learning
- How to Start Learning Machine Learning?
- ML | What is Machine Learning ?
- Machine Learning in C++
- P-value in Machine Learning
- Clustering in Machine Learning
- What is AutoML in Machine Learning?
- Firebase Machine Learning kit
- How Does NASA Use Machine Learning?
- How Does Google Use Machine Learning?
- Machine Learning - Applications
- Stacking in Machine Learning
- Machine Learning | Outlier
- Demystifying Machine Learning
- Getting started with Machine Learning
- An introduction to Machine Learning
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.