Data binning, bucketing is a data pre-processing method used to minimize the effects of small observation errors. The original data values are divided into small intervals known as bins and then they are replaced by a general value calculated for that bin. This has a smoothing effect on the input data and may also reduce the chances of overfitting in case of small datasets
There are 2 methods of dividing data into bins ”
- Equal Frequency Binning : bins have equal frequency.
- Equal Width Binning : bins have equal width with a range of each bin are defined as [min + w], [min + 2w] …. [min + nw] where w = (max – min) / (no of bins).
Input :[5, 10, 11, 13, 15, 35, 50, 55, 72, 92, 204, 215] Output : [5, 10, 11, 13] [15, 35, 50, 55] [72, 92, 204, 215]
Input :[5, 10, 11, 13, 15, 35, 50, 55, 72, 92, 204, 215] Output : [10, 11, 13, 15, 35, 50, 55, 72]  
Code : Implementation of Bining Technique
equal frequency binning [5, 10, 11, 13] [15, 35, 50, 55] [72, 92, 204, 215] equal width binning [[10, 11, 13, 15, 35, 50, 55, 72], , ]
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.
- Python | Binning method for data smoothing
- Exploration with Hexagonal Binning and Contour Plots
- ML | Binning or Discretization
- Data Integration in Data Mining
- Basic Concept of Classification (Data Mining)
- Redundancy and Correlation in Data Mining
- Ensemble Classifier | Data Mining
- Comparison b/w Bagging and Boosting | Data Mining
- Relationship between Data Mining and Machine Learning
- Difference Between Descriptive and Predictive Data Mining
- Classification of Data Mining Systems
- Difference Between Data mining and Machine learning
- Web Mining
- Association Rule Mining in R Programming
- Difference between Text Mining and Natural Language Processing
- Processing of Raw Data to Tidy Data in R
- Python - Convert Tick-by-Tick data into OHLC (Open-High-Low-Close) Data
- Difference between Data Cleaning and Data Processing
- Object Oriented Programming in Python | Set 2 (Data Hiding and Object Printing)
- Data analysis and Visualization with Python
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to email@example.com. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.
Improved By : vc17srimouli