sklearn.Binarizer() in Python

sklearn.preprocessing.Binarizer() is a method which belongs to preprocessing module. It plays a key role in the discretization of continuous feature values.

Example #1:
A continuous data of pixels values of an 8-bit grayscale image have values ranging between 0 (black) and 255 (white) and one needs it to be black and white. So, using Binarizer() one can set a threshold converting pixel values from 0 – 127 to 0 and 128 – 255 as 1.

Example #2:
One has a machine record having “Sucess Percentage” as a feature. These values are continuous ranging from 10% to 99% but a researcher simply wants to use this data for prediction of pass or fail status for the machine based on other given parameters.

Syntax :

sklearn.preprocessing.Binarizer(threshold, copy)

Parameters :



threshold :[float, optional] Values less than or equal to threshold is mapped to 0, else to 1. By default threshold value is 0.0.
copy :[boolean, optional] If set to False, it avoids a copy. By default it is True.

Return :

Binarized Feature values

Download the dataset:
Go to the link and download Data.csv

Below is the Python code explaning sklearn.Binarizer()


# Python code explaining how
# to Binarize feature values
 
""" PART 1
    Importing Libraries """
 
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

# Sklearn library 
from sklearn import preprocessing

""" PART 2
    Importing Data """
 
data_set = pd.read_csv(
        'C:\\Users\\dell\\Desktop\\Data_for_Feature_Scaling.csv')
data_set.head()

# here Features - Age and Salary columns 
# are taken using slicing
# to binarize values
age = data_set.iloc[:, 1].values
salary = data_set.iloc[:, 2].values
print ("\nOriginal age data values : \n",  age)
print ("\nOriginal salary data values : \n",  salary)

""" PART 4
    Binarizing values """

from sklearn.preprocessing import Binarizer

x = age
x = x.reshape(1, -1)
y = salary
y = y.reshape(1, -1)

# For age, let threshold be 35
# For salary, let threshold be 61000
binarizer_1 = Binarizer(35)
binarizer_2 = Binarizer(61000)

# Transformed feature
print ("\nBinarized age : \n", binarizer_1.fit_transform(x))

print ("\nBinarized salary : \n", binarizer_2.fit_transform(y))

Output :

   Country  Age  Salary  Purchased
0   France   44   72000          0
1    Spain   27   48000          1
2  Germany   30   54000          0
3    Spain   38   61000          0
4  Germany   40    1000          1

Original age data values : 
 [44 27 30 38 40 35 78 48 50 37]

Original salary data values : 
 [72000 48000 54000 61000  1000 58000 52000 79000 83000 67000]

Binarized age : 
 [[1 0 0 1 1 0 1 1 1 1]]

Binarized salary : 
 [[1 0 0 0 0 0 0 1 1 1]]



Aspire to Inspire before I expire

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.




Practice Tags :

Recommended Posts:



0 Average Difficulty : 0/5.0
No votes yet.