How to convert Categorical features to Numerical Features in Python?

Last Updated : 26 Jan, 2022

It’s difficult to create machine learning models that can’t have features that have categorical values, such models cannot function. categorical variables have string-type values. thus we have to convert string values to numbers. This can be accomplished by creating new features based on the categories and setting values to them. In this article, we are going to see how to convert Categorical features to Numerical Features in Python

Stepwise Implementation

Step 1: Import the necessary packages and modules

Python3

# import packages and modules 
import numpy as np 
import pandas as pd 
from sklearn import preprocessing 

Step 2: Import the CSV file

We will use the pandas read_csv() method to import the CSV file. To view and download the CSV file used click here.

Python3

# import the CSV file 
df = pd.read_csv('cluster_mpg.csv') 
print(df.head()) 

Output:

Step 3: Get all features with categorical values

We use df.info() to find categorical features. Categorical features have Dtype as “object”.

Python3

df.info()

Output:

In the given database columns “origin” and “name” is object type.

Step 4: Convert string values of origin column to numerical values

We will fit the “origin” column using preprocessing.LabelEncoder().fit() method.

Python3

label_encoder = preprocessing.LabelEncoder() 
label_encoder.fit(df["origin"]) 

Step 5: Get the unique values out of the categorical features

We will use label_encoder.classes_ attribute for this purpose.

classes_:ndarray of shape (n_classes,)

Holds the label for each class.

Python3

# finding the unique classes 
print(list(label_encoder.classes_)) 
print() 

Output

['europe', 'japan', 'usa']

Step 6: Transforming the categorical values

Python3

# values after transforming the categorical column. 
print(label_encoder.transform(df["origin"])) 

Output:

Suggest improvement

How to Resample Time Series Data in Python?

Python3 Program to Find Mth element after K Right Rotations of an Array

Share your thoughts in the comments

How to convert Categorical features to Numerical Features in Python?

Stepwise Implementation

Step 1: Import the necessary packages and modules

Python3

Step 2: Import the CSV file

Python3

Step 3: Get all features with categorical values

Python3

Step 4: Convert string values of origin column to numerical values

Python3

Step 5: Get the unique values out of the categorical features

Python3

Step 6: Transforming the categorical values

Python3

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?