How to Create Dummy Variables in Python with Pandas?
Last Updated :
16 Jan, 2022
A dataset may contain various type of values, sometimes it consists of categorical values. So, in-order to use those categorical value for programming efficiently we create dummy variables. A dummy variable is a binary variable that indicates whether a separate categorical variable takes on a specific value.
Explanation:
As you can see three dummy variables are created for the three categorical values of the temperature attribute. We can create dummy variables in python using get_dummies() method.
Syntax: pandas.get_dummies(data, prefix=None, prefix_sep=’_’,)
Parameters:
- data= input data i.e. it includes pandas data frame. list . set . numpy arrays etc.
- prefix= Initial value
- prefix_sep= Data values separation.
Return Type: Dummy variables.
Step-by-step Approach:
- Import necessary modules
- Consider the data
- Perform operations on data to get dummies
Example 1:
Python3
import pandas as pd
import numpy as np
df = pd.DataFrame({ 'Temperature' : [ 'Hot' , 'Cold' , 'Warm' , 'Cold' ],
})
print (df)
pd.get_dummies(df)
|
Output:
Example 2:
Consider List arrays to get dummies
Python3
import pandas as pd
import numpy as np
s = pd.Series( list ( 'abca' ))
print (s)
pd.get_dummies(s)
|
Output:
Example 3:
Here is another example, to get dummy variables.
Python3
import pandas as pd
import numpy as np
df = pd.DataFrame({ 'A' : [ 'hello' , 'vignan' , 'geeks' ],
'B' : [ 'vignan' , 'hello' , 'hello' ],
'C' : [ 1 , 2 , 3 ]})
print (df)
pd.get_dummies(df)
|
Output:
Share your thoughts in the comments
Please Login to comment...