Skip to content
Related Articles

Related Articles

Python Pandas – get_dummies() method
  • Last Updated : 13 Oct, 2020

pandas.get_dummies() is used for data manipulation. It converts categorical data into dummy or indicator variables.

syntax:  pandas.get_dummies(data, prefix=None, prefix_sep=’_’, dummy_na=False, columns=None, sparse=False, drop_first=False, dtype=None)

Parameters:

  • data: whose data is to be manipulated.
  • prefix: String to append DataFrame column names. Pass a list with length equal to the number of columns when calling get_dummies on a DataFrame. Default value is None.
  • prefix_sep: Separator/delimiter to use if appending any prefix. Default is ‘_’
  • dummy_na: It adds a column to indicate NaN values, default value is false, If false NaNs are ignored.
  • columns: Column names in the DataFrame that needs to be encoded. Default value is None, If columns is None then all the columns with object or category dtype will be converted.
  • sparse: It  specify whether the dummy-encoded columns should be backed by a SparseArray (True) or a regular NumPy array (False). default value is False.
  • drop_first: Remove first level to get k-1 dummies out of k categorical levels.
  • dtype: Data type for new columns. Only a single dtype is allowed. Default value is np.uint8.

Returns: Dataframe (Dummy-coded data)

Example 1:



Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
 
con = pd.Series(list('abcba'))
print(pd.get_dummies(con))

chevron_right


 
 Output:

Output 

 Example 2:

Python

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
import numpy as np
 
 
# list
li = ['s', 'a', 't', np.nan]
print(pd.get_dummies(li))

chevron_right


Output:

Nan column is not there as dummy_na is False by default

Example 3: (To get NaN column)

Python

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
import numpy as np
 
 
# list
li = ['s', 'a', 't', np.nan]
print(pd.get_dummies(li, dummy_na=True))

chevron_right


Output:



Example 4:

Python3

filter_none

edit
close

play_arrow

link
brightness_4
code

import pandas as pd
import numpy as np
 
 
# dictionary
diff = pd.DataFrame({'R': ['a', 'c', 'd'],
                     'T': ['d', 'a', 'c'],
                     'S_': [1, 2, 3]})
 
print(pd.get_dummies(diff, prefix=['column1', 'column2']))

chevron_right


Output:


Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.

To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.

My Personal Notes arrow_drop_up
Recommended Articles
Page :