Change the data type of a column or a Pandas Series
Last Updated :
17 Aug, 2020
Series is a one-dimensional labeled array capable of holding data of the type integer, string, float, python objects, etc. The axis labels are collectively called index.
Let’s see the program to change the data type of column or a Series in Pandas Dataframe.
Method 1: Using DataFrame.astype() method.
We can pass any Python, Numpy or Pandas datatype to change all columns of a dataframe to that type, or we can pass a dictionary having column names as keys and datatype as values to change type of selected columns.
Syntax: DataFrame.astype(dtype, copy = True, errors = ’raise’, **kwargs)
Return: casted : type of caller
Let’s see the examples:
Example 1: The Data type of the column is changed to “str” object.
Python3
import pandas as pd
df = pd.DataFrame({ 'srNo' : [ 1 , 2 , 3 ],
'Name' : [ 'Geeks' , 'for' ,
'Geeks' ],
'id' : [ 111 , 222 ,
333 ]})
print (df)
print (df.dtypes)
|
Output:
Now, changing the dataframe data types to string.
Python3
df = df.astype( str )
df.dtypes
|
Output:
Example 2: Now, let us change the data type of the “id” column from “int” to “str”. We create a dictionary and specify the column name with the desired data type.
Python3
import pandas as pd
df = pd.DataFrame({ 'No' : [ 1 , 2 , 3 ],
'Name' : [ 'Geeks' , 'for' ,
'Geeks' ],
'id' : [ 111 , 222 ,
333 ]})
print (df)
print (df.dtypes)
|
Output:
Now, change the data type of ‘id’ column to string.
Python3
data_types_dict = { 'id' : str }
df = df.astype(data_types_dict)
df.dtypes
|
Output:
Example 3: Convert the data type of “grade” column from “float” to “int”.
Python3
import pandas as pd
result_data = { 'name' : [ 'Alia' , 'Rima' , 'Kate' ,
'John' , 'Emma' , 'Misa' ,
'Matt' ],
'grade' : [ 13.5 , 7.1 , 11.5 ,
3.77 , 8.21 , 21.22 ,
17.5 ],
'qualify' : [ 'yes' , 'no' , 'yes' ,
'no' , 'no' , 'yes' ,
'yes' ]}
df = pd.DataFrame(result_data)
print (df)
print (df.dtypes)
|
Output:
Now, we convert the data type of “grade” column from “float” to “int”.
Python3
df.grade = df.grade.astype( int )
print (df)
print (df.dtypes)
|
Output:
Method 2: Using Dataframe.apply() method.
We can pass pandas.to_numeric, pandas.to_datetime and pandas.to_timedelta as argument to apply() function to change the datatype of one or more columns to numeric, datetime and timedelta respectively.
Syntax: Dataframe/Series.apply(func, convert_dtype=True, args=())
Return: Dataframe/Series after applied function/operation.
Let’s see the example:
Example: Convert the data type of “B” column from “string” to “int”.
Python3
import pandas as pd
df = pd.DataFrame({
'A' : [ 'a' , 'b' , 'c' ,
'd' , 'e' ],
'B' : [ 12 , 22 , 35 ,
'47' , '55' ],
'C' : [ 1.1 , '2.1' , 3.0 ,
'4.1' , '5.1' ] })
print (df)
df.dtypes
|
Output:
Now, we convert the datatype of column “B” into an “int” type.
Python3
df[[ 'B' ]] = df[[ 'B' ]]. apply (pd.to_numeric)
df.dtypes
|
Output:
Share your thoughts in the comments
Please Login to comment...