Change Data Type for one or more columns in Pandas Dataframe
Last Updated :
29 Sep, 2023
Let’s see How To Change Column Type in Pandas DataFrames, There are different ways of changing DataType for one or more columns in Pandas Dataframe.
Change column type into string object using DataFrame.astype()
DataFrame.astype() method is used to cast pandas object to a specified dtype. This function also provides the capability to convert any suitable existing column to a categorical type.
Python3
import pandas as pd
df = pd.DataFrame({
'A' : [ 1 , 2 , 3 , 4 , 5 ],
'B' : [ 'a' , 'b' , 'c' , 'd' , 'e' ],
'C' : [ 1.1 , '1.0' , '1.3' , 2 , 5 ]})
df = df.astype( str )
print (df.dtypes)
|
Output:
A object
B object
C object
dtype: object
Change column type in pandas using dictionary and DataFrame.astype()
We can pass any Python, Numpy, or Pandas datatype to change all columns of a Dataframe to that type, or we can pass a dictionary having column names as keys and datatype as values to change the type of selected columns.
Python3
import pandas as pd
df = pd.DataFrame({
'A' : [ 1 , 2 , 3 , 4 , 5 ],
'B' : [ 'a' , 'b' , 'c' , 'd' , 'e' ],
'C' : [ 1.1 , '1.0' , '1.3' , 2 , 5 ]})
convert_dict = { 'A' : int ,
'C' : float
}
df = df.astype(convert_dict)
print (df.dtypes)
|
Output:
A int64
B object
C float64
dtype: object
Change column type in pandas using DataFrame.apply()
We can pass pandas.to_numeric, pandas.to_datetime, and pandas.to_timedelta as arguments to apply the apply() function to change the data type of one or more columns to numeric, DateTime, and time delta respectively.
Python3
import pandas as pd
df = pd.DataFrame({
'A' : [ 1 , 2 , 3 , '4' , '5' ],
'B' : [ 'a' , 'b' , 'c' , 'd' , 'e' ],
'C' : [ 1.1 , '2.1' , 3.0 , '4.1' , '5.1' ]})
df[[ 'A' , 'C' ]] = df[[ 'A' , 'C' ]]. apply (pd.to_numeric)
print (df.dtypes)
|
Output:
A int64
B object
C float64
dtype: object
Change column type in pandas using DataFrame.infer_objects()
This DataFrame.infer_objects() method attempts soft-conversion by inferring the data type of ‘object’-type columns. Non-object and unconvertible columns are left unchanged.
Python3
import pandas as pd
df = pd.DataFrame({
'A' : [ 1 , 2 , 3 , 4 , 5 ],
'B' : [ 'a' , 'b' , 'c' , 'd' , 'e' ],
'C' : [ 1.1 , 2.1 , 3.0 , 4.1 , 5.1 ]
}, dtype = 'object' )
df = df.infer_objects()
print (df.dtypes)
|
Output:
A int64
B object
C float64
dtype: object
Change column type in pandas using convert_dtypes()
A new DataFrame with each column’s data type changed to the best one is returned by the convert dtypes() method.
Python3
import pandas as pd
data = {
"name" : [ "Aman" , "Hardik" , pd.NA],
"qualified" : [ True , False , pd.NA]
}
df = pd.DataFrame(data)
print ( "Original_dtypes:" )
print (df.dtypes)
newdf = df.convert_dtypes()
print ( "New_dtypes:" )
print (newdf.dtypes)
|
Output:
Original_dtypes:
name object
qualified object
dtype: object
New_dtypes:
name string
qualified boolean
dtype: object
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...