Open In App

Python | Pandas DataFrame.astype()

Improve
Improve
Like Article
Like
Save
Share
Report

Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier.

DataFrame.astype() method is used to cast a pandas object to a specified dtype.astype() function also provides the capability to convert any suitable existing column to a categorical type.

DataFrame.astype() function comes in very handy when we want to compare a particular column data type to another data type. Not only that but we can also use a Python dictionary input to change more than one column type at once. The key label in the dictionary is corresponding to the column name and the values label in the dictionary corresponds to the new data types we want the columns to be of.

Pandas DataFrame.astype() Syntax

Syntax: DataFrame.astype(dtype, copy=True, errors=’raise’, **kwargs)

Parameters:

  • dtype : Use a numpy.dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy.dtype or Python type to cast one or more of the DataFrame’s columns to column-specific types.
  • copy : Return a copy when copy=True (be very careful setting copy=False as changes to values then may propagate to other pandas objects).
  • errors : Control raising of exceptions on invalid data for provided dtype.
  • raise : allow exceptions to be raised
  • ignore : suppress exceptions. On error return original object
  • kwargs :keyword arguments to pass on to the constructor

Returns: casted : type of caller

For link to CSV file Used in Code, click here

Python Pandas DataFrame.astype() Function Examples

Below are some examples of Pandas DataFrame.astype() Function:

Example 1: Convert the Weight Column Data Type

In this example, the code uses pandas to read data from a CSV file named “nba.csv” into a DataFrame and prints the first 10 rows for initial data exploration.

Python3




# importing pandas as pd
import pandas as pd
 
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
 
# Printing the first 10 rows
df[:10]


As the data have some “nan” values so, to avoid any error we will drop all the rows containing any nan values.

Python3




# drop all those rows which
# have any 'nan' value in it.
df.dropna(inplace=True)


Here, the variable before stores the data type of the first element in the ‘Weight’ column before conversion, and after stores the data type after converting the entire ‘Weight’ column to ‘int64’. The values of before and after are not explicitly printed, so you may want to use print(before) and print(after) to display the results.

Python3




# let's find out the data type of Weight column
before = type(df.Weight[0])
 
# Now we will convert it into 'int64' type.
df.Weight = df.Weight.astype('int64')
 
# let's find out the data type after casting
after = type(df.Weight[0])
 
# print the value of before
before
 
# print the value of after
after


Output:

We will now print the DataFrame.

Python3




# print the data frame and see
# what it looks like after the change
df


Example 2: Change the Data Type of More than One Column at Once

Change the Name column to categorical type and Age column to int64 type. In this example, the code uses pandas to read a CSV file named “nba.csv” into a DataFrame (df). It then drops rows containing ‘nan’ values, and df.info() is used to display the existing data types of each column in the cleaned DataFrame.

Python3




# importing pandas as pd
import pandas as pd
 
# Making data frame from the csv file
df = pd.read_csv("nba.csv")
 
# Drop the rows with 'nan' values
df = df.dropna()
 
# print the existing data type of each column
df.info()


Output:

Now let’s change both the columns data type at once.

Python3




# Passed a dictionary to astype() function
df = df.astype({"Name":'category', "Age":'int64'})
 
# Now print the data type
# of all columns after change
df.info()


Output:

Now, we will print the DataFrame.

Python3




# print the data frame
# too after the change
df


Output:



Last Updated : 03 Dec, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads