Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
astype() is the one of the most important methods. It is used to change data type of a series. When data frame is made from a csv file, the columns are imported and data type is set automatically which many times is not what it actually should have. For example, a salary column could be imported as string but to do operations we have to convert it into float.
astype() is used to do such data type conversions.
Syntax: DataFrame.astype(dtype, copy=True, errors=’raise’)
dtype: Data type to convert the series into. (for example str, float, int)
copy: Makes a copy of dataframe/series.
errors: Error raising on conversion to invalid data type. For example dict to string. ‘raise’ will raise the error and ‘ignore’ will pass without raising error.
Return type: Series with changed data types
To download the data set used in following example, click here.
In the following examples, the data frame used contains data of some NBA players. The image of data frame before any operations is attached below.
In this example, the data frame is imported and .dtypes is called on the data frame to view the data types of series. After that some columns are converted using .astype() method and the dtypes are viewed again to see the changes.
As shown in the output image, the data types of columns were converted accordingly.
- Python | Pandas Series.data
- Convert the column type from string to datetime format in Pandas dataframe
- Change Data Type for one or more columns in Pandas Dataframe
- Python | Pandas series.cummax() to find Cumulative maximum of a series
- Python | Pandas series.cumprod() to find Cumulative product of a Series
- Python | Pandas Series.cummin() to find cumulative minimum of a series
- Python | Pandas Series.cumsum() to find cumulative sum of a Series
- Python | Pandas Series.nonzero() to get Index of all non zero values in a series
- Python | Pandas Series.mad() to calculate Mean Absolute Deviation of a Series
- Data type Object (dtype) in NumPy Python
- Python | Pandas series.str.get()
- Python | Pandas Series.sub()
- Python | Pandas Series.add()
- Python | Pandas Series.take()
- Python | Pandas Series.mul()
If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.