Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric Python packages. Pandas is one of those packages and makes importing and analyzing data much easier.
astype() is the one of the most important methods. It is used to change data type of a series. When data frame is made from a csv file, the columns are imported and data type is set automatically which many times is not what it actually should have. For example, a salary column could be imported as string but to do operations we have to convert it into float.
astype() is used to do such data type conversions.
Syntax: DataFrame.astype(dtype, copy=True, errors=’raise’)
dtype: Data type to convert the series into. (for example str, float, int)
copy: Makes a copy of dataframe/series.
errors: Error raising on conversion to invalid data type. For example dict to string. ‘raise’ will raise the error and ‘ignore’ will pass without raising error.
Return type: Series with changed data types
To download the data set used in following example, click here.
In the following examples, the data frame used contains data of some NBA players. The image of data frame before any operations is attached below.
In this example, the data frame is imported and .dtypes is called on the data frame to view the data types of series. After that some columns are converted using .astype() method and the dtypes are viewed again to see the changes.
As shown in the output image, the data types of columns were converted accordingly.
Attention geek! Strengthen your foundations with the Python Programming Foundation Course and learn the basics.
To begin with, your interview preparations Enhance your Data Structures concepts with the Python DS Course.