Conversion Functions in Pandas DataFrame
Python is a great language for doing data analysis, primarily because of the fantastic ecosystem of data-centric python packages. Pandas is one of those packages and makes importing and analyzing data much easier. In this article, we are using “
nba.csv” file to download the CSV, click here.
Cast a pandas object to a specified dtype
DataFrame.astype() function is used to cast a pandas object to a specified dtype.
astype() function also provides the capability to convert any suitable existing column to categorical type.
Code #1: Convert the Weight column data type.
As the data have some “nan” values so, to avoid any error we will drop all the rows containing any
Infer better data type for input object column
DataFrame.infer_objects() function attempts to infer better data type for input object column. This function attempts soft conversion of object-dtyped columns, leaving non-object and unconvertible columns unchanged. The inference rules are the same as during normal Series/DataFrame construction.
Code #1: Use
infer_objects() function to infer better data type.
Let’s see the dtype (data type) of each column in the dataframe.
As we can see in the output, first and third column is of
object type. whereas the second column is of
int64 type. Now slice the dataframe and create a new dataframe from it.
As we can see in the output, column “A” and “C” are of object type even though they contain integer value. So, let’s try the
Now, if we look at the dtype of each column, we can see that the column “A” and “C” are now of
Detect missing values
DataFrame.isna() function is used to detect missing values. It return a boolean same-sized object indicating if the values are NA. NA values, such as None or numpy.NaN, gets mapped to True values. Everything else gets mapped to False values. Characters such as empty strings ” or numpy.inf are not considered NA values (unless you set pandas.options.mode.use_inf_as_na = True).
Code #1: Use
isna() function to detect the missing values in a dataframe.
Lets use the
isna() function to detect the missing values.
In the output, cells corresponding to the missing values contains true value else false.
Detecting existing/non-missing values
DataFrame.notna() function detects existing/ non-missing values in the dataframe. The function returns a boolean object having the same size as that of the object on which it is applied, indicating whether each individual value is a na value or not. All of the non-missing values gets mapped to true and missing values get mapped to false.
Code #1: Use
notna() function to find all the non-missing value in the dataframe.
Let’s use the
dataframe.notna() function to find all the non-missing values in the dataframe.
As we can see in the output, all the non-missing values in the dataframe has been mapped to true. There is no false value as there is no missing value in the dataframe.
Methods for conversion in DataFrame
|DataFrame.convert_objects()||Attempt to infer better dtype for object columns.|
|DataFrame.copy()||Return a copy of this object’s indices and data.|
|DataFrame.bool()||Return the bool of a single element PandasObject.|