Open In App

Pandas Convert Column To String Type

Last Updated : 27 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Pandas is a Python library widely used for data analysis and manipulation of huge datasets. One of the major applications of the Pandas library is the ability to handle and transform data. Mostly during data preprocessing, we are required to convert a column into a specific data type. In this article, we’ll look into the process of converting a Pandas column to a string type.

Let us understand the different ways of converting Pandas columns to string types:

astype() Method:

The astype() method in Pandas is a straightforward way to change the data type of a column to any desired type. The astype method has the following syntax:

.astype(data_Type)

Let us understand this using an example:

Here we define that the numeric type for the dataset should be converted to a string (Str). This will convert the data type “int” to “string.”.

Python3




import pandas as pd
 
# sample data
data = {'NumericColumn': [1, 2, 3, 4]}
df = pd.DataFrame(data)
df['NumericColumn'] = df['NumericColumn'].astype(str)
df.info()


Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 1 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   NumericColumn  4 non-null      object
dtypes: object(1)
memory usage: 160.0+ bytes

apply() Function:

The apply() function is another way of converting the data type. This function allows us for more flexibility in data transformations. Lambda function is used in this method.

Lambda Function

However, before we understand how we can convert to string type using apply() function let’s study lambda function. In python, lambda is a anonymous function that can be be defined in short without using the def keyword. It has a very concise syntax where lambda keyword is followed by arguments and expression.
It can take multiple arguments but has only one expression which is evaluated and returned. The syntax of lambda function is as follows:

lambda arguments: expression

You can even study the example of using lambda function below:

Python3




add = lambda x, y: x + # Adds two numbers
result = add(5, 3# Calls the lambda function
print(result) 


Output:

8

Now, lets see how lambda can help us along with apply function to convert column to string type. Lambda function will be a quick way of telling the computer to apply the changes for each value

Python3




import pandas as pd
 
# sample data
data = {'NumericColumn': [1, 2, 3, 4]}
df = pd.DataFrame(data)
df['NumericColumn'] = df['NumericColumn'].apply(lambda x: str(x))
df.info()


Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4 entries, 0 to 3
Data columns (total 1 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   NumericColumn  4 non-null      object
dtypes: object(1)
memory usage: 160.0+ bytes

map() Function:

The map() function is our next method for conversion. This method is useful when we need to apply conversion based on a mapping dictionary:

Python3




import pandas as pd
 
# Create a sample DataFrame
data = {'NumericColumn': [1, 2, 3, 4, 5]}
df = pd.DataFrame(data)
 
# Define a mapping dictionary for conversion
mapping_dict = {1: 'One', 2: 'Two', 3: 'Three', 4: 'Four', 5: 'Five'}
 
# Use map() to convert 'NumericColumn' based on the mapping dictionary
df['NumericColumn'] = df['NumericColumn'].map(mapping_dict)
 
# Check the DataFrame
df.info()


Output:

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5 entries, 0 to 4
Data columns (total 1 columns):
 #   Column         Non-Null Count  Dtype 
---  ------         --------------  ----- 
 0   NumericColumn  5 non-null      object
dtypes: object(1)

Here, the list [1, 2, 3, 4, 5] will change to [one, two, three, four, five].

Frequently Asked Questions

Q. Why is it important to handle missing values before converting a column to a string type?

Handling missing values is important to ensure error free execution of the methods discussed. Missing values which will be NULL values in the datasets will create errors while getting converted into stringtype. Before converting, make sure to use the fillna().

Another efficient way to handle this error is by using the following method. This will convert the missing values into a marker (like NaN).

astype(str, errors=’coerce’)

Q. How should I format the numeric values before conversion?

While converting numeric values, we need to consider rounding or formatting of the numeric values.

For this we can take help of the ‘{:.2f}’.format(x) that will make the numeric values display with two decimal places.

df[‘NumericColumn’] = df[‘NumericColumn’].apply(lambda x: ‘{:.2f}’.format(x))

Q. Why use the apply() function with a lambda function for conversion?

The apply() function applies a given function to each element in a given column and this using a lambda function inside apply() allows you to define a specific way of converting each element in the column.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads