Open In App

Rename column name with an index number of the CSV file in Pandas

In this blog post, we will learn how to rename the column name with an index number of the CSV file in Pandas.

Renaming Column Name with an Index Number

Pandas, an advanced data manipulation package in Python, includes several methods for working with structured data such as CSV files. You might wish to change the automatically generated column names (0, 1, 2, etc.) to something more illustrative when using Pandas to work with CSV data. Instead of requiring users to refer to confusing default names, Pandas offers a straightforward approach for renaming columns by using the rename() function and providing the index number.



Methods to Rename Column Names

In Pandas, there are primarily two ways to rename columns:

Before we get started, let’s go over some fundamental notions about this topic.



Column Index

In a data frame, a column index has two main functions:

Why Rename Columns?

The names of your columns are important identifiers for the various attributes in your dataset. Sometimes, they might be too lengthy or complex, making it challenging to work with them. Renaming columns helps simplify data processing and makes your code easier to read.

Pandas Column Name Concepts

Pandas Implementation

Let’s create a simple dataset to demonstrate the renaming process:




# Import the csv module
import csv
 
# Define the data as a list of dictionaries
data = [
    {"Name": "Alice", "Age": 12, "Gender": "F", "Grade": "A"},
    {"Name": "Bob", "Age": 13, "Gender": "M", "Grade": "B"},
    {"Name": "Charlie", "Age": 14, "Gender": "M", "Grade": "C"},
    {"Name": "David", "Age": 12, "Gender": "M", "Grade": "A"},
    {"Name": "Eve", "Age": 13, "Gender": "F", "Grade": "B"}
]
 
# Open a new csv file for writing
with open("data.csv", "w") as file:
    # Create a csv writer object
    writer = csv.DictWriter(file, fieldnames=["Name", "Age", "Gender", "Grade"])
    # Write the header row
    writer.writeheader()
    # Write the data rows
    writer.writerows(data)
 
# Close the file
file.close()

Using rename() function

Renaming a Single Column Name with an Index Number

Using the df.rename() function, we can change the name of a single column using an index number. The old column names are the keys and the new column names are the values of a dictionary that is sent as an argument to this procedure. The desired new name may be used as the value, and the index position of the column name can be used as the key. Assume, for instance, that we wish to change the name of the second column (index 1) from “Age” to “Years.” The code that follows is usable:




import pandas as pd
df = pd.read_csv('data.csv')
print(df.columns)
df = df.rename(columns={df.columns[1]: 'Years'})
df

Output:

Index(['Name', 'Age', 'Gender', 'Grade'], dtype='object')
Name Years Gender Grade
0 Alice 12 F A
1 Bob 13 M B
2 Charlie 14 M C
3 David 12 M A
4 Eve 13 F B

This will modify the DataFrame in place and change the column name from ‘Age’ to ‘Years’. If we print the DataFrame, we will see the updated column name.

Renaming Multiple Column Names with Index Numbers

To rename numerous column names with index numbers, we may use the same df.rename() function, but with a bigger dictionary including more key–value pairs. Consider the following scenario: let’s say we wish to change the labels of the first and third columns (index 0 and 2) from “Name” and “Gender” to “Student” and “Sex,” respectively. The code that follows is usable:




df = df.rename(columns={df.columns[0]: 'Student', df.columns[2]: 'Sex'})
print(df)

Output:

   Student  Years Sex Grade
0 Alice 12 F A
1 Bob 13 M B
2 Charlie 14 M C
3 David 12 M A
4 Eve 13 F B

This will modify the DataFrame in place and change the column names from ‘Name’ and ‘Gender’ to ‘Student’ and ‘Sex’, respectively. If we print the DataFrame, we will see the updated column names.

Using List Comprehension




import pandas as pd
 
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 22],
        'Salary': [50000, 60000, 45000]}
 
df = pd.DataFrame(data)
 
# Display original dataset
print("Original Dataset:")
print(df)
 
# Rename columns with index numbers using list comprehension
df.columns = [f'Column_{index}' for index in range(len(df.columns))]
 
# Display dataset with renamed columns
print("\nDataset with Renamed Columns:")
print(df)

Output:

Original Dataset:
Name Age Salary
0 Alice 25 50000
1 Bob 30 60000
2 Charlie 22 45000

Dataset with Renamed Columns:
Column_0 Column_1 Column_2
0 Alice 25 50000
1 Bob 30 60000
2 Charlie 22 45000

In both methods, we first display the original dataset to provide context. The enumerate() function is used to get both the column names and their corresponding index numbers. The new column names are then generated based on these index numbers and applied to the DataFrame.

Conclusion

In conclusion, Pandas provides efficient methods for renaming columns with index numbers, aiding clarity and standardization in data manipulation tasks.

Frequently Asked Questions

Why would I want to rename columns with index numbers?

Renaming columns with index numbers can be helpful when you want a concise and standardized way to reference columns, especially in scenarios where original column names may be complex or inconsistent.

Can I rename the index too?

Yes, we can use the index argument to supply new index names.

How can I rename the column names without using the index numbers?

You can use the column names directly as the keys in the dictionary that you pass to the df.rename() method. For example, suppose we want to rename the column name from ‘Grade’ to ‘Score’. We can use the following code: df = df.rename(columns={‘Grade’: ‘Score’})

How can I rename the column names without modifying the original DataFrame?

You can use the inplace=False argument in the df.rename() method, which will return a new DataFrame with the renamed column names, without affecting the original DataFrame. For example, suppose we want to rename the column name from ‘Sex’ to ‘Gender’, but keep the original DataFrame intact. We can use the following code: df_new = df.rename(columns={‘Sex’: ‘Gender’}, inplace=False)


Article Tags :