Open In App

Rename column name with an index number of the CSV file in Pandas

Last Updated : 05 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In this blog post, we will learn how to rename the column name with an index number of the CSV file in Pandas.

Renaming Column Name with an Index Number

Pandas, an advanced data manipulation package in Python, includes several methods for working with structured data such as CSV files. You might wish to change the automatically generated column names (0, 1, 2, etc.) to something more illustrative when using Pandas to work with CSV data. Instead of requiring users to refer to confusing default names, Pandas offers a straightforward approach for renaming columns by using the rename() function and providing the index number.

Methods to Rename Column Names

In Pandas, there are primarily two ways to rename columns:

  • Using the rename() function: With Pandas, we can easily rename columns using the rename() function. We can add a dictionary to the columns option to rename columns that have index numbers. The present column names should serve as the dictionary’s keys, while the new names should serve as the values.
  • Using List Comprehension : Another strategy is to create new column names depending on the index numbers by using list comprehension. This approach comes in particularly useful when managing a lot of columns.

Before we get started, let’s go over some fundamental notions about this topic.

Column Index

In a data frame, a column index has two main functions:

  • Labeling: Just like row labels for rows, it gives each column in the DataFrame a distinct identity. This makes it simple for users to identify and make explicit references to different columns.
  • Location: It indicates where a column is located inside the DataFrame. Instead of depending on names that could be confusing, this enables accessing particular columns based on their index position.

Why Rename Columns?

The names of your columns are important identifiers for the various attributes in your dataset. Sometimes, they might be too lengthy or complex, making it challenging to work with them. Renaming columns helps simplify data processing and makes your code easier to read.

Pandas Column Name Concepts

  • Pandas will automatically assign column names (0, 1, 2…) to CSV data when loaded into a DataFrame
  • You can view and work with these default names, but descriptive names are preferable
  • The rename() method allows you to map new names to existing names
  • You refer to columns using their index number (starting from 0)

Pandas Implementation

Let’s create a simple dataset to demonstrate the renaming process:

Python3




# Import the csv module
import csv
 
# Define the data as a list of dictionaries
data = [
    {"Name": "Alice", "Age": 12, "Gender": "F", "Grade": "A"},
    {"Name": "Bob", "Age": 13, "Gender": "M", "Grade": "B"},
    {"Name": "Charlie", "Age": 14, "Gender": "M", "Grade": "C"},
    {"Name": "David", "Age": 12, "Gender": "M", "Grade": "A"},
    {"Name": "Eve", "Age": 13, "Gender": "F", "Grade": "B"}
]
 
# Open a new csv file for writing
with open("data.csv", "w") as file:
    # Create a csv writer object
    writer = csv.DictWriter(file, fieldnames=["Name", "Age", "Gender", "Grade"])
    # Write the header row
    writer.writeheader()
    # Write the data rows
    writer.writerows(data)
 
# Close the file
file.close()


Using rename() function

Renaming a Single Column Name with an Index Number

Using the df.rename() function, we can change the name of a single column using an index number. The old column names are the keys and the new column names are the values of a dictionary that is sent as an argument to this procedure. The desired new name may be used as the value, and the index position of the column name can be used as the key. Assume, for instance, that we wish to change the name of the second column (index 1) from “Age” to “Years.” The code that follows is usable:

Python3




import pandas as pd
df = pd.read_csv('data.csv')
print(df.columns)
df = df.rename(columns={df.columns[1]: 'Years'})
df


Output:

Index(['Name', 'Age', 'Gender', 'Grade'], dtype='object')
Name Years Gender Grade
0 Alice 12 F A
1 Bob 13 M B
2 Charlie 14 M C
3 David 12 M A
4 Eve 13 F B

This will modify the DataFrame in place and change the column name from ‘Age’ to ‘Years’. If we print the DataFrame, we will see the updated column name.

Renaming Multiple Column Names with Index Numbers

To rename numerous column names with index numbers, we may use the same df.rename() function, but with a bigger dictionary including more key–value pairs. Consider the following scenario: let’s say we wish to change the labels of the first and third columns (index 0 and 2) from “Name” and “Gender” to “Student” and “Sex,” respectively. The code that follows is usable:

Python3




df = df.rename(columns={df.columns[0]: 'Student', df.columns[2]: 'Sex'})
print(df)


Output:

   Student  Years Sex Grade
0 Alice 12 F A
1 Bob 13 M B
2 Charlie 14 M C
3 David 12 M A
4 Eve 13 F B

This will modify the DataFrame in place and change the column names from ‘Name’ and ‘Gender’ to ‘Student’ and ‘Sex’, respectively. If we print the DataFrame, we will see the updated column names.

Using List Comprehension

Python3




import pandas as pd
 
data = {'Name': ['Alice', 'Bob', 'Charlie'],
        'Age': [25, 30, 22],
        'Salary': [50000, 60000, 45000]}
 
df = pd.DataFrame(data)
 
# Display original dataset
print("Original Dataset:")
print(df)
 
# Rename columns with index numbers using list comprehension
df.columns = [f'Column_{index}' for index in range(len(df.columns))]
 
# Display dataset with renamed columns
print("\nDataset with Renamed Columns:")
print(df)


Output:

Original Dataset:
Name Age Salary
0 Alice 25 50000
1 Bob 30 60000
2 Charlie 22 45000

Dataset with Renamed Columns:
Column_0 Column_1 Column_2
0 Alice 25 50000
1 Bob 30 60000
2 Charlie 22 45000

In both methods, we first display the original dataset to provide context. The enumerate() function is used to get both the column names and their corresponding index numbers. The new column names are then generated based on these index numbers and applied to the DataFrame.

Conclusion

In conclusion, Pandas provides efficient methods for renaming columns with index numbers, aiding clarity and standardization in data manipulation tasks.

Frequently Asked Questions

Why would I want to rename columns with index numbers?

Renaming columns with index numbers can be helpful when you want a concise and standardized way to reference columns, especially in scenarios where original column names may be complex or inconsistent.

Can I rename the index too?

Yes, we can use the index argument to supply new index names.

How can I rename the column names without using the index numbers?

You can use the column names directly as the keys in the dictionary that you pass to the df.rename() method. For example, suppose we want to rename the column name from ‘Grade’ to ‘Score’. We can use the following code: df = df.rename(columns={‘Grade’: ‘Score’})

How can I rename the column names without modifying the original DataFrame?

You can use the inplace=False argument in the df.rename() method, which will return a new DataFrame with the renamed column names, without affecting the original DataFrame. For example, suppose we want to rename the column name from ‘Sex’ to ‘Gender’, but keep the original DataFrame intact. We can use the following code: df_new = df.rename(columns={‘Sex’: ‘Gender’}, inplace=False)



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads