Adding new column to existing DataFrame in Pandas
Last Updated :
21 Dec, 2023
Adding new columns to an existing DataFrame is a fundamental task in data analysis using Pandas. It allows you to enrich your data with additional information and facilitate further analysis and manipulation. This article will explore various methods for adding new columns, including simple assignment, the insert()
method, the assign()
method. Let’s discuss adding new columns to Pandas’s existing DataFrame.
What is Pandas DataFrame?
A Pandas DataFrame is a two-dimensional, size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns). It’s a fundamental data structure in the Python data science ecosystem and provides a powerful way to work with tabular data.
Here are some key features of a Pandas DataFrame:
- Data representation: Stores data in a table format with rows and columns.
- Heterogeneous data types: Can hold different data types in different columns (e.g., integers, floats, strings, booleans).
- Labeling: Each row and column has a label (index and column names).
- Mutable: Allows data manipulation and modification.
- Powerful operations: Provides various functions and methods for data analysis, manipulation, and exploration.
- Extensible: Can be customized and extended with additional functionalities through libraries and user-defined functions.
Adding a new Column to Existing DataFrame in Pandas in Python
There are multiple ways to add a new Column to an Existing DataFrame in Pandas in Python:
Creating a Sample Dataframe
Here we are creating a Sample Dataframe:
Python3
import pandas as pd
data = { 'Name' : [ 'Jai' , 'Princi' , 'Gaurav' , 'Anuj' ],
'Height' : [ 5.1 , 6.2 , 5.1 , 5.2 ],
'Qualification' : [ 'Msc' , 'MA' , 'Msc' , 'Msc' ]}
df = pd.DataFrame(data)
print (df)
|
Output:
Name Height Qualification
0 Jai 5.1 Msc
1 Princi 6.2 MA
2 Gaurav 5.1 Msc
3 Anuj 5.2 Msc
Note that the length of your list should match the length of the index column otherwise it will show an error.
Add a New Column to an Existing Datframe using DataFrame.insert()
It gives the freedom to add a column at any position we like and not just at the end. It also provides different options for inserting the column values.
Python3
import pandas as pd
data = { 'Name' : [ 'Jai' , 'Princi' , 'Gaurav' , 'Anuj' ],
'Height' : [ 5.1 , 6.2 , 5.1 , 5.2 ],
'Qualification' : [ 'Msc' , 'MA' , 'Msc' , 'Msc' ]}
df = pd.DataFrame(data)
df.insert( 2 , "Age" , [ 21 , 23 , 24 , 21 ], True )
print (df)
|
Output:
Name Height Age Qualification
0 Jai 5.1 21 Msc
1 Princi 6.2 23 MA
2 Gaurav 5.1 24 Msc
3 Anuj 5.2 21 Msc
Adding Columns to Pandas DataFrame using Dataframe.assign()
This method will create a new dataframe with a new column added to the old dataframe.
Python3
import pandas as pd
data = { 'Name' : [ 'Jai' , 'Princi' , 'Gaurav' , 'Anuj' ],
'Height' : [ 5.1 , 6.2 , 5.1 , 5.2 ],
'Qualification' : [ 'Msc' , 'MA' , 'Msc' , 'Msc' ]}
df = pd.DataFrame(data)
df2 = df.assign(address = [ 'Delhi' , 'Bangalore' , 'Chennai' , 'Patna' ])
print (df2)
|
Output:
Name Height Qualification address
0 Jai 5.1 Msc Delhi
1 Princi 6.2 MA Bangalore
2 Gaurav 5.1 Msc Chennai
3 Anuj 5.2 Msc Patna
Pandas Add Column to DataFrame using a Dictionary
We can use a Python dictionary to add a new column in pandas DataFrame. Use an existing column as the key values and their respective values will be the values for a new column.
Python3
import pandas as pd
data = { 'Name' : [ 'Jai' , 'Princi' , 'Gaurav' , 'Anuj' ],
'Height' : [ 5.1 , 6.2 , 5.1 , 5.2 ],
'Qualification' : [ 'Msc' , 'MA' , 'Msc' , 'Msc' ]}
address = { 'Delhi' : 'Jai' , 'Bangalore' : 'Princi' ,
'Patna' : 'Gaurav' , 'Chennai' : 'Anuj' }
df = pd.DataFrame(data)
df[ 'Address' ] = address
print (df)
|
Output:
Name Height Qualification Address
0 Jai 5.1 Msc Delhi
1 Princi 6.2 MA Bangalore
2 Gaurav 5.1 Msc Chennai
3 Anuj 5.2 Msc Patna
Adding a New Column to a Pandas DataFrame using List
In this example, Pandas add new columns from list “Address” to an existing Pandas DataFrame using a dictionary and a list.
Python3
address = [ 'Delhi' , 'Bangalore' , 'Chennai' , 'Patna' ]
df[ 'Address' ] = address
print (df)
|
Output:
Name Height Qualification Address
0 Jai 5.1 Msc Delhi
1 Princi 6.2 MA Bangalore
2 Gaurav 5.1 Msc Chennai
3 Anuj 5.2 Msc Patna
Add A New Column To An Existing Pandas DataFrame using Dataframe.loc()
In this example, It creates a Pandas DataFrame named df
with columns “Name”, “Height”, and “Qualification” and adds a new column “Address” using the loc
attribute.
Python3
import pandas as pd
data = { 'Name' : [ 'Jai' , 'Princi' , 'Gaurav' , 'Anuj' ],
'Height' : [ 5.1 , 6.2 , 5.1 , 5.2 ],
'Qualification' : [ 'Msc' , 'MA' , 'Msc' , 'Msc' ]}
df = pd.DataFrame(data)
address = [ "Delhi" , "Bangalore" , "Chennai" , "Patna" ]
df.loc[:, "Address" ] = address
print (df)
|
Output:
Name Height Qualification Address
0 Jai 5.1 Msc Delhi
1 Princi 6.2 MA Bangalore
2 Gaurav 5.1 Msc Chennai
3 Anuj 5.2 Msc Patna
Adding More than One columns in Existing Dataframe
In this example, it expands an existing Pandas DataFrame df
with two new columns, “Age” and “State”, using their respective data lists.
Python3
import pandas as pd
data = { 'Name' : [ 'Jai' , 'Princi' , 'Gaurav' , 'Anuj' ],
'Height' : [ 5.1 , 6.2 , 5.1 , 5.2 ],
'Qualification' : [ 'Msc' , 'MA' , 'Msc' , 'Msc' ],
'Address' : [ 'Delhi' , 'Bangalore' , 'Chennai' , 'Patna' ]}
df = pd.DataFrame(data)
age = [ 22 , 25 , 23 , 24 ]
state = [ 'NCT' , 'Karnataka' , 'Tamil Nadu' , 'Bihar' ]
new_data = { 'Age' : age, 'State' : state }
df = df.assign( * * new_data)
print (df)
|
Output:
Name Height Qualification Address Age State
0 Jai 5.1 Msc Delhi 22 NCT
1 Princi 6.2 MA Bangalore 25 Karnataka
2 Gaurav 5.1 Msc Chennai 23 Tamil Nadu
3 Anuj 5.2 Msc Patna 24 Bihar
Conclusion
Understanding how to add new columns to DataFrames is essential for data exploration and manipulation in Pandas. Choosing the appropriate method depends on the specific context and desired outcome. By mastering these techniques, you can effectively manipulate, analyze, and gain valuable insights from your data.
Share your thoughts in the comments
Please Login to comment...