Open In App

How To Concatenate Two or More Pandas DataFrames?

Improve
Improve
Like Article
Like
Save
Share
Report

Concatenation of two or more data frames can be done using pandas.concat() method. concat() in Pandas works by combining Data Frames across rows or columns. We can concat two or more data frames either along rows  (axis=0) or along columns (axis=1). In this article, we will see how we can concatenate or add two or more Pandas Dataframe.

Concatenate Two or More Pandas DataFrames Python

There are various methods to Concatenate DataFrames vertically or horizontally here we are discussing some generally used methods for Concatenate DataFrames vertically or horizontally. those are the following.

  • Concatenating Two Pandas DataFrame
  • Using pd.merge() to Concatenate Two DataFrames
  • pd.DataFrame.reindex() for Vertical Concatenation With Index Alignment
  • Using pd.concat() with sort=False for Faster Concatenation
  • Using pandas.concat() to Concatenate Two DataFrames
  • Concatenate Multiple DataFrames Using pandas.concat()
  • Using pandas.join() to Join Two DataFrames
  • Using DataFrame.append() to Concatenate Two DataFrames

Create a Sample DataFrame

Create two Data Frames which we will be concatenating now. For creating Data frames we will be using Numpy and pandas.

Python3




import pandas as pd
import numpy as np
 
df = pd.DataFrame({'Courses': ["GFG", "JS", "Python", "Numpy"],
                   'Fee': [20000, 25000, 22000, 24000]})
 
df1 = pd.DataFrame({'Courses': ["Matplotlib", "SSC", "CHSL", "Java"],
                    'Fee': [25000, 25200, 24500, 24900]})
 
df2 = pd.DataFrame({'Duration': ['30day', '40days', '35days', '60days'],
                    'Discount': [1000, 2300, 2500, 2000]})
 
print("DataFrame 1:")
print(df)
print("DataFrame 2:")
print(df1)
print("DataFrame 3:")
print(df2)


Output:

DataFrame 1:
Courses Fee
0 GFG 20000
1 JS 25000
2 Python 22000
3 Numpy 24000
DataFrame 2:
Courses Fee
0 Matplotlib 25000
1 SSC 25200
2 CHSL 24500
3 Java 24900
DataFrame 3:
Duration Discount
0 30day 1000
1 40days 2300
2 35days 2500
3 60days 2000

Concatenate Two Pandas DataFrame Vertically and Horizontally

We’ll pass two dataframes to pd.concat() method in the form of a list and mention in which axis you want to concat, i.e. axis=0 to concat along rows, axis=1 to concat along columns.

Python3




# concatenating df1 and df2 along rows
vertical_concat = pd.concat([df, df1], axis=0)
 
# concatenating df3 and df4 along columns
horizontal_concat = pd.concat([df1, df2], axis=1)
 
print("Vertical:")
print(vertical_concat)
print("Horizontal:")
print(horizontal_concat)


Output:

Vertical:
Courses Fee
0 GFG 20000
1 JS 25000
2 Python 22000
3 Numpy 24000
0 Matplotlib 25000
1 SSC 25200
2 CHSL 24500
3 Java 24900
Horizontal:
Courses Fee Duration Discount
0 Matplotlib 25000 30day 1000
1 SSC 25200 40days 2300
2 CHSL 24500 35days 2500
3 Java 24900 60days 2000

Concatenatenation DataFrames using pd.merge() to Concat Two DataFrames

The method "pd.merge()" in pandas is used to concatenate DataFrames either vertically or horizontally. It combines two DataFrames based on common columns using a merge operation. The "how" parameter in pd.merge() specifies the type of merge (inner, outer, left, or right), determining how the DataFrames are combined.

Python3




result = pd.merge(df, df1, on='Courses', how='outer', suffixes=('_df1', '_df2')).fillna(0)
 
result['Fee'] = result['Fee_df1'] + result['Fee_df2']
result = result[['Courses', 'Fee']]
 
print(result)


Output :

     Courses      Fee
0 GFG 20000.0
1 JS 25000.0
2 Python 22000.0
3 Numpy 24000.0
4 Matplotlib 25000.0
5 SSC 25200.0
6 CHSL 24500.0
7 Java 24900.0

Using pd.DataFrame.reindex() for Vertical Concatenation With Index Alignment

The method “pd.DataFrame.reindex()” is used for vertical concatenation of DataFrames in pandas. It aligns the indexes of the DataFrames, ensuring proper stacking. It’s a crucial step when combining DataFrames vertically using concatenation in pandas.

Example: In this example code concatenates two pandas DataFrames, df1 and df, ignoring their original indices, and stores the result in the variable result. It then resets the index of the concatenated DataFrame

Python3




result = pd.concat([df1, df], ignore_index=True# Concatenate and reset index
result = result.reindex(range(8))
 
print(result)


Output :

   Courses    Fee
0 Matplotlib 25000
1 SSC 25200
2 CHSL 24500
3 Java 24900
4 GFG 20000
5 JS 25000
6 Python 22000
7 Numpy 24000

Using pd.concat() with sort=False for Faster Concatenation

The method `pd.concat()` in Python’s pandas library is used to combine DataFrames either vertically (along rows) or horizontally (along columns). The parameter `sort=False` is employed to enhance concatenation speed by disabling the sorting of the resulting DataFrame.

Example : In this example code uses the pandas library to concatenate two DataFrames, df1 and df, along their rows (axis=0). The “sort=False” parameter prevents sorting of the resulting DataFrame by column names.

Python3




result = pd.concat([df1, df], sort=False)
 
print(result)


Output :

 Courses    Fee
0 Matplotlib 25000
1 SSC 25200
2 CHSL 24500
3 Java 24900
0 GFG 20000
1 JS 25000
2 Python 22000
3 Numpy 24000

Concatenate Two or More Pandas DataFrames in Python using pandas.concat()

`pandas.concat()` combines two DataFrames either vertically or horizontally, stacking them on top of each other or side by side, providing a flexible way to concatenate data along specified axes.

Example :In this example, the pd.concat() function is used to concatenate these dataframes vertically, producing a new dataframe named result, and ignore_index=True is used to reset the index. The final result is printed.

Python3




result = pd.concat([df, df1], ignore_index=True)
print(result)


Output:

      Courses    Fee
0 GFG 20000
1 JS 25000
2 Python 22000
3 Numpy 24000
4 Matplotlib 25000
5 SSC 25200
6 CHSL 24500
7 Java 24900

Concat Multiple DataFrames in Python using pandas.concat()

The pandas.concat() method is used to combine DataFrames either vertically (along rows) or horizontally (along columns). It takes a list of DataFrames as input and concatenates them based on the specified axis (0 for vertical, 1 for horizontal).

Example : This example uses pandas to create three dataframes (df, df1, and df2) representing information about courses, fees, duration, and discounts. It then concatenates these dataframes vertically using pd.concat(), creating a new dataframe named result with a reset index, and the resulting dataframe is printed.

Python3




result = pd.concat([df, df1, df2], ignore_index=True)
print(result)


Output:

      Courses    Fee Duration  Discount
0 GFG 20000 NaN NaN
1 JS 25000 NaN NaN
2 Python 22000 NaN NaN
3 Numpy 24000 NaN NaN
4 Matplotlib 25000 NaN NaN
5 SSC 25200 NaN NaN
6 CHSL 24500 NaN NaN
7 Java 24900 NaN NaN
8 NaN NaN 30day 1000.0
9 NaN NaN 40days 2300.0
10 NaN NaN 35days 2500.0
11 NaN NaN 60days 2000.0

Pandas Concat Two DataFrames using pandas.join() to Join Two DataFrames

The pandas.join() method is used to concatenate DataFrames vertically or horizontally based on specified columns, performing a SQL-style join. It combines rows or columns from two DataFrames based on common column values, allowing for inner, outer, left, or right joins.

Example : In this example, the join method is used to combine these dataframes based on their indices, resulting in a new dataframe named result, which is printed.

Python3




result = df.join(df1)
print(result)


Output:

  Courses    Fee Duration  Discount
0 GFG 20000 30day 1000
1 JS 25000 40days 2300
2 Python 22000 35days 2500
3 Numpy 24000 60days 2000

Combine two dataframe in Python using DataFrame.append()

The `DataFrame.append()` method in pandas is used to concatenate two DataFrames vertically, adding the rows of one DataFrame below the other. It returns a new DataFrame with the combined data. Ensure both DataFrames have the same columns.

Example : In this example, we are using the append() method, resulting in a new dataframe named result with a reset index, which is printed.

Python3




result = df.append(df1, ignore_index=True)
print(result)


Output:

      Courses    Fee
0 GFG 20000
1 JS 25000
2 Python 22000
3 Numpy 24000
4 Matplotlib 25000
5 SSC 25200
6 CHSL 24500
7 Java 24900


Last Updated : 18 Dec, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads