Open In App

Merge two Pandas DataFrames with complex conditions

Last Updated : 07 Apr, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we let’s discuss how to merge two Pandas Dataframe with some complex conditions. Dataframes in Pandas can be merged using pandas.merge() method.

Syntax:

pandas.merge(parameters)

Returns : A DataFrame of the two merged objects.

While working on datasets there may be a need to merge two data frames with some complex conditions, below are some examples of merging two data frames with some complex conditions.

Example 1 :

Merging two data frames with merge() function with the parameters as the two data frames. 

Python3




# importing pandas as pd
import pandas as pd
  
# creating dataframes
df1 = pd.DataFrame({'ID': [1, 2, 3, 4], 
                    'Name': ['John', 'Tom', 'Simon', 'Jose']})
  
df2 = pd.DataFrame({'ID': [1, 2, 3, 5], 
                    'Class': ['Second', 'Third', 'Fifth', 'Fourth']})
  
# merging df1 and df2 with merge function
df = pd.merge(df1, df2)
print(df)


Output : 

Example 2 :

Merging two data frames with merge() function on some specified column name of the data frames.

Python3




# importing pandas as pd
import pandas as pd
  
# creating dataframes
df1 = pd.DataFrame({'Name': ['John', 'Tom', 'Simon', 'Jose'],
                    'Age': [5, 6, 4, 5]})
  
df2 = pd.DataFrame({'Name': ['John', 'Tom', 'Jose'],
                    'Class': ['Second', 'Third', 'Fifth']})
  
# merging df1 and df2 with merge function 
# with the common column as Name
df = pd.merge(df1, df2, on='Name')
print(df)


Output : 

Example 3 :

Merging two data frames with all the values in the first data frame and NaN for the not matched values from the second data frame. The same can be done to merge with all values of the second data frame what we have to do is just give the position of the data frame when merging as left or right.

Python3




# importing pandas as pd
import pandas as pd
  
# creating dataframes
df1 = pd.DataFrame({'Name': ['John', 'Tom', 'Simon', 'Jose'], 
                    'Age': [5, 6, 4, 5]})
  
df2 = pd.DataFrame({'Name': ['John', 'Tom', 'Jose'],
                    'Class': ['Second', 'Third', 'Fifth']})
  
# merging df1 and df2 with merge function 
# with the common column as Name
df = pd.merge(df1, df2, on='Name', how="left")
print(df)


Output : 

Example 4 :

Merging two data frames with all the values of both the data frames using merge function with an outer join. The same can be done do join two data frames with inner join as well.

Python3




# importing pandas as pd
import pandas as pd
  
# creating dataframes
df1 = pd.DataFrame({'Name':['John', 'Tom', 'Simon', 'Jose'],
                    'Age':[5, 6, 4, 5]})
  
df2 = pd.DataFrame({'Name':['John', 'Tom', 'Philip'],
                    'Class':['Second', 'Third', 'Fifth']})
  
# merging df1 and df2 with merge function 
# with the records of both the dataframes
df = pd.merge(df1, df2, how = "outer")
print(df)


Output : 

Example 5 : 

Merging data frames with the indicator value to see which data frame has that particular record.

Python3




# importing pandas as pd
import pandas as pd
  
# creating dataframes
df1 = pd.DataFrame({'Name':['John', 'Tom', 'Simon', 'Jose'],
                    'Age':[5, 6, 4, 5]})
  
df2 = pd.DataFrame({'Name':['John', 'Tom', 'Simon', 'Tom'],
                    'Class':['Second', 'Third', 'Fifth', 'Fourth']})
  
# merging df1 and df2 with merge function 
# with the records of both the dataframes
df = pd.merge(df1, df2, how = 'left', indicator = True)
print(df)


Output : 

Example 6 :

Merging data frames with the one-to-many relation in the two data frames. The same can be done to merge with many-to-many, one-to-one, and one-to-many type of relationship.

Python3




# importing pandas as pd
import pandas as pd
  
# creating dataframes
df1 = pd.DataFrame({'Name':['John', 'Tom', 'Simon', 'Jose'],
                    'Age':[5, 6, 4, 5]})
  
df2 = pd.DataFrame({'Name':['John', 'Tom', 'Simon', 'Tom'],
                    'Class':['Second', 'Third', 'Fifth', 'Fourth']})
  
# merging df1 and df2 with merge function with 
# the one to many relations from both the dataframes
df = pd.merge(df1, df2, validate = 'one_to_many')
print(df)


Output : 



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads