Open In App

Combine two DataFrames in R with different columns

In this article, we will discuss how to combine two dataframes with different columns in R Programming Language.

Method 1 : Using plyr package

The “plyr” package in R is used to work with data, including its enhancements and manipulations. It can be loaded and installed into the working space by the following command :



install.packages(“plyr”)

rbind.fill() method in R is an enhancement of the rbind() method in base R, is used to combine data frames with different columns. The column names are number may be different in the input data frames. Missing columns of the corresponding data frames are filled with NA. The output data frame contains a column only if it is present in any of the data frame. 



Syntax:

rbind.fill( df1, df2)

 The following properties are maintained by the rbind.fill() method :

Example:




# loading the required library
library("plyr")
  
# declaring first data frame
data_frame1 <- data.frame(col1 = c(2,4,6), 
                          col2 = c(4,6,8), 
                          col3 = c(8,10,12), 
                          col4 = LETTERS[1:3])
print ("First Dataframe")
print (data_frame1)
  
# declaring second data frame
data_frame2 <- data.frame(col4 = letters[1:4], 
                          col5 = TRUE)
print ("Second Dataframe")
print (data_frame2)
  
print ("Combining Dataframe")
  
# binding data frames
rbind.fill(data_frame1,data_frame2)

Output

[1] "First Dataframe"
col1 col2 col3 col4
1    2    4    8   A
2    4    6   10   B
3    6    8   12   C
[1] "Second Dataframe"
col4 col5
1    a TRUE
2    b TRUE
3    c TRUE
4    d TRUE
[1] "Combining Dataframe"
[1] "First Dataframe"
col1 col2 col3 col4
1    2    4    8   20
2    4    6   10   16
3    6    8   12   14
[1] "Second Dataframe"
col5 col6
1    a TRUE
2    b TRUE
3    c TRUE
4    d TRUE
[1] "Combining Dataframe"
col1 col2 col3 col4 col5
1    2    4    8   A  NA
2    4    6   10   B  NA
3    6    8   12   C  NA
4   NA   NA   NA   a  TRUE
5   NA   NA   NA   b  TRUE
6   NA   NA   NA   c  TRUE
7   NA   NA   NA   d  TRUE

Method 2: Using dplyr package

The “dplyr” package in R is used to work with data, including its enhancements and manipulations. It can be loaded and installed into the working space by the following command : 

install.packages(“dplyr”)

The bind_rows() method is used to combine data frames with different columns. The column names are number may be different in the input data frames. Missing columns of the corresponding data frames are filled with NA. The output data frame contains a column only if it is present in any of the data frame. 

Syntax:

bind_rows(df1, df2)

 The following properties are maintained by the bind_rows() method :

Example:




# loading the required library
library("dplyr")
  
# declaring first data frame
data_frame1 <- data.frame(col1 = c(2,4,6), 
                          col2 = c(4,6,8), 
                          col3 = c(8,10,12), 
                          col4 = c(20,16,14))
print ("First Dataframe")
print (data_frame1)
  
# declaring second data frame
data_frame2 <- data.frame(col5 = letters[1:4], 
                          col6 = TRUE)
print ("Second Dataframe")
print (data_frame2)
  
print ("Combining Dataframe")
  
# binding data frames
bind_rows(data_frame1,data_frame2)

Output

[1] "First Dataframe" 
col1 col2 col3 col4 
1    2    4    8   20 
2    4    6   10   16 
3    6    8   12   14 
[1] "Second Dataframe" 
col5 col6 
1    a TRUE 
2    b TRUE
 3    c TRUE 
4    d TRUE 
[1] "Combining Dataframe" 
col1 col2 col3 col4 col5 col6 
1    2    4    8   20 <NA>   NA 
2    4    6   10   16 <NA>   NA 
3    6    8   12   14 <NA>   NA 
4   NA   NA   NA   NA    a TRUE 
5   NA   NA   NA   NA    b TRUE 
6   NA   NA   NA   NA    c TRUE 
7   NA   NA   NA   NA    d TRUE

In case, any of the column name is same in both of the input data frames, then the following properties are encountered : 

Example:




# loading the required library
library("dplyr")
  
# declaring first data frame
data_frame1 <- data.frame(col1 = c(2,4,6), 
                          col2 = c(4,6,8), 
                          col3 = c(8,10,12), 
                          col4 = LETTERS[1:3])
print ("First Dataframe")
print (data_frame1)
  
# declaring second data frame
data_frame2 <- data.frame(col4 = letters[1:4], 
                          col5 = TRUE)
print ("Second Dataframe")
print (data_frame2)
  
print ("Combining Dataframe")
  
# binding data frames
bind_rows(data_frame1,data_frame2)

Output

[1] "First Dataframe"
col1 col2 col3 col4
1    2    4    8   A
2    4    6   10   B
3    6    8   12   C
[1] "Second Dataframe"
col4 col5
1    a TRUE
2    b TRUE
3    c TRUE
4    d TRUE
[1] "Combining Dataframe"
[1] "First Dataframe" 
col1 col2 col3 col4 
1    2    4    8   20 
2    4    6   10   16 
3    6    8   12   14 
[1] "Second Dataframe" 
col5 col6 
1    a TRUE 
2    b TRUE
 3    c TRUE 
4    d TRUE 
[1] "Combining Dataframe" 
col1 col2 col3 col4 col5 
1    2    4    8   A  NA 
2    4    6   10   B  NA 
3    6    8   12   C  NA 
4   NA   NA   NA   a  TRUE 
5   NA   NA   NA   b  TRUE 
6   NA   NA   NA   c  TRUE 
7   NA   NA   NA   d  TRUE

Article Tags :