Skip to content
Related Articles
Get the best out of our app
GeeksforGeeks App
Open App
geeksforgeeks
Browser
Continue

Related Articles

Order DataFrame rows according to vector with specific order in R

Improve Article
Save Article
Like Article
Improve Article
Save Article
Like Article

In this article, we will see how to sort data frame rows based on the values of a vector with a specific order. There are two functions by which we can sort data frame rows based on the values of a vector. 

  • match() function
  • left_join() function

Example dataset:

data <- data.frame(x1 = 1:5,               
                  x2 = letters[1:5],
                  x3 = 6:10)
                                                     
data
  x1 x2 x3
1  1  a  6
2  2  b  7
3  3  c  8
4  4  d  9
5  5  e 10

Vector with specific ordering:

vec <- c("b", "e", "a", "c", "d")               
vec                                           
# "b" "e" "a" "c" "d"

Method 1: Using match() function to Sort Data Frame According to Vector.

Match returns a vector of the positions of (first) matches of its first argument in its second.

Syntax: match(x, table, nomatch = NA_integer_, incomparables = NULL)

Parameters:

  • X: Vector or NULL: the values to be matched. Long vectors are supported.
  • table: vector or NULL: the values to be matched against. Long vectors are not supported.
  • nomatch: the value to be returned in the case when no match is found. Note that it is coerced to integer.
  • incomparables: A vector of values that cannot be matched. Any value in x matching a value in this vector is assigned the nomatch value. For historical reasons, FALSE is equivalent to NULL.

Code:

R




data <- data.frame(x1 = 1:5,               
                   x2 = letters[1:5],
                   x3 = 6:10)
vec <- c("b", "e", "a", "c", "d"
  
new_dataset <- data[match(vec, data$x2), ]       
new_dataset                                      

Output:

  x1 x2 x3
2  2  b  7
5  5  e 10
1  1  a  6
3  3  c  8
4  4  d  9

As we can see from the above output the new data frame is sorted based on the values of the vector.

Method 2: Using left_join() Function of dplyr Package:

First, we have to install and load the dplyr package: now we can use left_join() method to sort the data frame based on the values on the vector.

Syntax: left_join(x, y, by = NULL, copy = FALSE, suffix = c(“.x”, “.y”), …)

Parameters:

  • x, y: tbls to join
  • by: a character vector of variables to join by. If NULL, the default, *_join() will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they’re right (to suppress the message, simply explicitly list the variables that you want to join).
  • copy: If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.
  • suffix: If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

Code:

R




install.packages("dplyr")                  
library("dplyr")
data <- data.frame(x1 = 1:5,               
                   x2 = letters[ 1 : 5] ,
                   x3 = 6:10)
  
vec <- c("b", "e", "a", "c", "d")  
  
new_dataset <- left_join(data.frame(x2 = vec),  
                     data,
                     by = "x2")
print(new_dataset)                                      

Output:

 x2 x1 x3
1  b  2  7
2  e  5 10
3  a  1  6
4  c  3  8
5  d  4  9

My Personal Notes arrow_drop_up
Last Updated : 26 Mar, 2021
Like Article
Save Article
Similar Reads
Related Tutorials