 GeeksforGeeks App
Open App Browser
Continue

# Order DataFrame rows according to vector with specific order in R

In this article, we will see how to sort data frame rows based on the values of a vector with a specific order. There are two functions by which we can sort data frame rows based on the values of a vector.

• match() function
• left_join() function

Example dataset:

```data <- data.frame(x1 = 1:5,
x2 = letters[1:5],
x3 = 6:10)

data
x1 x2 x3
1  1  a  6
2  2  b  7
3  3  c  8
4  4  d  9
5  5  e 10```

Vector with specific ordering:

```vec <- c("b", "e", "a", "c", "d")
vec
# "b" "e" "a" "c" "d"```

Method 1: Using match() function to Sort Data Frame According to Vector.

Match returns a vector of the positions of (first) matches of its first argument in its second.

Syntax: match(x, table, nomatch = NA_integer_, incomparables = NULL)

Parameters:

• X: Vector or NULL: the values to be matched. Long vectors are supported.
• table: vector or NULL: the values to be matched against. Long vectors are not supported.
• nomatch: the value to be returned in the case when no match is found. Note that it is coerced to integer.
• incomparables: A vector of values that cannot be matched. Any value in x matching a value in this vector is assigned the nomatch value. For historical reasons, FALSE is equivalent to NULL.

Code:

## R

 `data <- ``data.frame``(x1 = 1:5,               ``                   ``x2 = ``letters``[1:5],``                   ``x3 = 6:10)``vec <- ``c``(``"b"``, ``"e"``, ``"a"``, ``"c"``, ``"d"``) `` ` `new_dataset <- data[``match``(vec, data\$x2), ]       ``new_dataset                                      `

Output:

```  x1 x2 x3
2  2  b  7
5  5  e 10
1  1  a  6
3  3  c  8
4  4  d  9```

As we can see from the above output the new data frame is sorted based on the values of the vector.

Method 2: Using left_join() Function of dplyr Package:

First, we have to install and load the dplyr package: now we can use left_join() method to sort the data frame based on the values on the vector.

Syntax: left_join(x, y, by = NULL, copy = FALSE, suffix = c(“.x”, “.y”), …)

Parameters:

• x, y: tbls to join
• by: a character vector of variables to join by. If NULL, the default, *_join() will do a natural join, using all variables with common names across the two tables. A message lists the variables so that you can check they’re right (to suppress the message, simply explicitly list the variables that you want to join).
• copy: If x and y are not from the same data source, and copy is TRUE, then y will be copied into the same src as x. This allows you to join tables across srcs, but it is a potentially expensive operation so you must opt into it.
• suffix: If there are non-joined duplicate variables in x and y, these suffixes will be added to the output to disambiguate them. Should be a character vector of length 2.

Code:

## R

 `install.packages``(``"dplyr"``)                  ``library``(``"dplyr"``)``data <- ``data.frame``(x1 = 1:5,               ``                   ``x2 = ``letters``[ 1 : 5] ,``                   ``x3 = 6:10)`` ` `vec <- ``c``(``"b"``, ``"e"``, ``"a"``, ``"c"``, ``"d"``)  `` ` `new_dataset <- ``left_join``(``data.frame``(x2 = vec),  ``                     ``data,``                     ``by = ``"x2"``)``print``(new_dataset)                                      `

Output:

``` x2 x1 x3
1  b  2  7
2  e  5 10
3  a  1  6
4  c  3  8
5  d  4  9```

My Personal Notes arrow_drop_up