Open In App

Sorting, Ordering, and Ranking: Unraveling R’s Powerful Functions

Last Updated : 02 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

R Programming Language is widely used by data scientists and analysts. This language provides various packages and libraries that are user-friendly making analysis easier. The rich set of functions this language provides helps in data manipulation and transformation. In this article, we will understand R’s powerful functions and its uses.

Difference between Sorting, Ordering, and Ranking Function in R

sort()

This function is used to manage or arrange the dataset in ascending or descending order.

order()

This function returns the order or permutation of the dataset. It works on indexing.

rank()

This function works with various other methods like ‘min’, and ‘average’, and it gives the rank of the dataset.

Sorting

Sorting Data frame is a fundamental function in data analysis, it helps us in handling data and arranging it in a meaningful order. We can understand this with the help of multiple examples mentioned below. sort() function is used to arrange our data frame. To use this function first we need to check if our dataset is sorted or not.

How to check if the data is sorted?

We can use is.unsorted() function to check if our dataset is sorted or not.

R




# Create a sample data frame
sample_data <- data.frame(
  ID = c(101, 102, 103, 104, 105),
  Value = c(15, 22, 30, 18, 25)
)
 
# Display the original data frame
cat("Original Data Frame:\n")
print(sample_data)
 
# Check if the 'Value' column is sorted in ascending order
cat("\nIs 'Value' column sorted in ascending order?\n")
print(is.unsorted(sample_data$Value))


Output:

Original Data Frame:
   ID Value
1 101    15
2 102    22
3 103    30
4 104    18
5 105    25

[1] TRUE

As we can see it is already sorted, so the output is TRUE.

Sorting Numeric Vector

Let’s assume we have a fictional dataset representing the marks of students and we want to arrange it in order so that we can get a topper of the class.

R




# Numeric vector representing scores
scores <- c(85, 92, 78, 95, 89)
 
# Sort the scores in ascending order
sorted_scores <- sort(scores)
 
# Display the sorted scores
print(sorted_scores)


Output:

[1] 78 85 89 92 95

Sorting Character Vector

This function also helps us sort character vectors alphabetically in order.

R




# Character vector representing city names
cities <- c("New York", "London", "Paris", "Tokyo", "Sydney")
 
# Sort the cities alphabetically
sorted_cities <- sort(cities)
 
# Display the sorted cities
print(sorted_cities)


Output:

[1] "London"   "New York" "Paris"    "Sydney"   "Tokyo"   

Sorting Data Frame

We can also use this function to sort data frames.

R




# Create a data frame
employee_data <- data.frame(
  Name = c("Alice", "Bob", "Charlie", "David"),
  Salary = c(60000, 75000, 50000, 90000)
)
 
# Sort the data frame based on Salary column
sorted_employee_data <- employee_data[order(employee_data$Salary), ]
 
# Display the sorted data frame
print(sorted_employee_data)


Output:

Name Salary
3 Charlie  50000
1   Alice  60000
2     Bob  75000
4   David  90000

Sorting Laptop Dataset

We will use an external dataset based on Laptop Price.

Loading and understanding the dataset

We will first load the dataset using the read.csv() function, make sure you replace it with the original path of your dataset. the head() function is used to display the first 6 rows of the dataset.

R




#load the dataset
data<-read.csv('C:\\Users\\GFG19565\\Downloads\\Laptop_price.csv')
#display first 6 rows
head(data)


Output:

  X Manufacturer Category    Screen GPU OS CPU_core Screen_Size_cm CPU_frequency RAM_GB
1 0         Acer        4 IPS Panel   2  1        5         35.560           1.6      8
2 1         Dell        3   Full HD   1  1        3         39.624           2.0      4
3 2         Dell        3   Full HD   1  1        7         39.624           2.7      8
4 3         Dell        4 IPS Panel   2  1        5         33.782           1.6      8
5 4           HP        4   Full HD   2  1        7         39.624           1.8      8
6 5         Dell        3   Full HD   1  1        5         39.624           1.6      8
  Storage_GB_SSD Weight_kg Price
1            256      1.60   978
2            256      2.20   634
3            256      2.20   946
4            128      1.22  1244
5            256      1.91   837
6            256      2.20  1016

Checking if the dataset is sorted or not?

R




# Check if the dataframe is sorted
cat("\nIs the dataframe sorted by Price?\n")
print(all(diff(data$Price) >= 0))
 
# Sorting in ascending order based on 'Price' column
sorted_data_price_asc <- data[order(data$Price), ]
 
# Display the sorted and unsorted values of the 'Price' column
cat("\nSorted Data (Ascending Order by Price):\n")
print(sorted_data_price_asc$Price)


Output:

Is the data frame sorted by Price?

[1] FALSE

Sorted Data (Ascending Order by Price):
[1]  527  558  616  634  634  685  697  710  723  727  733  735  761  761  761  786
 [17]  786  800  808  812  837  837  860  866  876  883  888  888  888  888  892  896
 [33]  913  913  922  925  934  935  939  939  939  946  951  951  975  977  978  989
 [49] 1000 1002 1003 1010 1013 1016 1023 1053 1053 1054 1057 1066 1068 1075 1085 1089
 [65] 1091 1091 1092 1105 1117 1117 1117 1117 1118 1119 1123 1129 1142 1142 1142 1142
 [81] 1146 1157 1167 1167 1172 1179 1184 1188 1192 1195 1198 1200 1200 1206 1206 1206
 [97] 1208 1213 1219 1219 1236 1241 1244 1244 1245 1251 1255 1256 1268 1269 1269 1283
[113] 1286 1294 1306 1310 1325 1327 1333 1333 1334 1371 1374 1383 1390 1392 1394 1396
[129] 1396 1396 1396 1404 1418 1419 1420 1421 1442 1452 1453 1460 1480 1498 1498 1499
[145] 1501 1507 1513 1515 1518 1523 1524 1531 1541 1544 1548 1561 1562 1598 1607 1611
[161] 1626 1632 1641 1648 1650 1656 1696 1702 1709 1714 1714 1714 1731 1739 1749 1763
[177] 1777 1777 1777 1813 1813 1815 1841 1842 1855 1861 1870 1872 1874 1880 1891 1904
[193] 1904 1904 1905 1950 1950 1953 1983 2006 2012 2031 2069 2082 2095 2095 2096 2096
[209] 2120 2124 2125 2147 2158 2208 2223 2236 2240 2255 2285 2312 2323 2340 2349 2361
[225] 2361 2414 2417 2509 2509 2623 2655 2712 3059 3073 3301 3665 3810 3810

Sorting in Descending Order

We can also sort the price in descending order to check the most costly laptop.

R




# Sorting in descending order based on 'Price' column
sorted_data_price_desc <- data[order(-data$Price), ]
 
# Display the sorted data
cat("\nSorted Data (Descending Order by Price):\n")
head(sorted_data_price_desc)


Output:

      X Manufacturer Category    Screen GPU OS CPU_core Screen_Size_cm CPU_frequency RAM_GB
65   64         Asus        1   Full HD   3  1        7         43.942           2.9     16
145 144       Lenovo        3 IPS Panel   3  1        7         43.180           2.8      8
78   77         Dell        5   Full HD   3  1        7         43.942           2.9     16
160 159        Razer        1   Full HD   3  1        7         35.560           2.8     16
181 180           HP        5   Full HD   3  1        7         39.624           2.8     16
122 121         Dell        5   Full HD   3  1        7         39.624           2.8      8
    Storage_GB_SSD Weight_kg Price
65             256      3.60  3810
145            256      3.40  3810
78             256      3.42  3665
160            256      1.95  3301
181            256      2.60  3073
122            256      1.78  3059

Sorting Screen Size

R




# Sorting in ascending order based on 'Screen_Size_cm' column
sorted_data_screen_size_asc <- data[order(data$Screen_Size_cm), ]
 
# Display the sorted data
cat("\nSorted Data (Ascending Order by Screen Size):\n")
head(sorted_data_screen_size_asc)


Output:

      X Manufacturer Category    Screen GPU OS CPU_core Screen_Size_cm CPU_frequency RAM_GB
115 114       Lenovo        4 IPS Panel   2  1        5          30.48           2.5      8
162 161       Lenovo        4 IPS Panel   2  1        7          30.48           2.7      8
226 225       Lenovo        4 IPS Panel   2  1        7          30.48           2.7      8
236 235       Lenovo        4 IPS Panel   2  1        5          30.48           2.6      8
85   84           HP        4   Full HD   2  1        7          31.75           2.7      8
187 186         Dell        4   Full HD   2  1        7          31.75           2.8     16
    Storage_GB_SSD Weight_kg Price
115            256      1.36  1815
162            256      1.36  2012
226            256      1.36  2096
236            256      1.36  2236
85             256      1.26  1696
187            256      1.18  2361

Ordering

In R, the order() function is used to obtain the order of the elements in a vector or data frame.

R




# Example vector
x <- c(5, 2, 8, 1, 3)
 
# Get the order of elements in ascending order
order_result <- order(x)
print(order_result)


Output:

[1] 4 2 5 1 3

4 represents that the smallest digit is on the 4th index and the biggest digit is on the 3rd index. This returns the order of the vector.

Sorting a Data Frame Using Order

R




# Example data frame
df <- data.frame(ID = c(101, 102, 103, 104),
                 Value = c(25, 18, 32, 12))
 
# Use order to sort the data frame by the 'Value' column
sorted_df <- df[order(df$Value), ]
print(sorted_df)


Output:

ID Value
4 104    12
2 102    18
1 101    25
3 103    32

Using Order for Descending Order

We can also use the order function to sort the dataset using an index. This takes argument decreasing=TRUE.

R




# Example vector
x <- c(5, 2, 8, 1, 3)
 
# Get the order of elements in descending order
order_desc <- order(x, decreasing = TRUE)
print(order_desc)


Output:

[1] 3 1 5 2 4

Difference between Order and Sort

We can also find the difference between these two functions with the help of an example.

R




# Example vector
x <- c(5, 2, 8, 1, 3)
 
# Order
order_result <- order(x)
 
# Sort
sort_result <- sort(x)
 
print("Order Result:")
print(x[order_result])
 
print("Sort Result:")
print(sort_result)


Output:

[1] "Order Result:"
[1] 1 2 3 5 8

[1] "Sort Result:"
[1] 1 2 3 5 8

sort function directly returns the sorted list whereas the order function first sorts it based on the index.

Ordering Amazon Dataset

In this example, we will take an external dataset on Amazon Seller- Order Status Dataset. You can download this from the Kaggle website: https://www.kaggle.com/datasets/pranalibose/amazon-seller-order-status-prediction

Loading and Exploring Dataset

R




#load libraries
library(readxl)
#load dataset
data<- read_xlsx("C:\\Users\\GFG19565\\Downloads\\orders_data.xlsx")
#display dataset
head(data)


Output:

A tibble: 6 × 12
  order_no            order_date buyer ship_city ship_state sku   description quantity item_total
  <chr>               <chr>      <chr> <chr>     <chr>      <chr> <chr>       <chr>    <chr>     
1 405-9763961-5211537 Sun, 18 J… Mr.   CHANDIGA… CHANDIGARH SKU:… 100% Leath… 1        ₹449.00   
2 404-3964908-7850720 Tue, 19 O… Minam PASIGHAT, ARUNACHAL… SKU:… Women's Se… 1        ₹449.00   
3 171-8103182-4289117 Sun, 28 N… yati… PASIGHAT, ARUNACHAL… SKU:… Women's Se… 1        ₹449.00   
4 405-3171677-9557154 Wed, 28 J… aciya DEVARAKO… TELANGANA  SKU:… Pure 100% … 1        NA        
5 402-8910771-1215552 Tue, 28 S… Susm… MUMBAI,   MAHARASHT… SKU:… Pure Leath… 1        ₹1,099.00 
6 406-9292208-6725123 Thu, 17 J… Subi… HOWRAH,   WEST BENG… SKU:… Women's Tr… 1        ₹200.00 

Order by a Specific Column in Ascending Order

We can sort by the order date to check the latest orders of our dataset.

R




# Order by 'order_date' in ascending order
data_ordered_date <- data[order(data$order_date), ]
head(data_ordered_date)


Output:

A tibble: 6 × 12
  order_no            order_date buyer ship_city ship_state sku   description quantity item_total
  <chr>               <chr>      <chr> <chr>     <chr>      <chr> <chr>       <chr>    <chr>     
1 402-8678022-3083562 Fri, 1 Oc… Heena MUMBAI,   MAHARASHT… SKU:… 100% Pure … 1        ₹399.00   
2 402-6701060-6592325 Fri, 1 Oc… Heena MUMBAI,   MAHARASHT… SKU:… Women's Pu… 1        ₹399.00   
3 405-4776641-5401922 Fri, 1 Oc… Rath… AHMEDABA… GUJARAT    SKU:… Pure 100% … 1        ₹250.00   
4 171-7361479-0297146 Fri, 10 D… Amol  PUNE,     MAHARASHT… SKU:… Women's Se… 4        ₹1,796.00 
5 402-2278272-1998728 Fri, 10 D… Dalr… BENGALUR… KARNATAKA  SKU:… Women's Se… 1        ₹449.00   
6 171-3733329-6916359 Fri, 10 D… Shah… MUMBAI,   MAHARASHT… SKU:… Women's Se… 1        ₹449.00  

Order by a Specific Column in Descending Order

We can also sort item total column in descending order using the order function.

R




# Order by 'item_total' in descending order
data_ordered_item_total_desc <- data[order(data$item_total, decreasing = TRUE), ]
head(data_ordered_item_total_desc)


Output:

A tibble: 6 × 12
  order_no            order_date buyer ship_city ship_state sku   description quantity item_total
  <chr>               <chr>      <chr> <chr>     <chr>      <chr> <chr>       <chr>    <chr>     
1 403-9089686-7304307 Mon, 6 De… J     BENGALUR… KARNATAKA  SKU:… Stunning W… 1        ₹899.00   
2 408-6770537-3774707 Sun, 17 O… Paro… Mumbai,   MAHARASHT… SKU:… Women's Se… 2        ₹898.00   
3 405-6918787-5602743 Wed, 25 A… Mosin MAHALING… KARNATAKA  SKU:… Ultra Slim… 1        ₹649.00   
4 402-2054361-4513137 Mon, 6 Se… Rame… JALESWAR, ODISHA     SKU:… Ultra Slim… 1        ₹649.00   
5 405-1111150-1834754 Sun, 5 Se… Jai   HYDERABA… TELANGANA  SKU:… Ultra Slim… 1        ₹649.00   
6 171-5917046-2682765 Thu, 7 Oc… Anku  GUWAHATI, ASSAM      SKU:… Ultra Slim… 1        ₹649.00 

Order by Multiple Columns

We can also order multiple columns simultaneously. Here we will ship state in ascending whereas order date in descending order.

R




# Order by 'ship_state' (ascending)
order_ship_state <- order(data$ship_state)
 
# Then, order by 'order_date' (descending) for the tied values in 'ship_state'
order_order_date_desc <- order(data$order_date, decreasing = TRUE)
 
# Combine the orders to get the final order
final_order <- order_ship_state[order_order_date_desc]
 
# Use the final order to rearrange the rows in the dataset
data_ordered_state_date_desc <- data[final_order, ]
head(data_ordered_state_date_desc)


Output:

A tibble: 6 × 12
  order_no            order_date buyer ship_city ship_state sku   description quantity item_total
  <chr>               <chr>      <chr> <chr>     <chr>      <chr> <chr>       <chr>    <chr>     
1 408-9435263-6891514 Thu, 9 De… Shar… NOIDA,    UTTAR PRA… SKU:… Traditiona… 1        ₹1,299.00 
2 408-0358198-6688308 Wed, 21 J… S.    Tuticori… TAMIL NADU SKU:… Bright & C… 1        ₹549.00   
3 406-6774677-4553965 Tue, 13 J… Priy… HYDERABA… TELANGANA  SKU:… 100% Leath… 1        ₹349.00   
4 404-5515061-6165137 Fri, 15 O… Arpi… KOLKATA,  WEST BENG… SKU:… Set of 2 P… 1        ₹399.00   
5 404-6883107-8347508 Wed, 4 Au… chir… RAIA,     GOA        SKU:… Women's Se… 1        ₹449.00   
6 407-1526604-7803547 Fri, 13 A… Jolly GUWAHATI, ASSAM      SKU:… Pure Leath… 1        ₹1,099.00 

Ranking

The rank () function in R is used to compute the ranks of the elements present in a vector. The rank of an element is the position in a sorted order. Syntax to use the rank function.

R




# Example vector
scores <- c(80, 95, 80, 72, 90)
 
# Using rank with default ties.method ("average")
ranked_scores <- rank(scores)
 
# Display the result
cat("Original Scores:", scores, "\n")
cat("Ranked Scores:", ranked_scores, "\n")


OUTPUT:

Original Scores: 80 95 80 72 90 

Ranked Scores: 2.5 5 2.5 1 4 

The tied values (80 in this case) received an average rank of 2.5.

Additional Parameters

the na.last parameter in this function specifies if we want to place NA values at the last or not. This helps in dealing with the missing values of the function

R




# Handling missing values
scores_with_na <- c(80, 95, NA, 72, 90)
ranked_scores_with_na <- rank(scores_with_na, na.last = TRUE)
scores_with_na
ranked_scores_with_na


Output:

[1] 80 95 NA 72 90

[1] 2 4 5 1 3

Ties Handling

We can use the ties.method to specify how we want to handle the rank.

R




scores <- c(80, 95, 80, 72, 90)
 
# Using rank with different ties.method options
ranked_scores_first <- rank(scores, ties.method = "first")
ranked_scores_last <- rank(scores, ties.method = "last")
ranked_scores_random <- rank(scores, ties.method = "random")
 
# Display the results
cat("Ranked Scores (First):", ranked_scores_first, "\n")
cat("Ranked Scores (Last):", ranked_scores_last, "\n")
cat("Ranked Scores (Random):", ranked_scores_random, "\n")


Output:

[1] 80 95 80 72 90

Ranked Scores (First): 2 5 3 1 4 

Ranked Scores (Last): 3 5 2 1 4 

Ranked Scores (Random): 3 5 2 1 4 
  1. ranked_scores_first <- rank(scores, ties.method = "first"): This line ranks the scores using the “first” method for handling tied ranks. The “first” method assigns the same rank to tied values based on their order of appearance in the vector.
  2. ranked_scores_last <- rank(scores, ties.method = "last"): This line ranks the scores using the “last” method for handling tied ranks. The “last” method assigns the same rank to tied values based on their last occurrence in the vector.
  3. ranked_scores_random <- rank(scores, ties.method = "random"): This line ranks the scores using the “random” method for handling tied ranks. The “random” method randomly assigns ranks to tied values.

Conclusion

In this article, we understood how to use rank, sort, and order functions in R with the help of different examples. We explored how these functions make data handling and manipulation easier.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads