Open In App

Slice() Function In R

Last Updated : 19 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

Slice() is a function in R Programming Language that is used to manipulate data frames and datasets using a simple syntax. It is used to make subsets of data frames, it allows data manipulation. These datasets can be sliced using the slice() function.

R slice() function syntax:

Syntax : slice(.data, …, n = NULL, input = NULL)

Parameters:

  1. data: it represents the data frame or dataset
  2. …(three dots): it is used to specify conditions like logical conditions and row selection conditions.
  3. n: It is used to denote the number of rows to select alternatively to the condition(…).
  4. input: It is used as an alternative to conditions(…) to indicate several rows to be selected.

In this article, we will learn about the R slice() function with the help of multiple examples.

Importing the dplyr Package for Data Manipulation

Installation (If Not Installed):

Install the dplyr package in Rstudio using install.packages(“dplyr”) this package is essential to run slice() function.

install.packages(“dplyr”)

Loading the Package:

load the dplyr package in R using the library(dplyr). This will load the dplyr package so that we can run the slice() function.

library(dplyr)

Let’s create a simple dataset to explore various scenarios using the slice() function.

We have created a data frame ‘sample_data’ consisting of ID, Age, Gender, Score1, Score2, Status, and Income columns for six students. Now we will run some examples on this dataset to understand the slice() function.

R




# Creating a sample dataset
sample_data <- tibble(
  ID = c(1:10),
  Age = c(25, 30, 35, 28, 22, 40, 33, 26, 38, 29),
  Gender = c("Male", "Female", "Male", "Female", "Male", "Male", "Female", "Male",
             "Female", "Male"),
  Score1 = c(85, 70, 60, 75, 90, 80, 92, 78, 65, 88),
  Score2 = c(75, 82, 88, 95, 70, 68, 80, 85, 77, 93),
  Status = c("Active", "Inactive", "Active", "Active", "Inactive", "Active", "Inactive",
             "Active", "Inactive", "Active"),
  Income = c(50000, 60000, 75000, 55000, 80000, 90000, 72000, 65000, 82000, 70000)
)
# Display the sample dataset
print("Original dataset:")
print(sample_data)


Output:

[1] "Original dataset:"
A tibble: 10 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 1 25 Male 85 75 Active 50000
2 2 30 Female 70 82 Inactive 60000
3 3 35 Male 60 88 Active 75000
4 4 28 Female 75 95 Active 55000
5 5 22 Male 90 70 Inactive 80000
6 6 40 Male 80 68 Active 90000
7 7 33 Female 92 80 Inactive 72000
8 8 26 Male 78 85 Active 65000
9 9 38 Female 65 77 Inactive 82000
10 10 29 Male 88 93 Active 70000

slice() Function in R Examples

We can select a single row of a dataset by just passing the index of the row you want to select. Here is an example of selecting a single 3rd row of the ‘sample_data’ dataset. we provided the dataset and row index as arguments in the slice() function and it selected the 3rd row.

R




single_row <- slice(sample_data, 3)
print(single_row)


Output:

A tibble: 1 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 3 35 Male 60 88 Active 75000

Multiple Row Selection

We can select multiple rows in a dataset by passing the indexes of the rows in the c() function. Here is an example of selecting 1,5 and 8th rows of the ‘sample_data’ dataset. it will print only 1,5 and 8th rows of the dataset.

R




multiple_rows <- slice(sample_data, c(1, 5, 8))
print(multiple_rows)


Output:

A tibble: 3 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 1 25 Male 85 75 Active 50000
2 5 22 Male 90 70 Inactive 80000
3 8 26 Male 78 85 Active 65000

Range Selection

We can select a range of rows by passing the index range. Here is an example of selecting a range of rows, we select from the 2nd row to the 6th row of the ‘sample_data’ dataset and print them.

R




range_selection <- slice(sample_data, 2:6)
print(range_selection)


Output:

A tibble: 5 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 2 30 Female 70 82 Inactive 60000
2 3 35 Male 60 88 Active 75000
3 4 28 Female 75 95 Active 55000
4 5 22 Male 90 70 Inactive 80000
5 6 40 Male 80 68 Active 90000

Negative Indexing

We can exclude the rows from the selection by using negative indexing. It is done by passing the indexes we don’t need in the c() function with a ‘-‘ symbol before it. It will exclude the rows in the c() function and select the remaining rows. Here we are excluding the 3rd and 7th rows and printing the remaining rows.

R




negative_indexing <- slice(sample_data, -c(3, 7))
print(negative_indexing)


Output:

A tibble: 8 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 1 25 Male 85 75 Active 50000
2 2 30 Female 70 82 Inactive 60000
3 4 28 Female 75 95 Active 55000
4 5 22 Male 90 70 Inactive 80000
5 6 40 Male 80 68 Active 90000
6 8 26 Male 78 85 Active 65000
7 9 38 Female 65 77 Inactive 82000
8 10 29 Male 88 93 Active 70000

Conditional Selection

We can use conditions to select the rows of a dataset. Here we are selecting the rows where the Age is greater than 30 using the slice() function and printing them. Which() function checks for the condition and selects the rows that satisfy the condition. Here which() function checks for the rows where the age is greater than 30 and selects those rows.

R




conditional_selection <- slice(sample_data, which(sample_data$Age > 30))
print(conditional_selection)


Output:

A tibble: 4 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 3 35 Male 60 88 Active 75000
2 6 40 Male 80 68 Active 90000
3 7 33 Female 92 80 Inactive 72000
4 9 38 Female 65 77 Inactive 82000

Top N Rows

We can select the top rows by using the slice_head() function. We pass the dataset and number of rows as arguments. Here we are selecting the top 4 (n=4) rows of the dataset.

R




top_n_rows <- slice_head(sample_data, n = 4)
print(top_n_rows)


Output:

A tibble: 4 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 1 25 Male 85 75 Active 50000
2 2 30 Female 70 82 Inactive 60000
3 3 35 Male 60 88 Active 75000
4 4 28 Female 75 95 Active 55000

Bottom N Rows

We can select the bottom rows by using the slice_tail() function. We pass the dataset and number of rows as arguments. Here we are selecting the bottom 3 (n=3) rows of the dataset.

R




bottom_n_rows <- slice_tail(sample_data, n = 3)
print(bottom_n_rows)


Output:

A tibble: 3 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 8 26 Male 78 85 Active 65000
2 9 38 Female 65 77 Inactive 82000
3 10 29 Male 88 93 Active 70000

Random Row Selection

We can select rows randomly by using the slice_sample() function and pass the dataset, and number of rows to be selected as arguments. Here we are selecting two random rows from the dataset

R




random_rows <- slice_sample(sample_data, n = 2)
print(random_rows)


Output:

A tibble: 2 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 9 38 Female 65 77 Inactive 82000
2 1 25 Male 85 75 Active 50000

Alternate Row Selection

We can select alternate rows of a dataset using the slice() function and seq() function starting from the 1st index and incrementing by 2 up to number of rows of the dataset.

R




alternate_rows <- slice(sample_data, seq(1, nrow(sample_data), by = 2))
print(alternate_rows)


Output:

A tibble: 5 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 1 25 Male 85 75 Active 50000
2 3 35 Male 60 88 Active 75000
3 5 22 Male 90 70 Inactive 80000
4 7 33 Female 92 80 Inactive 72000
5 9 38 Female 65 77 Inactive 82000

The slice () function can combine with other functions and the most common functions that are combined with the slice() function are arranged () and filter().

arrange() function:

In R, the arrange() is a function used to arrange the data frame based on one or more variables. It is mainly used to sort and arrange the data in a data frame. here we are arranging the dataset in descending order and printing the top 3 rows of the sorted dataset

R




sorted_top_3 <- sample_data %>% arrange(desc(Score1)) %>% slice_head(n = 3)
print(sorted_top_3)


Output:

A tibble: 3 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 7 33 Female 92 80 Inactive 72000
2 5 22 Male 90 70 Inactive 80000
3 10 29 Male 88 93 Active 70000

filter function():

In R, the filter() function is used to filter the rows based on the conditions. here we are filtering the dataset by a condition where the Age is greater than 25 and printing the top two rows of the dataset.

R




age_gt_25_top_2 <- sample_data %>% filter(Age > 25) %>% slice_head(n = 2)
print(age_gt_25_top_2)


Output:

A tibble: 2 × 7
ID Age Gender Score1 Score2 Status Income
<int> <dbl> <chr> <dbl> <dbl> <chr> <dbl>
1 2 30 Female 70 82 Inactive 60000
2 3 35 Male 60 88 Active 75000

Conclusion

In conclusion, slice() is a function in R which is a part of the dplyr package and it is used to extract subsets from a dataset. In this article, we covered the basic syntax of the slice(), explaining different parameters in the slice() function, various examples to implement the slice() function, and its most commonly used combination functions arrange() and filter(). You can refer to my Google Colab Notebook for better understanding.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads