Open In App

Slice() From Dplyr In R

Last Updated : 08 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

With so much data around us in today’s world, dealing with them becomes tough. In this case, the Dplyr data frame package from R acts as a lifesaver and that package stands out as a powerful and versatile tool. for data manipulation. In R Programming Language package has many functions and among them, slice() is particularly useful for extracting specific rows from any data frame based on their indexes (positions).

In this article, we will look at the details of this slice() function and explore how can it help in the data manipulation process.

Introduction to Slice() function in dplyr

The slice() function in dplyr allows users to subset data frames by selecting specific rows based on their indexes or positions. In simple words, as the word slice suggests, it is like taking a piece or part of a data frame using the position of data. Using this function, we could get any desired part of the dataframe and we could use that part for some other purposes.

This function has a really simple syntax and integrates easily with other dplyr functions, which makes it an invaluable tool for data wrangling tasks. The basic syntax for the slice() function can be written as.

slice(.data, …, .preserve = FALSE)

here,

  • data –represents any data frame or tibble to which needs to be sliced
  • -specifies how we want to slice that particular dataframe
  • preserve -is a parameter which is used to preserve the grouping structure whose default value is FALSE

Now, let us look into how we can use this slice() function by looking at the steps.

Steps to implement Slicing in R

Now, let us know the steps to implement this slice() method in R.

Step 1: Install and load the packakges

The first step is to install and load the dplyr package which has this function into the environment.

install.packages(‘dplyr’)

library(dplyr)

Step 2: Data preparation

In this step, we now need data to slice. So, this step is data preparation.

df <- data.frame(
id = c(101, 102, 103, 104, 105),
name = c('Madhu', 'Ram', 'Krishna', 'Radha', 'Lakshmi'),
gender = c('F', 'M', 'M', 'F', 'F'),
dob = as.Date(c('1992-05-15', '1988-12-31', '1995-07-20', '1990-03-10', '1987-11-05')),
state = c('CA', 'NY', 'TX', 'FL', 'WA'),
stringsAsFactors = FALSE
)

The above code shows how to create a basic dataframe with multiple rows and columns in R. We used data.frame() to create it. We can either create our own dataframe or else, we can use an already existing dataset and work on it.

Step 3: Slice operation

Now it is time to perform slice operation on the data frame.

df2 <- df %>% slice(2,3)

In this case the operation in slice(), after which the result is stored in new variable df2. The parameters inside the slice() function denotes the start and end indexes for slicing, where both are included.

Note: In the above code, we used an operator called pipe (%>%) in dplyr package. The operator takes the input from the left hand dataframe and performs the operation on the right side.

These were the steps involved in using the slice() function. Now, let us dive into types of slice() functions in dplyr package which make data analysis much more simpler.

Types of Slicing Methods

There are various other slicing methods in dplyr package, that are available to cater to different needs, like selecting rows of a dataframe by index, choosing the first or last rows, extracting the minimum or maximum values from a column, or randomly sampling rows from a dataset.

Now, let us see each type in detail with an example. I will use the dataset called ‘mtcars’ which is available by default in R studio to demonstrate each slicing method.

1. slice(): Slices the dataframe by row index

This function is helpful to slice the dataframe by using the row indexes. We can either slice one row, rows in a range or even rows which are non-continuous, i.e, multiple rows. Below is the syntax for it.

one_row <- slice(df, n) # Slice nth row

rows_in_range <- slice(df, n1:n2) #Slice rows in range n1 to n2

multiple_rows <- slice(df, c(n1,n3,n6)) #Slice non-continuous rows using vector

R




#install and load package
install.packages('dplyr')
library(dplyr)
#load dataset to df variable
df <- mtcars
# Slice nth row
one_row <- slice(df, 2) 
cat("The 2nd row is:\n")
print(one_row)
# Slice rows in range n1 to n2
rows_in_range <- slice(df, 2:6)
cat("The rows in the range of 2 and 6 are:\n")
print(rows_in_range)
 
multiple_rows <- slice(df, c(4,6,8))
cat("The 4th, 6th, 8th rows are:\n")
print(multiple_rows)


Output:

The 2nd row is:
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 Wag 21 6 160 110 3.9 2.875 17.02 0 1 4 4
The rows in the range of 2 and 6 are:
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
The 4th, 6th, 8th rows are:
mpg cyl disp hp drat wt qsec vs am gear carb
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Valiant 18.1 6 225.0 105 2.76 3.460 20.22 1 0 3 1
Merc 240D 24.4 4 146.7 62 3.69 3.190 20.00 1 0 4 2

2. slice_head(): Select the top rows

This function helps us to get the top part of any dataframe. Here we can even specify how many rows in top we actually want to slice using an argument called ‘n’.

head_df <- slice_head(df, n = number) # Select the first n rows

print(head_df)

R




head_df <- slice_head(df,n=4)
cat("The first 4 rows are: ")
print(head_df)


Output:

The first 4 rows are: 
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1

3. slice_tail(): Select the bottom rows

This function is similar to the above but is for the bottom part of the dataframe.

tail_df <- slice_tail(df, n = number) # Select the last rows

print(tail_df)

R




tail_df <- slice_tail(df,n=4)
cat("The last 4 rows are: ")
print(tail_df)


Output:

The last 4 rows are: 
mpg cyl disp hp drat wt qsec vs am gear carb
Ford Pantera L 15.8 8 351 264 4.22 3.17 14.5 0 1 5 4
Ferrari Dino 19.7 6 145 175 3.62 2.77 15.5 0 1 5 6
Maserati Bora 15.0 8 301 335 3.54 3.57 14.6 0 1 5 8
Volvo 142E 21.4 4 121 109 4.11 2.78 18.6 1 1 4 2

4. slice_min(): Select the minimum of a column

As the function specifies, it gets the rows with minimum values from the dataframe, where we can specify based on the order of which column we need to slice the dataframe.

min_df <- slice_min(df, order_by = B) # Select the row with the minimum value in column ‘B’

print(min_df)

R




# Select the row with the minimum value in column 'mpg'
min_df <- slice_min(df, order_by = mpg)
cat("The row with the least mpg: ")
print(min_df)


Output:

The row with the lease mpg: 
mpg cyl disp hp drat wt qsec vs am gear carb
Cadillac Fleetwood 10.4 8 472 205 2.93 5.250 17.98 0 0 3 4
Lincoln Continental 10.4 8 460 215 3.00 5.424 17.82 0 0 3 4

5. slice_max(): Select the maximum of a column

This function is opposite to the slice_min() function. This selects the rows with maximum values based on the order of one particular column.

max_df <- slice_max(df, order_by = B) # Select the row with the maximum value in column ‘B’

print(max_df)

R




# Select the row with the maximum value in column 'B'
max_df <- slice_max(df, order_by = disp) 
cat("The row with the maximum disp: ")
print(max_df)


Output:

The row with the maximum disp
mpg cyl disp hp drat wt qsec vs am gear carb
Cadillac Fleetwood 10.4 8 472 205 2.93 5.25 17.98 0 0 3 4

6. slice_random(): Select random rows

As the term random says that this method slices random rows from the dataframe. Here, also a parameter called ‘n’ can be given to specify how many rows must be selected.

random_df <- slice_sample(df, n = number) # select n random rows

print(random_df)

R




random_df <- slice_sample(df, n = 3) 
cat("3 random rows are: ")
print(random_df)


Output:

3 random rows are: 
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Honda Civic 30.4 4 75.7 52 4.93 1.615 18.52 1 1 4 2
Maserati Bora 15.0 8 301.0 335 3.54 3.570 14.60 0 1 5 8

More Examples on Slice()

Now, let us look at the examples of the slicing and where it is used in data analysis.

For suppose we wanted to find out the rows with a particular condition. Let’s say we have a dataframe of student names and their scores. And we need to get all the student names with their score whose score is above 85%. So, firstly let us create a dataframe, after which we are going to write the slice function to slice the dataframe based on the condition which is score>85.

R




# Create a dataframe
class_score <- data.frame(
  ID = 1:5,
  Name = c("Krishna", "Sony", "Priya", "Rahul", "Rama"),
  Score = c(85, 92, 78, 88, 95)
)
 
# Slice the dataframe based on condition
top_scorers <- class_score %>% slice(which(Score > 85))
 
# Print the top scorers
print(top_scorers)


Output:

  ID    Name Score
1 1 Krishna 85
2 2 Sony 92
3 3 Priya 78
4 4 Rahul 88
5 5 Rama 95
dataframe based on condition that score > 85
ID Name Score
1 2 Sony 92
2 4 Rahul 88
3 5 Rama 95

We created a dataframe called class_score with columns like id, names, and scores respectively. Then we used the pipe operator as discussed before which uses the class_score dataframe to slice from it and store in new variable. The parameter of the slice() function is which(). This part of code returns the indices where the condition Score > 85 is true. So, it returns the position of elements that are True. Hence, in this way we can use the slice function.

Example 2: Let us now take the example dataset of cricket teams and their scores. Our task is to find the top teams from the dataframe.

R




# cricket teams dataframe
cricket_data <- data.frame(
  Team = c("India", "Australia", "England", "Pakistan", "South Africa"),
  Score = c(320, 289, 275, 241, 305)
)
 
cricket_data
# Arrange data in descending order of scores and then select the top 3 rows
top_scores <- cricket_data %>% arrange(desc(Score)) %>% slice(1:3) 
 
 
# Display the top_scores dataset
print("Top 3 Scores:")
print(top_scores)


Output:

          Team Score
1 India 320
2 Australia 289
3 England 275
4 Pakistan 241
5 South Africa 305
[1] "Top 3 Scores:"
Team Score
1 India 320
2 South Africa 305
3 Australia 289

Here, we took a dataframe for the cricket teams and their scores and we tried to find the top scorers. Here also we used the pipe operator from dplyr. Firstly, we arranged the dataframe in decreasing order according to the column ‘Score’, after which we used the slice() function to get the top three scores.

Conclusion

The slice() function in dplyr package of R is really a powerful tool to extract specific rows according to our need from any dataframe based on their positions. It is really easy and simple to use function which can be mastered by anyone with practice. By mastering this function, data anlaysts and scientists can improve their data wrangling tasks to unlock deeper insights from the datasets.

FAQs on Slice() function

What does slice() function do in R?

The slice function helps us extract rows from any data frame using their positions.

How does slice() differ from other functions like filter() in dplyr?

The filter function can be used when we want to extract rows with particular condition on them, whereas the slice() function has nothing to do with the conditions as it is purely based on the position.

Does slice() modify the original data frame?

No, it does’nt modify the original data frame, but returns the a new dataframe containing the selected rows.

Can I use negative positions with slice()?

Yes, you can absolutely use negative indices as positions with the slice() function.

What happens if I provide positions that are out of range?

If the index given is larger than the number of rows in the dataframe, then it would result in an error.



Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads