Open In App

How to print the first or last rows of a data set

Last Updated : 26 Mar, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

A data set typically means a collection of data organized in a tabular form, like a spreadsheet or a database table. Various programming languages have different techniques/methods to represent a data set in C++ we use vectors or arrays of structs/objects, In python pandas Data Frames, and Java 2D arrays or collections, similarly in R Programming Language we have built-in data structures called data frames which is similar to pandas Data Frames in python.

Given below is the representation of data frames in R

R




# Creating a data frame
my_data <- data.frame(
  S.No. = c(1, 2, 3, 4, 5),
  Name = c("Ram", "Rohan", "Aman", "Rahul", "Ravi"),
  Age = c(19, 20, 18, 21, 20),
  Score = c(95, 85, 80, 90, 88)
)
 
# Display the data frame
print(my_data)


Output:

  S.No.  Name Age Score
1 1 Ram 19 95
2 2 Rohan 20 85
3 3 Aman 18 80
4 4 Rahul 21 90
5 5 Ravi 20 88

Need for extraction of first or last rows of a dataset

  1. Data Validation: By printing and checking the first and last rows helps in validating whether the dataset was loaded correctly.
  2. Better Understanding Of Data: when you are working on a new dataset printing the first and last few rows helps in quick understanding of the structure ,data types, and values of the variables.
  3. Debugging: while writing and testing code, extracting the first/last rows helps in a quick check if the data analysis steps are producing desired results.
  4. Quality Assurance: By analyzing first/last row you can identify some potential risks to data quality and debug it.
  5. Documentation: This is quite helpful while sharing the code to someone else as it helps in better understanding of code.

So for all the above reasons the extraction can be used. Now we will study various methods of row extraction from top and bottom.

Method 1: Using head() and tail() functions

head() and tail() functions in R used to extract the first and last few rows of a dataset/data frame. Given below is the syntax of head() and tail() function.

R




head(my_data, n = 1)
tail(my_data, n = 1)


Output:

  S.No. Name Age Score
1 1 Ram 19 95

S.No. Name Age Score
5 5 Ravi 20 88

where,

  • data: It represents the data frame/matrix/vector whose first/last rows are to be printed.
  • n: It represents the number of rows to be printed.

Now let us apply these functions to a particular data set.

R




# Creating a data frame
my_data <- data.frame(
  ID = c(1, 2, 3, 4, 5),
  Item = c("oil", "cream", "shampoo", "soap", "gel"),
  Price = c(50, 80, 70, 30, 100),
  BestBefore = c(12, 6, 24, 24, 12)
)
 
head(my_data,n=2)
cat("\n")
tail(my_data,n=2)


Output:

  ID  Item Price BestBefore
1 1 oil 50 12
2 2 cream 80 6

ID Item Price BestBefore
4 4 soap 30 24
5 5 gel 100 12

As you can see from the above code for n=2 the first and last 2 rows are printed.

Method 2: Using index slicing

Here are the syntax for the index slicing

my_dataFrame[ a:b , c:d ]

Here a, b refers to start and end of the range of rows to be extracted and c, d for columns.

Now let’s use it to extract 1st row of some particular data frame

R




# Create a data frame
my_dataFrame <- data.frame(
  Name = c("Alice", "David", "Neil"),
  Age = c(20, 25, 27),
  Score = c(95, 80, 75)
)
 
# Slice row 1
sliced_my_dataFrame <- my_dataFrame[1:1,]
print(sliced_my_dataFrame)
 
# Print the last 2 rows
my_data[(nrow(my_data)-1):nrow(my_data), ]


Output:

   Name Age Score
1 Alice 20 95

ID Item Price BestBefore
4 4 soap 30 24
5 5 gel 100 12

As you can see we used the same square bracket technique just the difference is that here we first found the total rows of the data frame and then used it to extract the last 2 rows.

Method 3: Using slice() from the dplyr package

slice_head() and slice_tail() are used to extract subset of rows from a dataframe from start and end respectively. So this makes it an another one of the methods to extract first/last rows of the dataset.

let us code it and print first and last rows of any given data frame.

R




library(dplyr)
#creating a data frame
geeks_languages <- data.frame(
  Language = c("Python", "Java", "C++", "JavaScript", "R"),
  Articles = c(1200, 950, 800, 1100, 300),
  AverageRating = c(4.8, 4.5, 4.3, 4.7, 4.0)
)
#using slice_head to extract rows starting from first row till n
first_row <- slice_head(geeks_languages,n=1)
first_row
 
#using slice_tail to extract rows starting from last row till n from last.
last_row <- slice_tail(geeks_languages,n=1)
last_row


Output:

  Language Articles AverageRating
1 Python 1200 4.8

Language Articles AverageRating
1 R 300 4

Method 4: Using subset() function

subset() as the name indicates allows you to take out a particular subset from a data set based on some conditions. Its syntax is as follows:

subset(x, rows_to_select, columns_to_select, drop = FALSE)
  • x: It is the data frame in which operation has to be carried out.
  • rows_to_select: It defines the condition for which rows are to be selected.
  • columns_to_select: It is optional and defines the condition for column selection.
  • drop: It is also optional and is a Logical operator which is if TRUE drops the dimensions or converts the output to a vector if, only one column is selected.

R




geek_data <- data.frame(
  Geek_ID = c(1, 2, 3, 4, 5),
  Geek_Name = c("Prim", "kadane", "Geek", "HackerMan", "Dijkstra"),
  Age = c(26, 24, 20, 22, 25)
)
first_row <- subset(geek_data,Geek_ID==1)
first_row
cat("\n")
last_row <- subset(geek_data,Geek_ID==5)
last_row


Output:

  Geek_ID Geek_Name Age
1 1 Prim 26

Geek_ID Geek_Name Age
5 5 Dijkstra 25

So these all were various methods to print the first or last rows of a data set, depending upon the use case and users conveniency any of them can be used.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads