Open In App

Create Subsets of a Data frame in R Programming – subset() Function

Improve
Improve
Like Article
Like
Save
Share
Report

subset() function in R Programming Language is used to create subsets of a Data frame. This can also be used to drop columns from a data frame.

Syntax: subset(df, expr)

Parameters: 

  • df: Data frame used
  • expr: Condition for subset

Create Subsets of Data Frames in R Programming Language

Here we will make subsets of dataframe using subset() methods in R language.

Example 1: Basic example of R – subset() Function

R




# R program to create
# subset of a data frame
   
# Creating a Data Frame
df<-data.frame(row1 = 0:2, row2 = 3:5, row3 = 6:8)
print ("Original Data Frame")
print (df)
   
# Creating a Subset
df1<-subset(df, select = row2)
print("Modified Data Frame")
print(df1)


Output: 

[1] "Original Data Frame"
row1 row2 row3
1 0 3 6
2 1 4 7
3 2 5 8
[1] "Modified Data Frame"
row2
1 3
2 4
3 5

Here, in the above code, the original data frame remains intact while another subset of data frame is created which holds a selected row from the original data frame. 

Example 2: Create Subsets of Data frame in R Language

R




# R program to create
# subset of a data frame
 
# Creating a Data Frame
df<-data.frame(row1 = 0:2, row2 = 3:5, row3 = 6:8)
print ("Original Data Frame")
print (df)
 
# Creating a Subset
df<-subset(df, select = -c(row2, row3))
print("Modified Data Frame")
print(df)


Output: 

[1] "Original Data Frame"
row1 row2 row3
1 0 3 6
2 1 4 7
3 2 5 8
[1] "Modified Data Frame"
row1
1 0
2 1
3 2

Here, in the above code, the rows are permanently deleted from the original data frame.

Example 3: Logical AND and OR using subset

R




# Create a data frame
df <- data.frame(
  ID = 1:5,
  Name = c("Nishant", "Vipul", "Jayesh", "Abhishek", "Shivang"),
  Age = c(25, 30, 22, 35, 28)
)
 
df
 
# Subset based on age greater than 25 and ID less than 4
subset_df <- subset(df, subset = Age > 25 & ID < 4)
 
# Subset based on age greater than 30 or ID equal to 2
subset_df2 <- subset(df, subset = Age > 30 | ID == 2)
 
# Print the results
print(subset_df)
print(subset_df2)


Output:

  ID  Name Age
2 2 Vipul 30

ID Name Age
2 2 Vipul 30
4 4 Abhishek 35

Example 4: Subsetting with Missing Values

R




# Create a data frame
df <- data.frame(
  ID = 1:5,
  Name = c("Nishant", "Vipul", NA, "Abhishek", NA),
  Age = c(25, 30, NA, 35, NA)
)
 
df
 
subset_df <- subset(df, subset = !is.na(Age))
 
# Print the result
print(subset_df)


Output:

 ID     Name Age
1 1 Nishant 25
2 2 Vipul 30
3 3 <NA> NA
4 4 Abhishek 35
5 5 <NA> NA

ID Name Age
1 1 Nishant 25
2 2 Vipul 30
4 4 Abhishek 35



Last Updated : 24 Nov, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads