Skip to content
Related Articles

Related Articles

Split DataFrame into Custom Bins in R

Improve Article
Save Article
  • Last Updated : 14 Feb, 2022
Improve Article
Save Article

In this article, we are going to see how to split dataframe into custom bins in R Programming Language.

The cut() method in base R is used to first divide the range of the dataframe and then divide the values based on the intervals in which they fall. Each of the intervals corresponds to one level of the dataframe. Therefore, the number of levels is equivalent to the length of the breaks argument in the cut method.

Syntax: cut(x, breaks, labels = NULL)

Arguments :

  • x – Numeric vector to be divided
  • Breaks – A vector containing the intervals
  • Labels – labelling of the groups

Example 1: Split dataframe into Custom Bins

R




# creating a dataframe
data_frame <- data.frame(col1 = c(1:10),
                         col2 = letters[1:10],
                         col3 = c(rep(TRUE,4),
                                  rep(FALSE,6)))
print("Original DataFrame")
print(data_frame)
 
# getting rows of data
rows <- nrow(data_frame)
 
# custom bins
bins <- cut(1:rows,            
            breaks = c(0,6,rows        
                       ))
level_bins <- levels(bins)
 
# printing the subsets of dataframe
for(i in 1:length(level_bins)) {   
  assign(paste0("data_frame_", i),
         data_frame[bins == levels(bins)[i], ])
}
 
# retrieving dataframe subsets
print("DataFrame Subset 1")
print(data_frame_1)
 
print("DataFrame Subset 2")
print(data_frame_2)

 

 

Output:

 

 

Example 2: Illustrates the usage where three breakpoints are specified, thereby, dividing the rows into three subsets of the original dataframe.

 

R




# creating a dataframe
data_frame <- data.frame(col1 = c(1:10),
                         col2 = letters[1:10],
                         col3 = c(rep(TRUE,4),
                                  rep(FALSE,6)))
print("Original DataFrame")
print(data_frame)
 
# getting rows of data
rows <- nrow(data_frame)
 
# custom bins
bins <- cut(1:rows,            
            breaks = c(0,2,4,rows      
                       ))
level_bins <- levels(bins)
 
# printing the subsets of dataframe
for(i in 1:length(level_bins)) {   
  assign(paste0("data_frame_", i),
         data_frame[bins == levels(bins)[i], ])
}
 
# retrieving dataframe subsets
print("DataFrame Subset 1")
print(data_frame_1)
 
print("DataFrame Subset 2")
print(data_frame_2)
 
print("DataFrame Subset 3")
print(data_frame_3)

 

 

Output:

 

 

Example 3: The cut method may also specify the number of equal parts in which the dataframe is to be divided. This is specified as the second argument of the method. The dataframe is divided into those numbers of equivalent parts and correspondingly assigned the names specified. The following code divides the dataframe into 5 custom bins of equal sizes :

 

R




# creating a dataframe
data_frame <- data.frame(col1 = c(1:10),
                         col2 = letters[1:10],
                         col3 = c(rep(TRUE,4),
                                  rep(FALSE,6)))
 
print("Original DataFrame")
print(data_frame)
 
# getting rows of data
rows <- nrow(data_frame)
 
# custom bins
bins <- cut(1:rows,5)
level_bins <- levels(bins)
 
# printing the subsets of dataframe
for(i in 1:length(level_bins)) {   
  assign(paste0("data_frame_", i),
         data_frame[bins == levels(bins)[i], ])
}
 
# retrieving dataframe subsets
print("DataFrame Subset 1")
print(data_frame_1)
 
print("DataFrame Subset 2")
print(data_frame_2)
 
print("DataFrame Subset 3")
print(data_frame_3)
 
print("DataFrame Subset 4")
print(data_frame_4)
 
print("DataFrame Subset 5")
print(data_frame_5)

 

 

Output:

 

 


My Personal Notes arrow_drop_up
Related Articles

Start Your Coding Journey Now!