Skip to content
Related Articles

Related Articles

Improve Article
Save Article
Like Article

Split DataFrame into Custom Bins in R

  • Last Updated : 17 Oct, 2021

In this article, we are going to see how to split dataframe into custom bins in R Programming Language.

The cut() method in base R is used to first divide the range of the dataframe and then divide the values based on the intervals in which they fall. Each of the intervals corresponds to one level of the dataframe. Therefore, the number of levels is equivalent to the length of the breaks argument in the cut method.

Syntax: cut(x, breaks, labels = NULL)

Arguments :

  • x – Numeric vector to be divided
  • Breaks – A vector containing the intervals
  • Labels – labelling of the groups

Example 1: Split dataframe into Custom Bins



R




# creating a dataframe 
data_frame <- data.frame(col1 = c(1:10),
                         col2 = letters[1:10],
                         col3 = c(rep(TRUE,4),
                                  rep(FALSE,6)))
print("Original DataFrame")
print(data_frame)
  
# getting rows of data
rows <- nrow(data_frame)
  
# custom bins
bins <- cut(1:rows,             
            breaks = c(0,6,rows         
                       ))
level_bins <- levels(bins)
  
# printing the subsets of dataframe
for(i in 1:length(level_bins)) {    
  assign(paste0("data_frame_", i),
         data_frame[bins == levels(bins)[i], ])
}
  
# retrieving dataframe subsets
print("DataFrame Subset 1")
print(data_frame_1)
  
print("DataFrame Subset 2")
print(data_frame_2)

Output:

Example 2: IIllustrates the usage where three breakpoints are specified, thereby, dividing the rows into three subsets of the original dataframe.

R




# creating a dataframe 
data_frame <- data.frame(col1 = c(1:10),
                         col2 = letters[1:10],
                         col3 = c(rep(TRUE,4),
                                  rep(FALSE,6)))
print("Original DataFrame")
print(data_frame)
  
# getting rows of data
rows <- nrow(data_frame)
  
# custom bins
bins <- cut(1:rows,             
            breaks = c(0,2,4,rows       
                       ))
level_bins <- levels(bins)
  
# printing the subsets of dataframe
for(i in 1:length(level_bins)) {    
  assign(paste0("data_frame_", i),
         data_frame[bins == levels(bins)[i], ])
}
  
# retrieving dataframe subsets
print("DataFrame Subset 1")
print(data_frame_1)
  
print("DataFrame Subset 2")
print(data_frame_2)
  
print("DataFrame Subset 3")
print(data_frame_3)

Output:

Example 3: The cut method may also specify the number of equal parts in which the dataframe is to be divided. This is specified as the second argument of the method. The dataframe is divided into those numbers of equivalent parts and correspondingly assigned the names specified. The following code divides the dataframe into 5 custom bins of equal sizes :

R




# creating a dataframe 
data_frame <- data.frame(col1 = c(1:10),
                         col2 = letters[1:10],
                         col3 = c(rep(TRUE,4),
                                  rep(FALSE,6)))
  
print("Original DataFrame")
print(data_frame)
  
# getting rows of data
rows <- nrow(data_frame)
  
# custom bins
bins <- cut(1:rows,5)
level_bins <- levels(bins)
  
# printing the subsets of dataframe
for(i in 1:length(level_bins)) {    
  assign(paste0("data_frame_", i),
         data_frame[bins == levels(bins)[i], ])
}
  
# retrieving dataframe subsets
print("DataFrame Subset 1")
print(data_frame_1)
  
print("DataFrame Subset 2")
print(data_frame_2)
  
print("DataFrame Subset 3")
print(data_frame_3)
  
print("DataFrame Subset 4")
print(data_frame_4)
  
print("DataFrame Subset 5")
print(data_frame_5)

Output:




My Personal Notes arrow_drop_up
Recommended Articles
Page :