Open In App

How to calculate the mode of all rows or columns from a dataframe in R ?

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will discuss how to calculate the mode of all rows and columns from a dataframe in R Programming Language.

Method 1: Using DescTools package

The DescTools package in R is used to perform descriptive analysis. It contains a collection of miscellaneous basic statistic functions and convenience wrappers for efficiently describing data. It can be installed into the R working space using the following syntax : 

install.packages("DescTools")

The mode() method of this package is used to return the most frequently occurring numeric or character value from the input vector. 

Syntax: Mode(vec, na.rm = FALSE)

Arguments : 

vec – a (non-empty) numeric vector of values.

na.rm (Default : false)- Indicator of whether the missing values should be removed. 

In this approach, a for loop is initiated for iterating over all the columns, and then each individual column is supplied as an individual vector in the Mode() method. 

Code:

R




library ("DescTools")
 
# declaring a dataframe
data_frame = data.frame(col1 = c("b", "b", "d", "e", "e") ,
                        col2 = c(0, 2, 1, 2, 5),
                        col3= c(TRUE, FALSE, FALSE,
                                TRUE, TRUE))
 
print ("Original dataframe")
print (data_frame)
print ("Mode of columns \n")
 
# iterating over all the columns of the
# dataframe
for (i in 1:ncol(data_frame)){
   
  # calculating mode of ith column
  mod_val <- Mode(data_frame[,i])
  cat(i, ": ",mod_val,"\n")
}


Output:

[1] "Original dataframe" 
col1 col2  col3 
1    b    0  TRUE 
2    b    2 FALSE 
3    d    1 FALSE 
4    e    2  TRUE 
5    e    5  TRUE 
[1] "Mode of columns"
1 :  1 3  
2 :  2  
3 :  TRUE 

In the previous example, the numeric equivalent to the col1 is returned to the mode values. This leads to ambiguity or loss of data. In order to remove this problem, the explicit conversion to as.character() can be done. 

R




library ("DescTools")
 
# declaring a dataframe
data_frame = data.frame(col1 = c("b","b","d","e","e") ,
                        col2 = c(0,2,1,2,5),
                        col3= c(TRUE,FALSE,FALSE,TRUE, TRUE))
 
print ("Original dataframe")
print (data_frame)
print ("Mode of columns \n")
 
# iterating over all the columns
# of the dataframe
for (i in 1:ncol(data_frame)){
   
  # calculating mode of ith column
  mod_val <- as.character(Mode(data_frame[,i]))
  cat(i, ": ",mod_val,"\n")
}


Output:

[1] "Original dataframe"
col1 col2  col3
1    b    0  TRUE
2    b    2 FALSE
3    d    1 FALSE
4    e    2  TRUE
5    e    5  TRUE
[1] "Mode of columns"
1 :  b e  
2 :  2  
3 :  TRUE 

Method 2: User-defined method

A for loop iteration is done over all the columns of the dataframe. The mode can be calculated using the user-defined function by the following steps : 

Step 1: Compute the unique values of the vector using the unique() method in R. It returns the unique values from the vector.

Step 2: Match method is called to return a vector of the positions of (first) matches of its first specified argument in its second argument. The first vector is the original column vector and the second is the unique vector. 

match (col , unique_vec)

Step 3: The tabulate() method is then invoked which takes as the input the matched integer-valued vector and counts the number of occurrences of each integer in the specified vector.

Step 4: The max value out of these tabulated values is then calculated using the max() method, which is then returned as the mode of the column. 

Code:

R




# create function to compute mode
mode <- function(x) {
   
  # function to compute unique values
  # in vector
  unq_data <- unique(x)
   
  # map values to its number of occurrences
  map_data <- match(x, unq_data)
   
  # table of the data with its values
  tabulate_data <- tabulate(map_data)
   
  # compute maximum value from data
  max_val <- max(tabulate_data)
   
  # plot it from table
  unq_data[tabulate_data == max_val]
}
 
# declaring a dataframe
data_frame = data.frame(col1 = c("b","b","d","e","e") ,
                        col2 = c(0,2,1,2,5),
                        col3= c(TRUE,FALSE,FALSE,TRUE, TRUE))
print ("Original dataframe")
print (data_frame)
print ("Mode of columns \n")
 
# iterating over all the columns of
# the dataframe
for (i in 1:ncol(data_frame)){
   
  # calculating mode of ith column
  mod_val <- mode(data_frame[,i])
  print (mod_val)
}


Output:

[1] "Original dataframe"
col1 col2  col3
1    b    0  TRUE
2    b    2 FALSE
3    d    1 FALSE
4    e    2  TRUE
5    e    5  TRUE
[1] "Mode of columns"
[1] b e 
Levels: b d e 
[1] 2 
[1] TRUE


Last Updated : 22 Feb, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads