Skip to content
Related Articles

Related Articles

Improve Article

Get All Factor Levels of DataFrame Column in R

  • Last Updated : 17 May, 2021

The data frame columns in R can be factorized on the basis of its factor columns. The data frame factor columns are composed of factor levels. Factors are used to represent categorical data. Each of the factor is denoted by a level, computed in the lexicographic order of appearance of characters or strings in the encoded factor level vector. In this article we will discuss how to get all factor levels of dataframe column in R.

The hardhat package in R is responsible for providing functionality for preprocessing, predicting, and validating input. It is used to construct modeling packages. 

Syntax:

install.packages(“hardhat”)

get_levels() method in this package is used to extract the levels from any factor columns in the specified data frame. The major advantage of this method is utilized in the extraction of the original factor levels from the predictors in the training set, which is the data frame, in this case. It takes as an argument only a data frame or data.table in R and returns the different columns mapped to the corresponding factor levels in the form of vectors, if and only if the data type is compatible. 



Syntax:

get_levels(data_frame)

The columns are leveled on the basis of factor levels. However, any duplicate entries are removed, since they fall at the same factor level.

Example 1:

R




# getting required libraries
library("hardhat")
  
# declaring data frame
data_frame <- data.frame(
  col1 = letters[4:6], 
  col3 = c("geeks","for","geeks"))
  
print ("Original DataFrame")
print (data_frame)
  
print ("Factors")
get_levels(data_frame)

Output

[1] “Original DataFrame” 

 col1  col3 



1    d geeks 

2    e   for 

3    f geeks  

[1] “Factors” 

$col1 

[1] “d” “e” “f”  

$col3

[1] “for”   “geeks”

Only the columns of the data frame which are of the factor type return output in the get_levels() method. The following program is used to understand the data type compatibility for the computation of factor levels of the columns in the data frame.

Example 2:

R




# getting required libraries
library("hardhat")
  
# declaring data frame
data_frame <- data.frame(col1 = factor(c(2,4,6)), 
                         col2 = FALSE, col3 = LETTERS[1:3])
  
print ("Original DataFrame")
print (data_frame)
  
print ("Factors")
get_levels(data_frame)

Output

 col1  col2 col3 

1    2 FALSE    A

2    4 FALSE    B 

3    6 FALSE    C 

[1] “Factors” 

$col1 

[1] “2” “4” “6”  

$col3 

[1] “A” “B” “C”

In order to produce output factor(vec), where vec is the incompatible vector can be used while column declaration and definition. 




My Personal Notes arrow_drop_up
Recommended Articles
Page :