The Factor Issue in a DataFrame in R Programming

DataFrames are generic data objects of R which are used to store the tabular data. Data frames are considered to be the most popular data objects in R programming because it is more comfortable to analyze the data in the tabular form. Data frames can also be taught as matrices where each column of a matrix can be of the different data types.

Factor issue in a data frame in R

R has the inbuilt characteristics to assign the data types to the data you enter. When you enter numeric variables, it knows all the numeric variables that are available but when you enter character variables it takes whatever the character variables you are giving as categories or factors levels. And it assumes that these are the only factors that are available for now. Factor variables are those where the character column is split into categories or factor levels. So let’s understand this through an example. In the below R code there given a data frame and we want to manipulate the data frame and take a look, what’s the problem actually happening here.

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

# R program to illustrate
# the factor issue in a data frame
  
# Creating a dataframe 
df = data.frame( 
  "Name" = c("Amiya", "Raj", "Asish"), 
  "Language" = c("R", "Python", "Java"), 
  "Age" = c(22, 25, 45
print(df) 
  
# Manipulating the data frame
df[1, 3] = 37
df[3, 2] = "C"
  
print(df)

chevron_right


Output:

Name Language Age
1 Amiya        R  22
2   Raj   Python  25
3 Asish     Java  45

   Name Language Age
1 Amiya        R  37
2   Raj   Python  25
3 Asish     NA    45
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C") :
  invalid factor level, NA generated

At first, When you want to change the element in the first-row third column to others the operations performed successfully though it was a numeric variable. But when you want to change the element in the third-row second column to others; what happens is, it will display a warning message saying that this “C” categorical variable is not available and it replaces that with the NA. You can notice that the place where we want “C” to be there we are having a NA and we can also see the use of word factor in the warning message, how to get rid of the factor issue is the question now.



Resolving the factor issue

New entries in R when you are entering should be consistent with the factor levels that are already defined and if not, those error messages will be printed out. If you do not want this issue to happen what you have to do is while defining the data frame itself you need to pass another argument, which says “strings as factors” is false. By default this argument is true that is the reason why you get this warning message when you want to change the string characters into new string characters as an element. Now try doing the same manipulation you want to change.

Example:

filter_none

edit
close

play_arrow

link
brightness_4
code

# R program to illustrate
# resolving the factor issue in a data frame
  
# Creating a dataframe 
df = data.frame( 
  "Name" = c("Amiya", "Raj", "Asish"), 
  "Language" = c("R", "Python", "Java"), 
  "Age" = c(22, 25, 45),
  # Passing an additional argument 
  # to resolve factor issue
  stringsAsFactors = F
print(df) 
  
# Manipulating the data frame
df[1, 3] = 37
df[3, 2] = "C"
  
print(df)

chevron_right


Output:

Name Language Age
1 Amiya        R  22
2   Raj   Python  25
3 Asish     Java  45

   Name Language Age
1 Amiya        R  37
2   Raj   Python  25
3 Asish        C  45

From the above code, you can see that there is no NA anymore and we achieved what we want.




My Personal Notes arrow_drop_up

Technical Content Engineer at GeeksForGeeks

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.