How to Fix: Invalid factor level, NA generated in R
Last Updated :
18 Mar, 2022
In this article, we will be looking at the approaches with the examples to fix the error: invalid factor level, NA generated.
Such type of warning message is produced by the compiler when a programmer tries to add a value to a factor variable in R that doesn’t have any existence at the beforehand as a defined level. The complete warning message is given below:
Warning message:
In `[<-.factor`(`*tmp*`, iseq, value = "C") :
invalid factor level, NA generated
When error might occur
Let’s create a data frame.
R
dataframe < - data.frame (team= factor ( c ( 'Alpha' , 'Alpha' ,
'Beta' , 'Beta' ,
'Charlie' , 'Charlie' ,
'Charlie' )),
points= c (96, 91, 86, 89, 93, 87, 91))
dataframe
str (dataframe)
|
Output:
In this example, the team variable has the three types of values only: “Alpha”, “Beta”, “Charlie”. Now, we will try to insert an additional row at the end of the data frame having the team name equal to “Gamma”.
Example:
R
dataframe <- data.frame (team= factor ( c ( 'Alpha' , 'Alpha' ,
'Beta' , 'Beta' ,
'Charlie' , 'Charlie' ,
'Charlie' )),
points= c (96, 91, 86, 89, 93, 87, 91))
dataframe[ nrow (dataframe) + 1,] = c ( 'Gamma' , 99)
|
Output:
Output
The compiler produces the warning message. This is because the value “Gamma” is not already present under the team column. Note that it is just a warning message and the compiler will automatically insert a new row at the end of the data frame but instead of “Gamma” the cell would have the value equal to NA.
R
dataframe <- data.frame (team= factor ( c ( 'Alpha' , 'Alpha' ,
'Beta' , 'Beta' ,
'Charlie' , 'Charlie' ,
'Charlie' )),
points= c (96, 91, 86, 89, 93, 87, 91))
dataframe[ nrow (dataframe) + 1,] = c ( 'Gamma' , 99)
dataframe
|
Output:
How the warning can be avoided:
We can get rid of this warning by firstly transforming the factor variable to a character variable and then we can transform it again to a factor variable just after adding the additional row.
Example:
R
dataframe <- data.frame (team= factor ( c ( 'Alpha' , 'Alpha' ,
'Beta' , 'Beta' ,
'Charlie' , 'Charlie' ,
'Charlie' )),
points= c (96, 91, 86, 89, 93, 87, 91))
dataframe$team <- as.character (dataframe$team)
dataframe[ nrow (dataframe) + 1,] = c ( 'Gamma' , 99)
dataframe$team <- as.factor (dataframe$team)
dataframe
|
Output:
As you can see in the output, the warning, as well as the “NA” thing, have been eliminated from the dataframe. Now let’s display the structure of the modified dataframe once:
R
dataframe <- data.frame (team= factor ( c ( 'Alpha' , 'Alpha' ,
'Beta' , 'Beta' ,
'Charlie' , 'Charlie' ,
'Charlie' )),
points= c (96, 91, 86, 89, 93, 87, 91))
dataframe$team <- as.character (dataframe$team)
dataframe[ nrow (dataframe) + 1,] = c ( 'Gamma' , 99)
dataframe$team <- as.factor (dataframe$team)
str (dataframe)
|
Output:
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...