Handling Missing Values in R Programming

As the name indicates, Missing values are those elements which are not known. NA or NaN are reserved words that indicates missing value in R language for q arithmetical operations that are undefined.
Missing values are practical in life. For example, some cells in spreadsheets are empty. If an insensible or impossible arithmetic operation is tried then NAs occur.

Finding Missing values

Missing Values in R, are handled with the use of some pre-defined functions:

  • is.na() Function:
    A logical vector is returned by this function that indicates all the NA values present. It returns a Boolean value. If NA is present in a vector it returns TRUE else FALSE.

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    x<- c(NA, 3, 4, NA, NA, NA)
    is.na(x)

    chevron_right

    
    

    Output:

    [1]  TRUE FALSE FALSE  TRUE  TRUE  TRUE
    
  • is.nan() Function:
    A logical vector is returned by this function that indicates all the NaN values present. It returns a Boolean value. If NaN is present in a vector it returns TRUE else FALSE.



    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    x<- c(NA, 3, 4, NA, NA, 0 / 0, 0 / 0)
    is.nan(x)

    chevron_right

    
    

    Output:

    [1] FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE
    

Properties of Missing Values:

  • For testing objects that are NA use is.na()
  • For testing objects that are NaN use is.nan()
  • There are classes under which NA comes. Hence integer class has integer type NA, the character class has character type NA, etc.
  • A NaN value is counted in NA but the reverse is not valid.

The creation of a vector with one or multiple NAs is also possible.

filter_none

edit
close

play_arrow

link
brightness_4
code

x<- c(NA, 3, 4, NA, NA, NA)
x

chevron_right


Output:

[1] NA  3  4 NA NA NA

Removing NA or NaN values

There are two ways to remove missing values:

  • Extracting values except for NA or NaN values:

    Example 1:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    x <- c(1, 2, NA, 3, NA, 4)
    d <- is.na(x)
    x[! d]

    chevron_right

    
    

    Output:

    [1] 1 2 3 4
    

    Example 2:



    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    x <- c(1, 2, 0 / 0, 3, NA, 4, 0 / 0)
    x
    x[! is.na(x)]

    chevron_right

    
    

    Output:

    [1]   1   2 NaN   3  NA   4 NaN
    [1] 1 2 3 4
    
  • Function called complete.cases() can also be used. This function also works on data frames.

Missing Value Filter Functions

The modeling functions in R language acknowledge a na.action argument which provides instructions to the function regarding its response if NA comes in its way.

And hence this way the function calls one of the missing value filter functions.
Missing Value Filter Functions alter the data set and in the new data set the value of NAs has been changed.

The default Missing Value Filter Function is na.omit. It omits every row containing even one NA.

Some other Missing Value Filter Functions are:

  • na.omit– omits every row containing even one NA
  • na.fail– halts and does not proceed if NA is encountered
  • na.exclude– excludes every row containing even one NA but keeps a record of their original position
  • na.pass– it just ignores NA and passes through it
filter_none

edit
close

play_arrow

link
brightness_4
code

# Creating a data frame
df <- data.frame (c1 = 1:8
                  c2 = factor (c("B", "A", "B", "C",
                                 "A", "C", "B", "A")))
  
# Filling some NA in data frame
df[4, 1] <- df[6, 2] <- NA
  
# Printing all the levels(NA is not considered one)
levels(df$c2)
  
# fails if NA is encountered
na.fail (df)
  
# excludes every row containing even one NA
na.exclude (a)

chevron_right


Output:

[1] "A" "B" "C"
Error in na.fail.default(df) : missing values in object
Calls: na.fail -> na.fail.default
Execution halted

Special Cases

There are two special cases where NA is denoted or presented differently:

  • Factor Vectors– is the symbol displayed in factor vectors for missing values.
  • NaN – This is a special case of NA only. It is displayed when an arithmetic operation yields a result which is not a number. For example, dividing zero by zero produces NaN.



My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.


Article Tags :

Be the First to upvote.


Please write to us at contribute@geeksforgeeks.org to report any issue with the above content.