Complete Cases in R with Examples

In this article, we will discuss what is complete.cases() Function and where it is used in R Programming Language.

What is complete.cases() Function

complete.cases() function in R Programming Language is used to return a logical vector with cases that are complete, i.e., no missing value. This function is especially handy when dealing with datasets that may have missing data.

Syntax:

complete.cases(x)

Parameters:x: Object

Perform complete.cases() Function on Vector

# R Program to return 
# cases which are complete 
 
# Creating a vector 

vec <- c(1, 2, 3, 4, NA, 3) 
 
# Calling complete.cases() function 

complete.cases(vec) 
 
# Printing the returned vector 

vec1 <- vec[complete.cases(vec)] 
vec1 

Output:

[1]  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
[1] 1 2 3 4 3

Perform complete.cases() Function on Matrix

# Create a matrix with missing values

matrix_data <- matrix(c(1, 2, NA, 4, 5, 6, 7, 8, 9),3,3)
 
# Identify complete cases in the matrix

complete_rows <- complete.cases(matrix_data)
 
# Print the original matrix

print(matrix_data)
 
# Extract complete cases from the matrix

complete_matrix <- matrix_data[complete_rows, , drop = FALSE]
 
# Print the matrix with complete cases

print(complete_matrix)

Output:

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]   NA    6    9
Print the matrix with complete cases
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8

Perform complete.cases() Function on Data frame

# Create a data frame with missing values

data_frame <- data.frame(

  StudentID = c(101, 102, 103,104),

  ExamScore = c(85, NA, 92,68),

  Attendance = c(90, 75, NA,80),

  Grade = c("A", "B", "C", "D")
)
 
# Print the original data frame with updated column names

print(data_frame)
 
# Identify complete cases in the data frame

complete_rows <- complete.cases(data_frame)
 
# Extract complete cases from the data frame

complete_data_frame <- data_frame[complete_rows, , drop = FALSE]
 
# Print the data frame with complete cases

print(complete_data_frame)

Output:

     StudentID ExamScore Attendance Grade
1       101        85         90     A
2       102        NA         75     B
3       103        92         NA     C
4       104        68         80     D
complete_data_frame
  StudentID ExamScore Attendance Grade
1       101        85         90     A
4       104        68         80     D

Rows containing NA in specific columns of a data frame should be removed

# Create a data frame with missing values

data_frame <- data.frame(

  StudentID = c(101, 102, 103,104),

  ExamScore = c(85, NA, 92,68),

  Attendance = c(90, 75, NA,80),

  Grade = c("A", "B", "C", "D")
)
 
# Print the original data frame with updated column names

print(data_frame)
 
# Identify complete cases in the data frame

complete_rows <- complete.cases(data_frame[ , 'ExamScore'])
 
# Extract complete cases from the data frame

complete_data_frame <- data_frame[complete_rows, , drop = FALSE]
 
# Print the data frame with complete cases

print(complete_data_frame)

Output:

  StudentID ExamScore Attendance Grade
1       101        85         90     A
2       102        NA         75     B
3       103        92         NA     C
4       104        68         80     D
  StudentID ExamScore Attendance Grade
1       101        85         90     A
3       103        92         NA     C
4       104        68         80     D

Here we remove the only Rows containing NA in specific columns of a data frame.

Article Tags :

R Language

R Vector-Function