Open In App

Extract unique columns from a matrix using R

Last Updated : 20 Feb, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

A matrix is a rectangular arrangement of numbers in rows and columns. In a matrix, as we know rows are the ones that run horizontally and columns are the ones that run vertically. In R programming Language, matrices are two-dimensional, homogeneous data structures. These are some examples of matrices.

matrices

To create a matrix in R you need to use the function called matrix(). The arguments to this matrix() are the set of elements in the vector. You have to pass how many numbers of rows and how many numbers of columns you want to have in your matrix.

Note: By default, matrices are in column-wise order.

Extract unique columns from a matrix using R

First of all let’s make a matrix

R




A = matrix(
# Taking sequence of elements 
c(1,2,1, 5,3,5, 1,2,1, 1,2,3 ,1,1,1, 1,5,4, 1,1,1),
    
# No of rows
nrow = 3,  
    
# No of columns
ncol = 7,        
#byrow = TRUE    
#bycol= TRUE
# By default matrices are in column-wise order
# So this parameter decides how to arrange the matrix
)
rownames(A) = c("a", "b", "c")
   
# Naming columns
colnames(A) = c("a", "b", "c", "d", "e", "f", "g")
   
cat("The 3x7 matrix:\n")
print(A)


Output:

The 3x7 matrix:
  a b c d e f g
a 1 5 1 1 1 1 1
b 2 3 2 2 1 5 1
c 1 5 1 3 1 4 1

Method 1: Using unique() function

Now we can see that the columns 1-3 and 5-7 are same so we will use unique() function to extract unique columns out of matrix.

Syntax:

unique(x, incomparables, fromLast, nmax, …,MARGIN)

  • x: This parameter is a vector or a data frame or an array or NULL.
  • incomparables: This parameter is a vector of values that cannot be compared. If its value is FALSE, that means that all values can be compared, and maybe the only value accepted for methods other than the default. It will be coerced internally to the same type as x.
  • fromLast: This parameter indicates that if duplication should be considered from the last, i.e., the rightmost of identical elements will be kept. Its value is logical i.e., either true or false.
  • nmax: This parameter says the maximum number of unique items expected.
  • … : This is the arguments for particular methods.
  • MARGIN: This parameter says the array margin to be held fixed.
  • Return value: This function returns a vector, data frame, or array without any duplicate elements/rows.

R




A = matrix(c(1,2,1, 5,3,5, 1,2,1, 1,2,3 ,1,1,1, 1,5,4, 1,1,1),nrow = 3,ncol = 7)
rownames(A) = c("a", "b", "c")
   
# Naming columns
colnames(A) = c("a", "b", "c", "d", "e", "f", "g")
   
cat("The matrix:\n")
unique(A, MARGIN=2)


Output:

The matrix:
  a b d e f
a 1 5 1 1 1
b 2 3 2 1 5
c 1 5 3 1 4

As it can be observed from the output all the next occurance of the column are removed. More detailed description of unique function is as follows.

Some of the other examples of unique() are

R




# R program to show
# unique() function
  
# Initializing an input vector with some
# duplicate values
V <- c(1, 2, 3, 4, 4, 5, 6, 5, 6, 5)
   
# Calling the unique() function over the
# vector to remove duplicate values from it
unique(V)


Output:

[1] 1 2 3 4 5 6

Unique elements from the specified matrix

R




A = matrix(c(1,2,3,4, 1,2,3,4, 1,2,4,4, 1,2,4,4),nrow = 4,ncol = 4,byrow = TRUE)
rownames(A) = c("a", "b", "c", "d")
   
# Naming columns
colnames(A) = c("a", "b", "c", "d")
   
cat("The 4x4 matrix:\n")
unique(A)


Output:

The 4x4 matrix:
a b c d
a 1 2 3 4
c 1 2 4 4

as the output depicts all the next occurance of duplicate rows are removed.

Unique elements from the specified dataframe

R




# R program to illustrate
# unique() function
  
# Creating a data frame
my_class <- data.frame(Student = c('Rohit', 'Anjali',
                                     'Rohit', 'Rohan',
                                     'Anjali'),
    Age = c(22, 23, 22, 22, 23), Gender = c('Male', 'Female',
                                          'Male', 'Male',
                                          'Female'))
my_class
  
# Printing new line
writeLines("\n")
# the unique elements only
unique(my_class)


Output:

  Student Age Gender
1 Rohit 22 Male
2 Anjali 23 Female
3 Rohit 22 Male
4 Rohan 22 Male
5 Anjali 23 Female

Student Age Gender
1 Rohit 22 Male
2 Anjali 23 Female
4 Rohan 22 Male

Method 2: Using duplicated() function

In R, the duplicated() function is used to recognize duplicated elements in a vector/duplicated rows in a matrix.

  • When used with matrices or data frames, duplicated() returns a logical vector depicting whether each row is a duplicate (i.e., whether it has already occurred above in the data structure).
  • As shown below in the code we use !duplicated(t(A)) to get a logical vector indicating which columns are unique after transposing the matrix A.

R




A = matrix(
  c(1, 2, 1, 5, 3, 5, 1, 2, 1, 1, 2, 3, 1, 1, 1, 1, 5, 4, 1, 1, 1, 1),
  nrow = 3,
  ncol = 7
)
 
rownames(A) = c("a", "b", "c")
colnames(A) = c("a", "b", "c", "d", "e", "f", "g")
 
cat("The 3x7 matrix:\n")
print(A)
 
# Extract unique columns
unique_columns <- A[, !duplicated(t(A))]
 
cat("\nUnique columns:\n")
print(unique_columns)


Output:

The 3x7 matrix:
a b c d e f g
a 1 5 1 1 1 1 1
b 2 3 2 2 1 5 1
c 1 5 1 3 1 4 1

Unique columns:
a b d e f
a 1 5 1 1 1
b 2 3 2 1 5
c 1 5 3 1 4

This code uses the t(A) function which transpose the matrix. duplicated() is then applied to recognize duplicate columns, and ! is used to negate the result. At last columns with no duplicates are selected using A[,!duplicated(t(A))]. The result is the stored in the unique_columns variable and printed as output.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads