Convert matrix or dataframe to sparse Matrix in R
Sparse matrices are in column-oriented format and they contain mostly null values. The elements in the sparse matrix which are non-null are arranged in ascending order. In this article, we will convert the matrix and dataframe to a sparse matrix in R programming language.
Converting matrix to sparse matrix
As we know, matrices in R programming language are the objects or collections of elements arranged in a two-dimensional layout. We can construct a matrix in R using the matrix() function.
The first step we are going to do is to install the Matrix package using install.packages(“Matrix”) and then load the package using the library function in R. Next, we are going to construct our matrix using the matrix() function provided by the Matrix package. After the matrix has been generated, create an equivalent sparse matrix using as().
sparsematrix <- as(BaseMatrix, “sparseMatrix”)
- sparsematrix : This is our sample sparse matrix which is to be converted from our base matrix.
- BaseMatrix : This is our sample R matrix.
- “sparseMatrix” : It is the category specified inside the as() function to convert the base R matrix to sparse format.
Example: Converting matrix to sparse matrix in R
Converting a dataframe to sparse matrix
We know that a dataframe is a table or 2-D array-like structure that has both rows and columns and is the most common way of storing data. We will convert the dataframe to a sparse matrix by using the sparseMatrix() function in R.
Syntax: sparseMatrix(i = ep, j = ep, p, x, dims, dimnames, symmetric = FALSE, triangular = FALSE, index1 = TRUE, repr = “C”, giveCsparse = (repr == “C”), check = “TRUE”, use.last.ij = FALSE)
- i, j : These are the integers of same length that specifies the locations of row and column indices of the matrix.
- p : These are the integer vector of pointers, one for each column or row in the zero-based indexing of rows and columns.
- x : These are the optional values used in matrix entries.
- dims : These are the non-negative integer vectors.
- dimnames : These are the optional lists for ‘dimnames’.
- symmetric : This is the logical variable. If it is specified true, then the resulting matrix should be symmetric and false, otherwise.
- triangular : This is also the logical variable which gives true if the resulting matrix should be triangular and false, otherwise.
- index1 : This is the logical scalar variable. If it is true, then the counting of rows and columns starts at 1. If it is false, then the counting of rows and columns starts at 0.
- repr : These are the character strings which specifies the sparse representation used for result.
- giveCsparse : It is a logical variable indicating whether the resulting matrix is Csparse or Tsparse.
- check : It is a logical variable indicating whether a validity check is performed.
- use.last.ij : It is also logical which indicates in case of duplicate pairs, only the last one should be used.
Example: Converting dataframe to sparse matrix in R