Open In App

Data Serialization (RDS) using R

Last Updated : 17 Jan, 2024
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we can learn the Data Serialization using R. In R, one common serialization method is to use the RDS (R Data Serialization) format.

Data Serialization (RDS) using R

Data serialization is the process of converting data structures or objects into a format that can be easily stored, transmitted, or reconstructed later. In R Programming Language one common method for data serialization is to use the RDS (R Data Serialization) format. The RDS format allows us to save R objects, such as data frames or models, to a file and later read them back into R.

  • Serialization: The process of converting complex data structures into an understandable format, suitable for storage and transmission is known as Serialization.

Significance of Data Seralization:

  1. Data Preservation: It’s necessary to keep your objects’ class properties and structure as it is while working with complex data structures in R. It is possible because serialization promises data’s integrity will not be compromised during deserialization, or “unpacking.”
  2. Data share: It’s normal for distinct applications or systems to need to share data. Data sharing between platforms is made simple by serialization, that gives a uniform format independent of computer language.
  3. Storage Efficiency: Data stored in human-readable text forms like CSV or JSON takes less space than data stored in serialization formats like RDS. When working with big datasets, it might be very crucial.
  4. Diminished Data Transfer Overhead: Data that has been serialized can cut down on the overhead that goes with translating data into and out of different formats via networks. The result of this is reduced resource use and quicker data transmission.

Basic Concepts in RDS

  1. RDS (R Data Serialization):
    • RDS is a binary serialization format in R used to save R objects to a file.
    • It allows you to save and load R objects while preserving their class, attributes, and structure.
  2. Serialization Functions:
    • saveRDS() function in R is used to serialize an R object to a file.
    • readRDS() function in R is used to deserialize and read the R object back into the R environment.
  3. Saving and Loading Data: Use saveRDS() to save R objects to a file, and readRDS() to load them back into R.
  4. Serialization of Different Data Types:
    • RDS can serialize various data types, including vectors, lists, data frames, and more.
    • It’s suitable for saving individual objects or entire datasets.
  5. Alternative Formats:
    • Besides RDS, other serialization formats like CSV, JSON, and Feather may be used based on specific requirements.
    • Choose the format that best fits the use case in terms of performance, interoperability, and storage size.
  6. Compressing Serialized Data: For large datasets, consider compressing serialized data to reduce file size. RDS supports compression using the “gzip” or “xz” compression algorithms.

Serializing and Deserializing | using saveRDS(), readRDS() functions:

  • Step 1: Serialize ‘R’ Object to RDS File
  • Step 2: Deserialize RDS File to R Object

Key Concepts:

  1. Serialization: Serialization converts data objects into a specific format that is storable or transmissible. In R, this is often done using the saveRDS() function.
  2. Deserialization: The reverse process, where using the readRDS() function, serialized data is converted back into its original R data structure.
  3. RDS File Format: RDS files with extension .RDS are binary files that store serialized R objects. Compared to standard text formats like CSV, it is more space-efficient.

Serialize and Deserialize a Data Frame

A key part in programming is data serialization, which enables us to store, transfer, and rebuild easily readable format from a complexed data structures. The .RDS file format is frequently used in the R programming community while seralization. Distributing or storing data for prior use is made easier by this format, which allows us to store R objects while balancing their class properties and structure. Let’s discuss the fundamental ideas, procedures, and several instances of data serialization using RDS in this article.

Serializing and Deserializing a List

R




# Creating a list
my_list <- list(numbers = 1:5, colors = c("red", "blue", "green"))
 
# Save the list to a file
saveRDS(my_list, "serialized_list.rds")
 
# Read the serialized list back into R
loaded_list <- readRDS("serialized_list.rds")
 
# Display the loaded list
print(loaded_list)


Output:

$numbers
[1] 1 2 3 4 5

$colors
[1] "red"   "blue"  "green"

Serialize a List of Data Frames

R




# Create two data frames
df1 <- data.frame(Name = c("Ram", "Mina", "Sonu"), Age = c(32, 22, 24))
df2 <- data.frame(City = c("India", "Africa", "Japan"),
                  Population = c(8398456, 2350456, 2765494))
 
# Create a list of data frames
list_of_dfs <- list(data_frame1 = df1, data_frame2 = df2)
 
# Serialize the list of data frames
saveRDS(list_of_dfs, file = "list_of_data_frames.RDS")
 
# Deserialize the list of data frames
loaded_list_of_dfs <- readRDS("list_of_data_frames.RDS")
 
# Access and print one of the data frames
print(loaded_list_of_dfs$data_frame1)
print(loaded_list_of_dfs$data_frame2)


Output:

  Name Age
1  Ram  32
2 Mina  22
3 Sonu  24

    City Population
1  India    8398456
2 Africa    2350456
3  Japan    2765494

Serialize a Custom R Object

R




# Create a custom R object
custom_object <- structure(list(
  name = "Minakshi",
  age = 22,
  city = "Bihar",
  hobbies = c("Reading", "Writing", "Cooking"),
  scores = c(math = 95, science = 89, history = 75)
))
 
# Serialize the custom object
saveRDS(custom_object, file = "custom_object.RDS")
 
# Deserialize the custom object
loaded_custom_object <- readRDS("custom_object.RDS")
 
# Access and print the loaded custom object
print(loaded_custom_object)


Output:

$name
[1] "Minakshi"

$age
[1] 22

$city
[1] "Bihar"

$hobbies
[1] "Reading" "Writing" "Cooking"

$scores
   math science history 
     95      89      75 

We can see our save file in with the same name that we have given so we can access this file any time when its required.

ghj

Data Serialization (RDS) using r

This process is particularly useful for saving and sharing R objects, especially when the data or objects are too large to be easily shared in code form.

Keep in mind that while RDS is a convenient format for saving and loading R objects, it is specific to R. If you need to exchange data with other programming languages, you might want to consider other formats like CSV, JSON, or binary formats that are more widely supported.



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads