Open In App

Redis and its role in System Design

Improve
Improve
Like Article
Like
Save
Share
Report

Redis is an open-source, in-memory data structure store used as a database, cache, and message broker. It is widely used for its fast performance, flexibility, and ease of use. 

  • What is Redis
  • Redis Data Types
  • Benefits of using Redis
  • Working Architecture of Redis
    • 1. Single Redis Instance
    • 2. Redis HL (High Availability)
    • 3. Redis Sentinel
    • 4. Redis Cluster / Redis Cluster Master-Slave Model
      • What is gossiping in the Redis cluster?
  • Types of Redis Persistence Models
    • 1. RDB (Real-time Data Base) Persistence Model:
      • Snapshotting in RDB
      • Advantages of RDB(Real-time database)
      • Disadvantages of RDB(Real-time database)
    • 2. AOF (Append-Only File) Persistence Model
      • How AOF works?
      • Advantages of AOF
      • Disadvantages of AOF
    • 3. No Persistence Model
    • 4. Hybrid (RDB+AOF) Persistence Model
  • Availability, Consistency, and Partitioning in Redis
  • Can we use Redis as an alternative to the original DB?
  • Conclusion

Redis Data Storage Types

Redis allows developers to store, retrieve, and manipulate data in various data structures such as strings, bitmaps, bitfields, hashes, lists, sets, sorted sets, geospatial, hyperlogs, and streams.

Redis data types

Benefits of using Redis

All Redis data resides in the server’s main memory, in contrast to databases such as PostgreSQL, SQL Server, and others that store most data on disk. Redis can therefore support higher orders of magnitude of operations and faster response times. Thus, it results in super-fast performance with average read and writes operations taking less than milliseconds, and thus accordingly supports millions of operations per second.

Redis provides an overall system that offers us caching systems in both types of architectures – monolithic and distributed, thereby making the retrieval of data faster, as the direct access operation, by key in memory(like hashtables), will reduce the overall complexity of reading the data from the original SQL Database.

How redis is useful for caching

Working Architecture of Redis

There are several Redis architectures, depending on the use case and scale:

1. Single Redis Instance

This is the most straightforward Redis deployment. It involves users setting up and running small instances that can help them grow and speed up their services. However, it has its own drawback, as all calls made to Redis would fail if this running instance crashes or is unavailable. Thus there is a degradation in the overall performance and speed of the system.

Single Redis Instance

2. Redis HA (High Availability)

  • Another popular setup is the main deployment with a secondary deployment that is always kept in sync with the replication. The secondary instances can be one or more instances in our deployment, which helps in scale reads from Redis, and provide failover in the case when the main is lost.

Redis HA (secondary failover)

3. Redis Sentinel

  • Sentinel corresponds to a distributed system. It is designed in a way where there is a cluster of sentinel processes working together for coordination of state to provide constant availability of the Redis system. Here are the responsibilities of the sentinel:
    • Monitoring: Ensuring main and secondary instances are working as expected.
    • Notification: Notify all the system admins about the events occurring in Redis instances.
    • Management during failure: Sentinel nodes can start a process during failure if the primary instance is not available for long enough, and enough nodes agree that it is true.

Redis sentinel

4. Redis Cluster / Redis Cluster Master-Slave Model: The Ultimate Architecture of Redis

The Redis cluster is the ultimate architecture of Redis. It allows for horizontal scaling of Redis.

In Redis cluster, we decide to spread the data we are storing across multiple machines, which is known as Sharding. So each such Redis instance in the cluster is considered a shard of the whole data.

The Redis Cluster uses algorithmic sharding. To find the shard for a given key, we hash the key and mod the total result by the number of shards. Then, using a deterministic hash function, meaning that a given key will always map to the same shard, we can reason about where a particular key will be when we read it in the future.

Redis Cluster Architecture in System Design

To handle further addition of shards into the system (resharding), the Redis cluster uses Hashslot, to which all of the data is mapped. Thus, when we add new shards, we simply move hashslots from shard to shard and simplify the process of adding new primary instances into the cluster. And to the advantage, this is possible without any downtime, and minimal performance hit. Let’s look at an example below:

Consider the number of hashslots to be 10K.
Instance1 contains hashslots from 0 to 5000
Instance2 contains hashslots from 5001 to 10000.

Now, let’s say we need to add another instance, now the distribution of hashslots comes to,

Instance1 contains hashslots from 0 to 3333.
Instance2 contains hashslots from 3334 to 6667.
Instance3 contains hashslots from 6668 to 10000.

What is gossiping in the Redis cluster?

To determine the entire cluster’s health, the redis cluster uses gossiping. In the example below, we have 3 main instances and 3 secondary nodes of them. All these nodes constantly determine which nodes are currently available to serve requests. Suppose, if enough shards agree that instance1 is not responsive, they can promote instance1’s secondary into a primary to keep the cluster healthy. As a general rule of thumb, it is essential to have an odd number of primary nodes and two replicas each for the most robust and fault-tolerant network.

Types of Redis Persistence Models

Redis provides two main persistence options to save data to disk: RDB and AOF. Both options have their own advantages and disadvantages, and which one to use depends on the specific needs of the application. Given below are the several persistence options listed:

1. RDB (Real-time Data Base) Persistence Model:

RDB is a point-in-time snapshot of the Redis dataset stored as a binary file. The RDB file contains a representation of the dataset at a particular point in time and can be used to restore the dataset in case of a server crash or restart. RDB is very efficient in terms of disk space usage and performance, as it uses a binary format to store the data.

RDB can be configured to save the data periodically or based on certain conditions, such as a minimum number of write operations. However, the downside of RDB is that it can lead to data loss if the server crashes before the scheduled RDB snapshot is taken.

Snapshotting in RDB

Snapshotting is a process in Redis persistence that creates a point-in-time snapshot of the entire dataset in memory and saves it to disk in a binary format. This snapshot can be used to restore the dataset in case of a server crash or restart. Redis supports snapshotting through its RDB persistence mechanism.

The snapshotting process works as follows:

  • Redis forks a child process from the parent process.
  • The child process creates a copy of the current state of the dataset in memory.
  • The child process writes the copy of the dataset to a temporary RDB file.
  • The child process renames the temporary file to the final RDB file name, overwriting any existing RDB file.
  • The child process terminates, and Redis continues serving requests.

Redis can be configured to perform snapshotting automatically at regular intervals or based on certain conditions, such as a minimum number of write operations or a minimum amount of time elapsed since the last snapshot. If we are doing heavy work and changing lots of keys, then a snapshot per minute will be generated for us, in case changes are relatively less, then a snapshot per 5 minutes, and if it’s further less, then every 15 minutes a snapshot will be taken.

snapshotting

Advantages of RDB(Real-time database)

  • RDB files are perfect for backups, as it is a very compact single-file point-in-time representation of the redis data. It allows us to easily restore different versions of the data set in case of disasters.
  • It is very good for disaster recovery, being a single compact file that can be transferred to far data centers.

Disadvantages of RDB(Real-time database)

Let us now take a comparative look at the Disadvantages of Redis DB:

  • RDB is not optimal
  • If we need to minimize the chance of data loss in case Redis stops working. 
  • We can configure different save points where an RDB is produced. However, we will usually create an RDB snapshot every five minutes or more, so in case of Redis stops working without a correct shutdown for any reason, we should be prepared to lose the latest minutes of data.

2. AOF (Append-Only File) Persistence Model

AOF logs all write operations to a file in a human-readable format. This file contains a record of all the write operations performed on the dataset since the last save, making it possible to reconstruct the dataset in case of a crash. AOF provides better durability than RDB, as it logs every write operation to disk.

AOF can be configured to save the data periodically or based on certain conditions, such as a minimum number of write operations. However, AOF can lead to slower performance and larger disk space usage, as it logs every write operation to disk.

The append-only file is an alternative, fully-durable strategy for Redis, as the snapshotting is not very durable.

The AOF can be turned in the configuration file by,

appendonly yes

AOF

How AOF works?

  • Redis forks a child process from the parent process.
  • The child process creates a copy of the current state of the dataset in memory.
  • The child process writes the copy of the dataset to a new AOF in a temporary file.
  • The parent process accumulates all the new changes in an in-memory buffer (but at the same time it writes the new changes in the old append-only file, so if the rewriting fails, we are safe).
  • When the child is done rewriting the file, the parent process gets a signal and appends the in-memory buffer at the end of the file generated by the child.
  • Then, Redis automatically renames the old file into the new one and starts appending new data into the new file.

Advantages of AOF

  • AOF Redis is much more durable, as we can have different fsync policies, no fsync at all, fsync every second, fsync at every query.
  • It is an append-only log, so there are no seeks, nor corruption problems if there is power outage. 
  • The Redis check-of tool is automatically able to fix any half-written command if the log ends suddenly due to some disk-full or other reasons.

Disadvantages of AOF

Let us now take a comparative look at the Disadvantages of AOF:

  • These files are generally bigger than equivalent RDB files for the same dataset. 
  • It can be slower than RDB depending on the exact fsync policy.  
  • AOF can improve the data consistency but does not guarantee so likely you can lose your data but less than RDB mode considering the RDB is faster. 

Which one to choose – Real-time database (RDB) or Append Only Files (AOF)?

The general thought process should be that we use both the persistence methods if we want a degree of data safety comparable to what normal databases like PostgreSQL, can provide us. If we care a lot about our data but still can live with a few minutes of data loss in case of disasters, we can simply use RDB alone.

3. No Persistence Model

Redis also provides an option to disable persistence altogether, in which case the data is stored only in memory. This option is useful when Redis is used as a cache, and the data can be regenerated if lost.

4. Hybrid (RDB+AOF) Persistence Model

Redis provides an option to use both RDB and AOF persistence together, which is known as hybrid persistence. This option provides the benefits of both RDB and AOF, as the AOF log is used to replay write operations after a restart, and the RDB snapshot is used to restore the dataset at a specific point in time.

Availability, Consistency, and Partitioning in Redis

Here’s a brief overview of how Redis handles availability, consistency, and partitioning:

  • Availability: Redis uses a master-slave replication model to ensure high availability. This means that there is a single “master” node that accepts all writes and multiple “slave” nodes that replicate data from the master in real-time. In the event of a failure of the master node, one of the slave nodes can be promoted to become the new master.
  • Consistency: Redis provides strong consistency guarantees for single-key operations, meaning that if a value is written to a key, it will be immediately available for reads from any node in the cluster. However, Redis does not provide transactional consistency for multi-key operations, meaning that it is possible for some nodes to see a different view of the data than others.
  • Partitioning: Redis supports sharding, which allows the data set to be partitioned across multiple nodes. Redis uses a hash-based partitioning scheme, where each key is assigned to a specific node based on its hash value. Redis also provides a mechanism for redistributing data when nodes are added or removed from the cluster.

Can we use Redis as an alternative to the original DB?

Based on the above discussion, Redis seems to be a better option for the original DB as it provides faster retrievals. Even then Redis is not used as a primary option for the database in the system. 

Redis should always come as the second support to improve the performance of the overall system, because according to the CAP theorem, Redis is neither consistent nor highly available.

This is because, in the case of the server crashes, we would lose all of the data which is in the memory. It is okay to lose this data in case of a crash, but for some other apps, it becomes really important to reload Redis data immediately after the server restarts.

Conclusion

Overall, Redis is a powerful tool for system design, but it may not be suitable for all use cases. It is important to carefully consider its limitations when deciding whether to use Redis in a particular application.



Last Updated : 27 Mar, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads