CAP Theorem in System Design

As we know the CAP Theorem is a part of System Design. During the design, we face the common challenges that have to be considered while designing a distributed system. In this article, we will discuss the CAP Theorem, and why it helps design an efficient System.

Important Topics for the CAP Theorem in System Design

What is the CAP Theorem in System Design?
Properties of CAP Theorem in System Design
Trade-Offs in the CAP Theorem
Example to Understand the CAP Theorem
Use Cases of the CAP Theorem in System Design
Advantages of CAP Theorem in System Design
Disadvantages of CAP Theorem in System Design

1. What is the CAP Theorem in System Design?

CAP Theorem

CAP theorem states that in networked shared-data system or distributed system can share/have only two of the three desired characteristics for a database: Consistency, Availability, and Partition tolerance.

CAP Theorem is also known as Brewer’s theorem and it was introduced by the computer scientist Eric Brewer at the Symposium on Principles of Distributed Computing in 2000.

The theorem provides a way of thinking about the trade-offs involved in designing and building distributed systems.
It helps to explain why certain types of systems may be more appropriate for certain use cases.
According to Brewer, the theorem states that a distributed system can have at most two of these guarantees.

2. Properties of CAP Theorem in System Design

The property of three distributed system characteristics to which CAP Theorem refers:

1. Consistency

Consistency defines that all clients see the same data simultaneously, no matter which node they connect to in a distributed system. For eventual consistency, the guarantees are a bit loose. Eventual consistency gurantees means client will eventually see the same data on all the nodes at some point of time in the future.

Consistency

Below is the explaination of the above Diagram:

All nodes in the system see the same data at the same time. This is because the nodes are constantly communicating with each other and sharing updates.
Any changes made to the data on one node are immediately propagated to all other nodes, ensuring that everyone has the same up-to-date information.

2. Availability

Availabilty defines that all non-failing nodes in a distributed system return a response for all read and write requests in a bounded amount of time, even if one or more other nodes are down.

Availability

Below is the Explanation of the above Diagram:

User send requests, even though we don’t see specific network components. This implies that the system is available and functioning.
Every request receives a response, whether successful or not. This is a crucial aspect of availability, as it guarantees that users always get feedback and aren’t left hanging.

3. Partition Tolerance

Partition Tolerance defines that the system continues to operate despite arbitrary message loss or failure in parts of the system. Distributed systems guranteeing partition tolerance can gracefuly recover from partitions once the partition heals.

Partition Tolerance

Below is the Explanation of the above Diagram:

Addresses network failures, a common cause of partitions. It suggests that the system is designed to function even when parts of the network become unreachable.
It shows network disruptions visually demonstrates that the system remains operational. This is a key characteristic of partition tolerance.
The system can adapt to arbitrary partitioning, meaning it can handle unpredictable network failures without complete failure.

3. Trade-Offs in the CAP Theorem

We can classify the systems into the following three categories:

Trade-off in the CAP Theorem

1. CA System

A CA System delivers consistency and availiability across all the nodes. It can’t do this if there is a partition between any two nodes in the system and therefore does’t supoort partition tolerance.

2. CP System

A CP System delivers consistency and partition tolerance at the expense of availability. When a partition occurs between two nodes, the systems shuts down the non-available node until the partition is resolved. Some of the examples of the databases are MongoDB, Redis, and HBase.

3. AP System

An AP System availabiiity and partition tolerance at the expense of consistency. When a partition occurs, all nodes remains available, but those at the wrong end of a partition might return an older version of data than others. (When the partition is resolved, the AP databases typically resync the nodes to repair all the inconsistencies in the system). Example: CouchDB, Cassandra and Dyanmo DB, etc.

4. Example to Understand the CAP Theorem

In the figure above,

We have a straightforward distributed systen where S1 and S2 are two server. The two server can talk to each other. Here, System is partition tolerant. Here We will prove that system can be either consistent or available.
Suppose there is a network failure and S1 and S2 cannot talk to each other. Now assume that the client makes a write to S1. The client then send a read to S2.
Given S1 and S2 cannot talk, they have different view of the data. If the system has to remain consistent, it must deny the request and thus give up on availability.
If the system is available, then the system has to give up on consistency. This proves the CAP Theorem.

5. Use Cases of the CAP Theorem in System Design

Here we will see how we can use all the trade-off system in real application:

5.1 Banking Transactions (CP System)

Problem Statement:

Imagine a bank teller updating your account balance on a secure computer system. This system prioritizes consistency (C) and partition tolerance (P).

Why we use CP System ?

Each transaction must be accurately reflected across all servers (consistency) even if individual branches face network disruption (partition tolerance).
Inconsistency could lead to double spending or incorrect balances, unacceptable situations in financial transactions.
While data is always consistent, some users might experience momentary delays during network issues due to stricter synchronization requirements.

5.2 Social Media Newsfeed (AP System)

Problem Statement:

Think of your newsfeed on a social media platform constantly updating with new posts and stories. This system prioritizes availability (A) and partition tolerance (P).

Why we use AP System ?

Users expect immediate access to their newsfeeds (availability) even if parts of the network are temporarily down (partition tolerance). Slight inconsistencies in data, like seeing a friend’s post slightly sooner on one device than another, are tolerable in this context.
Data might not be perfectly consistent across all servers immediately after updates. Users might occasionally see slightly different versions of their newsfeed before data propagates across the system.

5.3 Online Shopping Cart (Hybrid System CAP System):

Problem Statement:

Imagine an online shopping cart, adding items, and checking out. This system might employ a hybrid approach balancing CAP trade-offs.

Why we use AP and CP System?

Adding items to the cart could be available and partition-tolerant (AP), allowing uninterrupted browsing even if temporary network glitches occur.
But when confirming the order and processing payment, the system might switch to a CP mode, ensuring consistency across all servers before finalizing the transaction.
The system requires careful design to switch seamlessly between availability and consistency modes at the right points to handle different stages of the user journey effectively.

6. Advantages of CAP Theorem in System Design

Provides a Framework for Decision-Making:
- It clarifies the fundamental choices involved in designing distributed systems – choosing two out of three key properties (Consistency, Availability, and Partition Tolerance).
- This framework forces engineers to explicitly prioritize their system’s goals, leading to more informed and conscious design decisions.
Promotes Understanding of Trade-offs:
- By outlining the limitations of achieving all three CAP properties simultaneously, the theorem prevents unrealistic expectations and encourages consideration of the inherent compromises involved.
- This awareness of trade-offs enables designers to make balanced choices that prioritize the most critical properties for their specific context.
Guides System Architecture and Technology Selection:
- Understanding the CAP implications helps choose appropriate database technologies, replication strategies, and communication protocols based on the desired properties.
- This knowledge informs decisions about using eventually consistent NoSQL databases for high availability or strongly consistent RDBMS for critical data integrity.
Enhances System Resilience and Performance:
- Focusing on specific CAP combinations leads to tailored solutions for resilience against network failures (partition tolerance) or ensuring responsiveness under heavy load (availability).
- This targeted approach results in systems that are better equipped to handle real-world challenges and maintain optimal performance under specific conditions.
Fosters Communication and Collaboration:
- The CAP theorem establishes a common language for discussing and understanding the critical factors in distributed system design.
- This shared terminology facilitates communication between developers, architects, and stakeholders, leading to better collaboration and decision-making.
Inspires Innovation and Exploration:
- While outlining limitations, the CAP theorem also motivates research and development of new technologies and protocols that attempt to push the boundaries of consistency, availability, and partition tolerance.
- This ongoing exploration leads to advancements in distributed systems theory and practical solutions that can overcome certain trade-offs in specific scenarios.

7. Disadvantages of CAP Theorem in System Design

Oversimplification:
- The CAP theorem focuses on three core properties, but real-world systems might involve other crucial factors like performance, data durability, and latency. Neglecting these aspects can lead to incomplete or suboptimal design solutions..
Abstract Trade-offs:
- The CAP theorem defines theoretical bounds, but choosing the appropriate CAP combination for a specific application can be challenging. Quantifying the acceptable level of inconsistency or downtime for different scenarios requires careful analysis and may not always have clear-cut answers.
Lack of Guidance for Hybrid Systems:
- While the CAP theorem helps choose between prioritized combinations, it doesn’t explicitly provide guidance for designing systems that might require switching between different prioritizations at different stages or for specific data subsets.
Potential Misinterpretation:
- Misunderstanding the nuances of the CAP theorem can lead to misinformed decisions. For example, prioritizing availability might be misinterpreted as compromising data integrity, resulting in unnecessary sacrifices in consistency when it’s not justified.

8. Conclusion

CAP theorem is a valuable tool, but it’s important to be aware of its limitations and apply it critically within the context of your specific system design challenges. Utilize its insights to make informed decisions, explore hybrid approaches when necessary, and stay open to adapting your solutions as needs and technologies evolve. The CAP theorem, despite highlighting inherent limitations, serves as a valuable guide and decision-making framework for designing reliable, efficient, and user-centric distributed systems. Its advantages lie in its ability to clarify trade-offs, inform technology choices, enhance system resilience, and even inspire further innovation in the field.

Article Tags :

System Design