Open In App

The CAP Theorem in DBMS

The CAP theorem, originally introduced as the CAP principle, can be used to explain some of the competing requirements in a distributed system with replication. It is a tool used to make system designers aware of the trade-offs while designing networked shared-data systems. 

The three letters in CAP refer to three desirable properties of distributed systems with replicated data: consistency (among replicated copies), availability (of the system for read and write operations) and partition tolerance (in the face of the nodes in the system being partitioned by a network fault). 



The CAP theorem states that it is not possible to guarantee all three of the desirable properties – consistency, availability, and partition tolerance at the same time in a distributed system with data replication. 

The theorem states that networked shared-data systems can only strongly support two of the following three properties: 
 



The use of the word consistency in CAP and its use in ACID do not refer to the same identical concept. 

In CAP, the term consistency refers to the consistency of the values in different copies of the same data item in a replicated distributed system. In ACID, it refers to the fact that a transaction will not violate the integrity constraints specified on the database schema.

The CAP theorem states that distributed databases can have at most two of the three properties: consistency, availability, and partition tolerance. As a result, database systems prioritize only two properties at a time.

The following figure represents which database systems prioritize specific properties at a given time:

CAP theorem with databases examples

           The system prioritizes availability over consistency and can respond with possibly stale data.

           Example databases: Cassandra, CouchDB, Riak, Voldemort.

           The system prioritizes availability over consistency and can respond with possibly stale data.
           The system can be distributed across multiple nodes and is designed to operate reliably even in the face of network partitions.
           Example databases: Amazon DynamoDB, Google Cloud Spanner.

           The system prioritizes consistency over availability and responds with the latest updated data.
           The system can be distributed across multiple nodes and is designed to operate reliably even in the face of network partitions.
           Example databases: Apache HBase, MongoDB, Redis.

           It’s important to note that these database systems may have different configurations and settings that can change their behavior with           respect to consistency, availability, and partition tolerance. Therefore, the exact behavior of a database system may depend on its            configuration and usage.

for example, Neo4j, a graph database, the CAP theorem still applies. Neo4j prioritizes consistency and partition tolerance over availability, which means that in the event of a network partition or failure, Neo4j will sacrifice availability to maintain consistency.

Article Tags :