Weak Levels of Consistency

Prerequisite:

Each transaction in a Database Management System must necessarily follow ACID properties, essentially required to maintain an appropriate mechanism to execute a sequence of basic operations or a function on a database without violating any constraints to produce a correct and consistent output at completion. An important property that forms an essential part of the set of properties for which ACID is being used as an acronym is consistency, represented by ‘C’ in the acronym for a Non-Distributed Database System. For further discussion, note that we discuss the concept of Consistency Levels for a Distributed Database System and hence follow CAP Theorem.

Here, it is important to consider that the context in which consistency is being used is not related to ‘C’ in the acronym ACID but is a property of the Distributed Database System and is discussed in context for ‘C’ in the acronym for CAP, as used in CAP Theorem. Here, the consistency, as in ‘C’ of CAP implies the rules defined for executing a concurrent and a distributed system enact and function as a single-threaded and a centralised database system for a user. As we discuss concurrent threads, any transaction, if exists as a part of a thread must adhere to ACID properties within itself.

Consistency is hence a property that ensures that if there exists any copy of the database, it must necessarily perform a series of operations in a specified order to maintain a general final state in one and all copies of the database as well as the original database itself.

Here, all the READ operations at any particular instant can have a single possible output and hence must return the same value back. This must follow for thread-based concurrent execution in a distributed database system.
Each READ must reflect the most recently done WRITE of that data item.
The condition specified above is irrespective of the fact that which server had processed that particular WRITE.
Also, even though, READ and WRITE operations are being executed at distinct nodes of a distributed system, it does not restrict the output and moreover creates a need for an additional responsibility to maintain global records for the order of execution of READ and WRITE for any variable.
The procedure facilitates the nodes to exchange information and present a similar order of output to ensure consistency.

This is the definition for almost perfect consistency, which is also termed as Atomic Consistency. Also, absolute perfect consistency is termed as Strict Consistency but is impractical to implement and hence is reduced to a theoretical basis only. Both of the levels are discussed in detail below.

We begin with discussing the basic definition of Consistency and a Consistency Level. Further, we will enlist distinct levels of consistency, categorization of each level from the strongest level to the weakest level, and conclude with a brief discussion on Degree-Two Consistency and Cursor Stability.

Difference between Strong Consistency and Weak Consistency:

S.No	Strong Consistency	Weak Consistency
1.	The current state of a database follows a universally and mutually accepted sequence of change of state.	It allows distinct views of the database state to see different and unmatched updates in the database state.
2.	Strict Consistency, Atomic Consistency, and Sequential Consistency are stronger levels of consistency.	Causal Consistency, Eventual Consistency is weaker levels of Consistency
3.	End-User is unaware of replications of the given database.	The application developer must be explicitly in the knowledge of the replicated nature of data items in the database.
4.	The user has a view of the database as there exists only one copy of the database that continuously reflects each of the state transitions in a forward direction along with the operations.	The developer must adapt to the replicated nature of the data in the database which results in an increase in the complexity of development as compared to strong consistency.

Consistency Levels:

Various Isolation Levels offer a certain ability to provide a specific degree of isolation to a transaction, and a chosen level also significantly affects the performance of the database. Similarly, in the context of an ACID-based database system, a vast majority of Database Management Systems consider offering a user, an entire range of levels to choose a specific level for consistency, in accordance with the need of an application. Consistency adds to the correctness in return for a reduction in the best performance or CPU utilization, which results in a high throughput of the system.

The purpose for the development of distinct levels of consistency is to specify a procedure to avoid any conflicts due to a number of individual but concurrent threads which might be accessing a shared memory space and hence the data. Here we focus to understand the basics through only READ and WRITE operations over individual data items.

Note that, similar to the case of weaker Isolation Levels, the fact that relatively weaker Consistency Levels, if implemented, ensure an improved performance, in context of total CPU utilisation, holds true.

Different levels at which consistency is being offered are as below:

Sequential Consistency
Strict Consistency
Atomic Consistency or Linear Consistency
Causal Consistency
Eventual Consistency

For all the examples below, we assume that the initial value for X and Y is 0 initially. Also, note the following representations for a given thread ‘T’ and data item ‘V’, used further :

R:V = x ⇒ Read the value of 'V'. The value of 'V' is 'x' at the instant.

W:V = y ⇒ Write the value of 'V'. The value of 'V' is updated to 'y' at the instant.

Each level is discussed in detail below:

1. Sequential Consistency:

Key characteristics for Sequential Consistency are as follows:

The most important condition which forms the basis for the definition of Sequential Consistency is that every WRITE operation is ordered throughout the database system.
Each thread, being executed, must visualize and use the specified order of WRITE operations, set and accessed globally, irrespective of the fact that which thread executed the operation and which data items were modified by a thread.
However, note that it does not always necessarily hold true that the specified global ordering is the same as the real-time sequence of execution of each operation.
There may exist any ordering of the WRITE operations, as long as the specified ordering is agreed upon by and is followed by all the threads.
The only restriction is specified as that the writes and reads originating from the same particular thread of execution must not be reordered.

T₁	W:X = 100
T₂		W:Y = 200
T₃			R:Y = 200	R:X = 0	R:X = 100
T₄			R:Y = 200	R:X = 100

Schedule S-1

S-1 follows Sequential Consistency and Causal Consistency but does not adheres to the conditions specified for Linear Consistency and Strict Consistency.

Observe that threads T₃ and T₄follow the fact that B has been updated before A. Even though by thread T₁ and T₂, we can observe that B was updated after A, as specified above, it is not necessary to follow real-time ordering for creating a global ordering. Hence, the schedule is still considered sequentially consistent because all the threads are in agreement for a certain order of WRITE operations for S-1.

T₁	W:X = 100
T₂		W: Y = 200
T₃			R:Y = 200	R:X = 0	R:X = 100
T₄			R:Y = 0	R:X = 100	R:Y = 200

Schedule S-2

S-2 follows only Causal Consistency and does not adheres to the conditions specified for Sequential Consistency, Linear Consistency or Strict Consistency.

S-2 does not adhere to the rules specified by the definition of Sequential Consistency because T₄ does not follow the global ordering followed by T₃ as:

By T₃, WRITE for Y has been put prior to WRITE of X.
Contrary to T₃, T₄ reads an update to X earlier than the update for Y.
The schedule violates Sequential Consistency as all the threads do not mutually agree on a specific ordering of WRITE Operations.

Note that the consistency levels, discussed next, exist as extensions of Sequential Consistency, where real-time constraints are specified on all the WRITE operations.

2. Strict Consistency:

Key characteristics for Strict Consistency are as follows:

Strict Consistency is the highest level of consistency and hence it is the strongest consistency.
It requires the WRITE operations to be ordered on the basis of the real-time sequence of occurrence for each WRITE operation.
The rule for a strictly consistent schedule specifies that an earlier WRITE operation must be seen before a later WRITE operation.
Also, every READ operation must read the value of the most recent WRITE in real-time order irrespective of which thread of execution initiated that WRITE.
It is not feasible to implement Strict Consistency at real-time distributed systems as it impossible to have a global agreement for precise time and hence the order of execution for vast distributed databases.

T₁	W:X = 100
T₂		W:Y = 200
T₃			R:Y = 200	R:X = 100
T₄			R:X = 100	R:Y = 200

Schedule S-3

S-3 follows Sequential Consistency, Strict Consistency, Atomic Consistency and Causal Consistency. Also note that S-1 an S-2 are not Strictly Consistent.

S-1 and S-2 do not follow Strict Consistency because all of them have either a READ of x=0 (R:X = 0) or a READ of y=0 ( R: Y = 0) after the value of X and Y has been updated to a new value. S-3 is strictly consistent because all the READ operations are reflective of the most recent WRITE operations for each variable.

3. Atomic Consistency or Linear Consistency:

A schedule, if follows the conditions for Atomic Consistency or Linear Consistency, is said to be a Linearizable Schedule. The highest level of consistency that can be used practically for distributed database systems is Linear Consistency. It is also known as Atomic Consistency as used in the context of the CAP Theorem.

Even though Atomic Consistency is very similar to Strict Consistency, note that the differences between Strict Consistency and Atomic Consistency are as follows:

The linearizability model acknowledges that there exists a significant time period difference as t₂-t₁, where at t = t₁, an operation is submitted to the system, and t = t₂, the system responds with an acknowledgment that the operation has been completed.
In a distributed system, the exchange and update of a WRITE operation at all appropriate locations which may also include replicated copies of the database can occur during this time period.
Atomic Consistency does not specify any constraints on operations that occur with overlapping start and end timestamps.
There exists an ordering constraint only for operations for which time stamps for the given operations do not overlap and hence only in such cases, it follows that the earlier WRITE has to be seen before the later WRITE.

T₁	W:X = 100
T₂		W:Y = 200
T₃	R:X = 0		R:Y = 200
T₄			R:X = 100	R:Y = 200

Schedule S-4

S-4 follows Atomic Consistency and hence S-4 is Linearizable. Moreover, S-4 follows Sequential Consistency as well as Causal Consistency. Note that S-4 is not Strictly Consistent.

The schedule S-4 does not follow Strict Consistency because the READ of X by T₃ is initialized a bit after the WRITE of X by T₁but the output for R:X is 0. Nonetheless, the schedule is linearizable because this READ of X by T₃ and write of X by T₁ overlap in time, and therefore linearizability does not require the READ of X by T₃ to represent the result of the write of X by T₁.

While Strict and Atomic Consistency is stronger than Sequential Consistency, it must be noted that Sequential Consistency, itself, is a very high level of consistency, and there exist weaker levels of consistency below it.

4. Causal Consistency:

Causal consistency is the most popular and very useful consistency level that is below Sequential Consistency.
In sequential consistency, all WRITE operations were bound to be globally ordered irrespective of the fact whether they were related to each other or not. Causal Consistency does not define and holds constraints on unrelated WRITE operations.
If a particular thread executes a READ of a data item (as R:X) and then WRITE the same or a different data item (R: Y), Causal Consistency assumes that the subsequent write may have been caused by the read. Therefore, the global ordering specified for X and Y is as all threads, while execution must observe the WRITE of Y after the WRITE of X.

T1	W:X = 100
T2		R:X = 100	W:Y = 200
T3				R:X = 100	R:Y = 0	R:Y = 200
T4				R:Y = 200	R:X = 100

Schedule S-5

Here, for S-5, Sequential Consistency is violated. The schedule, however, follows Causal Consistency.

Also, observe that even though S-2 ( Refer Sequential Consistency) does not follows Sequential Consistency , it does follows Causal Consistency because the order of X and Y is not restricted by any transaction.

T1	W:X = 100
T2		R:X = 100	W:Y = 200
T3				R:X = 100	R:Y = 0	R:Y = 200
T4				R:Y = 200	R:X = 0

Schedule S-6

The schedule S-6 does not follows any amongst the consistencies, been discussed so far. Also, note that the schedule is Eventually Consistent and Eventual Consistency is the weakest consistency discussed next.

5. Eventual Consistency:

Eventual Consistency is the weakest level of consistency.
The only assurance offered here is that if there are no WRITE operations for a specified period of time, the threads will have to eventually agree on the value updated by the latest WRITE operation.
And hence, eventually, all the copies of the specific database will reflect the same value for a data item.

Note the difference between Serializability and Linearizability:

S. No	Serializability	Linearizability
1.	Serializability is an assurance for transactions, or of one or more operations over objects.	Linearizability is an assurance for single operations on single objects.
2.	By definition, it assures that the execution of a set of transactions that generally contains READ and WRITE operations, over a number of items is equivalent to a serial execution of the transactions.	It specifies a real-time constraint on the behavior of a set of single operations as READ and WRITE, on a single object as a data item or a distributed register.
3.	Serializability stands for “I,” or isolation, in ACID. If users’ transactions each preserve logical correctness (“C,” or consistency, in ACID) is to check for the same.	Linearizability for READ and WRITE operations are as explained for the term “Atomic Consistency” and is the “C,” or “consistency,” in Gilbert and Lynch’s proof of the CAP Theorem.
4.	Serializability does not imply any type of deterministic order, but it simply requires an equivalent serial execution solution as output to existing.	Once a READ returns a specific value, all later read should return that value or the value of a later WRITE as discussed for Linear or Atomic Consistency.

Degree Two Consistency:

The aim for the introduction of Degree-Two Consistency is to avoid cascading aborts without compulsorily ensuring serializability.

The modes of the lock being used to implement the locking protocol, to ensure consistency is similar to the two used for Two-Phase Lock Protocol: Shared ( S) and Exclusive (X).

A transaction is required to hold an appropriate lock while accessing a data item but an accurate implementation of Two-Phase Lock is not required as contrary to 2-PL, S-Locks can be released at any instant of time and a lock can be acquired at any instant.
Note that the Exclusive Locks or X-Locks can not be released unless or until the transaction has either been committed or it aborts.
It may be possible that a transaction may execute a READ operation for a given data item repeatedly and get two distinct output values for each READ. (As discussed for weaker consistency levels discussed above)

An example for a Non-Serializable Schedule with Degree-Two Consistency is as below:

T₁	T₂
lock-S(Q)
read(Q)
unlock(Q)
	lock-X(Q)
	read(Q)
	write(Q)
	unlock(Q)
lock-S(Q)
read(Q)
unlock(Q)

Cursor Stability:

It is a form of Degree-Two Consistency. It is specifically designed and used to serve the purpose for programs where there is an iteration over tuples of relation by use of cursors.

Contrary to the procedure of locking the entire relation, cursor stability ensures that:

Only the tuple which is being processed by the iteration at the current instant is locked by an S-Lock in a shared mode.
If a tuple is modified, it is locked in exclusive mode by an X-Lock until the transaction commits so as to ensure consistency.

Both the rules specified above ensure degree-two consistency which is the weaker level of consistency.

Advantages:

It eliminates the need for a Two-Phase Lock Protocol.
It is being used for efficient system performance in practice for heavily accessed relations as an easy means of increasing concurrency.

Drawbacks:

Serializability is not assured by the procedure.
The developer must ensure that the application is coded in a way to ensure database consistency even in a situation where there is a high possibility of the existence of non-serializable schedules.
Hence, the use of Cursor Stability is restricted to a limited form of executable transactions and that too under specialized and safe code design as a trade-off against system performance.

Article Tags :

Computer Subject

DBMS

GATE CS