Design of Parallel Databases | DBMS
A parallel DBMS is a DBMS that runs across multiple processors or CPUs and is mainly designed to execute query operations in parallel, wherever possible. The parallel DBMS link a number of smaller machines to achieve the same throughput as expected from a single large machine.
In Parallel Databases, mainly there are three architectural designs for parallel DBMS. They are as follows:
Attention reader! Don’t stop learning now. Get hold of all the important CS Theory concepts for SDE interviews with the CS Theory Course at a student-friendly price and become industry ready.
- Shared Memory Architecture
- Shared Disk Architecture
- Shared Nothing Architecture
Let’s discuss them one by one:
1. Shared Memory Architecture- In Shared Memory Architecture, there are multiple CPUs that are attached to an interconnection network. They are able to share a single or global main memory and common disk arrays. It is to be noted that, In this architecture, a single copy of a multi-threaded operating system and multithreaded DBMS can support these multiple CPUs. Also, the shared memory is a solid coupled architecture in which multiple CPUs share their memory. It is also known as Symmetric multiprocessing (SMP). This architecture has a very wide range which starts from personal workstations that support a few microprocessors in parallel via RISC.
- It has high-speed data access for a limited number of processors.
- The communication is efficient.
- It cannot use beyond 80 or 100 CPUs in parallel.
- The bus or the interconnection network gets block due to the increment of the large number of CPUs.
2. Shared Disk Architectures :
In Shared Disk Architecture, various CPUs are attached to an interconnection network. In this, each CPU has its own memory and all of them have access to the same disk. Also, note that here the memory is not shared among CPUs therefore each node has its own copy of the operating system and DBMS. Shared disk architecture is a loosely coupled architecture optimized for applications that are inherently centralized. They are also known as clusters.
- The interconnection network is no longer a bottleneck each CPU has its own memory.
- Load-balancing is easier in shared disk architecture.
- There is better fault tolerance.
- If the number of CPUs increases, the problems of interference and memory contentions also increase.
- There’s also exists a scalability problem.
3, Shared Nothing Architecture :
Shared Nothing Architecture is multiple processor architecture in which each processor has its own memory and disk storage. In this, multiple CPUs are attached to an interconnection network through a node. Also, note that no two CPUs can access the same disk area. In this architecture, no sharing of memory or disk resources is done. It is also known as Massively parallel processing (MPP).
- It has better scalability as no sharing of resources is done
- Multiple CPUs can be added
- The cost of communications is higher as it involves sending of data and software interaction at both ends
- The cost of non-local disk access is higher than the cost of shared disk architectures.
Note that this technology is typically used for very large databases that have the size of 1012 bytes or TB or for the system that has the process of thousands of transactions per second.