Heterogeneous and other DSM systems | Distributed systems
A distributed shared memory is a system that allows end-user processes to access shared data without the need for inter-process communication. The shared-memory paradigm applied to loosely-coupled distributed-memory systems is known as Distributed Shared Memory (DSM).
Distributed shared memory (DSM) is a type of memory architecture in computer science that allows physically distinct memories to be addressed as one logically shared address space. Sharing here does not refer to a single central memory, but rather to the address space.
In other words, the goal of a DSM system is to make inter-process communications transparent to end-users.
Need for Heterogeneous DSM (HDSM):
Many computing environments have heterogeneity, which is almost always unavoidable because hardware and software are generally tailored to a certain application area. Supercomputers and multiprocessors, for example, excel at compute-intensive tasks but struggle with user interface and device I/O. Personal computers and workstations, on the other hand, are frequently equipped with excellent user interfaces.
Many applications necessitate complex user interfaces, specific I/O devices, and large amounts of computational power. Artificial intelligence, CAM, interactive graphics, and interactive simulation are all examples of such applications. As a result, integrating diverse machines into a coherent distributed system and sharing resources among them is very desirable.
HDSM is beneficial for distributed and parallel applications that need to access resources from numerous hosts at the same time.
In a heterogeneous computing environment, applications can take advantage of the best of several computing architectures. Heterogeneity is typically desired in distributed systems. With such a heterogeneous DSM system, memory sharing between machines with different architectures will be conceivable. The two major issues in building heterogeneous DSM are :
(i) Data Compatibility and conversion
(ii) Block Size Selection
Data compatibility & conversion:
The data comparability and conversion is the initial design concern in a heterogeneous DSM system. Different byte-ordering and floating-point representations may be used by machines with different architectures. Data that is sent from one machine to another must be converted to the destination machine’s format. The data transmission unit (block) must be transformed according to the data type of its contents. As a result, application programmers must be involved because they are familiar with the memory layout. In heterogeneous DSM systems, data conversion can be accomplished by organizing the system as a collection of source language objects or by allowing only one type of data block.
- DSM as a collection of source language objects:
The DSM is structured as a collection of source language objects, according to the first technique of data conversion. The unit of data migration in this situation is either a shared variable or an object. Conversion procedures can be used directly by the compiler to translate between different machine architectures. The DSM system checks whether the requesting node and the node that has the object are compatible before accessing remote objects or variables. If the nodes are incompatible, it invokes a conversion routine, translates, and migrates the shared variable or object.
This approach is employed in Agora Shared Memory systems, and while it is handy for data conversion, it has a low performance. Scalars, arrays, and structures are the objects of programming languages. Each of them necessitates access rights, and migration need communication overhead. Due to the limited packet size of transport protocols, access to big arrays may result in false sharing and thrashing, while migration would entail fragmentation and reassembling.
- DSM as one type of data block:
Only one type of data block is allowed in the second data conversion procedure. Mermaid DSM use this approach, which uses a page size equal to the block size. Additional information is kept in the page table entry, such as the type of data preserved in the page and the amount of data allocated to the page. The method changes the page to an appropriate format when there are page defects or incompatibilities.
There are a few drawbacks to this method. Because the block only contains one sort of data, fragmentation might waste memory. Compilers on heterogeneous computers must also be consistent in terms of data type size and field order within compound structures in the code generated by the compiler. Even though only a small piece of the page can be accessed, the complete page is transformed and sent. Because users must describe the conversion process for the user-specified data type, as well as the mapping of the data type to the conversion routine, transparency is reduced. Finally, if data is translated too frequently, the accuracy of floating point numbers may suffer.
Block size selection :
The choice of block size is another difficulty in creating heterogeneous DSM systems. In a DSM system, heterogeneous machines can have varying virtual memory page sizes. As a result, any of the following algorithms can be used to choose the right block size :
- Largest page size: The DSM block size is the largest virtual memory page size among all machines in this technique, as the name implies. Multiple virtual memory pages can fit within a single DSM block since the page size is always the power of two. Multiple blocks, including the required page, are received in the event of a page fault on a node with a reduced page size. False sharing and thrashing are common problem that occur frequently because of larger block size.
- Smallest page size: The DSM block size is selected as the smallest virtual memory page size available across all computers. Multiple blocks will be sent when a page fault occurs on a node with a greater page size. This approach decreases data contention but introduces additional block table management overheads due to greater communication.
- Intermediate page size: Given the aforementioned two procedures, the optimum option is to choose a block size that falls between the machines’ largest and smallest virtual memory page sizes. This method is used to balance the issues that come with large and small blocks.
Based on how data catching is managed, there are three approaches for designing a DSM system :
1. DSM managed by the OS
2. DSM managed by the MMU hardware
3. DSM managed by the language runtime system.
- DSM managed by the OS : This area of data cache management by the OS includes distributed operating systems like Ivy and Mirage. Each node has its own memory, and a page fault sends a trap to the operating system, which then employs exchange messages to identify and fetch the required block. The operating system is in charge of data placement and migration.
- DSM managed by the MMU hardware : Memory caching is managed by the MMU hardware and cache in this technique. To keep cache consistency, snooping bus protocols are utilized. DEC Firefly, for example, uses memory hardware. Directories can be used to keep track of where data blocks and nodes are located. MMU locates and transfers the requested page in the event of a page fault.
- DSM managed by the language runtime system : The DSM is organized as a set of programming language elements, such as shared variables, data structures, and shared objects, in this system. The programming language and the operating system handle the placement and migration of these shared variables or objects at runtime. Features to indicate the data utilization pattern can be added to the programming language. Such a system can support several consistency protocols and can be applied to the granularity of individual data. But this technique imposes extra burden on the programmer. Examples o such systems are Munin and Midway , which uses shared variables, while Orca and Linda uses shared Objects.
Advantages of DSM:
- When a block is moved, take use of locality-of-reference.
- Passing-by-reference and passing complex data structures are made easier with a single address space.
- There is no memory access bottleneck because there is no single bus.
- Because DSM programs have a similar programming interface, they are portable.
- Virtual memory space that is quite large.
Difference between a Homogeneous DSM & Heterogeneous DSM:
When distributed application components share memory directly through DSM, they are more tightly connected than when data is shared through DSM. As a result, extending a DSM to a heterogeneous system environment is difficult.
The performance of a homogeneous DSM is slight better than a heterogeneous DSM. Despite a number of challenges in data conversion, Heterogeneous DSM can be accomplished with functional and performance transparency that is comparable to homogeneous DSM.