Concurrent Merge Sort in Shared Memory
Given a number ‘n’ and a n numbers, sort the numbers using Concurrent Merge Sort. (Hint: Try to use shmget, shmat system calls).
Part1: The algorithm (HOW?)
Recursively make two child processes, one for the left half, one of the right half. If the number of elements in the array for a process is less than 5, perform a Insertion Sort. The parent of the two children then merges the result and returns back to the parent and so on. But how do you make it concurrent?
Part2: The logical (WHY?)
The important part of the solution to this problem is not algorithmic, but to explain concepts of Operating System and kernel.
To achieve concurrent sorting, we need a way to make two processes to work on the same array at the same time. To make things easier Linux provides a lot of system calls via simple API endpoints. Two of them are, shmget() (for shared memory allocation) and shmat() (for shared memory operations). We create a shared memory space between the child process that we fork. Each segment is split into left and right child which is sorted, the interesting part being they are working concurrently! The shmget() requests the kernel to allocate a shared page for both the processes.
Why traditional fork() does not work?
The answer lies in what fork() actually does. From the documentation, “fork() creates a new process by duplicating the calling process”. The child process and the parent process run in separate memory spaces. At the time of fork() both memory spaces have the same content. Memory writes, file-descriptor(fd) changes, etc, performed by one of the processes do not affect the other. Hence we need a shared memory segment.
Sorting Done Successfully
Try to time the code and compare its performance with the traditional sequential code. You would be surprised to know that sequential sort performance better!
When, say left child, access the left array, the array is loaded into the cache of a processor. Now when the right array is accessed (because of concurrent accesses), there is a cache miss since the cache is filled with left segment and then right segment is copied to the cache memory. This to-and-fro process continues and it degrades the performance to such a level that it performs poorer than the sequential code.
There are ways to reduce the cache misses by controlling the workflow of the code. But they cannot be avoided completely!
This article is contributed by Pinkesh Badjatiya .If you like GeeksforGeeks and would like to contribute, you can also write an article using write.geeksforgeeks.org or mail your article to firstname.lastname@example.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.
Please write comments if you find anything incorrect, or you want to share more information about the topic discussed above.