Sorting with Tapes : Balanced Merge

Balanced merge is a type of sorting algorithm where data is stored on multiple tapes (or other forms of storage). The tapes are divided into two groups: odd-numbered tapes and even-numbered tapes. The data on each tape is initially sorted, and the algorithm’s goal is to merge the data from all of the tapes into a single sorted output.

Approach: Here’s an approach for the above algorithm:

The algorithm works by repeatedly making passes over the data, merging pairs of tapes at a time. On each pass, the tapes are divided into pairs, and each pair’s data is merged into a temporary output tape. After all, pairs have been processed, the data on the temporary output tapes are copied back to the original tapes, and the process is repeated with a new set of tape pairs until all of the data is on a single tape.

Illustration: Here is an illustration with the approach used:

Suppose we have the following list of data that we want to sort: data[] = [5, 2, 4, 6, 1, 3]

Here are the steps involved in the sorting process:

• Step 1: Initialize the tapes and divide the data among them. We create three empty tapes and divide the data among them like this: tapes = [[], [], []]
for i, x in enumerate(data):
tapes[i % 3].append(x)
# tapes is now [[5, 4, 1], [2, 6, 3], []]

• Step 2: Sort the data on each tape using an internal sorting technique. We can use the built-in sort function to sort the data on each:
for tape in tapes:
tape.sort()
# tapes is now [[1, 4, 5], [2, 3, 6], []]

• Step 3: Repeatedly make passes over the tapes, merging pairs of tapes at a time. On each pass, we divide the tapes into pairs and merge the data from each pair into a temporary output tape using the merge function:
while len(tapes) > 1:
new_tapes = []
for i in range(0, len(tapes), 2):
tape1 = tapes[i]
tape2 = tapes[i+1] if i+1 < len(tapes) else None
new_tapes.append(merge(tape1, tape2))
tapes = new_tapes
# After the first pass, tapes is [[1, 2, 3, 4, 5, 6]]

• Step 4: Copy the data from the temporary output tapes back to the original tapes. Since we only have one tape at this point, we can skip this step.
• Step 5: Repeat the process until all of the data is on a single tape. Since we only have one tape at this point, the sorting process is complete.

The final result is a sorted version of the original data:
sorted_data = tapes[0]
# sorted_data is [1, 2, 3, 4, 5, 6]

The steps involved in implementing the balanced merge sorting algorithm are:

• Initialize the tapes and divide the data among them.
• Sort the data on each tape using an internal sorting technique.
• Repeatedly make passes over the tapes, merging pairs of tapes at a time.
• Copy the data from the temporary output tapes back to the original tapes.
• Repeat the process until all of the data is on a single tape.

Here is the code that demonstrates this approach:

C++

 `// C++ code for the above approach``#include ``using` `namespace` `std;` `vector<``int``> merge(vector<``int``> tape1, vector<``int``> tape2) {``    ``if` `(tape2.size() == 0) {``        ``return` `tape1;``    ``}` `    ``// Merge the data from tape1 and tape2``    ``// into a temporary output tape``    ``vector<``int``> output_tape;` `    ``int` `i = 0, j = 0;` `    ``// Merge all the tapes left``    ``while` `(i < tape1.size() && j < tape2.size()) {``        ``if` `(tape1[i] < tape2[j]) {``            ``output_tape.push_back(tape1[i]);``            ``i++;``        ``} ``else` `{``            ``output_tape.push_back(tape2[j]);``            ``j++;``        ``}``    ``}``    ``output_tape.insert(output_tape.end(), tape1.begin() + i, tape1.end());``    ``output_tape.insert(output_tape.end(), tape2.begin() + j, tape2.end());``    ``return` `output_tape;``}` `// Function for Balanced merge sort``vector<``int``> balanced_merge_sort(vector<``int``> data, ``int` `num_tapes) {``    ``// Initialize the tapes``    ``vector> tapes(num_tapes);` `    ``// Divide the data among the tapes``    ``for` `(``int` `i = 0; i < data.size(); i++) {``        ``tapes[i % num_tapes].push_back(data[i]);``    ``}` `    ``// Sort each tape``    ``for` `(``int` `i = 0; i < num_tapes; i++) {``        ``sort(tapes[i].begin(), tapes[i].end());``    ``}` `    ``// Repeatedly make passes over the tapes,``    ``// merging pairs of tapes at a time``    ``while` `(tapes.size() > 1) {``        ``vector> new_tapes;``        ``for` `(``int` `i = 0; i < tapes.size(); i += 2) {``            ``vector<``int``> tape1 = tapes[i];``            ``vector<``int``> tape2 = (i + 1 < tapes.size()) ? tapes[i + 1] : vector<``int``>();``            ``new_tapes.push_back(merge(tape1, tape2));``        ``}``        ``tapes = new_tapes;``    ``}` `    ``// Return the final merged tape``    ``return` `tapes[0];``}` `// Driver code``int` `main()``{``  ` `    ``// Input``    ``vector<``int``> data = {5, 2, 4, 6, 1, 3};` `    ``// Function call``    ``vector<``int``> sorted_data = balanced_merge_sort(data, 3);` `    ``for` `(``int` `i = 0; i < sorted_data.size(); i++) {``        ``cout << sorted_data[i] << ``" "``;``    ``}``    ``cout << endl;``    ``return` `0;``}` `// This code is contributed by lokeshpotta20.`

Java

 `// Java code for the above approach``import` `java.io.*;``import` `java.util.*;` `class` `GFG {` `  ``public` `static` `List merge(List tape1,``                                    ``List tape2)``  ``{``    ``if` `(tape2.size() == ``0``) {``      ``return` `tape1;``    ``}` `    ``// Merge the data from tape1 and tape2``    ``// into a temporary output tape``    ``List outputTape = ``new` `ArrayList<>();` `    ``int` `i = ``0``, j = ``0``;` `    ``// Merge all the tapes left``    ``while` `(i < tape1.size() && j < tape2.size()) {``      ``if` `(tape1.get(i) < tape2.get(j)) {``        ``outputTape.add(tape1.get(i));``        ``i++;``      ``}``      ``else` `{``        ``outputTape.add(tape2.get(j));``        ``j++;``      ``}``    ``}``    ``outputTape.addAll(tape1.subList(i, tape1.size()));``    ``outputTape.addAll(tape2.subList(j, tape2.size()));``    ``return` `outputTape;``  ``}` `  ``// Function for Balanced merge sort``  ``public` `static` `List``    ``balancedMergeSort(List data, ``int` `numTapes)``  ``{``    ``// Initialize the tapes``    ``List > tapes = ``new` `ArrayList<>();``    ``for` `(``int` `i = ``0``; i < numTapes; i++)``      ``tapes.add(``new` `ArrayList<>());` `    ``// Divide the data among the tapes``    ``for` `(``int` `i = ``0``; i < data.size(); i++) {``      ``tapes.get(i % numTapes).add(data.get(i));``    ``}` `    ``// Sort each tape``    ``for` `(``int` `i = ``0``; i < numTapes; i++) {``      ``tapes.get(i).sort(Integer::compareTo);``    ``}` `    ``// Repeatedly make passes over the tapes,``    ``// merging pairs of tapes at a time``    ``while` `(tapes.size() > ``1``) {``      ``List > newTapes``        ``= ``new` `ArrayList<>();``      ``for` `(``int` `i = ``0``; i < tapes.size(); i += ``2``) {``        ``List tape1 = tapes.get(i);``        ``List tape2``          ``= (i + ``1` `< tapes.size())``          ``? tapes.get(i + ``1``)``          ``: ``new` `ArrayList<>();``        ``newTapes.add(merge(tape1, tape2));``      ``}``      ``tapes = newTapes;``    ``}` `    ``// Return the final merged tape``    ``return` `tapes.get(``0``);``  ``}` `  ``public` `static` `void` `main(String[] args)``  ``{``    ``// Input``    ``List data = List.of(``5``, ``2``, ``4``, ``6``, ``1``, ``3``);` `    ``// Function call``    ``List sortedData``      ``= balancedMergeSort(data, ``3``);` `    ``for` `(``int` `i = ``0``; i < sortedData.size(); i++) {``      ``System.out.print(sortedData.get(i) + ``" "``);``    ``}``    ``System.out.println();``  ``}``}` `// This code is contributed by lokesh.`

Python3

 `# Python code for the above approach` `# Function for Balanced merge sort`  `def` `balanced_merge_sort(data, num_tapes):` `    ``# Initialize the tapes``    ``tapes ``=` `[[] ``for` `_ ``in` `range``(num_tapes)]` `    ``# Divide the data among the tapes``    ``for` `i, x ``in` `enumerate``(data):` `        ``tapes[i ``%` `num_tapes].append(x)` `    ``# Sort each tape``    ``for` `tape ``in` `tapes:``        ``tape.sort()` `    ``# Repeatedly make passes over the tapes,``    ``# merging pairs of tapes at a time``    ``while` `len``(tapes) > ``1``:` `        ``new_tapes ``=` `[]``        ``for` `i ``in` `range``(``0``, ``len``(tapes), ``2``):``            ``tape1 ``=` `tapes[i]``            ``tape2 ``=` `tapes[i ``+` `1``] ``if` `i ``+` `1` `< ``len``(tapes) ``else` `None``            ``new_tapes.append(merge(tape1, tape2))``        ``tapes ``=` `new_tapes` `    ``# Return the final merged tape``    ``return` `tapes[``0``]`  `def` `merge(tape1, tape2):``    ``if` `tape2 ``is` `None``:``        ``return` `tape1` `    ``# Merge the data from tape1 and tape2``    ``# into a temporary output tape``    ``output_tape ``=` `[]` `    ``i ``=` `j ``=` `0` `    ``# Merge all the tapes left``    ``while` `i < ``len``(tape1) ``and` `j < ``len``(tape2):` `        ``if` `tape1[i] < tape2[j]:``            ``output_tape.append(tape1[i])``            ``i ``+``=` `1``        ``else``:``            ``output_tape.append(tape2[j])``            ``j ``+``=` `1``    ``output_tape.extend(tape1[i:])``    ``output_tape.extend(tape2[j:])``    ``return` `output_tape` `# Driver code`  `# Input``data ``=` `[``5``, ``2``, ``4``, ``6``, ``1``, ``3``]` `# Function call``sorted_data ``=` `balanced_merge_sort(data, ``3``)` `print``(sorted_data)`

C#

 `// C# code for the above approach``using` `System;``using` `System.Collections.Generic;` `public` `class` `GFG {` `  ``public` `static` `List<``int``> merge(List<``int``> tape1,``                                ``List<``int``> tape2)``  ``{``    ``if` `(tape2.Count == 0) {``      ``return` `tape1;``    ``}` `    ``// Merge the data from tape1 and tape2``    ``// into a temporary output tape``    ``List<``int``> outputTape = ``new` `List<``int``>();` `    ``int` `i = 0, j = 0;` `    ``// Merge all the tapes left``    ``while` `(i < tape1.Count && j < tape2.Count) {``      ``if` `(tape1[i] < tape2[j]) {``        ``outputTape.Add(tape1[i]);``        ``i++;``      ``}``      ``else` `{``        ``outputTape.Add(tape2[j]);``        ``j++;``      ``}``    ``}``    ``outputTape.AddRange(``      ``tape1.GetRange(i, tape1.Count - i));``    ``outputTape.AddRange(``      ``tape2.GetRange(j, tape2.Count - j));``    ``return` `outputTape;``  ``}` `  ``// Function for Balanced merge sort``  ``public` `static` `List<``int``>``    ``balancedMergeSort(List<``int``> data, ``int` `numTapes)``  ``{``    ``// Initialize the tapes``    ``List > tapes = ``new` `List >();``    ``for` `(``int` `i = 0; i < numTapes; i++)``      ``tapes.Add(``new` `List<``int``>());``    ``// Divide the data among the tapes``    ``for` `(``int` `i = 0; i < data.Count; i++) {``      ``tapes[i % numTapes].Add(data[i]);``    ``}` `    ``// Sort each tape``    ``for` `(``int` `i = 0; i < numTapes; i++) {``      ``tapes[i].Sort();``    ``}` `    ``// Repeatedly make passes over the tapes,``    ``// merging pairs of tapes at a time``    ``while` `(tapes.Count > 1) {``      ``List > newTapes``        ``= ``new` `List >();``      ``for` `(``int` `i = 0; i < tapes.Count; i += 2) {``        ``List<``int``> tape1 = tapes[i];``        ``List<``int``> tape2 = (i + 1 < tapes.Count)``          ``? tapes[i + 1]``          ``: ``new` `List<``int``>();``        ``newTapes.Add(merge(tape1, tape2));``      ``}``      ``tapes = newTapes;``    ``}` `    ``// Return the final merged tape``    ``return` `tapes[0];``  ``}` `  ``static` `public` `void` `Main()``  ``{` `    ``// Input``    ``List<``int``> data = ``new` `List<``int``>{ 5, 2, 4, 6, 1, 3 };` `    ``// Function call``    ``List<``int``> sortedData = balancedMergeSort(data, 3);` `    ``for` `(``int` `i = 0; i < sortedData.Count; i++) {``      ``Console.Write(sortedData[i] + ``" "``);``    ``}``    ``Console.WriteLine();``  ``}``}` `// This code is contributed by lokeshmvs21.`

Javascript

 `// JavaScript code for the above approach` `// Function to merge two sorted tapes``function` `merge(tape1, tape2) {``    ``// If one of the tapes is empty,``    ``// return the other tape``    ``if` `(tape2.length == 0) {``        ``return` `tape1;``    ``}` `    ``// Merge the data from tape1 and tape2``    ``// into a temporary output tape``    ``let output_tape = [];` `    ``let i = 0, j = 0;` `    ``// Merge all the tapes left``    ``while` `(i < tape1.length && j < tape2.length) {``        ``if` `(tape1[i] < tape2[j]) {``            ``output_tape.push(tape1[i]);``            ``i++;``        ``} ``else` `{``            ``output_tape.push(tape2[j]);``            ``j++;``        ``}``    ``}``    ``output_tape.push(...tape1.slice(i));``    ``output_tape.push(...tape2.slice(j));``    ``return` `output_tape;``}` `// Function for Balanced merge sort``function` `balanced_merge_sort(data, num_tapes) {``    ``// Initialize the tapes``    ``let tapes = ``new` `Array(num_tapes);``    ``for` `(let i=0; i 1) {``        ``let new_tapes = [];``        ``for` `(let i = 0; i < tapes.length; i += 2) {``            ``let tape1 = tapes[i];``            ``let tape2 = (i + 1 < tapes.length) ? tapes[i + 1] : [];``            ``new_tapes.push(merge(tape1, tape2));``        ``}``        ``tapes = new_tapes;``    ``}` `    ``// Return the final merged tape``    ``return` `tapes[0];``}` `// Driver code``let data = [5, 2, 4, 6, 1, 3]; ``// Input``let sorted_data = balanced_merge_sort(data, 3);` `console.log(sorted_data); ``// Output`

Output

`[1, 2, 3, 4, 5, 6]`

Time Complexity:  O(n log n)
Auxiliary Space: O(n)

1. Can take advantage of parallelism to speed up the sorting process
2. The time complexity of O(n log n), which is generally considered to be efficient for large amounts of data
3. Can handle large amounts of data