Open In App

Sorting with Tapes : Balanced Merge

Improve
Improve
Like Article
Like
Save
Share
Report

Balanced merge is a type of sorting algorithm where data is stored on multiple tapes (or other forms of storage). The tapes are divided into two groups: odd-numbered tapes and even-numbered tapes. The data on each tape is initially sorted, and the algorithm’s goal is to merge the data from all of the tapes into a single sorted output.

Approach: Here’s an approach for the above algorithm:

The algorithm works by repeatedly making passes over the data, merging pairs of tapes at a time. On each pass, the tapes are divided into pairs, and each pair’s data is merged into a temporary output tape. After all, pairs have been processed, the data on the temporary output tapes are copied back to the original tapes, and the process is repeated with a new set of tape pairs until all of the data is on a single tape.

Illustration: Here is an illustration with the approach used:

Suppose we have the following list of data that we want to sort: data[] = [5, 2, 4, 6, 1, 3]

Here are the steps involved in the sorting process:

  • Step 1: Initialize the tapes and divide the data among them. We create three empty tapes and divide the data among them like this: tapes = [[], [], []]
            for i, x in enumerate(data):
                 tapes[i % 3].append(x)
    # tapes is now [[5, 4, 1], [2, 6, 3], []]
     
  • Step 2: Sort the data on each tape using an internal sorting technique. We can use the built-in sort function to sort the data on each:
          for tape in tapes:
                tape.sort()
    # tapes is now [[1, 4, 5], [2, 3, 6], []]
     
  • Step 3: Repeatedly make passes over the tapes, merging pairs of tapes at a time. On each pass, we divide the tapes into pairs and merge the data from each pair into a temporary output tape using the merge function: 
              while len(tapes) > 1:
                  new_tapes = []
             for i in range(0, len(tapes), 2):
                  tape1 = tapes[i]
                  tape2 = tapes[i+1] if i+1 < len(tapes) else None
                  new_tapes.append(merge(tape1, tape2))
                  tapes = new_tapes
    # After the first pass, tapes is [[1, 2, 3, 4, 5, 6]]
     
  • Step 4: Copy the data from the temporary output tapes back to the original tapes. Since we only have one tape at this point, we can skip this step.
  • Step 5: Repeat the process until all of the data is on a single tape. Since we only have one tape at this point, the sorting process is complete.

The final result is a sorted version of the original data:
sorted_data = tapes[0]
# sorted_data is [1, 2, 3, 4, 5, 6]

The steps involved in implementing the balanced merge sorting algorithm are:

  • Initialize the tapes and divide the data among them.
  • Sort the data on each tape using an internal sorting technique.
  • Repeatedly make passes over the tapes, merging pairs of tapes at a time.
  • Copy the data from the temporary output tapes back to the original tapes.
  • Repeat the process until all of the data is on a single tape.

Here is the code that demonstrates this approach:

C++




// C++ code for the above approach
#include <bits/stdc++.h>
using namespace std;
 
vector<int> merge(vector<int> tape1, vector<int> tape2) {
    if (tape2.size() == 0) {
        return tape1;
    }
 
    // Merge the data from tape1 and tape2
    // into a temporary output tape
    vector<int> output_tape;
 
    int i = 0, j = 0;
 
    // Merge all the tapes left
    while (i < tape1.size() && j < tape2.size()) {
        if (tape1[i] < tape2[j]) {
            output_tape.push_back(tape1[i]);
            i++;
        } else {
            output_tape.push_back(tape2[j]);
            j++;
        }
    }
    output_tape.insert(output_tape.end(), tape1.begin() + i, tape1.end());
    output_tape.insert(output_tape.end(), tape2.begin() + j, tape2.end());
    return output_tape;
}
 
// Function for Balanced merge sort
vector<int> balanced_merge_sort(vector<int> data, int num_tapes) {
    // Initialize the tapes
    vector<vector<int>> tapes(num_tapes);
 
    // Divide the data among the tapes
    for (int i = 0; i < data.size(); i++) {
        tapes[i % num_tapes].push_back(data[i]);
    }
 
    // Sort each tape
    for (int i = 0; i < num_tapes; i++) {
        sort(tapes[i].begin(), tapes[i].end());
    }
 
    // Repeatedly make passes over the tapes,
    // merging pairs of tapes at a time
    while (tapes.size() > 1) {
        vector<vector<int>> new_tapes;
        for (int i = 0; i < tapes.size(); i += 2) {
            vector<int> tape1 = tapes[i];
            vector<int> tape2 = (i + 1 < tapes.size()) ? tapes[i + 1] : vector<int>();
            new_tapes.push_back(merge(tape1, tape2));
        }
        tapes = new_tapes;
    }
 
    // Return the final merged tape
    return tapes[0];
}
 
// Driver code
int main()
{
   
    // Input
    vector<int> data = {5, 2, 4, 6, 1, 3};
 
    // Function call
    vector<int> sorted_data = balanced_merge_sort(data, 3);
 
    for (int i = 0; i < sorted_data.size(); i++) {
        cout << sorted_data[i] << " ";
    }
    cout << endl;
    return 0;
}
 
// This code is contributed by lokeshpotta20.


Java




// Java code for the above approach
import java.io.*;
import java.util.*;
 
class GFG {
 
  public static List<Integer> merge(List<Integer> tape1,
                                    List<Integer> tape2)
  {
    if (tape2.size() == 0) {
      return tape1;
    }
 
    // Merge the data from tape1 and tape2
    // into a temporary output tape
    List<Integer> outputTape = new ArrayList<>();
 
    int i = 0, j = 0;
 
    // Merge all the tapes left
    while (i < tape1.size() && j < tape2.size()) {
      if (tape1.get(i) < tape2.get(j)) {
        outputTape.add(tape1.get(i));
        i++;
      }
      else {
        outputTape.add(tape2.get(j));
        j++;
      }
    }
    outputTape.addAll(tape1.subList(i, tape1.size()));
    outputTape.addAll(tape2.subList(j, tape2.size()));
    return outputTape;
  }
 
  // Function for Balanced merge sort
  public static List<Integer>
    balancedMergeSort(List<Integer> data, int numTapes)
  {
    // Initialize the tapes
    List<List<Integer> > tapes = new ArrayList<>();
    for (int i = 0; i < numTapes; i++)
      tapes.add(new ArrayList<>());
 
    // Divide the data among the tapes
    for (int i = 0; i < data.size(); i++) {
      tapes.get(i % numTapes).add(data.get(i));
    }
 
    // Sort each tape
    for (int i = 0; i < numTapes; i++) {
      tapes.get(i).sort(Integer::compareTo);
    }
 
    // Repeatedly make passes over the tapes,
    // merging pairs of tapes at a time
    while (tapes.size() > 1) {
      List<List<Integer> > newTapes
        = new ArrayList<>();
      for (int i = 0; i < tapes.size(); i += 2) {
        List<Integer> tape1 = tapes.get(i);
        List<Integer> tape2
          = (i + 1 < tapes.size())
          ? tapes.get(i + 1)
          : new ArrayList<>();
        newTapes.add(merge(tape1, tape2));
      }
      tapes = newTapes;
    }
 
    // Return the final merged tape
    return tapes.get(0);
  }
 
  public static void main(String[] args)
  {
    // Input
    List<Integer> data = List.of(5, 2, 4, 6, 1, 3);
 
    // Function call
    List<Integer> sortedData
      = balancedMergeSort(data, 3);
 
    for (int i = 0; i < sortedData.size(); i++) {
      System.out.print(sortedData.get(i) + " ");
    }
    System.out.println();
  }
}
 
// This code is contributed by lokesh.


Python3




# Python code for the above approach
 
# Function for Balanced merge sort
 
 
def balanced_merge_sort(data, num_tapes):
 
    # Initialize the tapes
    tapes = [[] for _ in range(num_tapes)]
 
    # Divide the data among the tapes
    for i, x in enumerate(data):
 
        tapes[i % num_tapes].append(x)
 
    # Sort each tape
    for tape in tapes:
        tape.sort()
 
    # Repeatedly make passes over the tapes,
    # merging pairs of tapes at a time
    while len(tapes) > 1:
 
        new_tapes = []
        for i in range(0, len(tapes), 2):
            tape1 = tapes[i]
            tape2 = tapes[i + 1] if i + 1 < len(tapes) else None
            new_tapes.append(merge(tape1, tape2))
        tapes = new_tapes
 
    # Return the final merged tape
    return tapes[0]
 
 
def merge(tape1, tape2):
    if tape2 is None:
        return tape1
 
    # Merge the data from tape1 and tape2
    # into a temporary output tape
    output_tape = []
 
    i = j = 0
 
    # Merge all the tapes left
    while i < len(tape1) and j < len(tape2):
 
        if tape1[i] < tape2[j]:
            output_tape.append(tape1[i])
            i += 1
        else:
            output_tape.append(tape2[j])
            j += 1
    output_tape.extend(tape1[i:])
    output_tape.extend(tape2[j:])
    return output_tape
 
# Driver code
 
 
# Input
data = [5, 2, 4, 6, 1, 3]
 
# Function call
sorted_data = balanced_merge_sort(data, 3)
 
print(sorted_data)


C#




// C# code for the above approach
using System;
using System.Collections.Generic;
 
public class GFG {
 
  public static List<int> merge(List<int> tape1,
                                List<int> tape2)
  {
    if (tape2.Count == 0) {
      return tape1;
    }
 
    // Merge the data from tape1 and tape2
    // into a temporary output tape
    List<int> outputTape = new List<int>();
 
    int i = 0, j = 0;
 
    // Merge all the tapes left
    while (i < tape1.Count && j < tape2.Count) {
      if (tape1[i] < tape2[j]) {
        outputTape.Add(tape1[i]);
        i++;
      }
      else {
        outputTape.Add(tape2[j]);
        j++;
      }
    }
    outputTape.AddRange(
      tape1.GetRange(i, tape1.Count - i));
    outputTape.AddRange(
      tape2.GetRange(j, tape2.Count - j));
    return outputTape;
  }
 
  // Function for Balanced merge sort
  public static List<int>
    balancedMergeSort(List<int> data, int numTapes)
  {
    // Initialize the tapes
    List<List<int> > tapes = new List<List<int> >();
    for (int i = 0; i < numTapes; i++)
      tapes.Add(new List<int>());
    // Divide the data among the tapes
    for (int i = 0; i < data.Count; i++) {
      tapes[i % numTapes].Add(data[i]);
    }
 
    // Sort each tape
    for (int i = 0; i < numTapes; i++) {
      tapes[i].Sort();
    }
 
    // Repeatedly make passes over the tapes,
    // merging pairs of tapes at a time
    while (tapes.Count > 1) {
      List<List<int> > newTapes
        = new List<List<int> >();
      for (int i = 0; i < tapes.Count; i += 2) {
        List<int> tape1 = tapes[i];
        List<int> tape2 = (i + 1 < tapes.Count)
          ? tapes[i + 1]
          : new List<int>();
        newTapes.Add(merge(tape1, tape2));
      }
      tapes = newTapes;
    }
 
    // Return the final merged tape
    return tapes[0];
  }
 
  static public void Main()
  {
 
    // Input
    List<int> data = new List<int>{ 5, 2, 4, 6, 1, 3 };
 
    // Function call
    List<int> sortedData = balancedMergeSort(data, 3);
 
    for (int i = 0; i < sortedData.Count; i++) {
      Console.Write(sortedData[i] + " ");
    }
    Console.WriteLine();
  }
}
 
// This code is contributed by lokeshmvs21.


Javascript




// JavaScript code for the above approach
 
// Function to merge two sorted tapes
function merge(tape1, tape2) {
    // If one of the tapes is empty,
    // return the other tape
    if (tape2.length == 0) {
        return tape1;
    }
 
    // Merge the data from tape1 and tape2
    // into a temporary output tape
    let output_tape = [];
 
    let i = 0, j = 0;
 
    // Merge all the tapes left
    while (i < tape1.length && j < tape2.length) {
        if (tape1[i] < tape2[j]) {
            output_tape.push(tape1[i]);
            i++;
        } else {
            output_tape.push(tape2[j]);
            j++;
        }
    }
    output_tape.push(...tape1.slice(i));
    output_tape.push(...tape2.slice(j));
    return output_tape;
}
 
// Function for Balanced merge sort
function balanced_merge_sort(data, num_tapes) {
    // Initialize the tapes
    let tapes = new Array(num_tapes);
    for (let i=0; i<num_tapes; i++) {
        tapes[i] = [];
    }
 
    // Divide the data among the tapes
    for (let i = 0; i < data.length; i++) {
        tapes[i % num_tapes].push(data[i]);
    }
 
    // Sort each tape
    for (let i = 0; i < num_tapes; i++) {
        tapes[i].sort(function(a, b) {
            return a - b;
        });
    }
 
    // Repeatedly make passes over the tapes,
    // merging pairs of tapes at a time
    while (tapes.length > 1) {
        let new_tapes = [];
        for (let i = 0; i < tapes.length; i += 2) {
            let tape1 = tapes[i];
            let tape2 = (i + 1 < tapes.length) ? tapes[i + 1] : [];
            new_tapes.push(merge(tape1, tape2));
        }
        tapes = new_tapes;
    }
 
    // Return the final merged tape
    return tapes[0];
}
 
// Driver code
let data = [5, 2, 4, 6, 1, 3]; // Input
let sorted_data = balanced_merge_sort(data, 3);
 
console.log(sorted_data); // Output


Output

[1, 2, 3, 4, 5, 6]

Time Complexity:  O(n log n)
Auxiliary Space: O(n)

Advantages of Balanced Merge:

  1. Can take advantage of parallelism to speed up the sorting process
  2. The time complexity of O(n log n), which is generally considered to be efficient for large amounts of data
  3. Can handle large amounts of data

Disadvantages of Balanced Merge:

  1. Requires additional space to store the data on the tapes and the temporary output tapes during the merge process
  2. May not be the most efficient choice for smaller datasets

Benefits of using this over other sorting techniques:

  • Good choice for sorting large amounts of data when multiple tapes (or other forms of sequential storage) are available
  • Can be used to take advantage of parallelism to speed up the sorting process
  • Maybe a good choice when you need to sort data in an external storage device (such as a tape drive or hard drive)


Last Updated : 01 Feb, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads