Why is it faster to process sorted array than an unsorted array ?

Last Updated : 19 Sep, 2023

Here is a C++ and java code that illustrates that sorting the data miraculously makes the code faster than the unsorted version. Let’s try out sample C++ and java programs to understand the problem statement better.

Implementation:

CPP

// CPP program to demonstrate processing 
// time of sorted and unsorted array 
#include <iostream> 
#include <algorithm> 
#include <ctime> 
using namespace std; 
  
const int N = 100001; 
  
int main() 
{ 
    int arr[N]; 
  
    // Assign random values to array 
    for (int i=0; i<N; i++) 
        arr[i] = rand()%N; 
  
    // for loop for unsorted array 
    int count = 0; 
    double start = clock(); 
    for (int i=0; i<N; i++) 
        if (arr[i] < N/2) 
            count++; 
  
    double end = clock(); 
    cout << "Time for unsorted array :: "
        << ((end - start)/CLOCKS_PER_SEC) 
        << endl; 
    sort(arr, arr+N); 
  
    // for loop for sorted array 
    count = 0; 
    start = clock(); 
  
    for (int i=0; i<N; i++) 
        if (arr[i] < N/2) 
            count++; 
  
    end = clock(); 
    cout << "Time for sorted array :: "
        << ((end - start)/CLOCKS_PER_SEC) 
        << endl; 
  
    return 0; 
} 

Java

// Java implementation for the above approach. 
  
import java.util.Arrays; 
  
public class Main { 
    static final int N = 100001; 
  
    public static void main(String[] args) { 
        int[] arr = new int[N]; 
  
        // Assign random values to array 
        for (int i = 0; i < N; i++) 
            arr[i] = (int)(Math.random() * N); 
  
        // for loop for unsorted array 
        int count = 0; 
        long start = System.currentTimeMillis(); 
        for (int i = 0; i < N; i++) 
            if (arr[i] < N/2) 
                count++; 
  
        long end = System.currentTimeMillis(); 
        System.out.println("Time for unsorted array :: " + (end - start) / 1000.0); 
  
        Arrays.sort(arr); 
  
        // for loop for sorted array 
        count = 0; 
        start = System.currentTimeMillis(); 
        for (int i = 0; i < N; i++) 
            if (arr[i] < N/2) 
                count++; 
  
        end = System.currentTimeMillis(); 
        System.out.println("Time for sorted array :: " + (end - start) / 1000.0); 
    } 
} 
  
// contributed my Rishabh

Python3

import random 
import time 
  
N = 100001
  
# Assign random values to array 
arr = [random.randint(0, N) for i in range(N)] 
  
# for loop for unsorted array 
count = 0
start = time.time() 
for i in range(N): 
    if arr[i] < N/2: 
        count += 1
  
end = time.time() 
print("Time for unsorted array ::", end - start) 
  
arr.sort() 
  
# for loop for sorted array 
count = 0
start = time.time() 
for i in range(N): 
    if arr[i] < N/2: 
        count += 1
  
end = time.time() 
print("Time for sorted array ::", end - start) 

C#

using System; 
  
namespace Demo 
{ 
    class Program 
    { 
        const int N = 100001; 
  
        static void Main(string[] args) 
        { 
            int[] arr = new int[N]; 
  
            // Assign random values to array 
            Random rand = new Random(); 
            for (int i = 0; i < N; i++) 
            { 
                arr[i] = rand.Next(N); 
            } 
  
            // for loop for unsorted array 
            int count = 0; 
            double start = DateTime.Now.Ticks / (double)TimeSpan.TicksPerSecond; 
            for (int i = 0; i < N; i++) 
            { 
                if (arr[i] < N / 2) 
                { 
                    count++; 
                } 
            } 
            double end = DateTime.Now.Ticks / (double)TimeSpan.TicksPerSecond; 
            Console.WriteLine("Time for unsorted array :: " + (end - start)); 
  
            Array.Sort(arr); 
  
            // for loop for sorted array 
            count = 0; 
            start = DateTime.Now.Ticks / (double)TimeSpan.TicksPerSecond; 
            for (int i = 0; i < N; i++) 
            { 
                if (arr[i] < N / 2) 
                { 
                    count++; 
                } 
            } 
            end = DateTime.Now.Ticks / (double)TimeSpan.TicksPerSecond; 
            Console.WriteLine("Time for sorted array :: " + (end - start)); 
  
            Console.ReadKey(); 
        } 
    } 
}

Javascript

const N = 100001; 
const arr = []; 
  
// Assign random values to array 
for (let i = 0; i < N; i++) { 
    arr[i] = Math.floor(Math.random() * N); 
} 
  
// for loop for unsorted array 
let count = 0; 
let start = new Date().getTime(); 
for (let i = 0; i < N; i++) { 
  if (arr[i] < N/2) { 
      count++; 
  } 
} 
  
let end = new Date().getTime(); 
console.log("Time for unsorted array :: " + (end - start) / 1000.0); 
  
arr.sort(); 
  
// for loop for sorted array 
count = 0; 
start = new Date().getTime(); 
for (let i = 0; i < N; i++) { 
  if (arr[i] < N/2) { 
      count++; 
  } 
} 
  
end = new Date().getTime(); 
console.log("Time for sorted array :: " + (end - start) / 1000.0); 
  
// This code is contributed by shivhack999

Output

Time for unsorted array :: 0.000844
Time for sorted array :: 0.00023

Observe that time taken for processing a sorted array is less as compared to unsorted array. The reason for this optimisation for sorted array is branch prediction.

What is branch prediction ?
In computer architecture, branch prediction means determining whether a conditional branch(jump) in the instruction flow of a program is likely to be taken or not. All the pipelined processors do branch prediction in some form, because they must guess the address of the next instruction to fetch before the current instruction has been executed.

How branch prediction in applicable on above case ?
The if condition checks that arr[i] < 5000, but if you observe in case of sorted array, after passing the number 5000 the condition is always false, and before that it is always true, compiler optimises the code here and skips the if condition which is referred as branch prediction.

Case 1 : Sorted array

    T = if condition true
    F = if condition false
    arr[] = {0,1,2,3,4,5,6, .... , 4999,5000,5001, ... , 100000}
            {T,T,T,T,T,T,T, .... , T,    F,   F,   ... ,    F  }

We can observe that it is very easy to predict the branch in sorted array, as the sequence is TTTTTTTTTTTT………FFFFFFFFFFFFF

Case 2 : Unsorted array

    T = if condition true
    F = if condition false
    arr[] = {5,0,5000,10000,17,13, ... , 3,21000,10}
            {T,T,F,     F,   T, T, ... , T, F,    T}

It is very difficult to predict that if statement will be false or true, hence branch prediction don’t play any significant role here.

Branch prediction works on the pattern the algorithm is following or basically the history, how it got executed in previous steps. If the guess is correct, then CPU continue executing and if it goes wrong, then CPU need to flush the pipeline and roll back to the branch and restart from beginning.

In case compiler is not able to utilise branch prediction as a tool for improving performance, programmer can implement his own hacks to improve performance.

Suggest improvement

Check if any interval completely overlaps the other

Sort a k sorted doubly linked list

Share your thoughts in the comments

Why is it faster to process sorted array than an unsorted array ?

CPP

Java

Python3

C#

Javascript

Please Login to comment...

Similar Reads

What kind of Experience do you want to share?