Speed up Code executions with help of Pragma in C/C++

The primary goal of a compiler is to reduce the cost of compilation and to make debugging produce the expected results. Not all optimizations are controlled directly by a flag, sometimes we need to explicitly declare flags to produce optimizations. By default optimizations are suppressed. To use suppressed optimizations we will use pragmas.

Example for unoptimized program: Let us consider an example to calculate Prime Numbers up to 10000000.

Below is the code with no optimization:



filter_none

edit
close

play_arrow

link
brightness_4
code

// C++ program to calculate the Prime
// Numbers upto 10000000 using Sieve
// of Eratosthenes with NO optimization
  
#include <cmath>
#include <iostream>
#include <vector>
#define N 10000005
using namespace std;
  
// Boolean array for Prime Number
vector<bool> prime(N, true);
  
// Seive implemented to find Prime
// Number
void sieveOfEratosthenes()
{
    for (int i = 2; i <= sqrt(N); ++i) {
        if (prime[i]) {
            for (int j = i * i; j <= N; j += i) {
                prime[j] = false;
            }
        }
    }
}
  
// Driver Code
int main()
{
    // Intialise clock to calculate
    // time required to execute without
    // optimization
    clock_t start, end;
  
    // Start clock
    start = clock();
  
    // Function call to find Prime Numbers
    sieveOfEratosthenes();
  
    // End clock
    end = clock();
  
    // Calculate the time difference
    double time_taken
        = double(end - start)
          / double(CLOCKS_PER_SEC);
  
    // Print the Calculated execution time
    cout << "Execution time: " << time_taken
         << " secs";
  
    return 0;
}

chevron_right


Output:

Execution time: 0.592183 secs

Following are the Optimization:

  1. O1: Optimizing compilation at O1 includes more time and memory to break down larger functions. The compiler makes an attempt to reduce both code and execution time. At O1 hardly any optimizations produce great results, but O1 is a setback for an attempt for better optimizations.

    Below is the implementation of previous program with O1 optimization:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    // C++ program to calculate the Prime
    // Numbers upto 10000000 using Sieve
    // of Eratosthenes with O1 optimization
      
    // To see the working of controlled
    // optimization "O1"
    #pragma GCC optimize("O1")
      
    #include <cmath>
    #include <iostream>
    #include <vector>
    #define N 10000005
    using namespace std;
      
    // Boolean array for Prime Number
    vector<bool> prime(N, true);
      
    // Seive implemented to find Prime
    // Number
    void sieveOfEratosthenes()
    {
        for (int i = 2; i <= sqrt(N); ++i) {
            if (prime[i]) {
                for (int j = i * i; j <= N; j += i) {
                    prime[j] = false;
                }
            }
        }
    }
      
    // Driver Code
    int main()
    {
        // Intialise clock to calculate
        // time required to execute without
        // optimization
        clock_t start, end;
      
        // Start clock
        start = clock();
      
        // Function call to find Prime Numbers
        sieveOfEratosthenes();
      
        // End clock
        end = clock();
      
        // Calculate the time difference
        double time_taken
            = double(end - start)
              / double(CLOCKS_PER_SEC);
      
        // Print the Calculated execution time
        cout << "Execution time: " << time_taken
             << " secs.";
      
        return 0;
    }

    chevron_right

    
    

    Output:

    Execution time: 0.384945 secs.
    
  2. O2: Optimizing compilation at O2 optimize to a greater extent. As compared to O1, this option increases both compilation time and the performance of the generated code. O2 turns on all optimization flags specified by O1.

    Below is the implementation of previous program with O2 optimization:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    // C++ program to calculate the Prime
    // Numbers upto 10000000 using Sieve
    // of Eratosthenes with O2 optimization
      
    // To see the working of controlled
    // optimization "O2"
    #pragma GCC optimize("O2")
      
    #include <cmath>
    #include <iostream>
    #include <vector>
    #define N 10000005
    using namespace std;
      
    // Boolean array for Prime Number
    vector<bool> prime(N, true);
      
    // Seive implemented to find Prime
    // Number
    void sieveOfEratosthenes()
    {
        for (int i = 2; i <= sqrt(N); ++i) {
            if (prime[i]) {
                for (int j = i * i; j <= N; j += i) {
                    prime[j] = false;
                }
            }
        }
    }
      
    // Driver Code
    int main()
    {
        // Intialise clock to calculate
        // time required to execute without
        // optimization
        clock_t start, end;
      
        // Start clock
        start = clock();
      
        // Function call to find Prime Numbers
        sieveOfEratosthenes();
      
        // End clock
        end = clock();
      
        // Calculate the time difference
        double time_taken
            = double(end - start)
              / double(CLOCKS_PER_SEC);
      
        // Print the Calculated execution time
        cout << "Execution time: " << time_taken
             << " secs.";
      
        return 0;
    }

    chevron_right

    
    

    Output:

    Execution time: 0.288337 secs.
    
  3. O3: All the optimizations at level O2 are specified by O3 and a list of other flags are also enabled. Few of the flags which are included in O3 are floop-interchange -floop-unroll-jam and -fpeel-loops.

    Below is the implementation of previous program with O3 optimization:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    // C++ program to calculate the Prime
    // Numbers upto 10000000 using Sieve
    // of Eratosthenes with O3 optimization
      
    // To see the working of controlled
    // optimization "O3"
    #pragma GCC optimize("O3")
      
    #include <cmath>
    #include <iostream>
    #include <vector>
    #define N 10000005
    using namespace std;
      
    // Boolean array for Prime Number
    vector<bool> prime(N, true);
      
    // Seive implemented to find Prime
    // Number
    void sieveOfEratosthenes()
    {
        for (int i = 2; i <= sqrt(N); ++i) {
            if (prime[i]) {
                for (int j = i * i; j <= N; j += i) {
                    prime[j] = false;
                }
            }
        }
    }
      
    // Driver Code
    int main()
    {
        // Intialise clock to calculate
        // time required to execute without
        // optimization
        clock_t start, end;
      
        // Start clock
        start = clock();
      
        // Function call to find Prime Numbers
        sieveOfEratosthenes();
      
        // End clock
        end = clock();
      
        // Calculate the time difference
        double time_taken
            = double(end - start)
              / double(CLOCKS_PER_SEC);
      
        // Print the Calculated execution time
        cout << "Execution time: " << time_taken
             << " secs.";
      
        return 0;
    }

    chevron_right

    
    

    Output:

    Execution time: 0.580154 secs.
    
  4. Os: It is optimize for size. Os enables all O2 optimizations except the ones that have increased code size. It also enables -finline-functions, causes the compiler to tune for code size rather than execution speed and performs further optimizations designed to reduce code size.

    Below is the implementation of previous program with Os optimization:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    // C++ program to calculate the Prime
    // Numbers upto 10000000 using Sieve
    // of Eratosthenes with Os optimization
      
    // To see the working of controlled
    // optimization "Os"
    #pragma GCC optimize("Os")
      
    #include <cmath>
    #include <iostream>
    #include <vector>
    #define N 10000005
    using namespace std;
      
    // Boolean array for Prime Number
    vector<bool> prime(N, true);
      
    // Seive implemented to find Prime
    // Number
    void sieveOfEratosthenes()
    {
        for (int i = 2; i <= sqrt(N); ++i) {
            if (prime[i]) {
                for (int j = i * i; j <= N; j += i) {
                    prime[j] = false;
                }
            }
        }
    }
      
    // Driver Code
    int main()
    {
        // Intialise clock to calculate
        // time required to execute without
        // optimization
        clock_t start, end;
      
        // Start clock
        start = clock();
      
        // Function call to find Prime Numbers
        sieveOfEratosthenes();
      
        // End clock
        end = clock();
      
        // Calculate the time difference
        double time_taken
            = double(end - start)
              / double(CLOCKS_PER_SEC);
      
        // Print the Calculated execution time
        cout << "Execution time: " << time_taken
             << " secs.";
      
        return 0;
    }

    chevron_right

    
    

    Output:

    Execution time: 0.317845 secs.
    
  5. Ofast: Ofast enables all O3 optimizations. It also has the number of enabled flags that produce super optimized results. Ofast combines optimizations produced by each of the above O levels. This optimization is usually preferred by a lot of competitive programmers and is hence recommended. In case more than one optimizations are declared the last declared one gets enabled.

    Below is the implementation of previous program with Ofast optimization:

    filter_none

    edit
    close

    play_arrow

    link
    brightness_4
    code

    // C++ program to calculate the Prime
    // Numbers upto 10000000 using Sieve
    // of Eratosthenes with Ofast optimization
      
    // To see the working of controlled
    // optimization "Ofast"
    #pragma GCC optimize("Ofast")
      
    #include <cmath>
    #include <iostream>
    #include <vector>
    #define N 10000005
    using namespace std;
      
    // Boolean array for Prime Number
    vector<bool> prime(N, true);
      
    // Seive implemented to find Prime
    // Number
    void sieveOfEratosthenes()
    {
        for (int i = 2; i <= sqrt(N); ++i) {
            if (prime[i]) {
                for (int j = i * i; j <= N; j += i) {
                    prime[j] = false;
                }
            }
        }
    }
      
    // Driver Code
    int main()
    {
        // Intialise clock to calculate
        // time required to execute without
        // optimization
        clock_t start, end;
      
        // Start clock
        start = clock();
      
        // Function call to find Prime Numbers
        sieveOfEratosthenes();
      
        // End clock
        end = clock();
      
        // Calculate the time difference
        double time_taken
            = double(end - start)
              / double(CLOCKS_PER_SEC);
      
        // Print the Calculated execution time
        cout << "Execution time: " << time_taken
             << " secs.";
      
        return 0;
    }

    chevron_right

    
    

    Output:

    Execution time: 0.303287 secs.
    



My Personal Notes arrow_drop_up

Check out this Author's contributed articles.

If you like GeeksforGeeks and would like to contribute, you can also write an article using contribute.geeksforgeeks.org or mail your article to contribute@geeksforgeeks.org. See your article appearing on the GeeksforGeeks main page and help other Geeks.

Please Improve this article if you find anything incorrect by clicking on the "Improve Article" button below.