Related Articles

# Speed up Code executions with help of Pragma in C/C++

• Difficulty Level : Medium
• Last Updated : 11 Jun, 2021

The primary goal of a compiler is to reduce the cost of compilation and to make debugging produce the expected results. Not all optimizations are controlled directly by a flag, sometimes we need to explicitly declare flags to produce optimizations. By default optimizations are suppressed. To use suppressed optimizations we will use pragmas
Example for unoptimized program: Let us consider an example to calculate Prime Numbers up to 10000000.
Below is the code with no optimization:

## CPP

 `// C++ program to calculate the Prime``// Numbers upto 10000000 using Sieve``// of Eratosthenes with NO optimization` `#include ``#include ``#include ``#define N 10000005``using` `namespace` `std;` `// Boolean array for Prime Number``vector<``bool``> prime(N, ``true``);` `// Sieve implemented to find Prime``// Number``void` `sieveOfEratosthenes()``{``    ``for` `(``int` `i = 2; i <= ``sqrt``(N); ++i) {``        ``if` `(prime[i]) {``            ``for` `(``int` `j = i * i; j <= N; j += i) {``                ``prime[j] = ``false``;``            ``}``        ``}``    ``}``}` `// Driver Code``int` `main()``{``    ``// Initialise clock to calculate``    ``// time required to execute without``    ``// optimization``    ``clock_t` `start, end;` `    ``// Start clock``    ``start = ``clock``();` `    ``// Function call to find Prime Numbers``    ``sieveOfEratosthenes();` `    ``// End clock``    ``end = ``clock``();` `    ``// Calculate the time difference``    ``double` `time_taken``        ``= ``double``(end - start)``          ``/ ``double``(CLOCKS_PER_SEC);` `    ``// Print the Calculated execution time``    ``cout << ``"Execution time: "` `<< time_taken``         ``<< ``" secs"``;` `    ``return` `0;``}`
Output:
`Execution time: 0.592183 secs`

Following are the Optimization:

1. O1: Optimizing compilation at O1 includes more time and memory to break down larger functions. The compiler makes an attempt to reduce both code and execution time. At O1 hardly any optimizations produce great results, but O1 is a setback for an attempt for better optimizations.
Below is the implementation of previous program with O1 optimization:

## CPP

 `// C++ program to calculate the Prime``// Numbers upto 10000000 using Sieve``// of Eratosthenes with O1 optimization` `// To see the working of controlled``// optimization "O1"``#pragma GCC optimize("O1")` `#include ``#include ``#include ``#define N 10000005``using` `namespace` `std;` `// Boolean array for Prime Number``vector<``bool``> prime(N, ``true``);` `// Sieve implemented to find Prime``// Number``void` `sieveOfEratosthenes()``{``    ``for` `(``int` `i = 2; i <= ``sqrt``(N); ++i) {``        ``if` `(prime[i]) {``            ``for` `(``int` `j = i * i; j <= N; j += i) {``                ``prime[j] = ``false``;``            ``}``        ``}``    ``}``}` `// Driver Code``int` `main()``{``    ``// Initialise clock to calculate``    ``// time required to execute without``    ``// optimization``    ``clock_t` `start, end;` `    ``// Start clock``    ``start = ``clock``();` `    ``// Function call to find Prime Numbers``    ``sieveOfEratosthenes();` `    ``// End clock``    ``end = ``clock``();` `    ``// Calculate the time difference``    ``double` `time_taken``        ``= ``double``(end - start)``          ``/ ``double``(CLOCKS_PER_SEC);` `    ``// Print the Calculated execution time``    ``cout << ``"Execution time: "` `<< time_taken``         ``<< ``" secs."``;` `    ``return` `0;``}`
1.
Output:
`Execution time: 0.384945 secs.`

1.
2. O2: Optimizing compilation at O2 optimize to a greater extent. As compared to O1, this option increases both compilation time and the performance of the generated code. O2 turns on all optimization flags specified by O1.
Below is the implementation of previous program with O2 optimization:

## CPP

 `// C++ program to calculate the Prime``// Numbers upto 10000000 using Sieve``// of Eratosthenes with O2 optimization` `// To see the working of controlled``// optimization "O2"``#pragma GCC optimize("O2")` `#include ``#include ``#include ``#define N 10000005``using` `namespace` `std;` `// Boolean array for Prime Number``vector<``bool``> prime(N, ``true``);` `// Sieve implemented to find Prime``// Number``void` `sieveOfEratosthenes()``{``    ``for` `(``int` `i = 2; i <= ``sqrt``(N); ++i) {``        ``if` `(prime[i]) {``            ``for` `(``int` `j = i * i; j <= N; j += i) {``                ``prime[j] = ``false``;``            ``}``        ``}``    ``}``}` `// Driver Code``int` `main()``{``    ``// Initialise clock to calculate``    ``// time required to execute without``    ``// optimization``    ``clock_t` `start, end;` `    ``// Start clock``    ``start = ``clock``();` `    ``// Function call to find Prime Numbers``    ``sieveOfEratosthenes();` `    ``// End clock``    ``end = ``clock``();` `    ``// Calculate the time difference``    ``double` `time_taken``        ``= ``double``(end - start)``          ``/ ``double``(CLOCKS_PER_SEC);` `    ``// Print the Calculated execution time``    ``cout << ``"Execution time: "` `<< time_taken``         ``<< ``" secs."``;` `    ``return` `0;``}`
1.
Output:

`Execution time: 0.288337 secs.`

1.
2. O3: All the optimizations at level O2 are specified by O3 and a list of other flags are also enabled. Few of the flags which are included in O3 are floop-interchange -floop-unroll-jam and -fpeel-loops.
Below is the implementation of previous program with O3 optimization:

## CPP

 `// C++ program to calculate the Prime``// Numbers upto 10000000 using Sieve``// of Eratosthenes with O3 optimization` `// To see the working of controlled``// optimization "O3"``#pragma GCC optimize("O3")` `#include ``#include ``#include ``#define N 10000005``using` `namespace` `std;` `// Boolean array for Prime Number``vector<``bool``> prime(N, ``true``);` `// Sieve implemented to find Prime``// Number``void` `sieveOfEratosthenes()``{``    ``for` `(``int` `i = 2; i <= ``sqrt``(N); ++i) {``        ``if` `(prime[i]) {``            ``for` `(``int` `j = i * i; j <= N; j += i) {``                ``prime[j] = ``false``;``            ``}``        ``}``    ``}``}` `// Driver Code``int` `main()``{``    ``// Initialise clock to calculate``    ``// time required to execute without``    ``// optimization``    ``clock_t` `start, end;` `    ``// Start clock``    ``start = ``clock``();` `    ``// Function call to find Prime Numbers``    ``sieveOfEratosthenes();` `    ``// End clock``    ``end = ``clock``();` `    ``// Calculate the time difference``    ``double` `time_taken``        ``= ``double``(end - start)``          ``/ ``double``(CLOCKS_PER_SEC);` `    ``// Print the Calculated execution time``    ``cout << ``"Execution time: "` `<< time_taken``         ``<< ``" secs."``;` `    ``return` `0;``}`
1.
Output:
`Execution time: 0.580154 secs.`

1.
2. Os: It is optimize for size. Os enables all O2 optimizations except the ones that have increased code size. It also enables -finline-functions, causes the compiler to tune for code size rather than execution speed and performs further optimizations designed to reduce code size.
Below is the implementation of previous program with Os optimization:

## CPP

 `// C++ program to calculate the Prime``// Numbers upto 10000000 using Sieve``// of Eratosthenes with Os optimization` `// To see the working of controlled``// optimization "Os"``#pragma GCC optimize("Os")` `#include ``#include ``#include ``#define N 10000005``using` `namespace` `std;` `// Boolean array for Prime Number``vector<``bool``> prime(N, ``true``);` `// Sieve implemented to find Prime``// Number``void` `sieveOfEratosthenes()``{``    ``for` `(``int` `i = 2; i <= ``sqrt``(N); ++i) {``        ``if` `(prime[i]) {``            ``for` `(``int` `j = i * i; j <= N; j += i) {``                ``prime[j] = ``false``;``            ``}``        ``}``    ``}``}` `// Driver Code``int` `main()``{``    ``// Initialise clock to calculate``    ``// time required to execute without``    ``// optimization``    ``clock_t` `start, end;` `    ``// Start clock``    ``start = ``clock``();` `    ``// Function call to find Prime Numbers``    ``sieveOfEratosthenes();` `    ``// End clock``    ``end = ``clock``();` `    ``// Calculate the time difference``    ``double` `time_taken``        ``= ``double``(end - start)``          ``/ ``double``(CLOCKS_PER_SEC);` `    ``// Print the Calculated execution time``    ``cout << ``"Execution time: "` `<< time_taken``         ``<< ``" secs."``;` `    ``return` `0;``}`
1.
Output:
`Execution time: 0.317845 secs.`

1.
2. Ofast: Ofast enables all O3 optimizations. It also has the number of enabled flags that produce super optimized results. Ofast combines optimizations produced by each of the above O levels. This optimization is usually preferred by a lot of competitive programmers and is hence recommended. In case more than one optimizations are declared the last declared one gets enabled.
Below is the implementation of previous program with Ofast optimization:

## CPP

 `// C++ program to calculate the Prime``// Numbers upto 10000000 using Sieve``// of Eratosthenes with Ofast optimization` `// To see the working of controlled``// optimization "Ofast"``#pragma GCC optimize("Ofast")` `#include ``#include ``#include ``#define N 10000005``using` `namespace` `std;` `// Boolean array for Prime Number``vector<``bool``> prime(N, ``true``);` `// Sieve implemented to find Prime``// Number``void` `sieveOfEratosthenes()``{``    ``for` `(``int` `i = 2; i <= ``sqrt``(N); ++i) {``        ``if` `(prime[i]) {``            ``for` `(``int` `j = i * i; j <= N; j += i) {``                ``prime[j] = ``false``;``            ``}``        ``}``    ``}``}` `// Driver Code``int` `main()``{``    ``// Initialise clock to calculate``    ``// time required to execute without``    ``// optimization``    ``clock_t` `start, end;` `    ``// Start clock``    ``start = ``clock``();` `    ``// Function call to find Prime Numbers``    ``sieveOfEratosthenes();` `    ``// End clock``    ``end = ``clock``();` `    ``// Calculate the time difference``    ``double` `time_taken``        ``= ``double``(end - start)``          ``/ ``double``(CLOCKS_PER_SEC);` `    ``// Print the Calculated execution time``    ``cout << ``"Execution time: "` `<< time_taken``         ``<< ``" secs."``;` `    ``return` `0;``}`
1.
Output:
`Execution time: 0.303287 secs.`

1.

To further achieve optimizations at architecture level we can use targets with pragmas. These optimizations can produce surprising results. However it is recommended to use target with any of the optimizations specified above.
Below is the implementation of previous program with Target

## CPP14

 `// C++ program to calculate the Prime``// Numbers upto 10000000 using Sieve``// of Eratosthenes with Ofast optimization along with target optimizations` `// To see the working of controlled``// optimization "Ofast"``#pragma GCC optimize("Ofast")``#pragma GCC target("avx,avx2,fma")` `#include ``#include ``#include ``#define N 10000005``using` `namespace` `std;` `// Boolean array for Prime Number``vector<``bool``> prime(N, ``true``);` `// Sieve implemented to find Prime``// Number``void` `sieveOfEratosthenes()``{``    ``for` `(``int` `i = 2; i <= ``sqrt``(N); ++i) {``        ``if` `(prime[i]) {``            ``for` `(``int` `j = i * i; j <= N; j += i) {``                ``prime[j] = ``false``;``            ``}``        ``}``    ``}``}` `// Driver Code``int` `main()``{``    ``// Initialise clock to calculate``    ``// time required to execute without``    ``// optimization``    ``clock_t` `start, end;` `    ``// Start clock``    ``start = ``clock``();` `    ``// Function call to find Prime Numbers``    ``sieveOfEratosthenes();` `    ``// End clock``    ``end = ``clock``();` `    ``// Calculate the time difference``    ``double` `time_taken``        ``= ``double``(end - start)``        ``/ ``double``(CLOCKS_PER_SEC);` `    ``// Print the Calculated execution time``    ``cout << ``"Execution time: "` `<< time_taken``        ``<< ``" secs."``;` `    ``return` `0;``}`
Output:
`Execution time: 0.292147 secs.` My Personal Notes arrow_drop_up