When we run a code, sometimes we see absurd results instead of expected output. So, in C/C++ programming, undefined behavior means when the program fails to compile, or it may execute incorrectly, either crashes or generates incorrect results, or when it may fortuitously do exactly what the programmer intended. Whenever the result of an executing program is unpredictable, it is said to have undefined behavior.
As a C programmer, understanding undefined behavior is very important for optimal coding and for the program to yield a good efficiency, especially when it comes to there are C codes embedded in system design.
Examples:
Division By Zero
int val = 5;
return val / 0; // undefined behavior
Memory accesses outside of array bounds
int arr[4] = {0, 1, 2, 3};
return arr[5]; // undefined behavior for indexing out of bounds
Signed integer overflow
int x = INT_MAX;
printf("%d", x + 1); // undefined behavior
Null pointer dereference
val = 0;
int ptr = *val; // undefined behavior for dereferencing a null pointer
Modification of string literal
char* s = "geeksforgeeks";
s[0] = 'e'; // undefined behavior
Accessing a NULL Pointer, etc.
int* ptr = NULL;
printf("%d", *ptr); // undefined behavior for accessing NULL Pointer
Sometimes compilers may diagnose simple errors, however, sometimes they are not designed to diagnose the undefined behavior.
Following are some C/C++ programs that showcase undefined behavior:
Program 1:
C++
#include <iostream>
using namespace std;
int main()
{
int x = 25, y = 0;
int z = x / y;
cout << z;
return 0;
}
|
C
#include <stdio.h>
int main()
{
int x = 25, y = 0;
int z = x / y;
printf ( "%d" , z);
return 0;
}
|
Program 2:
C++
#include <iostream>
using namespace std;
int main()
{
bool val;
if (val)
printf ( "TRUE" );
else
printf ( "FALSE" );
}
|
C
#include <stdio.h>
int main( void )
{
typedef enum {False, True} bool ;
bool val;
if (val)
printf ( "TRUE" );
else
printf ( "FALSE" );
}
|
Program 3:
C++
#include <iostream>
using namespace std;
int main()
{
int * ptr = NULL;
cout << *ptr;
return 0;
}
|
C
#include <stdio.h>
int main()
{
int * ptr = NULL;
printf ( "%d" , *ptr);
return 0;
}
|
Program 4:
C++
#include <iostream>
using namespace std;
int main()
{
int arr[5];
for ( int i = 0; i <= 5; i++)
cout << arr[i];
return 0;
}
|
C
#include <stdio.h>
int main()
{
int arr[5];
for ( int i = 0; i <= 5; i++)
printf ( "%d " , arr[i]);
return 0;
}
|
Program 5:
C++
#include <iostream>
#include <climits>
using namespace std;
int main()
{
int x = INT_MAX;
cout << x + 1;;
return 0;
}
|
C
#include <stdio.h>
#include <limits.h>
int main()
{
int x = INT_MAX;
printf ( "%d" , x + 1);
return 0;
}
|
Program 6:
C++
#include <iostream>
using namespace std;
int main()
{
char * s = "geeksforgeeks" ;
s[0] = 'e' ;
return 0;
}
|
C
#include <stdio.h>
int main()
{
char * s = "geeksforgeeks" ;
s[0] = 'e' ;
return 0;
}
|
Program 7:
C++
#include <iostream>
using namespace std;
int main()
{
int i = 8;
int p = i++ * i++;
cout << p;
}
|
C
#include <stdio.h>
int main()
{
int i = 8;
int p = i++ * i++;
printf ( "%d\n" , p);
}
|
Explanation: The program produces 72 as output in most of the compilers, but implementing software based on this assumption is not a good idea.
The output of all of the above programs is unpredictable (or undefined). The compilers (implementing the C/C++ standard) are free to do anything as these are undefined by the C and C++ standards.
Language like Java, trap errors as soon as they are found but languages like C and C++ in a few cases keep on executing the code in a faulty manner which may result in unpredictable results. The program can crash with any type of error message, or it can unknowingly corrupt the data which is a grave issue to deal with.
Importance of knowing about Undefined Behaviour: If a user starts learning in a C/C++ environment and is unclear about the concept of undefined behavior then that can bring plenty of problems in the future while debugging someone else’s code might be actually difficult in tracing the root to the undefined error.
Risks and Disadvantages of Undefined Behaviour
- The programmers sometimes rely on a particular implementation (or compiler) of undefined behavior which may cause problems when the compiler is changed/upgraded. For example, the last program produces 72 as output in most of the compilers, but implementing software based on this assumption is not a good idea.
- Undefined behaviors may also cause security vulnerabilities, especially due to the cases when an array out of bound is not checked (causes buffer overflow attack).
Advantages of Undefined Behaviour
- C and C++ have undefined behaviours because it allows compilers to avoid lots of checks. Suppose a set of code with a greater performing array need not keep a look at the bounds, which avoids the need for a complex optimization pass to check such conditions outside loops. The tightly bound loops and speed up the program from thirty to fifty percent when it gains an advantage of the undefined nature of signed overflow, which is generally offered by the C compiler.
- We also have another advantage of this as it allows us to store a variable’s value in a processor register and manipulate it over time that is larger than the variable in the source code.
- It also helps in wrap-around then compile-time checks which would not be possible without the greater knowledge of the undefined behaviour in the C/C++ compiler.
More Examples of undefined behavior
- Sequence Points in C | Set 1
- “delete this” in C++
- Passing NULL to printf in C
- Accessing array out of bounds in C/C++
- Use of realloc()
- Execution of printf with ++ operatorsVirtual destruction using shared_ptr in C++
- Virtual Destructor