Any time you use an *f function, whether it be printf, scanf, or their derivatives (fprintf, fscanf, etc…), you are doing more things than you might realize. Not only are you reading (or writing) something, but-and here’s the problem- you are interpreting it. The format string can be thought of as kind of an ‘eval’ function like you would see if you program in Lisp. So the problem with simply reading input from the user and echoing it back out is that a malevolent actor can simply insert a function pointer or executable code and voila, you are no longer in control.
Advantage of using scanf():
The user doesn’t need to know the size of the stack, which is the starter code is one-hundred bytes. That’s kind of good, although anyone can just sit there trying longer and longer input strings until BufferOverflow error occur. Ideally, you would just write a script that automatically tries increasing string sizes and reads the exit code of the program. Once it detects an error, you simply work your way back to a close estimate of the stack. Any modern operating system using memory address randomization to make the whole process of hijacking an application harder, but it’s by no means impossible.
fgets() over scanf():
fgets function is short for file-get-string. Remember that files can be pretty much anything on *nix systems (sockets, streams, or actual files), so we can use it to read from standard input, which again, is also technically a file. This also makes our program more robust, because as mentioned in the source, simply adjust the code to read from the command line, a file, or stdin, without much trouble.
Crucially, the fgets function also allows us to specify a specific buffer length, thereby disallowing any buffer overflow attacks. If you look at the full code below, you’ll notice that a default buffer size has been defined as a macro. Remember that C can’t use ‘const int’ variables to initialize an array properly. You can hack it using variable-length arrays (VLA’s), but it’s not ideal, and I strongly recommend against it. So while in C++ we would normally use literally anything else, here we do use preprocessor macros, but keep in mind that C and C++ have vastly different capabilities when it comes to their static type checking, namely that C++ is good over C. Hence fgets() method can actually Handle Errors during I/O.
So the code to actually read the user input is as follows:
As expected, we dynamically allocate a buffer of predetermined size. Dynamically allocating it as opposed to stack-allocating it gives us another layer of protection against stack smashing.
Note: We are explicitly zeroing out the memory in the inputBuffer. C does nothing for you, and malloc is no exception. The call to malloc returns a pointer to the first memory cell of the requested memory, but the values in every single one of those cells are unchanged from before the call. For all intents and purposes, they are garbage, so we zero them out. Note also that by zeroing out, we are automatically giving ourselves the null terminator, although the fgets function does actually append a null to the end of the input string, provided there is enough room in the buffer.
When the user inputs too long string:
You’ll notice that I check to make sure the last read value was not a new line. If that is true, this means that the user passed in an input string that was too long. To fix this, we need to set our ‘result’ variable to NULL so we cycle through the loop again, but we also need to clear out the input buffer. Otherwise, the program will simply read from the old input, which has not yet been used, rather than prompting the user for additional input. To handle this, I supply two additional functions.
Note: Usually, to clear the input buffer, one would just call fseek(stdin, 0, SEEK_END); but it doesn’t seem to work on every platform. Hence, the above-described method is used.
Note: if fgets returns an error, you should call ferror() to figure out what went wrong.