Open In App

Why to use fgets() over scanf() in C?

Improve
Improve
Like Article
Like
Save
Share
Report

Any time you use an *f function, whether it be printf, scanf, or their derivatives (fprintf, fscanf, etc…), you are doing more things than you might realize. Not only are you reading (or writing) something, but-and here’s the problem- you are interpreting it. The format string can be thought of as kind of an ‘eval’ function like you would see if you program in Lisp. So the problem with simply reading input from the user and echoing it back out is that a malevolent actor can simply insert a function pointer or executable code and voila, you are no longer in control. 
Advantage of using scanf(): The user doesn’t need to know the size of the stack, which is the starter code is one-hundred bytes. That’s kind of good, although anyone can just sit there trying longer and longer input strings until BufferOverflow error occur. Ideally, you would just write a script that automatically tries increasing string sizes and reads the exit code of the program. Once it detects an error, you simply work your way back to a close estimate of the stack. Any modern operating system using memory address randomization to make the whole process of hijacking an application harder, but it’s by no means impossible. 
fgets() over scanf(): fgets function is short for file-get-string. Remember that files can be pretty much anything on *nix systems (sockets, streams, or actual files), so we can use it to read from standard input, which again, is also technically a file. This also makes our program more robust, because as mentioned in the source, simply adjust the code to read from the command line, a file, or stdin, without much trouble. Crucially, the fgets function also allows us to specify a specific buffer length, thereby disallowing any buffer overflow attacks. If you look at the full code below, you’ll notice that a default buffer size has been defined as a macro. Remember that C can’t use ‘const int’ variables to initialize an array properly. You can hack it using variable-length arrays (VLA’s), but it’s not ideal, and I strongly recommend against it. So while in C++ we would normally use literally anything else, here we do use preprocessor macros, but keep in mind that C and C++ have vastly different capabilities when it comes to their static type checking, namely that C++ is good over C. Hence fgets() method can actually Handle Errors during I/O. So the code to actually read the user input is as follows: 

C




char* inputBuffer = malloc(sizeof(char) * DEFAULT_BUFFER_SIZE);
memset(inputBuffer, NUL, DEFAULT_BUFFER_SIZE);
 
char* result = NULL;
 
while (result == NULL) {
    result = fgets(inputBuffer, DEFAULT_BUFFER_SIZE, stdin);
 
    if (inputBuffer[strlen(inputBuffer) - 1] != '\n') {
        ErrorInputStringTooLong();
 
        // Since 'result' is the canary
        // we are using to notify of a failure
        // in execution, set it to NULL, to indicate an error.
        // This is a useful value because
        // if for some reason the fgets f/nction were
        // to fail, the return value would also be NULL.
 
        result = NULL;
    }
}


As expected, we dynamically allocate a buffer of predetermined size. Dynamically allocating it as opposed to stack-allocating it gives us another layer of protection against stack smashing. Note: We are explicitly zeroing out the memory in the inputBuffer. C does nothing for you, and malloc is no exception. The call to malloc returns a pointer to the first memory cell of the requested memory, but the values in every single one of those cells are unchanged from before the call. For all intents and purposes, they are garbage, so we zero them out. Note also that by zeroing out, we are automatically giving ourselves the null terminator, although the fgets function does actually append a null to the end of the input string, provided there is enough room in the buffer. 
When the user inputs too long string: You’ll notice that I check to make sure the last read value was not a new line. If that is true, this means that the user passed in an input string that was too long. To fix this, we need to set our ‘result’ variable to NULL so we cycle through the loop again, but we also need to clear out the input buffer. Otherwise, the program will simply read from the old input, which has not yet been used, rather than prompting the user for additional input. To handle this, I supply two additional functions. 

C




static inline void ErrorInputStringTooLong()
{
 
    // NOTE: Print to stderr, not to stdout.
    fprintf(stderr, "[ERROR]: The input was too long, please try again.\n");
 
    // Having notified the user,
    // clear the input buffer and return.
    ClearInputBuffer();
}
 
static inline void ClearInputBuffer()
{
 
    // This variable is merely here to munch the garbage out of the input
    // buffer. As long as the input buffer's current character is not either
    // a new line character or the end of input, keep reading characters. This
    // seems to be the only portable way of clearing the input buffer.
    char c = NULL;
 
    while ((c = getchar()) != '\n' && c != EOF) {
        // Do nothing until input buffer fully flushed.
    }
}


Note: Usually, to clear the input buffer, one would just call fseek(stdin, 0, SEEK_END); but it doesn’t seem to work on every platform. Hence, the above-described method is used. Note: if fgets returns an error, you should call ferror() to figure out what went wrong.



Last Updated : 22 Feb, 2023
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads