The Pitfalls of Reading User Input in C: a Story About scanf and Stdin
Source: Dev.to
The Pitfalls of Reading User Input in C: a Story About scanf and stdin
I recently had to write a piece of C code that reads input from stdin, ignores the newline, discards anything that exceeds the buffer, and repeats this in a loop.
Initial Attempt
#include <stdio.h>
void take_input(void)
{
char buf[20];
printf("> ");
scanf("%19[^\n]", buf);
printf("input=`%s`\n", buf);
}
int main(void)
{
for (int i = 0; i < 5; ++i) {
take_input();
}
return 0;
}
$ ./main
> hello world↵
input=`hello world`
> input=`hello world`
> input=`hello world`
> input=`hello world`
> input=`hello world`
$ █
It consumed the string once and printed the same value five times. Why?
Stack Behavior
Although char buf[20] is a locally scoped variable, each invocation of take_input() ends up using the exact same stack memory address.
/* ... */
printf("address=%p, input=`%s`\n", (void*)buf, buf);
/* ... */
$ ./main
> hello world
address=0x7ffcb96c49d0, input=`hello world`
> address=0x7ffcb96c49d0, input=`hello world`
> address=0x7ffcb96c49d0, input=`hello world`
...
Because the buffer wasn’t initialized, and subsequent scanf calls failed to overwrite it, the old content (“ghost data”) remained unchanged. Initializing the buffer with zeroes fixes this specific issue:
char buf[20] = {0};
Now the buffer resets each time, but scanf still behaves oddly:
$ ./main
> hello world
address=0x7ffea5c2d840, input=`hello world`
> address=0x7ffea5c2d840, input=``
> address=0x7ffea5c2d840, input=``
...
The Real Problem: %[^\n] Does Not Consume the Newline
The scanset %19[^\n] reads up to 19 characters that are not a newline, but it leaves the newline in the input stream.
Inspecting stdin with GDB shows that the newline is still pending:
$ gcc -g3 -O0 main.c -o main
$ gdb main
(gdb) break 8
Breakpoint 1 at 0x11d3: file main.c, line 8.
(gdb) run
...
> hello world↵
...
(gdb) p *stdin
$1 = {
_flags = -72539512,
_IO_read_ptr = 0x55555555972b "\n",
_IO_read_end = 0x55555555972c "",
_IO_read_base = 0x555555559720 "hello world\n",
}
stdin->_IO_read_ptr still points to the newline. When the loop runs again, scanf("%19[^\n]", buf) sees the newline immediately, matches zero characters, and aborts. The buffer stays empty (or zeroed), and the loop repeats the same output.
Workable Solutions
1. Using Extra scanf Calls
Force consumption of the remainder of the line and the newline:
void take_input(void)
{
char buf[20];
printf("> ");
scanf("%19[^\n]", buf);
/* discard the rest of the line, if it's >19 chars */
scanf("%*[^\n]");
/* discard the newline */
scanf("%*c");
printf("input=`%s`\n", buf);
}
It looks messy, but it works reliably for whitespace, short inputs, and long truncated inputs:
$ ./main
> hey
input=`hey`
> hello world
input=`hello world`
> a b c d e f g h i j k l m o p q r s t u v w x y z
input=`a b c d e f g h i j k l m o p q r s t u v w x y z`
> /* 5 whitespaces */
input=` `
> abcdefghijklmnpoqrstuvwxyz
input=`abcdefghijklmnpoqrs`
2. fgets – The Standard Way
A more predictable and explicit approach is to read a line with fgets and then trim the newline:
#include <stdio.h>
#include <string.h>
void take_input_2(void)
{
char buf[20];
printf("> ");
if (fgets(buf, sizeof(buf), stdin)) {
char *newline_ptr = strchr(buf, '\n');
if (newline_ptr) {
/* replace \n with \0 to trim it */
*newline_ptr = '\0';
} else {
/* no newline found – input was truncated;
consume the rest of the line */
int c;
while ((c = getchar()) != '\n' && c != EOF);
}
}
printf("input=`%s`\n", buf);
}
3. getline – The Heap Way
getline (POSIX, not ISO C) reads an entire line, allocating (or resizing) a buffer on the heap. You must free the memory yourself:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
void take_input_3(void)
{
char *line = NULL; /* initialize pointer */
size_t cap = 0; /* capacity, updated by getline */
ssize_t n;
printf("> ");
n = getline(&line, &cap, stdin);
if (n > 0) {
/* remove trailing newline */
if (line[n-1] == '\n') {
line[n-1] = '\0';
}
printf("input=`%s`\n", line);
}
free(line); /* always free heap memory */
}
Takeaway
The exercise became a reminder of how surprisingly tricky input handling in C can be. scanf looks convenient until you hit edge cases around whitespace and newlines; at that point, “convenience” becomes a liability.
- For predictable, portable, line‑oriented input with truncation handling,
fgetsis almost always the better choice. - If you have the luxury of POSIX and heap allocation,
getlineoffers flexibility. - Save
scanffor situations where you know exactly what the input looks like.