For the love of POSIX: A more useful way to not segfault from null pointers
There’s a post making the rounds on Hacker News about actually mapping the address 0 to make it possible to not crash when dereferencing a null pointer.
The thing is, POSIX provides a more useful mechanism for recovering from null pointer dereferences using signal handlers.
See, signals in UNIX can be caught and you can do things with them. Let’s say that we’ve got an event-driven application and we want to move on to the next event if the handler for one event caused a segfault. We can do that by catching the SIGSEGV and SIGBUS signals and returning to our event loop to process the next event.
Let’s look at some code:
#include <stdio.h> #include <signal.h> #include <setjmp.h> #include <string.h> static jmp_buf event_loop_jump; static void process_event(int event_id) { if(event_id % 10 == 0) { /* Boom. */ int *i = 0; ++*i; } else { printf("Processed Event: %i\n", event_id); } } static void handler(int signum) { printf("Caught %s\n", strsignal(signum)); longjmp(event_loop_jump, 1); } static void run_event_loop(void) { int event_id = 0; for(event_id = 1; event_id <= 20; event_id++) { if(setjmp(event_loop_jump)) { printf("Couldn't process event: %i\n", event_id); } else { process_event(event_id); } } } int main(void) { signal(SIGSEGV, handler); signal(SIGBUS, handler); run_event_loop(); return 0; } |
And that give us:
Processed Event: 1 Processed Event: 2 Processed Event: 3 Processed Event: 4 Processed Event: 5 Processed Event: 6 Processed Event: 7 Processed Event: 8 Processed Event: 9 Caught Bus error Couldn't process event: 10 Processed Event: 11 Processed Event: 12 Processed Event: 13 Processed Event: 14 Processed Event: 15 Processed Event: 16 Processed Event: 17 Processed Event: 18 Processed Event: 19 Caught Bus error Couldn't process event: 20
What’s happening, Peter
So, let’s break that down a bit — we’ve created a function called handler that matches the signature required for signal handlers. man signal knows more, or check out the GNU documentation. In main we specify that we want to use that function to handle SIGSEGV and SIGBUS, the typical things you’ll be looking at when dereferencing null pointers. From there we call run_event_loop, whose function should be obvious.
Jump Around
run_event_loop just loops through 20 fake events, but the interesting bit is setjmp. That saves our stack (the variables in use, basically). During normal execution it just returns 0, so we fall through to our process_event call below, which just pretends to handle and event and prints out a message.
We’ve wired our event handler to blow up once every 10 times. When it does execution is suspended and we cut over to our signal handler. Our signal handler, in turn, prints a little message and then breaks out of the signal handler and back into the normal event handling flow with a longjmp call that corresponds to our setjmp from before.
So once we call that we’re back at our line with the setjmp call, only this time it returns the value that we passed to longjmp — 1, and since that evaluates to true, we end up printing a message saying that we weren’t able to process the event.
In Context
On the whole this seems far more useful than mapping address zero. However, There Be Dragons in these waters, and it no doubt opens up a can of worms in terms of what’s done with static variables in the functions you’re breaking out of and if they’re left in a secure state — in other words, this does open up another potential security vector.
I’ll also note that we don’t actually do this in any of our stuff at Directed Edge, going for the less exciting, but more predictable process watchers to handle restarts when our C++ engine falls foul of memory protection (which, blessedly, is a rare occurrence).
Update:
For posterity, and based on some of the comments here and on Hacker News, it seems apt to note that this method isn’t being seriously suggested for production code; the idea behind the post was more as a novelty and a brief exploration in signal handling.