For the love of POSIX: A more useful way to not segfault from null pointers
There’s a post making the rounds on Hacker News about actually mapping the address 0 to make it possible to not crash when dereferencing a null pointer.
The thing is, POSIX provides a more useful mechanism for recovering from null pointer dereferences using signal handlers.
See, signals in UNIX can be caught and you can do things with them. Let’s say that we’ve got an event-driven application and we want to move on to the next event if the handler for one event caused a segfault. We can do that by catching the SIGSEGV and SIGBUS signals and returning to our event loop to process the next event.
Let’s look at some code:
#include <stdio.h> #include <signal.h> #include <setjmp.h> #include <string.h> static jmp_buf event_loop_jump; static void process_event(int event_id) { if(event_id % 10 == 0) { /* Boom. */ int *i = 0; ++*i; } else { printf("Processed Event: %i\n", event_id); } } static void handler(int signum) { printf("Caught %s\n", strsignal(signum)); longjmp(event_loop_jump, 1); } static void run_event_loop(void) { int event_id = 0; for(event_id = 1; event_id <= 20; event_id++) { if(setjmp(event_loop_jump)) { printf("Couldn't process event: %i\n", event_id); } else { process_event(event_id); } } } int main(void) { signal(SIGSEGV, handler); signal(SIGBUS, handler); run_event_loop(); return 0; } |
And that give us:
Processed Event: 1 Processed Event: 2 Processed Event: 3 Processed Event: 4 Processed Event: 5 Processed Event: 6 Processed Event: 7 Processed Event: 8 Processed Event: 9 Caught Bus error Couldn't process event: 10 Processed Event: 11 Processed Event: 12 Processed Event: 13 Processed Event: 14 Processed Event: 15 Processed Event: 16 Processed Event: 17 Processed Event: 18 Processed Event: 19 Caught Bus error Couldn't process event: 20
What’s happening, Peter
So, let’s break that down a bit — we’ve created a function called handler that matches the signature required for signal handlers. man signal knows more, or check out the GNU documentation. In main we specify that we want to use that function to handle SIGSEGV and SIGBUS, the typical things you’ll be looking at when dereferencing null pointers. From there we call run_event_loop, whose function should be obvious.
Jump Around
run_event_loop just loops through 20 fake events, but the interesting bit is setjmp. That saves our stack (the variables in use, basically). During normal execution it just returns 0, so we fall through to our process_event call below, which just pretends to handle and event and prints out a message.
We’ve wired our event handler to blow up once every 10 times. When it does execution is suspended and we cut over to our signal handler. Our signal handler, in turn, prints a little message and then breaks out of the signal handler and back into the normal event handling flow with a longjmp call that corresponds to our setjmp from before.
So once we call that we’re back at our line with the setjmp call, only this time it returns the value that we passed to longjmp — 1, and since that evaluates to true, we end up printing a message saying that we weren’t able to process the event.
In Context
On the whole this seems far more useful than mapping address zero. However, There Be Dragons in these waters, and it no doubt opens up a can of worms in terms of what’s done with static variables in the functions you’re breaking out of and if they’re left in a secure state — in other words, this does open up another potential security vector.
I’ll also note that we don’t actually do this in any of our stuff at Directed Edge, going for the less exciting, but more predictable process watchers to handle restarts when our C++ engine falls foul of memory protection (which, blessedly, is a rare occurrence).
Update:
For posterity, and based on some of the comments here and on Hacker News, it seems apt to note that this method isn’t being seriously suggested for production code; the idea behind the post was more as a novelty and a brief exploration in signal handling.
nelhage:
Thanks for the follow-up post. I agree that if you actually want to catch NULL pointer dereferences and try to do something with the, a signal handler is a much better approach. The point of my post, of course, was not that you should ever mmap(NULL) your applications, but as a jumping-off point to showing how to exploit kernel NULL pointer dereferences in part II.
As an additional comment, your code is not quite portable, in that POSIX doesn’t specify what happens to signal dispositions while a handler established using signal() is in effect. It is legal for the OS to reset the signal to SIG_DFL or SIG_IGN, or to block it while the handler is running. Since your handler effectively never returns, this means that you’re not guaranteed to get the second or future segfaults (although, as you’ve shown, it does work on Linux).
If anyone wants to actually use this sort of code in production, you’ll want to use sigaction(2), which allows you to explicitly specify the semantics you want.
March 31, 2010, 1:25 pmpietro:
A small typo:
for(event_id = 1; event_id <= 20; event_id++)
should be instead
for(event_id = 1; event_id <= 20; event_id++)
March 31, 2010, 1:26 pmpietro:
ugh, the comment interpreted the html correctly; I meant to say replace
for(event_id = 1; event_id <= 20; event_id++)
with
for(event_id = 1; event_id <= 20; event_id++)
March 31, 2010, 1:28 pmScott Wheeler:
Thanks, fixed. WordPress got wonky switching between HTML and Rich Text modes.
March 31, 2010, 1:43 pmMatt H:
Calling longjmp in a signal handler is undefined according to this site:
https://www.securecoding.cert.org/confluence/display/seccode/SIG32-C.+Do+not+call+longjmp()+from+inside+a+signal+handler
March 31, 2010, 2:27 pmScott Wheeler:
@Matt The last comment in the post about static variables is actually addressing that — if the function can leave the global state in a mess then badness can happen.
March 31, 2010, 2:34 pmForthe:
On Linux, from the man pages:
sigaction, not signal
siglongjmp, not longjmp
sigsetjmp, not setjmp
C89, C99, and POSIX.1-2001 specify longjmp(). POSIX.1-2001 specifies siglongjmp().
C89, C99, and POSIX.1-2001 specify setjmp(). POSIX.1-2001 specifies sigsetjmp().
sigaction: CONFORMING TO
March 31, 2010, 3:41 pmPOSIX.1-2001, SVr4.
Paul Betts:
This may work in the example case, but doing things like this is unsafe in that the stack or the heap (depending on why you faulted) is likely to be trashed on your way towards the signal handler.
Furthermore, it’s a security vulnerability: if an attacker is looking to find a piece of data he’s interested to jump into or a function entry point that isn’t always in the same place, doing this gives him unlimited retries to do so, whereas he would normally only get one shot to find the stack variable or return address he’s looking for.
March 31, 2010, 6:54 pm