Stopping a broken program in its tracks

Last week I debugged a Node issue where under some conditions, fork() failure could result in some file descriptors being closed, leading shortly to bedlam. In the best cases, the program would immediately crash, but often it would just do the wrong thing. This happens a lot, of course, and in many ways it's worse than a crash because you have no core dump nor even a stack trace to go on.

In my write-up of the bug, I glossed over a key step to debugging the problem:

Using DTrace and reproducing on SmartOS, I stopped the process at the moment close() returned EBADF, and caught this stack trace...

I want to explain this technique in more detail because it's very useful for getting a stack trace or core dump when a program starts going south, even if the program isn't actually going to crash. (This technique isn't specific to Node.js or SmartOS, either, though it's easier with Node.js and on SmartOS.)

In this case, I was debugging a production problem that I reproduced in development, which made things much easier. I was free to kill the process to create a core dump without worrying about disrupting any users. Since the Node program was blowing up on a "close, EBADF" error, I used this DTrace one-liner to make the program dump core precisely when that happened (instead of barreling on with the wrong file descriptor):

# dtrace -w -n 'syscall::close:return  /execname == "node" && arg1 == -1 && errno == EBADF/  { trace(pid); raise(SIGABRT); exit(0); }'

Taking apart the script:

  • dtrace -w: The "-w" flag to "dtrace" enables destructive actions. Without "-w", DTrace won't let you do anything that can significantly change the behavior of running programs (like raising a signal, in this case).
  • syscall::close:return: We're tracing returns from the "close" system call. We're tracing the kernel, not a particular process, so this will catch all closes in all processes.
  • /execname == "node" && arg1 == -1 && errno == EBADF/: We're only going to take action for processes called "node", where the return value from "close" is -1, and the kernel errno is EBADF.
  • { trace(pid); raise(SIGABRT); exit(0); }: The operative piece here is raise(SIGABRT), which immediately sends SIGABRT to the current process. (The reason for trace(pid) is so that you know what process was killed, and the reason for exit(0) is to end the D script as soon as this happens. If you don't exit and you get the predicate wrong, you can accidentally do a lot of damage!)

The default behavior on receiving SIGABRT is to dump core immediately (see signal.h). It's possible to alter this behavior, but very few programs do, so this technique almost always works.

Once I had a core file, I fired up MDB to get a JavaScript stack trace, and from there it was easy to figure out what had gone wrong. For details, see the Node issue.

What about production?

I mentioned that in this case, I'd reproduced the problem in development, so I was free to kill the process. What if (as so often happens) the first time we saw this problem was in production, and we wanted to get a core dump at exactly the right time, but without causing the process to crash? That's not actually much harder, at least on systems with a program like gcore(1):

# dtrace -w -n 'syscall::close:return  /execname == "node" && arg1 == -1 && errno == EBADF/  { trace(pid); stop(); system("gcore %d; prun %d", pid, pid); exit(0); }'

This one is very similar to the previous script, except that instead of raising SIGABRT, we use stop() to stop the process in its tracks, and asynchronously run gcore PID (to save the core dump) followed by prun PID (to start the process running again). This is still technically destructive, because we're stopping the process for however long it takes to save the core dump, but in most cases this isn't too long and the disruption is insignificant.

To recap: you can use DTrace for more than just data collection, including debugging non-fatal buggy program behavior. DTrace can instrument very specific events on the system (like "the close syscall in a Node process returned -1 with EBADF" or any application-level event), and you can use that to take action like stopping the process to take a core dump and then resuming it again. Putting these pieces together makes it possible to gather all kinds of debugging information from a program that's behaving badly.



Post written by David Pacheco