Debugging enhancements in Node 0.12

Background

Node.js provides the richest postmortem debugging facilities of just about any dynamic language. Postmortem debugging refers to debugging programs from a core file, heap dump, or some similar dump of memory. With a core file of a Node program (which is essentially a snapshot of your program's state), you can:

  • Find all JavaScript objects and classify them by their properties. This is very useful for identifying large memory leaks.
  • Find JavaScript objects by their constructors or properties and print them out.
  • Enhanced in 0.12: Print a stack trace of where the program was when the core file was created. (If you're generating a core file on crash, this is the stack where the program crashed.) Unlike the stack you get when a Node program normally crashes, this stack includes function arguments. In 0.12, by default the printed stack trace is much friendlier, and it also contains the "this" argument for each function.
  • New in 0.12: Find all of the JavaScript closures that have been created. Closures can be a major source of memory leaks, so identifying which functions have created lots of closures helps find memory leaks.
  • New in 0.12: Find the chunk of native memory that's being used for a given Node Buffer. When Buffers make up a large percentage of memory used in your program, this can help identify what they're storing.
  • New in 0.12: Find the C pointer for a given V8 "internal" field. This is mostly interesting for add-on developers.

The postmortem approach has several advantages over other debugging techniques:

  • It doesn't require any runtime overhead.
  • It's comprehensive. A core file is a complete snapshot of your program. If there's something missing, you can add it by just storing it in your program.
  • In production, when you find a program behaving badly, you can save a core file and then immediately restart the program. This often fixes the problem, restores service, and still enables you to debug the problem after the fact.
  • Because all the debugging is offline (i.e., the program is not suspended), debuggers can use sophisticated (but expensive) analysis techniques like the ones mentioned above.

You can save core files of running programs using gcore(1) on most systems, and you can configure Node to generate a core file when it crashes using the --abort-on-uncaught-exception flag. Although MDB only runs on illumos-based systems (like SmartOS), it can read core files from GNU/Linux systems as well.

Changes in v0.12

If this is the first you've heard of debugging Node from core files, you'll want to check out the Node.js MDB intro. The rest of this post is just a summary of changes.

These changes are part of the debugger module that ships with Node 0.12, but you can use this module on core files from versions of Node as far back as 0.4. So you can use these features today before even upgrading your programs to use Node 0.12.

Streamlined ::jsstack

In previous versions, ::jsstack's default output includes a lot of information that's not relevant for JavaScript-only problems and is frequently overwhelming for new users. After these changes, it hides internal frames, frame pointers, and instruction pointers by default. It also chops long C++ symbol names to fit lines at 80 columns.

Before:

> ::jsstack8043190 0xb4e1a5ee MyObject (ba780b15)80431b4 0x94e0fba9 80431c8 0xb4e1096c  (as ) (ba7b3d59)8043200 0x94e645bf  (as EventEmitter.emit) (93f24ad9)804321c 0x94e0f7a1 8043240 0xb4e0f9c0  (as parser.onIncoming) (92609b45)804327c 0xb4e0c9f3 parserOnHeadersComplete (ba7917e9)804329c 0x94e0fe19 80432d8 0x94e0a0aa 8043348 _ZN2v88internalL6InvokeEbNS0_6HandleINS0_10JSFunctionEEENS1_INS0_6ObjectEEEiPS5_Pb+0xef...8047c98 uv_run+0xea8047d08 node::Start+0x17b8047d28 main+0x1b8047d40 _start+0x83

After:

> ::jsstackjs:     MyObject        (1 internal frame elided)js:      (as )js:      (as EventEmitter.emit)        (1 internal frame elided)js:      (as parser.onIncoming)js:     parserOnHeadersComplete        (1 internal frame elided)        (1 internal frame elided)native: _ZN2v88internalL6InvokeEbNS0_6HandleINS0_10JSFunctionEEENS1_INS0......native: uv_run+0xeanative: node::Start+0x17bnative: main+0x1bnative: _start+0x83

You can get the full native details with "-a" (for "all frames and information").

this pointer and constructor names in ::jsstack -v output

::jsstack -v now includes the this pointer. Also, when it prints out object pointers (for this as well as function arguments), they include the constructor name.

Before:

> ::jsstack -vn08043190 0xb4e1a5ee MyObject (ba780b15)    file: /root/example.js    posn: position 6280431b4 0x94e0fba9 80431c8 0xb4e1096c  (as ) (ba7b3d59)    file: /root/example.js    arg1: ba7ceb01 (JSObject)    arg2: ba7cf781 (JSObject)    posn: position 595...

After:

> ::jsstack -vn0js:     MyObject          file: /root/example.js          posn: position 62        (1 internal frame elided)js:      (as )          file: /root/example.js          posn: position 595          this: ba7b3d7d (JSObject: Server)          arg1: ba7ceb01 (JSObject: IncomingMessage)          arg2: ba7cf781 (JSObject: ServerResponse)...

New dcmd: ::jsconstructor: print constructor of a JavaScript object

Given an object, you can print the constructor that was used to create it:

> ba7ce109::jsconstructorSlowBuffer> ba7ce781::jsconstructorObject> ba7cdb75::jsconstructorHTTPParser

With "-v", the command prints out a pointer to the constructor function:

> ba7b3d7d::jsconstructor -vServer (JSFunction: ba791bbd)

which is useful for passing to ::jssource (see next).

New dcmd: ::jssource: print source for a JavaScript function

Given a function, you can print the source code for it:

> 88a21d0d::jssource -n1file: http.js 1963 1964   socket.addListener('error', function(e) { 1965     self.emit('clientError', e, this); 1966   }); 1967

This closure is an instance of the anonymous function defined inline.

New dcmd: ::nodebuffer: fetch native pointer for a buffer

When debugging memory leaks using ::findjsobjects, you may find that you have a ton of Buffer objects. These represent chunks of memory from the native heap. You often want to know what that memory contains. With the ::nodebuffer command, you can take one of these Buffers:

> b1a9ec5d::jsconstructorBuffer> b1a9ec5d::jsprint{    "length": 1423,    "parent": {        "length": 8192,        "used": 7272,    },    "offset": 0,}

and get a pointer to the underlying C memory:

> b1a9ec5d::nodebuffer8db8c08

Then you can use ::dump (for example) to see what that memory actually contains:

> 8db8c08,0t64::dump           0 1 2 3  4 5 6 7 \/ 9 a b  c d e f  01234567v9abcdef8db8c00:  08200000 f89f103a 00000007 6f707469  . .....:....opti8db8c10:  6f6e7300 00000468 65616400 00000470  ons....head....p8db8c20:  6f737400 00000370 75740000 00066465  ost....put....de8db8c30:  6c657465 00000005 74726163 65000000  lete....trace...8db8c40:  06616363 65707400 00000e61 63636570  .accept....accep

This often points you to a particular HTTP request handler or other code path in your program.

New dcmd: ::jsfunctions: list closures

findjsobjects is very helpful for identifying classes of objects with many instances. Once you've found which objects are being leaked, you can use findjsobjects -r to find out which objects reference the leaked one. Sometimes memory is not referenced by other objects, but rather by closures: JavaScript functions that are pending and may be invoked some time in the future. The new jsfunctions command lists all functions in your JavaScript program, and how many different closures have been created for each one:

> ::jsfunctions    FUNC   #FUNCS NAME                                     FROM...8803e63d     1667  (as parser.onIncoming)       http.js position 58040880389c5      543  (as )                  _stream_readable.js position 2634088038921      543  (as )                  _stream_writable.js position 8134880388d5      543  (as cb)                      _stream_writable.js position 59238800be75      587  (as )                  /home/dap/demo/restify-ex.js position 6168800be35     2114 _ifError                                 /home/dap/demo/node_modules/restify/lib/server.js position 28468800b229      691  (as )                  net.js position 12228880099a1     1755  (as )                  http.js position 558408800997d     1755  (as )                  http.js position 5499988009939     1696 socketOnDrain                            http.js position 5777488009915     2234 serverSocketCloseListener                http.js position 54505880098f1     2234 abortIncoming                            http.js position 5431488009885     1754 g                                        events.js position 5742...

You can take one of these functions and print the source. Here's abortIncoming():

> 880098f1::jssourcefile: http.js 1907 function connectionListener(socket) { 1908   var self = this; 1909   var outgoing = []; 1910   var incoming = []; 1911 1912   function abortIncoming() { 1913     while (incoming.length) { 1914       var req = incoming.shift(); 1915       req.emit('aborted'); 1916       req.emit('close'); 1917     } 1918     // abort socket._httpMessage ? 1919   } 1920 1921   function serverSocketCloseListener() { 1922     debug('server socket close'); 1923     // mark this parser as reusable 1924     if (this.parser) 1925       freeParser(this.parser);

Note that it's defined inside the connectionListener function. This explains why we have 2234 closures for it. Each time connectionListener is called, a new closures of abortIncoming is created to close around socket, self, outgoing, and incoming.

As with findjsobjects, the exact count in the jsfunctions list may be misleading, but large values often reflect memory leaks.

New advanced dcmd: ::v8internal: fetch internal fields

Native code that uses the V8 API (whether part of Node.js itself or part of an add-on) often uses internal fields, which are essentially slots inside an object for storing internal data. For example, Node classes like TcpWrap typically store the underlying libuv handle as internal field 0. So if you have a TCP object, then you can grab internal field 0 to get the underlying libuv handle.

Bug fixes

  • jsprint now quotes property names. (The current approach is simplistic, and the results are not to be confused for valid JSON, but it's much clearer now than it was, particularly for properties with non-alphanumeric characters.)
  • Several bugs in 64-bit core files, particularly printing out source code using "::jsstack -v" and printing two-byte strings. Both 32-bit and 64-bit core files of Node 0.10 and 0.12 are fully supported. Core files from 0.4 and 0.6 should work as well, but have not been recently tested.
  • Two-byte strings are no longer erroneously truncated.

Stay tuned

We're working on a few more impressive tools using jsfunctions. Stay tunedfor details!

For help

For help, file a Node issue or reach out to @dapsays, @tjfontaine, or @misterdjules on Twitter. Joyent also offers Nodejs Support that builds heavily on these tools.



Post written by Dave Pacheco