Post-mortem Debugging and Promises

September 26, 2016 - by Wyatt Lyon Preul

Over the years there has been an increase in the popularity of promises in Node.js and JavaScript in general. This popularity is evidenced by the Promise object becoming part of the ECMAScript specification. As a result of having promises built-in, the language can add new capabilities that rely on them. Async functions are a prime example of a new language feature that uses promises.

One reason for the increase in popularity is a result of people attempting to reap the benefits of universal JavaScript. Code reuse is and will always be a major selling point for adopting Node.js. There is a tremendous amount of financial savings that result from being able to reuse code between clients and servers. More often than not, JavaScript that already exists in an organization comes from applications written to run in a browser, not code created for Node.js. Promises are prevalent in JavaScript applications that run in a browser, instead of the callback convention that is more common within Node.js applications. Therefore, the promises style that already exists to execute in browsers will often be adopted by Node.js server developers, so that existing code doesn't go to waste.

Post-mortem requirements

There are different post-mortem requirements for JavaScript applications running in browsers than those running on the server. On the browser, if there is a programmer error in your JavaScript code, you don't want it to crash, create a core dump, or abort the loading of other JavaScript resources. Instead, you will likely trap the error and report it to a reporting server along with the stack trace. You are not likely to be able to convince a user to let you create a heap snapshot or a core dump and upload it to your server along with any logs. Instead, you are trying to prevent the user from even noticing that an error ever occurred.

On the other hand, server developers tend to want a programmer error to abort the process and a core dump to be created. Occasionally, the stack trace and error logs are enough to diagnose and fix the issue. For every other problem, a core dump will be essential for post-mortem analysis.

A remarkable benefit from generating a core dump is the ability to restart your failed process while retaining a reliable snapshot of the process state at the time of failure. This practice goes hand-in-hand with the common suggestion to fail fast. The closer in time to the incident when a process aborts will mean that the core dump will be closer to the moment in time of the failure. If, however, a process doesn't fail fast then there will be additional work on the part of a developer to move back in time to the point when the incident occurred. Moreover, if you do not have a core dump then restarting a process may not even be possible, as you will want to keep the environment frozen in time when the incident occurred. Therefore, the recommended approach is to fail fast with the generation of a core dump.

Differences with error handling

It's clear that promises are not going anywhere, and are likely to continue to increase in popularity. As a result, it's important to understand some of the limitations that impact post-mortem debugging with promises. To help illustrate these shortcomings, we will first look at a server that doesn't use promises at all.

Non-promise server

'use strict';

const Http = require('http');

const server = Http.createServer((req, res) => {
  let result = '';
  req.on('data', (data) => {
    result += data;
  });

  req.once('end', () => {
    const obj = JSON.parse(result);
    res.writeHead(200);
    res.end(obj.foo.bar);
  });
});

server.listen(8080, () => {
  console.log('listening at http://localhost:8080');
});

The server in this example expects that every request has a JSON payload that contains a property named foo. If, however, a request is made that is not JSON formatted or doesn't have a property named foo then there will be an uncaught exception and the server will crash. For example, if you start the server above and send it the curl command below, then the Node.js server process will crash with the following output.

The curl command:

curl -X POST http://localhost:8080 -d '{ "hi": "world" }'

The output of the server process:

$ node server.js
listening at http://localhost:8080
/server.js:14
    res.end(obj.foo.bar);
                   ^

TypeError: Cannot read property 'bar' of undefined
    at IncomingMessage.req.once (/server.js:14:20)
    at IncomingMessage.g (events.js:286:16)
    at emitNone (events.js:86:13)
    at IncomingMessage.emit (events.js:185:7)
    at endReadableNT (_stream_readable.js:975:12)
    at _combinedTickCallback (internal/process/next_tick.js:74:11)
    at process._tickCallback (internal/process/next_tick.js:98:9)

The logged output is simple enough to debug. The error is a programmer error from no request validation and a dangerous assumption about the structure of the payload. The error is a result of trying to access a property on an undefined variable.

The above is not the recommended way to run a Node.js process in production. Instead, it's a best practice to start Node.js with --abort-on-uncaught-exception so that there is a core dump whenever there is an uncaught exception. When the process executes with this argument, the output will be slightly different, and you will also have a core dump created.

$ node --abort-on-uncaught-exception server.js
listening at http://localhost:8080
Uncaught TypeError: Cannot read property 'bar' of undefined

FROM
IncomingMessage.req.once (/server.js:14:20)
IncomingMessage.g (events.js:286:16)
emitNone (events.js:86:13)
IncomingMessage.emit (events.js:185:7)
endReadableNT (_stream_readable.js:975:12)
_combinedTickCallback (internal/process/next_tick.js:74:11)
process._tickCallback (internal/process/next_tick.js:98:9)
Illegal instruction: 4 (core dumped)

The core dump will be in either the working directory for the server or wherever core dumps reside on your system.

Promise server

If the server in the example is switched to use a promise for the request, there will be a difference in how the error gets managed. Instead of aborting the process, the promise will trap the error and pass it to the catch handler. If there isn't a catch function to handle the error, the request will eventually time out, and no error details will exist on the server. Below is what the updated server code looks like when using promises.

'use strict';

const Http = require('http');

const server = Http.createServer((req, res) => {
  const promise = new Promise((resolve, reject) => {
    let result = '';
    req.on('data', (data) => {
      result += data;
    });

    req.once('end', () => {
      const obj = JSON.parse(result);
      resolve(obj);
    });
  });

  promise.then((obj) => {
    res.writeHead(200);
    res.end(obj.foo.bar);
  }).catch((reason) => {
    res.writeHead(500);
    res.end();
  });
});

server.listen(8080, () => {
  console.log('listening at http://localhost:8080');
});

Now when you send the same invalid request to the server, you get a 500 response, and no error information is output. Even if you remove the catch handler, the server will not abort, and there will be no exception displayed. The same is the case if you start Node.js with --abort-on-uncaught-exception, no core dump exists, and there is no useful output. In Node.js version 6.6.0 and later there will be a warning message advising you to catch or handle the rejection along with the error message. Even still, this warning message doesn't include the error stack.

The behavior above may be desirable; it depends on your needs. Regardless, in the catch handler, the error information can be logged to stderr with console.error(reason). Outputting the following information to your error logs can be used by the developer to diagnose and fix the bug. Below is the output of the error when it's logged.

$ node server.js
listening at http://localhost:8080
TypeError: Cannot read property 'bar' of undefined
    at promise.then (/server.js:20:20)
    at process._tickCallback (internal/process/next_tick.js:103:7)

At this point, all that a developer has to diagnose the issue in the promises example is the error log. The log isn't always enough to determine the cause of the error. Instead, when you do use promises you should interrogate the error object, and if it's not something that the process should be able to recover from, then you should abort the process immediately or rethrow the error. How to rethrow and let the process fail on its own will be explored later. Below is the updated server that logs to stderr and aborts the process manually when there is an error.

'use strict';

const Http = require('http');

const server = Http.createServer((req, res) => {
  const promise = new Promise((resolve, reject) => {
    let result = '';
    req.on('data', (data) => {
      result += data;
    });

    req.once('end', () => {
      const obj = JSON.parse(result);
      resolve(obj);
    });
  });

  promise.then((obj) => {
    res.writeHead(200);
    res.end(obj.foo.bar);
  }).catch((reason) => {
    res.writeHead(500);
    res.end();
    console.error(reason);
    process.abort()
  });
});

server.listen(8080, () => {
  console.log('listening at http://localhost:8080');
});

The response will still be a 500, but now the process will abort, and a core file will be created for further post-mortem debugging. Below is the output from running the updated server.

$ node server.js
listening at http://localhost:8080
TypeError: Cannot read property 'bar' of undefined
    at promise.then (/server.js:20:20)
    at process._tickCallback (internal/process/next_tick.js:103:7)
 1: node::Abort() [/bin/node]
 2: node::Chdir(v8::FunctionCallbackInfo<v8::Value> const&) [/bin/node]
 3: v8::internal::FunctionCallbackArguments::Call(void (*)(v8::FunctionCallbackInfo<v8::Value> const&)) [/bin/node]
 4: v8::internal::MaybeHandle<v8::internal::Object> v8::internal::(anonymous namespace)::HandleApiCallHelper<false>(v8::internal::Isolate*, v8::internal::(anonymous namespace)::BuiltinArguments<(v8::internal::BuiltinExtraArguments)1>) [/bin/node]
 5: v8::internal::Builtin_HandleApiCall(int, v8::internal::Object**, v8::internal::Isolate*) [/bin/node]
 6: 0x1fe0b530961b
Abort trap: 6 (core dumped)

Core dump differences

Another difference between the promise and non-promise based examples is the way that you approach the core dump inspection. In the case of the non-promise server, the process aborted as a result of an unhandled exception. The information that will be most helpful for this analysis is to determine the location of the exception in code, then identify the variable values when the exception occurred. To gather this information, we will use the modular debugger (MDB) with the v8 debugging module from Joyent.

Non-promise server

Below is an abbreviated output of running ::jsstack -v on the non-promise core file.

> ::jsstack -v
native: v8::base::OS::Abort+9
native: v8::internal::IC::TypeError+0x52
native: v8::internal::LoadIC::Load+0x21c
native: v8::internal::Runtime_LoadIC_Miss+0x2f9
        (1 internal frame elided)
js:     <anonymous> (as req.once)
          file: /root/server.js
          posn: line 11
          this: 3783811141d9 (JSObject: IncomingMessage)

              6   let result = '';
              7   req.on('data', (data) => {
              8     result += data;
              9   });
             10
             11   req.once('end', () => {
             12     const obj = JSON.parse(result);
             13     res.writeHead(200);
             14     res.end(obj.foo.bar);
             15   });
             16 });

As you can see in the output, the last function to be executed on the stack is the end handler. The next step is to look at the variables that the end handler function had access to, to get a better picture of the incident. To do this, you will need to locate the address that refers to the function. Use the ::jsfunctions command looking for all functions in the server.js file as shown below. Another option is to add the address or -a argument to ::jsstack to get all address details for the functions on the stack.

> ::jsfunctions -s /server.js
            FUNC   #FUNCS NAME                                     FROM
    378381117f51        1 <anonymous> (as req.once)                /server.js line 11
    378381117be1        1 <anonymous> (as req.on)                  /server.js line 7
    3783810d6a69        1 <anonymous> (as server.listen)           /server.js line 18
    3783810d4af1        1 <anonymous> (as Http.createServer)       /server.js line 5
    378381014bb9        1 <anonymous> (as <anon>)                  /server.js line 1

In the above output, the function on line 11 has the address 378381117f51. Therefore, pass this address into ::jsclosure to see what variables the function referenced. The output below shows the command and the output.

> 378381117f51::jsclosure
    "res": 378381116171: {
        "domain": 1e40ce504101: null,
        "_events": 378381116301: [...],
        "_eventsCount": 100000000: 1,
        "_maxListeners": 1e40ce504189: undefined,
        "output": 3783811164a9: [...],
        "outputEncodings": 378381116591: [...],
        "outputCallbacks": 378381116691: [...],
        "outputSize": 0: 0,
        "writable": 1e40ce504231: true,
        "_last": 1e40ce504299: false,
        "chunkedEncoding": 1e40ce504231: true,
        "shouldKeepAlive": 1e40ce504231: true,
        "useChunkedEncodingByDefault": 3783811174d1: [...],
        "sendDate": 378381111909: function updateOutgoingData,
        "_removedHeader": 145ea0c63831: "OK",
        "_contentLength": c800000000: 200,
        "_hasBody": 1e40ce504189: undefined,
        "_trailer": 1e40ce504189: undefined,
    }
    "result": 378381118951: "{ "hi": "world" }"

Using the above information, rewrite the function that threw the exception by replacing the variables with their values during the incident.

req.once('end', () => {
    const obj = JSON.parse("{ "hi": "world" }");
    res.writeHead(200);
    res.end(obj.foo.bar);
  });

With this information, it's easy to see that foo is undefined and the code tried to access the property bar on foo.

Promise server

Analyzing the information in the process that uses promises will require extra steps, first to look at the error and then to backtrack into the code that caused the error. The function that handles the catch will need to be determined, and the error argument passed to it analyzed. We will start the same as before, with the output of ::jsstack, shown below.

> ::jsstack -v
native: libc.so.1`_lwp_kill+0xa
native: libc.so.1`raise+0x20
native: libc.so.1`abort+0x98
        (1 internal frame elided)
native: v8::internal::FunctionCallbackArguments::Call+0xf4
native: _ZN2v88internal12_GLOBAL__N_119HandleApiCallHelperILb0EEENS0_11M...
native: _ZN2v88internalL21Builtin_HandleApiCallEiPPNS0_6ObjectEPNS0_7Iso...
        (1 internal frame elided)
js:     <anonymous> (as promise.then.catch)
          file: /server.js
          posn: line 21
          this: 1e3b87204189 (Oddball: "undefined")
          arg1: 347ffb11c9f9 (JSObject: TypeError)

             16   });
             17
             18   promise.then((obj) => {
             19     res.writeHead(200);
             20     res.end(obj.foo.bar);
             21   }).catch((reason) => {
             22     res.writeHead(500);
             23     res.end();
             24     console.error(reason);
             25     process.abort()
             26   });
             27 });

The function on line 21 receives the exception. The TypeError object is the first argument and has address 347ffb11c9f9. Using ::jsprint we can display the properties of this object in memory.

> 347ffb11c9f9::jsprint
{
    "message": "Cannot read property 'bar' of undefined",
}

From the above information we can gather the error message, but nothing else. Since we didn't access the stack property on the error object and store it some place, it will not exist for us in memory. Not having the error is a roadblock. The recommendation is that if you are going to use promises that you do access and store the value of stack before executing process.abort.

In any event, we can assume that either the promise or the then function is the culprit. Use the ::jsfunctions command to get a listing of the functions in the server.js file then look at the closures that the then function references.

> ::jsfunctions -s server.js
            FUNC   #FUNCS NAME                                     FROM
    347ffb1198f1        1 <anonymous> (as promise.then.catch)      /server.js line 21
    347ffb1189b1        1 <anonymous> (as promise.then)            /server.js line 18
    347ffb118791        1 <anonymous> (as req.once)                /server.js line 12
    347ffb118421        1 <anonymous> (as req.on)                  /server.js line 8
    347ffb117cb9        1 <anonymous> (as Promise)                 /server.js line 6
    347ffb0d7049        1 <anonymous> (as server.listen)           /server.js line 29
    347ffb0d50d1        1 <anonymous> (as Http.createServer)       /server.js line 5
    347ffb014bb9        1 <anonymous> (as <anon>)                  /server.js line 1

> 347ffb1189b1::jsclosure
    "res": 347ffb116171: {
        "domain": 1e3b87204101: null,
        "_events": 347ffb116301: [...],
        "_eventsCount": 100000000: 1,
        "_maxListeners": 1e3b87204189: undefined,
        "output": 347ffb1164a9: [...],
        "outputEncodings": 347ffb116591: [...],
        "outputCallbacks": 347ffb116691: [...],
        "outputSize": 0: 0,
        "writable": 1e3b87204231: true,
        "_last": 1e3b87204299: false,
        "chunkedEncoding": 1e3b87204231: true,
        "shouldKeepAlive": 1e3b87204231: true,
        "useChunkedEncodingByDefault": 347ffb1174d1: [...],
        "sendDate": 347ffb111909: function updateOutgoingData,
        "_removedHeader": 63dffb63879: "OK",
        "_contentLength": 1f400000000: 500,
        "_hasBody": 1e3b87204189: undefined,
        "_trailer": 1e3b87204189: undefined,
    }
    "req": 347ffb1141d9: {
        "_readableState": 347ffb1142c9: [...],
        "readable": 1e3b87204299: false,
        "domain": 1e3b87204101: null,
        "_events": 347ffb114769: [...],
        "_eventsCount": 100000000: 1,
        "_maxListeners": 1e3b87204189: undefined,
        "socket": 347ffb10f681: [...],
        "connection": 347ffb10f681: [...],
        "httpVersionMajor": 100000000: 1,
        "httpVersionMinor": 100000000: 1,
        "httpVersion": 347ffb115c29: "1.1",
        "complete": 1e3b87204231: true,
        "headers": 347ffb114db1: [...],
        "rawHeaders": 347ffb114fa1: [...],
        "trailers": 347ffb115009: [...],
        "rawTrailers": 347ffb115071: [...],
        "upgrade": 1e3b87204299: false,
        "url": 63dffb20611: "/",
        "method": 347ffb09b909: "POST",
        "statusCode": 1e3b87204101: null,
        "statusMessage": 1e3b87204101: null,
        "client": 347ffb10f681: [...],
        "_consuming": 1e3b87204231: true,
        "_dumped": 1e3b87204299: false,

Unfortunately, the output of ::jsclosure doesn't contain the argument value passed into the then function. In fact, the then function has already executed and was out of scope when the catch executed. At this point, the best course of action is to filter the results of ::findjsobjects and look for the argument object passed to the then function. Unfortunately, the error message won't help to locate the object that caused the issue. Below is an abbreviated output of running the ::findjsobjects -v command.

::findjsobjects -v
          OBJECT #OBJECTS   #PROPS CONSTRUCTOR: PROPS
    347ffb105cc1        1        0 TTY
    347ffb10eeb9        1        0 WriteWrap
    1e3b872e51b9        1        0 process
    347ffb1128e1        1        0 HTTPParser
    347ffb112311        1        0 Timer
    347ffb10ba79        2        0 Signal
    1e3b872e51d1        2        0 <anonymous> (as <anon>)
    1e3b872e50c1        2        0 Float64Array
    347ffb08e2c1        3        0 Uint32Array
    347ffb0ac459        5        0 EventHandlers
    347ffa0c82c9        5        0 Uint8Array
    1e3b87245a11        7        0
    347ffa012449       48        0 ContextifyScript
     63dffb4fa41       87        0 Array
     63dffb4e729      440        0 Object
    347ffb11b001        1        1 Object: hi
    347ffb114769        1        1 EventHandlers: data
    347ffb10c189        1        1 EventHandlers: end
    347ffb105481        1        1 Object: isTTY
    347ffb0fa849        1        1 Object: handles
    347ffb0d3f39        1        1 Object: aborted
    347ffb0ac7a9        1        1 Object: path
    347ffb0a96e1        1        1 Object: globalAgent
    347ffb095e61        1        1 Object: ...
    347ffb095e29        1        1 Object: /server.js
    347ffb08c3c1        1        1 Object: methods
    347ffb08c301        1        1 Object: hasIntl
    347ffb00f691        1        1 Object: STATUS_CODES
     63dffb17e81        1        1 Object:
    347ffb08e1c9        1        2 Object: INSPECT_MAX_BYTES, kMaxLength

Fortunately, we do know that the request made included the property hi, which will allow us to locate the object and print it more easily. However, in the real world, this is likely not going to be the case unless you can correlate the request that caused the issue with the full request log to obtain the value of the request. Even then, it would require looking through the source to reconstruct the details of the request values with the variables stored in memory. Below is an example of running ::findjsobjects with the property argument to find the object with the property hi.

> ::findjsobjects -v -p hi
347ffb11b001
> 347ffb11b001::jsprint
{
    "hi": "world",
}

From this information, we have evidence that the request doesn't have the properties that we expect. However, this is only an assumption since the object doesn't have any references to the raw request and we cannot relate it to the function that caused the exception. Again, if you are going to use promises and plan to use post-mortem debugging, you should do your best to retain the stack string information passed to the catch handler. One idea is to rethrow the error with the stack as the message then to catch this information. The error won't include as much information as the non-promises JavaScript server previously, but it will include more information. Below is an example of an updated catch handler along with the resulting ::jsprint output.

  }).catch((reason) => {
    res.writeHead(500);
    res.end();
    console.error(reason);
    throw new Error(reason.stack);
  }).catch((reason) => {
    process.abort();
  });
> 1ce5291225d9::jsprint
{
    "message": "TypeError: Cannot read property 'bar' of undefined
    at promise.then (/server.js:20:20)
    at process._tickCallback (internal/process/next_tick.js:103:7)",
}

Another option is to use the unhandledRejection event on the process to intercept and log the error details. The unhandledRejection event executes when a rejected promise doesn't have a catch handler to deal with the rejection. Below is the promises example with updated code to show how to use the unhandledRejection.

'use strict';

const Http = require('http');

const server = Http.createServer((req, res) => {
  const promise = new Promise((resolve, reject) => {
    let result = '';
    req.on('data', (data) => {
      result += data;
    });

    req.once('end', () => {
      const obj = JSON.parse(result);
      resolve(obj);
    });
  });

  promise.then((obj) => {
    res.writeHead(200);
    res.end(obj.foo.bar);
  });
});

process.on('unhandledRejection', (reason, promise) => {
  console.error(reason);
  throw reason;
});

server.listen(8080, () => {
  console.log('listening at http://localhost:8080');
});

The output from the above example looks like the following:

 node --abort-on-uncaught-exception server.js
listening at http://localhost:8080
TypeError: Cannot read property 'bar' of undefined
    at promise.then (/server.js:20:20)
    at process._tickCallback (internal/process/next_tick.js:103:7)
Uncaught TypeError: Cannot read property 'bar' of undefined

FROM
process.on (/server.js:26:3)
emitTwo (events.js:106:13)
process.emit (events.js:191:7)
emitPendingUnhandledRejections (internal/process/promises.js:44:22)
process._tickCallback (internal/process/next_tick.js:104:7)
Illegal Instruction (core dumped)

Fortunately, the standard error output does include a stack trace with the correct line number and error that occurred. However, the stack trace information contained in the core dump is still a moment too late. Below is the output of running ::jsstack -v on the core dump.

> ::jsstack -v
native: v8::base::OS::Abort+9
native: v8::internal::Runtime_Throw+0x52
        (1 internal frame elided)
js:     <anonymous> (as process.on)
          file: //server.js
          posn: line 24
          this: 33ca8eee0f51 (JSObject: process)
          arg1: 3d008f11ca29 (JSObject: TypeError)
          arg2: 3d008f119919 (<unknown>)

             19     res.writeHead(200);
             20     res.end(obj.foo.bar);
             21   });
             22 });
             23
             24 process.on('unhandledRejection', (reason, promise) => {
             25   console.error(reason);
             26   throw reason;
             27 });
             28
             29 server.listen(8080, () => {
             30   console.log('listening at http://localhost:8080');
             31 });
             32

js:     emitTwo
          file: events.js
          posn: line 104

Following the full output of the stack in MDB doesn't get you closer to where the real exception occurred. Instead, the stack goes into v8 and Node.js internal functions instead of into the request end event handler. Even printing the output of the TypeError object above only contains the error message.

Conclusion

At this point, it should be obvious that post-mortem debugging with promises isn't as easy as without. That isn't to say that you should avoid promises. Instead, you need to be aware of the tradeoffs and plan for how to debug production issues when they inevitably do occur. Remember to abort the process manually or to rethrow in the unhandledRejection event to get a core dump.

Even with the core dump, backtracking and pinning down the root cause will likely require additional work. This difficulty is largely a result with how promises don't crash immediately upon an exception.

Unfortunately, the fact is that core dumps generated when using promises are created too late and are harder to use for post-mortem debugging purposes than those generated without.