What I Did at nodeconfeu

October 07, 2013 - by Mr. Max Bruning

The nodeconfeu conference in beautiful Waterford, Ireland was, like nodedublin last year, an excellent chance to learn about all things node.js. I want to thank Cian O'Maidin (and staff) once again for another great conference. Hopefully, I'll be back next year.

In this blog, I'll talk about what I did at the conference. Others have already blogged about the talks. The rest of this post is technical.

On the first day, TJ Fontaine and I took turns talking about Manta and about using DTrace and mdb. I went through a problem I ran into when I first started playing with node, and TJ showed a simple example using Manta to sort through about 150 300-500 MByte files in about 5 seconds.

I'll leave TJ to blog about Manta. I'll take this opportunity to write about the problem I covered, and briefly talk about the other sessions and talk I did at the conference.

I wrote a simple file server. The client side requests a file, and the server (written in (link: http://www.joyent.com/technology/nodejs text: Node.js)) sends the contents of the file to the client.

Here's the server side:

# node fileserver.js &
[1] 93158
myapp listening at http://0.0.0.0:3000
#

And the client:

$ curl http://10.88.88.3:3000//var/tmp/nodeconf/fileserver.js > /tmp/foo
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2015    0  2015    0     0  16812      0 --:--:-- --:--:-- --:--:-- 22142
$

This appears to work. But what happens when the file is large? Here, /var/tmp/bigfile is 80Mbytes large:

$ curl http://10.88.88.3:3000//var/tmp/bigfile > /tmp/foo
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (52) Empty reply from server
$

And on the server:

FATAL ERROR: JS Allocation failed - process out of memory
[1]+  Exit 1                  node fileserver.js

The server has run out of space, at least according to the error message. On some versions of Node.js, a core file is produced. On this version, there is no core file. The error message says that allocation failed. We could try finding the error message in the Node.js source code, but we'll try using DTrace first. To allocate memory, a process needs to call either brk(2), or mmap(2) (sbrk() calls brk(2)). Let's watch sbrk(2) and mmap(2) system calls for errors (and associated errno values). Here's a DTrace script to do that.

#!/usr/sbin/dtrace -qs

syscall::mmap:entry
/pid == $target/
{
    self->sz = arg1;
}

syscall::mmap:return
/self->sz && arg1 == -1/
{
    printf("mmap failed for %d bytes, errno = %d\n", self->sz, errno);
    jstack(500,8000);
    self->sz = 0;
}

syscall::brk:entry
/pid == $target/
{
    self->new_end = arg0;
}

syscall::brk:return
/self->new_end && arg1 == -1/
{
            jstack(500,8000);
    printf("brk failed at %x, errno = %d\n", self->new_end, errno);
    self->new_end = 0;
}

We'll run the script after the server has started, just before we send the client request that causes the error.

# node fileserver.js &
[1] 51539
myapp listening at http://0.0.0.0:3000
#
# ./mem.d -p `pgrep node`

And run the client again.

$ curl http://10.88.88.3:3000//var/tmp/bigfile > /tmp/foo
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
curl: (52) Empty reply from server
$

Back in the window where we're running mem.d:

mmap failed for 503349392 bytes, errno = 11

              libc.so.1`0xfe8959d5
              node`_ZN2v88internal15MemoryAllocator21AllocateAlignedMemoryEjjNS0_13ExecutabilityEPNS0_13VirtualMemoryE+0x5b
              node`_ZN2v88internal15MemoryAllocator13AllocateChunkEiNS0_13ExecutabilityEPNS0_5SpaceE+0x53
              node`_ZN2v88internal15MemoryAllocator17AllocateLargePageEiPNS0_5SpaceENS0_13ExecutabilityE+0x28
              node`_ZN2v88internal16LargeObjectSpace11AllocateRawEiNS0_13ExecutabilityE+0x9b
              node`_ZN2v88internal4Heap11AllocateRawEiNS0_15AllocationSpaceES2_+0x152
              node`_ZN2v88internal4Heap22AllocateRawAsciiStringEiNS0_13PretenureFlagE+0x55
              node`_ZN2v88internal17AllocateRawStringINS0_14SeqAsciiStringEEEPNS0_11MaybeObjectEPNS0_7IsolateEi+0x23
              node`_ZN2v88internalL19SlowQuoteJsonStringIcNS0_14SeqAsciiStringELb0EEEPNS0_11MaybeObjectEPNS0_7IsolateENS0_6VectorIKT_EE+0x3f
              node`_ZN2v88internal23Runtime_QuoteJSONStringENS0_9ArgumentsEPNS0_7IsolateE+0xd6
              << internal code >>
              BasicJSONSerialize at native json.js position 8494
              stringify at native json.js position 10178
              << adaptor >>
              formatJSON at  position 278
              << adaptor >>
              format at  position 2258
              send at  position 8428
              << adaptor >>
              myread at  line 36
              (anon) as (anon) at fs.js position 14328
              << internal >>
              (anon) as (anon) at fs.js position 7136
              << adaptor >>
              << internal >>
              << entry >>
              node`_ZN2v88internalL6InvokeEbNS0_6HandleINS0_10JSFunctionEEENS1_INS0_6ObjectEEEiPS5_Pb+0xef
              node`_ZN2v88internal9Execution4CallENS0_6HandleINS0_6ObjectEEES4_iPS4_Pbb+0xd0
              node`_ZN2v88Function4CallENS_6HandleINS_6ObjectEEEiPNS1_INS_5ValueEEE+0x15c
              node`_ZN4node18MakeDomainCallbackEN2v86HandleINS0_6ObjectEEENS1_INS0_8FunctionEEEiPNS1_INS0_5ValueEEE+0x18f
              node`_ZN4node12MakeCallbackEN2v86HandleINS0_6ObjectEEENS1_INS0_6StringEEEiPNS1_INS0_5ValueEEE+0x9d
              node`_ZN4nodeL5AfterEP7uv_fs_s+0x131
              node`uv__work_done+0x83
              node`uv__async_event+0x56
              node`uv__async_io+0x84
              node`uv__io_poll+0x1a7
              node`uv_run+0xea
              node`_ZN4node5StartEiPPc+0x17b
              node`main+0x1b
              node`_start+0x83

The error is occurring in the myread function. More importantly, the errno value of 11 is EAGAIN (see /usr/include/sys/errno.h). According to the mmap(2) man page, EAGAIN is set if the mapping could not be locked in memory or there was insufficient room to reserve swap space for the mapping. We're not trying to lock down the memory. Let's look at space on the machine.

# swap -sh
total: 53M allocated + 0K reserved = 53M used, 203M available
#

It's trying to map 503349392 bytes (roughly 500MB), but there isn't enough space available. We could try increasing the swap space, but a more interesting question is: why is it trying to map 500MB? The "bigfile" is only 80MB large. Answering this question is left as an exercise for the reader. The source code is at https://github.com/max123/nodeconfeu.

Also in this session, an attendee had written a version of the nQueens puzzle in Node.js. The process runs for a long time using up 100% of a CPU, and one question was: "What is node doing during this time?".

For this, we built a flamegraph.

First, we start the program. The arguments are the board size and the number of generations to run. So a 250x250 board, ran 100000 times.

# node main 250 100000 &
[1] 52005
Running with size 250 in 100000 generations.
#

Let's look at a high level to see what the process is doing.

# prstat -cmLp `pgrep node`
Please wait...
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 52005 root      99 0.0 0.0 0.0 0.0 0.0 0.0 0.7   1 594   8   0 node/1
 52005 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.3 471   2 471   0 node/2
Total: 1 processes, 2 lwps, load averages: 0.99, 0.57, 0.25
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 52005 root      99 0.0 0.0 0.0 0.0 0.0 0.0 0.6   0 775   0   0 node/1
 52005 root     0.0 0.0 0.0 0.0 0.0 0.0 100 0.1 475   3 475   0 node/2
Total: 1 processes, 2 lwps, load averages: 0.99, 0.58, 0.25
(^c)
#

So, node is spending almost 100% of its time running in user level code. This could be anywhere in the node engine, or handling the javascript that node is running. While the program is running, let's use the DTrace profile provider to gather information about where it is spending its time for 1 minute.

# dtrace -n 'profile-997/execname == "node" && arg1/{@[jstack(500,8000)]=count();} tick-60s{exit(0);}' > dtrace.out
dtrace: description 'profile-997' matched 2 probes
#
# ls -l dtrace.out
-rw-r--r-- 1 root root 349781 Sep 18 12:44 dtrace.out
#

So, a lot of output. We'll make a Flame Graph.

    $ npm install -g stackvis
    $ stackvis dtrace flamegraph-svg < dtrace.out > nqueens.svg
nqueens

And read the nqueens.svg in a browser.

So, the program spent 98.13% of its time in evaluate in nqueens.js Here's the function:

function evaluate(gene) {
  var board = gene;
  var threats = 0;

  var i, j;
  for(i = 0; i < board.length; ++i)
    for(j = 0; j < board.length; ++j)
      if (mutualThreat(board, i, j))
        ++threats;

  var maxThreats = board.length * (board.length - 1);
  var fitness = (maxThreats - threats) / maxThreats;

  return fitness;
}

So, the program spends a lot of time in this function. The next question was: "How can I make this run faster?". This program is completely compute-bound. It does no I/O until it is finished. To make the evaluate function run faster, I suggested trying loop unwinding, i.e., doing more than 1 action for each inner loop iteration. So for(j = 0; j < board.length; ++j) might be replaced with for(j = 0; j < board.length; j += 5), then calling mutualThreat 5 times within the for loop. The person who asked the question came back later and said he got a 70% performance improvement, but I did not see the code.

On the third day, after I did my talk on debugging with core files (view the slides), I spent a few hours in the afternoon giving an impromptu DTrace course to several people. This included a description of how DTrace works, what it is useful for, and several examples using it. Everyone who sees DTrace in action is interested, but it really becomes important when you use it to solve a real problem. In general, when I talk about DTrace (or mdb), I find it is better to go through the steps I take to solve a specific problem rather than showing a bunch of examples. When you walk through the steps taken to solve a real problem, you tend to need to use several different tools at different times to find the solution. It's always good to learn about new tools and techniques, especially in "real" situations. The talks at the conference did a good job of this. I'm looking forward to more of the same.

:

Sign up now for Instant Cloud Access Get Started