Announcing the Latest Node Update

I am pleased to announce a new stable version of Node.

This branch brings significant improvements to many areas, with a focus on API polish, ease of use, and backwards compatibility.

For a very brief overview of the relevant API changes since v0.8, please see the API changes wiki page.

Streams2

In a previous post, we introduced the "Streams2" API changes. If you haven't reviewed the changes, please go read that now (at leastthe tl;dr section).

The changes to the stream interface have been a long time in the making. Even from the earliest days of Node, we've all sort of known that this whole "data events come right away" and "pause() is advisory" stuff was unnecessarily awful. In v0.10, we finally bit the bullet and committed to making drastic changes in order to make these things better.

More importantly, all streams in Node-core are built using the same set of easily-extended base classes, so their behavior is much more consistent, and it's easier than ever to create streaming interfaces in your own userland programs.

In fact, the Streams2 API was developed while using it for modules in the npm registry. At the time of this writing, 37 published Node modules already are using the readable-stream library as a dependency. The readable-stream npm package allows you to use the new Stream interface in your legacy v0.8 codebase.

Domains and Error Handling

The domain module has been elevated from "Experimental" to "Unstable" status. It's been given more of a first-class treatment internally, making it easier to handle some of the edge cases that we found using Domains for error handling in v0.8. Specifically, domain error handler no longer relies on process.on('uncaughtException') being raised, and the C++ code in Node is domain-aware.

If you're not already using Domains to catch errors in your programs, and you've found yourself wishing that you could get better debugging information when errors are thrown (especially in the midst of lots of requests and asynchronous calls), then definitely check it out.

Faster process.nextTick

In v0.8 (and before), the process.nextTick() function scheduled its callback using a spinner on the event loop. This usually caused the callback to be fired before any other I/O. However, it was not guaranteed.

As a result, a lot of programs (including some parts of Node's internals) began using process.nextTick as a "do later, but before any actual I/O is performed" interface. Since it usually works that way, it seemed fine.

However, under load, it's possible for a server to have a lot of I/O scheduled, to the point where the nextTick gets preempted for something else. This led to some odd errors and race conditions, which could not be fixed without changing the semantics of nextTick.

So, that's what we did. In v0.10, nextTick handlers are run right after each call from C++ into JavaScript. That means that, if your JavaScript code calls process.nextTick, then the callback will fire as soon as the code runs to completion, but before going back to the event loop. The race is over, and all is good.

However, there are programs out in the wild that use recursive calls to process.nextTick to avoid pre-empting the I/O event loop for long-running jobs. In order to avoid breaking horribly right away, Node will now print a deprecation warning, and ask you to use setImmediate for these kinds of tasks instead.

Latency and Idle Garbage Collection

One of the toughest things to get right in a garbage collected language is garbage collection. In order to try to avoid excessive memory usage, Node used to try to tell V8 to collect some garbage whenever the event loop was idle.

However, knowing exactly when to do this is extremely difficult. There are different degrees of "idleness", and if you get it wrong, you can easily end up spending massive amounts of time collecting garbage when you'd least expect. In practice, disabling the IdleNotification call yields better performance without any excessive memory usage, because V8 is pretty good at knowing when it's the best time to run GC.

So, in v0.10, we just ripped that feature out. (According to another point of view, we fixed the bug that it was ever there in the first place.) As a result, latency is much more predictable and stable. You won't see a difference in the benchmarks as a result of this, but you'll probably find that your app's response times are more reliable.

Performance and Benchmarks

When the Streams2 feature first landed in master, it disrupted a lot of things. We focused first on correctness rather than speed, and as a result of that, we got a correct implementation that was significantly slower.

We have a consistent rule in Node, that it cannot be allowed to get slower for our main use cases. It took a lot of work, but over the last few months, we've managed to get v0.10 to an appropriate level of performance, without sacrificing the API goals that we had in mind.

Benchmarks are complicated beasts. Until this release, we've gotten by with a pretty ad-hoc approach to running benchmarks. However, as we started actually having to track down regressions, the need for a more comprehensive approach was obvious.

Work is underway to figure out the optimum way to get statistically significant benchmark results in an automated way. As it is, we're still seeing significant jitter in some of the data, so take the red and green colors with a grain of salt.

The benchmarks below were run on an Apple 13-inch, Late 2011 MacBook Pro with a 2.8 GHz Intel Core i7 processor, 8GB of 1333MHz DDR3 RAM, running OS X Lion 10.7.5 (11G63b). The numbers are slightly different on Linux and SmartOS, but the conclusions are the same. The raw data is available, as well.

Benchmarks: http

Node is for websites, and websites run over HTTP, so this is the one that people usually care the most about:

http/cluster.js type=bytes length=4: v0.10: 16843 v0.8: 16202 ................. 3.96%http/cluster.js type=bytes length=1024: v0.10: 15505 v0.8: 15065 .............. 2.92%http/cluster.js type=bytes length=102400: v0.10: 1555.2 v0.8: 1566.3 ......... -0.71%http/cluster.js type=buffer length=4: v0.10: 15308 v0.8: 14763 ................ 3.69%http/cluster.js type=buffer length=1024: v0.10: 15039 v0.8: 14830 ............. 1.41%http/cluster.js type=buffer length=102400: v0.10: 7584.6 v0.8: 7433.6 ......... 2.03%http/simple.js type=bytes length=4: v0.10: 12343 v0.8: 11761 .................. 4.95%http/simple.js type=bytes length=1024: v0.10: 11051 v0.8: 10287 ............... 7.43%http/simple.js type=bytes length=102400: v0.10: 853.19 v0.8: 892.75 .......... -4.43%http/simple.js type=buffer length=4: v0.10: 11316 v0.8: 10728 ................. 5.48%http/simple.js type=buffer length=1024: v0.10: 11199 v0.8: 10429 .............. 7.38%http/simple.js type=buffer length=102400: v0.10: 4942.1 v0.8: 4822.9 .......... 2.47%

What we see here is that, overall, HTTP is faster. It's just slightly slower (1-5%) when sending extremely large string messages (ie type=bytes rather than type=buffer). But otherwise, things are about the same, or slightly faster.

Benchmarks: fs

The fs.ReadStream throughput is massively improved, and less affected by the chunk size argument:

fs/read-stream buf size=1024: v0.10: 1106.6 v0.8: 60.597 ................... 1726.12%fs/read-stream buf size=4096: v0.10: 1107.9 v0.8: 235.51 .................... 370.44%fs/read-stream buf size=65535: v0.10: 1108.2 v0.8: 966.84 .................... 14.62%fs/read-stream buf size=1048576: v0.10: 1103.3 v0.8: 959.66 .................. 14.97%fs/read-stream asc size=1024: v0.10: 1081.5 v0.8: 62.218 ................... 1638.21%fs/read-stream asc size=4096: v0.10: 1082.3 v0.8: 174.78 .................... 519.21%fs/read-stream asc size=65535: v0.10: 1083.9 v0.8: 627.91 .................... 72.62%fs/read-stream asc size=1048576: v0.10: 1083.2 v0.8: 627.49 .................. 72.62%fs/read-stream utf size=1024: v0.10: 46.553 v0.8: 16.944 .................... 174.74%fs/read-stream utf size=4096: v0.10: 46.585 v0.8: 32.933 ..................... 41.45%fs/read-stream utf size=65535: v0.10: 46.57 v0.8: 45.832 ...................... 1.61%fs/read-stream utf size=1048576: v0.10: 46.576 v0.8: 45.884 ................... 1.51%

The fs.WriteStream throughput increases considerably, for most workloads. As the size of the chunk goes up, the speed is limited by the underlying system and the cost of string conversion, so v0.8 and v0.10 converge. But for smaller chunk sizes (like you'd be more likely to see in real applications), v0.10 is a significant improvement.

fs/write-stream buf size=2: v0.10: 0.12434 v0.8: 0.10097 ..................... 23.15%fs/write-stream buf size=1024: v0.10: 59.926 v0.8: 49.822 .................... 20.28%fs/write-stream buf size=65535: v0.10: 180.41 v0.8: 179.26 .................... 0.64%fs/write-stream buf size=1048576: v0.10: 181.49 v0.8: 176.73 .................. 2.70%fs/write-stream asc size=2: v0.10: 0.11133 v0.8: 0.08123 ..................... 37.06%fs/write-stream asc size=1024: v0.10: 53.023 v0.8: 36.708 .................... 44.45%fs/write-stream asc size=65535: v0.10: 178.54 v0.8: 174.36 .................... 2.39%fs/write-stream asc size=1048576: v0.10: 185.27 v0.8: 183.65 .................. 0.88%fs/write-stream utf size=2: v0.10: 0.11165 v0.8: 0.080079 .................... 39.43%fs/write-stream utf size=1024: v0.10: 45.166 v0.8: 32.636 .................... 38.39%fs/write-stream utf size=65535: v0.10: 176.1 v0.8: 175.34 ..................... 0.43%fs/write-stream utf size=1048576: v0.10: 182.3 v0.8: 182.82 .................. -0.28%

Benchmark: tls

We switched to a newer version of OpenSSL, and the CryptoStream implementation was significantly changed to support the Stream2 interface.

The throughput of TLS connections is massively improved:

tls/throughput.js dur=5 type=buf size=2: v0.10: 0.90836 v0.8: 0.32381 ....... 180.52%tls/throughput.js dur=5 type=buf size=1024: v0.10: 222.84 v0.8: 116.75 ....... 90.87%tls/throughput.js dur=5 type=buf size=1048576: v0.10: 403.17 v0.8: 360.42 .... 11.86%tls/throughput.js dur=5 type=asc size=2: v0.10: 0.78323 v0.8: 0.28761 ....... 172.32%tls/throughput.js dur=5 type=asc size=1024: v0.10: 199.7 v0.8: 102.46 ........ 94.91%tls/throughput.js dur=5 type=asc size=1048576: v0.10: 375.85 v0.8: 317.81 .... 18.26%tls/throughput.js dur=5 type=utf size=2: v0.10: 0.78503 v0.8: 0.28834 ....... 172.26%tls/throughput.js dur=5 type=utf size=1024: v0.10: 182.43 v0.8: 100.3 ........ 81.88%tls/throughput.js dur=5 type=utf size=1048576: v0.10: 333.05 v0.8: 301.57 .... 10.44%

However, the speed at which we can make connections is somewhat reduced:

tls/tls-connect.js concurrency=1 dur=5: v0.10: 433.05 v0.8: 560.43 .......... -22.73%tls/tls-connect.js concurrency=10 dur=5: v0.10: 438.38 v0.8: 577.93 ......... -24.15%

At this point, it seems like the connection speed is related to the new version of OpenSSL, but we'll be tracking that further.

TLS still has more room for improvement, but this throughput increase is a huge step.

Benchmark: net

The net throughput tests tell an interesting story. When sending ascii messages, they're much faster.

net/net-c2s.js len=102400 type=asc dur=5: v0.10: 3.6551 v0.8: 2.0478 ......... 78.49%net/net-c2s.js len=16777216 type=asc dur=5: v0.10: 3.2428 v0.8: 2.0503 ....... 58.16%net/net-pipe.js len=102400 type=asc dur=5: v0.10: 4.4638 v0.8: 3.0798 ........ 44.94%net/net-pipe.js len=16777216 type=asc dur=5: v0.10: 3.9449 v0.8: 2.8906 ...... 36.48%net/net-s2c.js len=102400 type=asc dur=5: v0.10: 3.6306 v0.8: 2.0415 ......... 77.84%net/net-s2c.js len=16777216 type=asc dur=5: v0.10: 3.2271 v0.8: 2.0636 ....... 56.38%

When sending Buffer messages, they're just slightly slower. (This difference is less than the typical variability of the test, but they were run 20 times and outliers were factored out for this post.)

net/net-c2s.js len=102400 type=buf dur=5: v0.10: 5.5597 v0.8: 5.6967 ......... -2.40%net/net-c2s.js len=16777216 type=buf dur=5: v0.10: 6.1843 v0.8: 6.4595 ....... -4.26%net/net-pipe.js len=102400 type=buf dur=5: v0.10: 5.6898 v0.8: 5.986 ......... -4.95%net/net-pipe.js len=16777216 type=buf dur=5: v0.10: 5.9643 v0.8: 5.9251 ....... 0.66%net/net-s2c.js len=102400 type=buf dur=5: v0.10: 5.473 v0.8: 5.6492 .......... -3.12%net/net-s2c.js len=16777216 type=buf dur=5: v0.10: 6.1986 v0.8: 6.3236 ....... -1.98%

When sending utf-8 messages, they're a bit slower than that:

net/net-c2s.js len=102400 type=utf dur=5: v0.10: 2.2671 v0.8: 2.4606 ......... -7.87%net/net-c2s.js len=16777216 type=utf dur=5: v0.10: 1.7434 v0.8: 1.8771 ....... -7.12%net/net-pipe.js len=102400 type=utf dur=5: v0.10: 3.1679 v0.8: 3.5401 ....... -10.51%net/net-pipe.js len=16777216 type=utf dur=5: v0.10: 2.5615 v0.8: 2.7002 ...... -5.14%net/net-s2c.js len=102400 type=utf dur=5: v0.10: 2.2495 v0.8: 2.4578 ......... -8.48%net/net-s2c.js len=16777216 type=utf dur=5: v0.10: 1.7733 v0.8: 1.8975 ....... -6.55%

You might suspect that this is a result of the new Streams implementation. However, running the same benchmarks without using any of the code in Node's lib/ folder, just calling into the C++ bindings directly, yields consistently similar results.

This slight regression comes along with significant improvements in everything that sits on top of TCP (that is, TLS and HTTP).

Keep an eye out for more work in this area. Fast is never fast enough!

Continuous Integration

To support a higher degree of stability, and hopefully catch issues sooner, we have a Jenkins instance running every commit through the test suite, on each operating system we support. You can watch the action at the Node Jenkins web portal.

Coming soon, we'll have automatically generated nightly builds every day, and eventually, the entire build process will be automated.

While we're pretty rigorous about running tests and benchmarks, it's easy for things to slip by, and our ad-hoc methods are not cutting it any longer. This promises a much lower incidence of the sort of regressions that delayed the release of v0.10 for several months.

Growing Out

A year ago, we said that the innovation in the Node universe would be happening in userland modules. Now, we've finally taken that to its logical conclusion, and moved our iteration on core modules into userland as well. Things like readable-stream and tlsnappy allow us to get much more user-testing, experimentation, and contributions to a feature.

The userland module can live on as a compatibility layer so that libraries can use the new features, even if they need to support older versions of Node. This is a remarkably effective way to do node-core development. Future developments will continue to be iterated in userland modules.

Growing Up

The question comes up pretty often whether Node is "ready for prime time" yet. I usually answer that it depends on your requirements for "prime time", but Node has been powering some high profile sites, and the options for "real" companies using Node for The Business are better than ever.

It would be out of scope to try to provide an exhaustive list of all the companies using Node, and all of the options for support and training. However, here are a few resources that are quickly expanding to fill the "Enterprise Node" space.

For those looking for commercial support, StrongLoop (Ben Noordhuis & Bert Belder's company) has released a distribution containing node v0.10 that they will support on Windows, Mac, Red Hat/Fedora, Debian/Ubuntu and multiple cloud platforms. You can download their Node distribution here.

The Node Firm is a worldwide network of key Node contributors and community members that help organizations succeed with Node. Through corporate training, consulting, architectural guidance, and ongoing consulting subscriptions, they have helped Skype, Qualcomm, and others quickly and effectively embrace Node.

Node would not be what it is without npm, and npm would not be what it is without the registry of published modules. However, relying on the public registry is problematic for many enterprise use-cases. Iris npm is a fully managed private npm registry, from Iris Couch, the team that runs the public npm registry in production.

Joyent, the company you probably know as the custodian of the Node project, provides high performance cloud infrastructure specializing in real-time web and mobile applications. Joyent uses Node extensively throughout their stack, and provides impressive post-mortem debugging and real-time performance analysis tools for Node.js applications. They are my also employer, so I'd probably have to get a "real" job if they weren't sponsoring Node :)

Next Up: v0.12

The focus of Node v0.12 will be to make HTTP better. Node's current HTTP implementation is pretty good, and clearly sufficient to do a lot of interesting things with. However:

  1. The codebase is a mess. We share a lot of code between the Client and Server implementations, but do so in a way that makes it unnecessarily painful to read the code or fix bugs. It will be split up so that client and server are clearly separated, and have cleaner interfaces.
  2. The socket pooling behavior is confusing and weird. We will be adding configurable socket pooling as a standalone utility. This will allow us to implement KeepAlive behavior in a more reasonable manner, as well as providing something that you can use in your own programs.

There is some experimentation going on in the tlsnappy module, which may make its way back into the core TLS implementation and speed things up considerably.

1.0

After 0.12, the next major stable release will be 1.0. At that point, very little will change in terms of the day-to-day operation of the project, but it will mark a significant milestone in terms of our stability and willingness to add new features. However, we've already gotten strict about maintaining backwards compatibility, so this won't really be so much of a shift.

New versions will still come out, especially to pull in new versions of our dependencies, and bugs will continue to be fixed. There's been talk of pinning our release cycles to V8, and automating the release process in some interesting ways.

The goal of Node has always been to eventually be "finished" with the core program. Of course, that's a rather lofty goal, perhaps even impossible. But as we take Node to more places, and use it in more ways, we're getting closer to the day when the relevant innovation happens outside of the core Node program.

Stability in the platform enables growth on top of it.

And now, the traditional release notes:

2013.03.11, Version 0.10.0 (Stable)

  • npm: Upgrade to 1.2.14
  • core: Append filename properly in dlopen on windows (isaacs)
  • zlib: Manage flush flags appropriately (isaacs)
  • domains: Handle errors thrown in nested error handlers (isaacs)
  • buffer: Strip high bits when converting to ascii (Ben Noordhuis)
  • win/msi: Enable modify and repair (Bert Belder)
  • win/msi: Add feature selection for various Node parts (Bert Belder)
  • win/msi: use consistent registry key paths (Bert Belder)
  • child_process: support sending dgram socket (Andreas Madsen)
  • fs: Raise EISDIR on Windows when calling fs.read/write on a dir (isaacs)
  • unix: fix strict aliasing warnings, macro-ify functions (Ben Noordhuis)
  • unix: honor UV_THREADPOOL_SIZE environment var (Ben Noordhuis)
  • win/tty: fix typo in color attributes enumeration (Bert Belder)
  • win/tty: don't touch insert mode or quick edit mode (Bert Belder)

Source Code: nodejs.org/dist/v0.10.0/node-v0.10.0.tar.gz
Macintosh Installer (Universal): nodejs.org/dist/v0.10.0/node-v0.10.0.pkg
Windows Installer: nodejs.org/dist/v0.10.0/node-v0.10.0-x86.msi
Windows x64 Installer: nodejs.org/dist/v0.10.0/x64/node-v0.10.0-x64.msi
Windows x64 Files: nodejs.org/dist/v0.10.0/x64
Linux 32-bit Binary: nodejs.org/dist/v0.10.0/node-v0.10.0-linux-x86.tar.gz
Linux 64-bit Binary: nodejs.org/dist/v0.10.0/node-v0.10.0-linux-x64.tar.gz
Solaris 32-bit Binary: nodejs.org/dist/v0.10.0/node-v0.10.0-sunos-x86.tar.gz
Solaris 64-bit Binary: nodejs.org/dist/v0.10.0/node-v0.10.0-sunos-x64.tar.gz
Other release files: nodejs.org/dist/v0.10.0
Website: nodejs.org/docs/v0.10.0
Documentation: nodejs.org/docs/v0.10.0/api

Shasums:

b9e9bca99cdb5563cad3d3f04baa262e317b827c  node-v0.10.0-darwin-x64.tar.gz0227c9bc3df5b62267b9d4e3b0a92b3a70732229  node-v0.10.0-darwin-x86.tar.gz9f5f7350d6f889ea8e794516ecfea651e8e53d24  node-v0.10.0-linux-x64.tar.gzcc5f1cd6a2f2530bc400e761144bbaf8fcb66cc4  node-v0.10.0-linux-x86.tar.gz42c14b7eab398976b1ac0a8e6e96989059616af5  node-v0.10.0-sunos-x64.tar.gzddcadbac66d1acea48aa6c5462d0a0d7308ea823  node-v0.10.0-sunos-x86.tar.gz70eacf2cca7abec79fac4ca502e8d99590a2108a  node-v0.10.0-x86.msic48c269b9b0f0a95e6e9234d4597d1c8a1c45c5a  node-v0.10.0.pkg7321266347dc1c47ed2186e7d61752795ce8a0ef  node-v0.10.0.tar.gzf8c6f55469551243ea461f023cc57c024f57fef2  node.exe253ae79e411fcfddcf28861559ececb4b335db64  node.expacb8febb5ea714c065f201ced5423b0838fdf1b6  node.lib0fdad1400036dd26d720070f783d3beeb3bb9c0a  node.pdbabcaf8ef606655a05e73ee5d06715ffd022aad22  x64/node-v0.10.0-x64.msie5d0c235629b26430b6e07c07ee2c7dcda82f35e  x64/node.exe43b3fb3a6aaf6a04f578ee607a9455c1e23acf08  x64/node.exp87dd6eb6c8127a1af0dcca639961441fc303d75a  x64/node.lib50aca715777fa42b854e6cfc56b6199a54aabd3c  x64/node.pdb


Post written by Isaac Z. Schlueter