State of Node.js Core Fireside Chat - Questions and Answers

April 25, 2019 - by Joyent

On April 3rd, Joyent brought together members of their engineering team to answer questions about the upcoming Node.js 12 release. Node.js continues to gain popularity and has even crossed over the million downloads a day threshold. Much of this growth can be attributed to its widespread adoption across web applications, mobile, and cloud native applications.

Our first panelist was Colin Ihrig, a member of the Node.js Technical Steering Committee and a frequent speaker at community events. Colin was joined by Wyatt Lyon Preul, who is a long time Node.js contributor, co-creator of the hapi framework, and author of a few technical books. Lloyd Benson is the last panelist, he is an architect with expertise on operationalizing Node.js projects at massive scale.

During the discussion the panelists shared a wealth of information. As a result, we decided to post their discussion on our blog. If you would like to listen to the recorded chat instead you can click here to register. The following is a transcribed version of their fireside discussion.

Question #1: What are some operational best practices of Node.js we should consider as we move Node.js applications into production? What tools can I use?

Answer from Lloyd Benson: Here are recommendations for operationalizing Node. I would say the main one is to do it as quickly as possible with the resources you have.

• Centralized logging is very useful with tools like Splunk, Humio, and the ELK stack. You should create alerts based on logging that will help you triage those issues before they happen. This isn’t specific to Node, it’s useful for any application.

• Tracing - You don’t have to instrument every call in your app. Depending on framework, you can add a plugin or inject JavasScript so you know what’s going on in your application. I have found this useful in many instances.

• Automatic restarts (Docker/systemd/pm2/etc) are handy. For instance if you want to prevent DDoS (Distributed Denial of Service) attacks, or you are getting a bunch of traffic and running out of memory, how do I rectify that? I recommend getting as close to the operating system as possible.

In Linux, the common way would be to use systemd or PM2 for this type of thing. If you have a problem with your PM2 process going down, monitoring it will help. But honestly, it is more ideal to have the operating system handle that situation.

node-report - This is an npm module that can dump human readable diagnostic information about your Node process. It is actually built into newer versions of Node. There is lots of good information there.

• Core file analysis - This is the last hope if something really bad happened. You can get a lot of information out of a core file from llnode. Before this was available, you had to use SmartOS.

• Metrics for node - It’s important to look at your event loop delay. In newer versions of Node, you can use the perf_hooks module. I would look at your event emitter before and after those events and see if there is an increase over time (and not going down). This could be an indicator that you need to do some optimization.

Heap space is an important one - get 1.5 g, doing a lot of objects or having leaks, you will see that number increase. If you hit that 1.5 g threshold, your application will crash. If you are automatically restarting it, great, but you want to avoid it. Things will get slower and slower as your garbage collector has to work harder and harder.

• Taking all these metrics and monitor over time. You can write solutions for monitoring thresholds over time. If you application is restarted, don’t ignore it. You need to ask, why did it restart?

Question #2: My particular interest in Node recently has been as I’ve been trying to instrument a distributed tracing strategy for our Node services here at {my company}, which brought me to async_hooks for managing my context. The API is super powerful and I really love it, but we’ve seen a significant perf hit as we’ve tried to integrate it with some of our services. I’ve been trying to follow discussions in the Node community about benchmarking and perf improvements, but would love to hear more detail about where we are/what’s coming up for that particular feature. Our primary use case for node is to back our react front end services, which we have more and more of these days as we continue to break up our old monolith into microservices.

Answer from Wyatt: It looks like they are on the right path by adding additional logging around requests.

The question is more around adding distributed tracing and how you accomplish that with Node. Typically, the way we solve this is with additional logs and timing around a recent request and how long that request is taking to get a response. Previously, we’ve relied on a specification called OpenTracing which is part of the CNCF. It works in Node and other languages, so it’s nice that it’s language neutral. There are likely some OpenTracing modules for the framework you are using. If you are using hapi, you could use a module called traci that I wrote. You can use it to create hooks for requests coming in or going out. You could aggregate all that date in Zipkin or Jaeger and monitor where you are spending your time. It provides insight on how to improve performance and where the bottlenecks are in your system.

Question #3: Will TypeScript affect anything in Node? Will it affect programming in Node?

Answer from Colin: It is having an impact. Many users are expecting TypeScript typings with modules. It helps when you are using something like VS Code for IntelliSense. Long term, I would hope that TypeScript would be folded into JavaScript and go the way of CoffeeScript.

TypeScript provides a good path for developers coming from languages like Java or C# to ease into Node.js. If you are coming from JavaScript to TypeScript, there is more overhead to pick up. With TypeScript, you will end up typing more. It’s not a silver bullet. I have personally fixed bugs that we expected to be covered by TypeScript and they weren’t.

Question #4: What about best practices using Node.js in containers?

Answer from Lloyd: As I mentioned earlier, you want to get as close to the operating system as possible. If you are doing anything in containers and Docker, you can tell Docker to automatically restart your processes. However, you have to be careful in Node. There are some use cases where it’s generally a good practice that your init process is not Node. You can also use tini with Docker. A common thing to use is alpine for your Docker images to get those containers small in size. To keep up with versions, I’ll make a core container and then I’ll have patches in my operating system, tied to specific version. I will push them out and have my Node location installed after that. Then I will import my custom version.

Question #5: We have someone who is Interested in JAMstack or serverless web apps.

Answer from Colin: I haven’t used JAMstack, but I’m a big fan of serverless. Node is a perfect match for serverless because of the small run time, and it starts quickly, which minimizes cold start penalties. If you want to squeeze out more startup performance, webpack all you stuff together to cut down the time for all of the require() calls inside your application. Everytime I see a new serverless platform announced, Node is the first, or one of the first, languages that it supports and I think it’s a solid match.

Question #6 : What should I do if my node version is end of life?

Answer from Colin: I would migrate off of an end of life version of Node immediately. This is a problem because there are periodic security releases as dictated by the OpenSSL dependency. If you are running a version that is end of life, I would be willing to bet you are running an insecure version of Node at this point. I would recommend moving to the earliest supported version of Node. Right now that’s Node 6. I would look at changelogs on the Node.js blog.

It’s easy to fall behind on your dependencies. Often, modules get updated, so you should get in the habit of regularly checking your dependencies. npm outdated will tell you if your modules are outdated. You can also use a hosted solution like Greenkeeper to keep you up to date. Alternatively, do all that and put it in CI (continuous integration) solution.

Question #7 - What is being removed from the current Node version release?

Answer from Colin: Nothing significant is being removed. I would recommend looking over the Node 12 release issue. Most breaking changes are removing old, deprecated code. The process to first implement a documentation deprecation that will last an entire release cycle. Then, we do a runtime deprecation. During that deprecation, you will see warnings being printed. That will last for another release cycle. If the ecosystem usage is low enough, then we decide to remove the feature. We still treat changes to some error messages as breaking changes. We are in the process of moving to a new error system that would avoid that. Major releases tend to be very boring. Semver minor releases contain the new features that people are often excited about.

Question #8: Is there any Kubernetes support/framework? Like Spring Cloud for Java?

Answer from Colin: No, Node doesn’t generally doesn’t buy into large frameworks, especially not part of core. There are no plans for that in Node’s core.

Question #9: My nodeJS product is a gateway component that translates the RPC/protobuf API (from a C++ codebase) to REST and exposes it over different transports (http/socket.io/webRTC/iot). It gets bundled with this C++ server and is deployed through installers for our customers who use Windows. What are some best practices for deploying nodeJS codebase in Windows environments, any recommendations would be highly appreciated - thank you!

Answer from Wyatt: One thing I would recommend is that all the modules are tested out in Windows, but be cognizant that moving between Windows and Linux, there can be issues. In the hapi ecosystem, we have started testing all of our modules on Windows.

Answer from Lloyd: Something to note is that Windows is not case sensitive and this can cause problems since Linux is. For example, require()’ing a file might work on Windows, but not Linux because the casing is incorrect. Also, older versions of npm had issues around excessively long file paths that Windows could not deal with. New versions of npm mitigate that by flattening the node_modules/ directory.

Question #10: What are the current state of safer Buffers?

Answer from Colin: Here is the background. Node exposed a Buffer() constructor and it works basically like malloc() in C. It returns an uninitialized block of memory. In JavaScript we overloaded the constructor, so you can pass in a number to get a block of a certain size, or pass in a string to get a Buffer representation of that string.

A few years back, it was reported that the ws module had a vulnerability where it was confusing which type was being passed to the constructor and could send back an uninitialized memory to users, which is bad since it might have private information that you don’t want to share over the network including things like keys, certificates, or passwords. We’ve added new functions to Node to work around the issue. We would like to remove the Buffer() constructor, but we can’t just yet because it would break a lot of existing user code.

Question #11: Is it to expect Node.JS to remain important, once there is prevalent WebASM on the client side?

Answer from Colin: My two cents is that Node will stay relevant. Anything that happens on the client side like wasm doesn’t impact Node on the server side. WebAssembly is not a replacement for JavaScript, but rather a compilation target.

Question #12: What documentation would we recommend during an upgrade?

Answer from Wyatt: I would look at changelogs from Node’s GitHub repository. Take a look at the version you’re migrating from to the version you’re migrating to. Unfortunately, there is no compiled list for all those changes.

Answer from Lloyd: You need to make sure you have a little time to maintain dependencies or keep up with your versions in order to mitigate problems. If you neglect software it will be bad.

Question #13: What we have learned from supporting enterprise customers ? Common mistakes we see?

Answer from Lloyd: From operating system point of view, make sure it’s scalable. You should work with smaller chunks of code and do an application run. When adding something new to your load balancer for your application, make sure it scales out. If you can add more servers until you can contain the problem and then go back to the beginning If you are seeing automatic restarts, look at all of your logs. People sometimes just rely on rebooting, and this will hide these problems. Recommend observing them with analytics. If you have more insights then you can react more quickly. Take those statistics over time and in small chunks and review.

Answer from Colin: The majority of problems seem to come from infrastructure or overall architecture, and not necessarily problems in the code. For problems in code, you can do profiling and rectify things fairly quickly. Use Node intelligently and don’t try to make it do too much. For example, don’t do TLS termination, or serve static files from Node if you can help it.