Node.js on the Road: Q&A

Node.js on the Road is an event series aimed at sharing Node.js production user stories with the broader community. Watch for key learnings, benefits, and patterns around deploying Node.js.

Question and answer session from Node.js on the Road, Amsterdam.


Emcee: Justin Halsall @juice10
Director of Developer Programs, Bottlenose

TJ Fontaine @tjfontaine
Node.js Project Lead, Joyent

Marijn Deurloo @braindevils
CTO, imgZine

Dominiek Ter Heide @dominiek
CTO, Bottlenose

Luca Maraschi @lucamaraschi
Architect, Icemobile

This is TJ, this is Marijn, Luca and Dominiek over here. Yeah, let's clap. OK, so if you've a question, there's this great thing, most of you were born with them. You have hands, you can stick them in the air or just any way to get my attention, and then I'll be able to direct your comment or question. We have one over there, what is it?

[audience question]

So the
question was, can you comment on the harmony stuff, what's going to happen in a stable Node?

Right, so what we have is a scenario where V8 is our runtime, and they make decisions on what features generally are enabled, and we try not to get in front of them for when those features are added. What we did add for 0.12, and if people need it for the really broken versions in 0.10, you can recompile Node enabling any V8 feature flag.

So if you need a Node binary that you don't have to pass the enable harmony features for you can actually do that. But Node's going to grow that support whenever V8 grows that support. We're not going to jump out ahead of it. Some of it is great and generators are helpful for a lot of people, but in my opinionn not ready for prime time.

If Google's not ready to turn it on for all of those Chrome browsers out there, I'm not ready to turn it on for Node.

More hands here we go.

ES6 is developing
rapidly now, and they've got their own little nice ES6 module system. Now I know you're not from NPM but it's still Node, so it's required to be a thing.

Well what's the future of packages in ES6?

So what's the future of packages in ES6?

So ES6 right now my understanding is that they are still going back to the drawing board a little bit on modules, which is the great part about standards is I don't have to, it's not my responsibility. But we do have motivated people working with the standards body to make sure that whatever comes out hopefully works well with Node.

So there's actually even a polyfill module right now for people who are trying to do that work. I know that the default stuff is still a problem from some of the use cases, but we're trying to do that. But Node has a huge—we care about backwards compatibility, so when ES6 releases modules, or whatever ES happens, and there are modules in it and V8 gets it and Node uses that, then at that point in time Node will have to figure out the best way to merge these two things together.

But backwards compatibility-wise, we still have to be able to support those people who've written all those modules that use that. I would do that—since we have two questions on ES6—do any of you guys use harmony or any of those features?

It's not in production.

Fantastic don't do it.

So it's not production ready.

Any hands, any more hands?

The guy in the back.

[audience question]

So how much was Node
a conscious decision for using it as glue basically?

I don't think it was a
conscious decision, I think it was a natural evolution in our case our backend, we started with a very slim back end and then moving from a more consumer product to an enterprise product where we had more and more backend, but I think for backends in general the old principal from the rails world of, what is it?

Skinny controller, fat model, kind of went outside of the MVC where in a sense your whole controller level, or your backend application becomes very skinny, and a fat model becomes these big data solutions like elasticsearch, you can do so much powerful things. So, yeah that's how I see that.


I would also point
out that for a lot of people, when they are introducing node into a legacy environment kind of like what I looked at what Luca said, it reminded me a lot of what Walmart did. So if you listen to what Eran Hammer has to talk about, he brought in Node specifically to be like an analytics platform and put it in front of a bunch of legacy Java services.

And then they just started doing analytics to convince people about what the platform was actually doing. Then they actually decided to provide a cohesive API for all these different backend services, so you could take all these legacy, SOAP ridiculous backend things and turn it into RESTful app that you can then go and do from the mobile side.

So this glue this composible piece that you put in there, they were able to turn these analytics and that API into a way to say actually we need to do more catching here to be better, or we need to replace that service with a node service or we can now go tell that service team why they're screwing up so bad for everybody else, and give them information about that. So it was a huge tool to bring in to bring to light for everybody else how their platform was actually performing, but also to actually bring node in and introduce it, and as a result, Walmart has adopted it in huge ways.


[audience question]

Repeat the question?

can summarize, I can summarize.


so the question is you're doing evil things with streams in the form of under bar readable state, or even under bar writable state, and am I going to break you, and/or are we going to formalize some of these things around to make it easier for you to understand that?

The first answer is, which I think is most fun, is it's got an underbar in it, and I can do whatever I want. The second answer is we try really, really hard not to break that kind of stuff. So even when we didn't intend for that to be a feature it's easy because JavaScript let's you do all sorts of evil things for people to depend on that.

And so we have all sorts of crazy things being held on to in Node from like .2 and .3 days for various kinds of reasons. We tried really hard not to break those. That being said there are a lot of things in the streams API that need to be formalized, and need to be written down about what you can do, and we need to provide useful access to that.

A colleague of mine Dave Pacheco, who have been really excited for him to get into streams, particularly object mode streams, because he has a wonderful ability to break everything. He doesn't really actually break it, he will break me down for like a week being like let me talk to you about this API, I don't think it's right.

And so it's me going through back and forth with him about how to do it non-stop. So I'm really excited he's going to help reinvigorate the whole streams stuff, and he writes fantastic documentation and at the end of it we're going to—we're not going to end up with streams 4 necessarily, but we're going to end up being able to have accessors for the state data, and be able to provide formal API access to those kinds of things, so you can have faith that what you're doing is going to work going forward, and you know how to do things like flush out the data that's in the readable state in a useful way.

I want to add on to that, are you guys using steams as part of your infrastructure? Some. You guys are big on streams. Not big? Yeah, so there's a tons conversations up here about people using async.js. I think most of you guys were using async, right? Are you using async?


You're what?

Nah, we're using Bluebird.

Bluebird so promises.
OK, so anyway the conversation around control flow, it's not about promises versus callbacks. The best control flow library for node is actually our flow control library, so how many people have tried transform streams in Node? OK. You guys check out and the streams adventure stuff and learn about Node streams. Transform streams and object mode streams are amazing kinds of things, especially if you're approaching it from a functional kind of thing. If you want something that looks monadic or whatever, you can totally approach Node in that way. You can understand the amount of state that you're carrying on throughout the system, or if you just install a Unix Node and you like shell pipelines the stuff works and looks exactly like a shell pipeline, so start playing around with streams if you haven't yet.

I'm wondering if any of other speakers find themselves kind like hacking in Node internals every so often. Is that a thing that happens?

No. No I'm joking.
Yeah, well we have this hacking mentality inside of Ice Mobile, so we really want to hack instead of following procedure and so on. To be very honest I don't think we have any team member that hacks in the internal of Node.js, of Node core but definitely we're open to do that. So while we're hacking on top of it, not in the deep inside. We're also very scared of making a pull request before TJ is going to validate it, so I'm going to start with the readme.

Yes sir.

Put the comma in the right place.


Yeah I think for us it's kind
of the same; we're not diving in and changing stuff in Node, we do that in Node modules, though because we find a lot of stuff in Node modules not working or not working as expected or as we wanted but we're very careful not to do it in Node, sorry.

Well absolutely do it to Node. Just don't do something evil.

So don't be evil.

Any more questions?
Guy in the back again?

[audience question]


You didn't use anything any ORMs or anything like that?


So I'm wondering, to the
other guys when you moved over from the thing that you were doing before, when you had Rails earlier, what were the things that you really missed in Node?

Well I remember spending a
lot of days just being totally lost in some sort of race condition bug. So just the change from going from a serial kind of language like Ruby to everything happens in parallel is just a mind fuck in the beginning, but once you get over that it's marvelous, but you have to get over that hump, I think.

MDB on
Linux and Mac would be really appreciated.

Because moving from Visual Studio where you can see all the memory stack easily and then for us actually the real problem at this moment is we're missing a low-level debugger and a profiler so well, we're pushed to go on SmartOS or hack with memwatch or whatever. That is not really actually handy for newbies so maybe just think about Linux or Visual Studio.

I think for us it's usually finding the right modules. I think we spend a lot of time evaluating a lot of modules out there, and the process of finding the right one and standardizing on that one, you saw our stack before it's still changing every two weeks I would say and its async vs promises vs whatever its neato vs request vs whatever, so we're still trying to find out what suits us and what doesn't break down.

I mean so you guys kind of hit on what I kind of deemed the four questions that we hear all the time at Node on the Road events. It's how do I run Node, how do I keep Node running, how do I know what Node's doing when it's running, and then what modules am I supposed to use? So these are the kind of themes, and its important to have events like these because that's where everybody starts to share their stories of how they get to it. Intil we can actually give some extra discoverability around that, so you guys coming up here an saying what module use I a big way to tell people, here's how I'm being successful with Node using these modules.

As far as the tooling around things, yeah, so SmartOS and you can run Dtrace. How many people know what Dtrace is? Wow that's fantastic, I like this. So Dtrace is a dynamic tracing platform that you can run on OS X, Free BSD and SmartOS which is an operating system that Joyent maintains. You can potentially run it in some fashion on some distribution or on something from Oracle.

OK, so you can do a bunch of amazing things, Linux on the other hand, Dtrace says it needs to be built kind of into the kernel and so that's a lot of work and a lot of effort, there are people who are doing that, Oracle hypothetically has a version of that, hypothetically, definitely not for you to download for free, but you can—there is also ktap. So Node has some built-in static probes, so there is ktap and system tap and those static probes actually can be used in that platform successfully, so if what you need to do is trace HTTP server request and response, or HTTP client request and response you can actually use those kind of utilities to do that.

On postmortem analysis, so the idea he mentioned MDB which is Modular Debugger which is a fantastic tool that I spend a lot of time in whose syntax can be slightly arcane if you're not used to that, especially for debuggers. Debuggers are not necessarily the most approachable things. But what we have is the ability to, from a core file actually inspect all of the heap space of your node process, so you can actually reconstruct all the JavaScript objects that were there, unlike if you were using heap dump to figure out where a memory leak was going on, you have to know in advance to take a snapshot and then load that 1 GB son of a gun into Chrome which is not going to work that well. But with MDB you can actually—the process may die, you can take that core file and then inspect it and reconstruct state and ask questions of it.

You can do that with Linux. You can take a core file from Linux to something somewhere where MDB is running and inspect that state there. Canonical tried to port MDB and they decided it was going to take too much time and effort. It's all open source, all this stuff is out there for people. It's just for Joyent we write an operating system that's baked into the platform already.

It's difficult to justify that, but it's open source and we work with a bunch of people who try and do that kind of stuff all the time, so it's exciting and I will happily talk at length about MDB and DTrace to anybody who wants to afterwards, or if you have questions about it.

Great, the man in the checked shirt.

[audience question]

So what to use internally and
to keep everything modular and scale everything with Node.

Yeah, I think I shared
most of what we have in our stack in, well we try to keep it as simple as possible, I think that's the biggest thing in there to not get all fancy and try to squeeze everything out of it, we don't want to do that, we just want to keep performance, security the main issues to keep in mind when building stuff, and bringing it to an enterprise level always has these two things in mind, performance and scalability, and security.

[audience question]

So what's your infrastructure like when you deploy the app, what kind of infrastructure do you deploy it on?

Yes, so it's just a
normal Ubuntu stack with haproxy in the end which does distribution and then we have nginx running which distributes requests to PM2, PM2 handles requests to Node, Node runs express in express, we handle all their requests and stuff like that just like that, and it's Mongo, or MySQL on the back.

So in our case we're like a typical SaaS product and we sell to the marketing and enterprise so we don't have behind the firewall installs yet, we will get there. So we don't have to deal with things like, and we don't have to deal with internal IT people yet at these enterprises, but in our case we run everything on bare metal we use Ubuntu as the operating system, about 60 bare metal machines. We lease them, we don't have our own datacenter or anything but we're very IO bound, we have Node process just for APIs and app servers, which is behind nginx and we can add just add more Node processes, and then we have Node workers in our pipeline that are working on RabbitMQ, and we can just add more workers if we need to handle more processing but our biggest scalability challenges are always around elasticsearch and big data storage solutions.

What do
you do to deploy service? Because you have quite a few.

We have a very crappy old cake file that we've been using for a long time, and we have puppet and stuff like that, but we're looking actually for a good devops engineer to join our team, so if you have some ideas, talk to me.

Yeah our stack is kind
of simple, we are moving away from Oracle Linux, from WebLogic and so on, I already showed you, we're running camp in front as load balancer, we have nginx, we're running clusterized uses of Node, we are actually using cluster clearly to multiprocess, and we're using NSQ now in test to try to scale up this patching of messaging or kind of message bus, we have radius behind the scenes, we are using memcache and radius also for cache at the same time. Database we're still running on Oracle database because of security compliance and we have running customer, we're migrating slowly to Mongo, and we are testing also to run on cloud, so at this moment we're on premise on datacenter, and we're moving into cloud so you can imagine elastic load balancer will be our best friend. We are using Chef for deployment, and we have a promotion mechanism build in jenkins. We're using RPMs, so we can argue about that, but yeah that's a little bit of our stack, that's why we're running this moment and we have our own proprietary—well, open source implementation of Virgilio to clusterize messages across different instances of Virgilio, so you can federate Virgilio across different servers, physical and logical.

And we make processes interop but this is not very interesting.

The guy with the funky hair.

We're pretty big fan of doing
continuous deployment, and I did a check online to see what other people are doing to get their code deployed on their servers and getting their code deployed on like clusterized servers.

And I could not really find something that everyone said that this is the way to go, how we should do it. So first of all I would be interested, is this something that the Node team [xx] that it has some certain opinion about instead of just running Node the main server of js and keeping it right. And on the other side, I'm interested if the guys running it in production actually have some of these scripts to quickly get your code from development into production clusterized and all automatically.

So continous deployment, do you do it? How could Node help out with that and what tools do you use for it?

So we do think about it—the Node core team definitely does think about it, and trying to do it—it's weird because what you're asking for in a lot of cases is an opinionated answer from a framework that tries really hard to not have an opinion on anything. It's like Switzerland. It's fantastic.

But what we try and do, so imagine a world—we do think about it. First let me give you my advice when you're doing it, then these guys can tell you about how they do it. Use whatever the system, your operating system provides you for monitioring and launching services OK, on SmartOS that'd be SMF which has kernel support for understanding that, because there's always a question of who watches the watcher.

But if you're in a systemd kind of world, an upstart world, if you're on launchd because you went on Darwin or something like that, use what the operating system has for you to manage services and let it be it's problem, that's my suggestion to you. There are things like PM2 that people here use, but for my money, let your operating system be in control of that.

On the other side of that from Node, how would Node help with the continuous deployment side of things? We're working—Bradley Meck from the community (who's now recently hired at Node Source) did work to make Node modules into archives, kind of like a jar kind of environments, and then also he and I worked together on getting what we might call Node Bundles together.

So the idea is once you put your whole environment into an archive, then you attach that to an executable, and then you have a single result that's actually executable by your operations teams, to say "here go put this in your start up script and start 10 of them," and it's going to figure out how to do it on it's own or put the configuration file here, and it's just this binary and that's how you can deploy it.

If that's how you want to do it, that's the path forward, beyond 0.12 we're going to add that stuff. It 's one way to do it for that, but if you're like writing tools, like grunt runners and stuff like that, or JSON parsers, or some kind of tooling around Node, its a good deployment solution for that as well. So that's kind of how Node's looking at it ,but for the most part I try to stay out of those opinions.

I already gave you
my opinion. We're running PM2 and I'm really, really happy with it, because it just keeps stuff running, that's the question you mentioned well for us it's really important to have it keep stuff running. We're not doing continuous deployment yet.

We're working on setting up a system for it actually. We're thinking about what to use etc, and currently it's just—we don't have that many servers yet, so in our SaaS model (or we call it PaaS model) in our PaaS model we have like maybe 20 servers running everything, so it's still using stuff like puppet etc, and git pull and deploying everything, and PM2 does a silent restart as well for us, so it's very easy you don't have any downtime yet, and well, for us that does the job. Continuous deployment: yes we want, but not yet.

So we
have two stacks, the open source stack it goes in continuous deployment. I recommend you use something like Codeship or [xx], they're pretty good, so all our libraries they continue to get deployed to NPM, and to GitHub continuously. On the other hand our production stack splitting off, we have the legacy—the legacy, we have everything automatically deployed by human being, which gets an email when inside the jar file in a tarball format, because Outlook blocks it, then he downloads it, copies it, there are different flavors, you can rsync it or you can SSH copy—doesn't matter. The Node stack, as we are getting continuous deployment, so we're using RPM for that and like I told you we have a promotion system through Jenkins. I highly recommend you to stay if you want my suggestion that is taking part, is you can use either the git flow, so you just simply git pull and promote your SHA—that is unique enough, is a unique identifier. If you don't trust it, well you can argue against the unique identifier, and on the other end you can use RPM. They're both valuable solution. One is more mechanical the other one is more lean. That's the thing that we're running at this moment.

OK so basically the way we do it is I'm a big believer in being responsible as a developer for the things you build, so we have a certain QA system where you need to find someone to help you QA. That is kind of the first gate you have to go through, which means you have to put your pull request that you're working on for your story or your branch, sorry the branch that you're working on for your story you have to put that on the staging branch going on to the staging server. We do still have manual deploy activators there, and then we have if you want to go into production you need final approval, and there's a pull request that goes out to the master branch which is hooked up to CI, so CI will pass and then you're allowed to merge. However you need code review by one other developer, then when that's done it goes into master and then, yes, you still have to push one manual button to deploy, but I mean we can automate that, but it's not a big deal, I think.

Thank you. [audience question]

So if embedded
systems and Node is a focus for Joyent.

So, what I'll say is this. So V8 runs on ARM and that's that's kind of—and we try hard to keep node and libuv compiling and working in that environment. We're not always successful because the matrix for supporting ARM is huge, comparatively to Intel and the operating systems that we currently support.

It's difficult for me as project lead to even want to look at supporting all of that matrix, but we work really hard to keep it working when we can. TooTallNate (Nathan), he actually manages the builds for pies for instance which doesn't cover all of the ARM builds out there that people actually would find useful.

So it's not that we don't find it interesting, it's a question of managing the resources that we have available to us, so if you're part of a group that has resources and time and effort that can help with that, absolutely. That's a fantastic place to contribute back to the project and for the whole ecosystem, because we are seeing a huge uptake (if I may use a buzzword) in Internet of Things and interest around Node. The Q42 guys are here, if you find one corner them and kind of talk to them.

That's a fantastic story.

But it's great to
be able to do that kind of stuff and we want to be able to do that. Right now it's difficult to get all of those testing boards up to go through Jenkins (even though I don't want to use Jenkins, but that's another problem). But it's a question of resources and time to get people to provide that matrix, and we can get through in an efficient manner all of these things, so if you want to absolutely talk to me about that afterwards we can talk about how to do that.

Guy in front.

[audience question]

So how
do you monitor performance of your Node applications?

New Relic is what we use to look at performance but also look at the insides, but we only do it in our own environment, because in the enterprise it's usually not allowed. Before we can bring stuff to the enterprise, usually they use tools like (from your world), LoadRunner and stuff like that to stress test our stuff and see where it breaks, and to do conquerent user testing etc.

So we have a pretty clear picture of how our stack will perform, and what it will do. We have a lot of numbers about that. We're really quite confident in that department, and the good thing about being inside the enterprise is that there is no massive spikes in the amount of users, because the number of employees is usually quite limited and you can design for that scale. So that's a big benefits as opposed to being out in the internet where, if you become the next hype there will be 100 million times more requests than you are used to for example so it's quite under control.

Luca, you?

So we
are on premise and on remote location at the retailer because we need some connector there for the POS integration, so we are using a Layer 7 solution for the gateway to gateway communication that offers you monitoring inside. Our API we have New Relic and we have Nagios that basically monitor everything that goes on the network. Camp on the other side is also very, very powerful because you can communicate with Nagios.

So network is done through Camp and Nagios and the Layer 7 solution, and software-wise we're using New Relic. We wish we could track real time dtrace, but we're not doing—at this moment we're only doing in this only in development and test. Acceptance and production they're not dtraced, so that's it. Very simple.

We do performance testing. We use New Relic as well. We have custom monitioring tools. We use getsentry for error monitoring, and we have logs, and we have customers complaining if something goes wrong.

Customers are a great monitor tool.

I want to put a small button on that.
Part of the dynamic tracing that's coming in 0.12 that you'll be able to have helps people like New Relic and strongops stuff actually do this kind of work by being able, when you're on a platform like SmartOS, to actually observe it outside of the process.

One of things when you're actually trying to do performance analysis, you don't want to actually start changing the behavior of your application just by turning on performance things. You could run Node --prof which will turn on V8 profiler, but it's a completely different experience because it's keeping track of a whole bunch of junk that you don't need.

And it's something that at an operating system level you actually have the ability to do, is be able to sample the process externally. So when you are looking to figure out—there are different kinds of performance. The performance through a system and then performance of the process when it's on CPU, and you're trying to figure out which ones you're talking about.

So make sure you're using the right tools for the right job, and not just looking at means; look for outliers as well.

Last question.

[audience question]

of Node, V8 and islets.

So there's a lot of interesting work.
There are a couple people in the community already doing that, and maybe even some forks of Node that are actually kind of popular that are experimenting with multi threading and islets like JXcore for instance.

There are people who are experimenting with it. The future for Node is that when we see something proved out that people are working with, we can integrate it back into the project proper. There are interesting APIs like web workers and stuff like that maybe we can use for offloading high CPU usage, or CPU bound workloads in JavaScript though in large part I would still argue you probably shouldn't be doing that in JavaScript.

So there is a future for that, but right now we have other things that we need to work on in my opinion. If somebody in the community wants to start working on that and get that nice and solid we can bring it in, but we have a lot of things that we need to work on right now that are not there, so if you want to help, I would love to talk to you about it.

Thank you very much, so this meet up wouldn't be possible without the people at Rockstart, the Amsterdam Node communities, and you guys of course, and Joyent, and all of our speakers, so I would like to say thank you, thank you to everyone.

Sign up now for Instant Cloud Access Get Started