Node.js on the Road is an event series aimed at sharing Node.js production user stories with the broader community. Watch for key learnings, benefits, and patterns around deploying Node.js.
Chris Montgomery, Application Architect
This is awesome, this is a great group of people, thanks for coming out guys. Okay, so, my name is Chris Montgomery as you heard. I work at Dow Jones, and in case you don't know or have never heard of Dow Jones, there's one thing that a lot of people think about when they think about Dow Jones, and that's typically the Dow Jones Industrial average. Most people think of that. Actually that's a very, very, very small part of what we do.
For the most part Dow Jones is really a media company. So you probably know The Wall Street Journal? It's kind of big deal in publishing, I guess, and that's really our big boy and there's a lot of Node.js under the covers of the Wall street Journal. So that's one of our big production stories of many at Dow Jones.
We also have a lot of other websites which I'll, kind of, I'll touch on a little bit as I go here, but Dow Jones as a company has been in business for about 130 years or so, 125 of those has been the Wall Street Journal, so we've been around for quite some time. If you ask people external to the company, it's certainly media, journalistic focused, but if you ask people internally, a lot of people would say that Dow Jones really is a technology company.
We're all about trying to innovate, we're all about trying to use new and upcoming technologies and solve big problems with our technology. So one example of that is this little picture here. This is a 1914 Dow Jones Ticker advertisement. This was used way back in the day to deliver a five inch ticker, just a little piece of paper with news on it, and it was a way to deliver news to people out in Wall Street, and to kind of tell them that something is going on, instead of having some news boy run around say the price of IBM changed, take a look it's going to affect us in some way.
So really in some ways you can almost think of this as the Node.js of 1914. We've also done a number of other things, like we were the first company to transfer the newspaper via satellite, so again, technology is kind of at the forefront of what we do. So a lot of these big websites you maybe have heard of, of course, The Wall Street Journal is one big one. And all of these use Node in some way. The Wall Street Journal, parts of it like the article page, the mobile pieces of our website, markets, things like that, those all use Node. We average around 8 million page views daily, so we have decent traffic, it's not amazing but certainly a good amount that we have to handle. That translates into about 100 requests per second in each of our data centers, and it really does handle pretty well, so those connections and those requests we can really scale on the Wall Street Journal.
Barrons, we have some stuff there, and we're going to do more and more as we go forward in Node. We have another product called Real-Time, also known as DJX. It's a portal where we deliver our content directly to our customers. All of our financial data and things like that. And that's been a big product for us, and one of our keystones, I guess you could say, for using Node.js.
MarketWatch.com, that's the main business here in Minneapolis. We have a shop of 30-35 (something like that) developers that are just a couple of blocks from here, and most of those guys, or a lot of those guys work on MarketWatch.com, and right now are going through a big rewrite to turn all of that into a Node.js
platform as well. And then Factiva and a number of other internal websites.
So the big question is, we have all this stuff in Node, but how did we get there, because we are a huge company, and it didn't all start in Node, and it's taken time to make pieces of that to be put into Node. And so I want to talk a little bit about the path and how we got there. So our first big endeavor was WSJ Social, and this was a Facebook news reader that was POC back in early 2011. Trying out Node for the first time, seeing how it would go, I was on Node 0.4, and it ended up being very successful, partially because it was a quick development cycle, we had a large user base right off the bat so we could really test some good load, single language which everybody loves, and the design to production just went super quick.
In about three months or so, you could get something out the door, and working and running in real time, so from there it was just a natural path to try and start using it in other places where it made sense. So we started a project called DJ Hub which was an internal website that did a bunch of stuff for us.
Real-Time which, like I already talked about, that was another big product that we started, and when we started that we really wanted to have something that would support all of our developers, because it's great to jump into Node, but you need something to put a little bit of Rails on it when you have, all of a sudden 100 or 200 developers that get thrown at a project and they all need to work in similar code base. So around that time, and this was 2012, we needed a kind of a framework to support all of these developers and so that's where Tesla came in, and now when I say Tesla, I don't mean Tesla.js.
If you Google for Tesla.js, that's not us. We have our own thing, we called it Tesla, and it's still internal to us, and we've wanted to open-source it, and we're starting to open source pieces of it, but currently it's still just our own thing. But out of this, we've kind of been developing it and creating it in such a way that it can support all of our different websites. So around late 2012 or so, our CIO was appointed and he was originally a project lead of our DJ Hub project, and so that was also a big turning point for us because then you have a person in leadership that all of a sudden is really excited about using Node.
So what does it look like today? Because it's changed a lot in the last 3 years. Today, we use NPM very heavily, we have an internal registry, we actually have 2 internal registries, along with a public registry, of course, and we proxy that through our internal registries. We've tried a number of different things and so far that's worked the best for us. We have an internal GitHub, where we publish things too.
We use Mongo for a lot. We do a lot of caching, a lot of assets that we bundle. We go through mongo for a number of reasons. Not always a recommended solution, but we do that. Dev tools, we're just recently moving over to Gulp more and more now. It was a debate for the last few months, and it's kind of our new goal to get rid of some of our custom tooling that we have that we call our Tesla CLI, and we're trying to get rid of that and move more towards open-source.
And that's kind of the general direction that we've been going in the last six months in particular, is trying to pull out pieces that we've written custom and transfer it to open-source. Just because internal is great, but kind of like some other guys talked about, the community has just come such a long way and there's such great tooling out there that some of these things we created initially because we thought there wasn't a perfect solution that fit our needs, well now we have to maintain it. And so trying to pull some of those things out and going back to the community and supporting the community is starting to become a a real push for us. So we also have Tesla, like I said, and then we have an internal thing called teslad which is a little daemon that installs and runs and monitors our Node processes, so a lot of things like that around Tesla. And again, since Tesla's internal, I can talk about that a little bit just so that you get a feel for what we're doing. Again, it's an internal framework. We've started to open-source pieces of it, and we're getting there but we really want to start pulling out the pieces where it make sense and start using open-source more.
So, what we came up with was a framework to handle that, and a framework that could have all these separate modules that were encapsulated into their own little pieces with its frontend code, with its backend code, tests, and all of that in a single little piece. And ideally, all the developer had to care about was your one module, and you could plop that on a page and it would work just all cohesively by itself. And so we've been trying to keep it modular, but in a way that makes sense without having to have all these separate repositories, hundreds and hundreds as it turns out just for a single application.
Another thing that gives us the gifts of convention, like I said, 100 developers, you got to have some consistency, otherwise you just end up getting spaghetti code for the most part, and you can't have 100 developers all doing it their own way because you can't support a website long term doing that.
So lessons learned, we've learned a lot. I've already touched on a few things that we've learned. One of those is memory management. Fortunately, we haven't run into any huge memory leaks up to this point. Most of those things we found in our staging environments, so that's good, but we have run into the Node process memory limit a number of times especially because we do so much caching, we probably, too much especially in Node, but there's a limit of about 1.7 gigs that we've run into and so we have to be very, very strict on what we allow to be cached. And of course, the caching is there for a good reason, for performance but there's certainly some different ways that we could tackle that problem. So that's something we're constantly battling.
Avoid building your own. So that's again something other people have talked about, and what I've talked about a lot already. In a lot of ways, I mean, you have to go back three years and imagine yourself with Node 0.4 and a lot of those frameworks were in their infancy, and a lot of them did one or two things really well, but we had an enterprise production application that had to support hundreds of developers and thousands upon thousands of users, and so Express, at the time wasn't really going to work for us, and so that's why we built our own.
Well, fast forward now three years later, and there are so many good open-source packages out there to do all kinds of things, and really getting out of the framework business is our big drive going forward, just because we're not a framework business, we're a media company, and that's what we want to focus on.
So, really supporting the community and giving back to the community is going to be a focus.
Testing. Testing doesn't have to be hard. I know for a lot of people avoiding it just because it can be difficult and tedious, but really there's a lot of good packages out there that make it possible and make it easy to do.
Things like proxyquire which we've used mocha, sinon, we use Gulp to run those things, and we have a pretty big test suite at this point running with those various things and making it reusable. It really isn't as hard as you might think. So we've been trying to build that and bake that into our process more and more.
Another one, managing your dependencies carefully. Essentially, my big word here is, just don't use just a star in your package.json file. Don't use wildcards and just get the latest of everything because you'll shoot yourself in the foot. If you know what I mean, you've been there. So many times I can't count, where we've really just killed ourselves because we were getting the latest of some package, and it had a star inside of its dependency, and then you got the latest that should work, but didn't, and things blow up on your staging environment, you have no idea why, or even your production environment for that matter.
So there's a lot of really good things out there, a lot of really good modules out there that will solve these kind of problems. For one, just be specific about your versions, but Node also has built in some, or NPM I should say, has some things built in like shrinkwrap, which I would suggest using.
There's a few other packages out there like node-pac that are really good, but essentially, don't use wild cards, be specific about your versions, you'll be happier in the long run. And lastly, I've already been talking about this, but participate in the community. There's actually been quite a bit of participation already from Dow Jones, but you might just not know it because a lot of the people at Dow Jones have been giving back for the community just through their own public users, and we've contributed, me and a number of others, to a lot of various projects like the MongoDB Driver, the MongoDB Parser, Hogan, some XML serializers, so there's actually quite a bit of activity, you just wouldn't know it, but we want to continue to do that because that's, again, really important for us going forward.
So that's really it, and I just want to say thank you, not just for coming to listen to me and these guys here, but just for being a part of the community, and I hope that you guys are continuing to be active just like we are, because that's the way that Node goes forward, and one of my favorite things about Node—there's a lot great things about it—but the community by far, for me, is my favorite.
And lastly, we are hiring a person here in Minneapolis, so if you're interested and want to work on Node, come talk to me. That's all I got.
Node in Production
See techniques for deploying a large-scale, high-uptime production cluster.