Node.js on the Road is an event series aimed at sharing Node.js production user stories with the broader community. Watch for key learnings, benefits, and patterns around deploying Node.js.
Roberto Masiero SVP
ADP Innovation Labs
Thank you guys, thanks for having ADP here. I do have a Twitter handle in the slide, it was not there, it's mostly because they don't use Node yet, so I'm kind of boycotting Twitter. I'm kidding. So I get a little bit about ADP, ADP is not the security company, it's the payroll company, human capital management; about a $12 billion company all over the world. We run one in five private sector employee payrolls in US, one-in-ten in the world, so pretty big company and ADP Innovation Lab is idea that we kind of pitched to our CIO about three, three and a half years ago to start a small department within ADP to do a little more ideation, research, and come up with new stuff that we can then plug in to our core products, starting with a small group in our headquarters as Bryan said we opened a nice facility here in Chelsea and we were up to like 85 engineers there. About myself you see the accent from Brazil, World Cup coming one month, excited.
Again, moved here about 11 years ago. About to go back after this winter on the East Coast, it sucks. But again I'm very happy to be at ADP I was acquired back in Brazil, obviously not for enough money I'm still there, but I'm very happy to be the lead of ADP Innovation Labs, and that's my twitter handle there.
So we started the lab, the first project we run in the lab was mobile right? We started mobile back in the days, this is three years ago, Node I don't think it was ready, really for us, so we ended up going PHP. We pump a lot of HTML out of the server. It's funny because on those days you remember those things they called Blackberries?
I'm not kidding. So but right after mobile, right, which is very successful, we embark on another project which is called Semantic Search, and on that one what we wanted to do is to create a search engine that would cross all the objects of ADP, and we have a lot of objects at ADP right? HCM, and payroll is a complicated thing, and we do like benefits, and time, and retirement, and FSA, the whole benefit ACA that Obama is going through, so really complicated stuff, and we needed a search engine that will go across all those objects right?
And we wanted to do even more, right? So we said why not instead of just looking at nouns, we do verbs as well, right? So when we make the search the navigation mode for ADP, so we said, you know what, if I can understand the predicate, and understand that there's a verb there, and I can relate that to an object or a noun, why don't I just say hire John, and the thing just magically go hire them John, right?
So we started writing that project, and it was pretty cool, right? So we said let's not do an LP, let's just do metadata, right? Let's allow to search a noun and object, you can see an example there, I see hire, I see that, I then see all, there's a object related to hire as a candidate. Do I have a candidate with the name John? Go, right?
And we needed to be very first, we wanted the same thing that Google gives you, like millisecond response, very very fast, so what we did is that, we said OK what we need to do is when the user issues that query predicate, we'll need to branch this into two queries, right? One against the metadata and one against the index data itself, right?
We need to parallelize these things like crazy because we decided to use instant which means that every character you put on the predicate is a new query firing against the server, right? We said, I don't think PHP can do that, parallel shit and all, so we said why don't we test this Node thing. This was about two and a half years ago, so we said let's write this in Node and we wrote it in Node and it was awesome.
The thing just screams, and it's very easy and it's—very complicated manipulation of the predicate to change relevance, we're a multi-tenant cloud server provider, so we run about half a million clients, in our data centers, so pay about 30 some million people on any given week and this is used heavily.
So we need it to be fast, and Node really got us there. So the stack is basically good old Linux right, nginx, and Node, and the search itself is Solr, and the metadata sits on Mongo. So really, really lean and pretty awesome. So when you use something like this, so you go to the search there and you say "John," it gets you all the objects that employees, candidates, applicants everything and then you say hire John and then it creates like I got it hire, and I see applicants with the name John on it go right?
And that's what we've been using in production now for our tablet app. So you can do things like you know, fire John with is pretty awesome, right, and it's cool because when we demo this it's pretty cool, because we add it on the tablet, we add a voice command so you can literally walk the hallways and like "fire John," and the guy is just like, I could get fired.
And so it's sweet, right? And it understands all kinds of concepts like time and team and peers so, really awesome engine right? Then, mobile's very successful. We reach a million users going really really well. We have functions like—it's crazy because we allow people to punch in, so clock in and clock out on the mobile, look their pay, look at all kinds of stuff, so volumes are going like crazy, so we said we need to do tablet now, do we go PHP again? It doesn't sound right. Node was so awesome on the search thing, why don't we try this Node thing on tablet. So we ended up doing it and one of the reasons we designed tablet, right?
We came up with this idea of extremely responsive design. So you have this page that can take the entire canvas of a desktop, or the tablet, or a tile on the tablet, or your Google glass or your watch. Same page goes down for one little thing that sits here, all the way to a TV. And here you see the tile, and this is a dashboard.
So the dashboard is a bunch of tiles, that are basically—they understand—you see there your pay, your clock and all that stuff. But it understands who you are, where you are, what time of the day it is, if you're online or not, because we don't want to consume a lot of bandwidth, and who's close to you, which is pretty awesome as well.
We're experimenting with iBeacons and all. So that page is the first page you see. And you see how many tiles are in there, right? So we need to do all that in parallel. We need to just like, the guy comes in, we need to pump that page out with like ten tiles, and it needs to be damn fast. So what we did is we used Node, and that's the cool thing about being asynchronous.
You're just like, boom. Every one of those pages gets rendered, get it all back up and boom! Put it there, right? And it changes depending on where you are. So if you're at home, maybe benefits becomes more relevant, because that's the kind of stuff you do with your spouse. You decide, how much are we going to contribute or whatever.
Right? or retirement. But if you're at work, it's all about punching and looking at your pay and all that stuff. So Node was the perfect thing for this, so very happy it works and its awesome. So what we did was—again we have the consumer app, right? In the consumer app itself, we also innovated on this.
So we have a container that's native on iOS for example, and it uses Cordova to communicate between the container and the page apps. So it's almost like it thinks that it's sitting on a browser, but it have all the device stuff communicated to the page, sends a request, go through our security gateway, that thing is still like old school, but awesome like big IP F5, that does the SSL termination.
We're the Nirvana for hackers I mean, we have social securities of all of you, so we don't want to play around with the edge. The edge is sacred. So in there it's all like two F5s and something else in the middle that they don't even tell me what it is, so it stops all the traffic on the door there, and then that's the Node part, the kind of orange, I don't know who's choosing colors for this slide, it's kind of, but it's OK.
So like the API proxy there is the thing that it's all in Node, and that gets all the requests from mobile, tablets, search, everything, and it does a bunch of things for us, and it communicates with the APIs to the backends, to the payroll, HR, benefits, search, everything on the backend. What that API does—ten minutes. Good—what that API does is the API facade, it provides the authorization for what you can and can not do, all the routing determination. It caches. The application don't need to worry about cache, and some of this stuff we can cache forever, so once I cut you a paycheck, you know when that is going to change?
Never, right? If there is a mistake on that it'll come on the next pay check. So we cache that forever, and it just sits there. So it's great because the backends—it was funny when we launched mobile and tablet, the backend, the payroll there was like, you guys are not using, no one is using mobile. Yes, they are.
We just call you once, and you never see us again. We get the payroll we cash, and we never go back to you, so that was funny the main frame guys. We do aggregation and orchestration. Obviously that's extremely—it's funny because we ended up deploying this on six machines, which I feel ashamed after seeing two machines from Artsy, but we run active, active on two data centers, so we have two tier-four data centers, we host our stuff, so two data centers active, and the minimum that the I/O guys, that our operation guys allow you to deploy per data center is three VMs for high availability.
So if one goes down you still have two, second goes down you still have one. So that's just a rule there, so we have three VMs on each one of the data centers, and we do geo load balancing against them, we're never sticky. You might have one transaction going to data center one, and the second one going to data center two.
So we have to synchronize the cache across the data centers, because you never know where you're going to hit. It depends on the ping speed between your device and the data centers. We take care of all of that for the applications. We do logging and auditing at that layer as well, and we do a lot of filtering as well.
So we love the name of the stuff that we create, so this is the APIMP, right? So if you want to pimp up your APIs, you go through the APIMP. So the APIMP is the API multi proxy, it was just the API proxy and I said, you got to put an M in there, so multi proxy. I don't even know what it means. To get all that working, we wrote a couple of pretty cool modules and I think the guys that wrote are right here. Mike and John. So the first one is the PigeonKeeper, we call it PigeonKeeper. I think it's awesome. So the idea is this, to make the dashboard with a bunch of tiles in there.
When you know it's done—so the idea is like, you send all the pigeons out, and then you wait for all of them to come back. But to do that in Node is pretty hard, right? Because when you know all the pigeons are back, I'm still missing one oh my god what do I do? So we wrote basically a DAG engine. So what is the PigeonKeeper? So it's a JS object right that orchestrates the execution of all these processes and events.
It's not a workflow, it's not a visual programming language, right? And basically implement a direct acyclic graph. It finds the best path from an initial state to an end state, and it tries to parallelize that at the max possible way, right? It understands all the vertices, and the edges, and the dependencies that exist, and it just tries to go down as fast as it can.
And everyone of the nodes have a state, and it controls the state on each one of them. So, as I said, it's a JS object, it orchestrates, so you just describe it as a JS object, it maintains the association between the vertices and the processes, so it's pretty cool. It understands the dependencies, and it does that topological sort.
And they have this shared data object, right Mike? Am I right? So again, he wrote it, so he knows it. He likes this mathematical algorithm stuff that I don't even comprehend, but if you have a question about this, go to Mike. Don't ask me during the Q&A because I don't know. What is the Kahn's Algorithm? I don't know, right?
So basically it implements the Kahn's Algorithm. It goes—anything on the process, like an access to a database, it can be a call to the search, it can be an HTTP request, a file operation, it doesn't matter, right? Every vertex goes through this life cycle, it's not ready, it's ready, it's in process, and then it succeeds or fails, so you see like a graph like this, it will try to just go through it so it parallelized the three requests, they now are ready in progress.
One of them fail, it will automatically flag the dependency as I cannot do it. And then it just keeps going the happy path, and you end up with the final state like this, right? Hopefully everything works, it's not crappy code like this that half of it fails, but it manages all that, right? So in Node, that's pretty hard having all those things like asynchronous going on it can make your brain hurt.
So the other one is JQL. Just the name is like screwed up so, JSON Query Language this one's another we started with Stefan Goessner's JSONPath, but there were some issue with that because it's an awesome library, don't get me wrong, but we find a problem with recursive descent operator so we just re-wrote it.
These two models, I'm going through legal to open source them, because there's like an army of legal people at ADP that you have to go through, to put anything out, but this will be open source and put on github as well. So, as I said about the name, we started calling it JQL, right. So JQL, but it was confusing with SQL, we'll talk about that in the company of people and others.
So we started calling it JayQuil like Vicks Nyquil but for congested birds, they can't sleep, so and that worked for a while, and then we said it sounds French, the thing sounds French, so we tried Je Quelle but that was too French, so we went back to JQL. So that's JQL it's just fun trivia there. So what it does is if you have a simple JSON, it can be anything, as deep as you want, as big as you want, you get the JSON in and it can do things like this.
So you just say, get the prices of everything, and it would just traverse that thing at lightning speed and put an array out of all the prices that you see inside of JSON, so awesome for manipulation, right and we can do like complicated things like get titles of all books priced under 10. So whatever JSON is, you just you say you put this expression in and it will spit out that array for you. And really complicated stuff, like get the title of all books priced under 10 that are rated three or above by the New York Times.
So you just go through, you see the expression is really easy to manipulate that big JSON, and it's extremely fast. So we're looking forward to open source, and let you guys use this. And then as a future, I can't believe I'll be within time as first time in my life, I think that I'll be on time. So as I said those styles they work on the glass.
They work on your Telsa, on the freaking panel of the Telsa. We're testing wearables as well that can do like clocking in and stuff like that. All of that test on the lab is being done with Node behind the scenes. Sensors, we've been using ibeacons [xx] it's pretty cool, people don't clock in any more, they just walk into the office or the lab, and they are automatically clocked right?
Because we can talk to their phone which is pretty creepy right? I mean this protocol, I mean seriously, your phone can be locked, right? No password, and it's still communicating with the thing, right? It's creepy, but anyway, so we're using Node for that because Node is perfect. The speed it give us for sensor which is really attenuation on the decibels that you get from the distance and the velocity between you and the beacon. It's awesome. You have to ping that thing like a thousand times per second to get the attenuation to know I'm walking this direction through the device, and Node is awesome because you can throw a lot at it and it will just go.
So it's really cool, and also we open up our developer community, so developers.adp.com and you can see all the APIs, browse social securities of everybody. No, no, no, no, but all of the APIs that we use to do mobile and tablet and all the stuff that we do inside it's there, so if you want to create an app you can, you submit the app and you become a partner of ADP. All of that powered by Node as well, so what sits behind, and this is all production.
I mean mobile alone is like 2.5 million users now. We serve on any day half a million unique users, coming in and clocking in and all that, and it's growing exponentially so really, really exciting stuff. I'm speaking low because I have eight seconds, seven. Thank you I appreciate it and alright, good stuff, thank you.
Node in Production
See techniques for deploying a large-scale, high-uptime production cluster.