550 regression tests in 4 minutes with Joyent Manta

To find a performance regression between Node v0.10 and v0.11, I used Joyent Manta instead of git-bisect to find the offending commit(s)among all 550 commits of our development branch in under 4 minutes. Thankfully the result was bimodaland pointed to a single commit. Here's how you can use Manta to parallelizeyour development process to quickly identify a regression in your project.

Software inevitably has bugs. We're only human. Hopefully you have a testsuite with plenty of coverage such that when you introduce a new bug you'llnotice it pretty quickly. Of course, you need to be running your test suiteregularly to make sure you notice when a regression was introduced.

However, even with the most diligent and dutiful of engineers and test suites,there may be times when a bug or regression finds its way into your code base.When this happens there are a few ways you can solve it:

  • You are very aware of all changes coming into the system and can thereforeeasily identify which change introduced the regression and back it out
  • Use bruteforce/hunt-and-peck strategy and revert commits that may or may not berelated until you find the offending commit
  • Usegit-bisectto do a binary search between a known good and a known bad commit and find theexact commit where the regression was introduced

Bisect

It's almost always to your benefit to use git-bisect, as you will only bechecking against a subset of the potential commits that might have introducedthe regression. As you remember from your algorithms class, a binary search isO(log n), because at each step you are cutting the size of possible searchpoints in half. To work with git-bisect you tell it the last commit sha youknew was good, and the first commit sha you know is bad, git will then splitthe set of commits in half and ask you if the commit you're on now is good orbad. If you say it's good it splits the remaining commits between it and thelast bad commit in half, if you say it was bad it splits the remaining commitsbetween the last good commit in half, and then moves the checkout to that pointand asks you again. So on and so forth, until you find your offending commit.You can even supply a test script that can be executed, and if the exit code is0 git-bisect will assume the commit was good, and if the exit code wasnon-zero assume it was bad.

There are a few problems with using git-bisect though. You may mistakenlyanswer wrong, or you may find out the build was broken for other reasons atthat commit. Which results in you wasting time only to never find the offendingcommit, or worse believe you had found the appropriate commit only to bechasing a ghost.

Ultimately though, for every step you need to do your prep work, i.e. buildyour project. In the case of Node, that means recompiling the source tree alongeach set of commits. So while you may have a beefy workstation, it can stilltake a lot of time to find your answer.

Enter Joyent Manta. Manta is a parallel computeservice that is backed by an object storage service. Compute jobs are expressed withUnix commands and pipeline semantics where input is passed in via stdin and chainedto the next phase via stdout. A job may have many phases of which there are twokinds: map and reduce. Parallelization is primarily introduced in map phases.

Cache your build artifacts

As part of the continuous integration that we dofor Node.js, we cache the build artifacts (the results of make install) forevery commit on the master branch. Note that jenkinswhen triggered by a push to GitHub will only build what'schanged since the last time a job was run. So if you want per-commitgranularity you'll need to manage that on your own.

I use a cron job that launches this script, which keeps track of commits that havebeen scheduled to be built and what new commits have come in since then and consequentlyschedules more builds. The commands that start with 'm' areManta CLI commands (e.g. mput = put a fileinto manta):

## We only care about the commits since we branched off v0.10,START_COMMIT=43c1830e0a9c046d0209421ca732187e1cc3d938cd /var/tmp/nodegit fetch -q## Sometimes people force-push, don't let that ruin your day.git reset -q --hard origin/master## get the list of commits for this branch and preserve the orderinggit log --first-parent --pretty="%H" $START_COMMIT..HEAD > ordermput -qf order /NodeCore/public/builds/node/order## sort these so comparing what we've already built to what's outstanding is usefulsort < order > known## find the commits that have been previously scheduledmfind --type d --maxdepth=1 /NodeCore/public/builds/node | \  xargs -I{} sh -c 'basename {}' | sort > built## we now know the commits that we haven't yet builtcomm -13 built known > tobuildJENKINS_URL="http://jenkins.nodejs.org/job/node-build/buildWithParameters?token=supersekrit"## trigger a jenkins job for each commitxargs -I{} sh -c 'mmkdir -p /NodeCore/public/builds/node/{} &&  curl -sS "$JENKINS_URL&GIT_COMMIT={}"' < tobuild

If you don't already have your builds cached in manta, you could (of course)use manta itself to build all the commits with something like:

git clone git://github.com/joyent/nodecd nodegit log --first-parent --pretty="%H" $START_COMMIT..HEAD > ordermmkdir -p ~~/public/builds/nodexargs -I{} sh -c 'echo | mput ~~/public/builds/node/{}'mfind -t o ~~/public/builds/node | mjob create \ --init 'git clone git://github.com/joyent/node' \ -m 'COMMIT=$(basename $MANTA_INPUT_OBJECT);     mrm $MANTA_INPUT_OBJECT &&     cd node &&     git checkout $COMMIT &&     ./configure --prefix=/build &&     make -j8 &&     make install &&     mmkdir -p ~~/public/builds/node/$COMMIT &&     tar cj /build | mput ~~/public/builds/node/$COMMIT/build.tar.bz2'

Parallel Regression Tests

A recent regression we had in Node, involved trying to figure out when thefollowing code snippet started to take longer to run on master than on v0.10.

mput ~~/public/test.js <

In this example, it's just creating 10K vmcontexts. Sofor every commit we have previously built, let's find out how long it takesto run that script.

mjob create -o \## first phase, ask manta to find all of our builds, lower latency then us doing it## only grab our builds for ia32 smartos and use that name as a key to the next phase (mcat)  -r 'mfind /NodeCore/public/builds/node -n build.tar.bz2 |      grep smartos | grep ia32 | xargs mcat' \## include the test script as an asset  -s '~~/public/test.js' \## extract build, set path to that node, grab wall time to run script  -m 'tar xjf ${MANTA_INPUT_FILE}; export PATH=$PWD/build/bin:$PATH;      ECODE=$(ptime -m ctrun -i core,signal -l child         node /assets/$MANTA_USER/public/test.js 2>&1 |         grep real | awk "{ print $2 }");      echo "$(echo ${MANTA_INPUT_FILE} | cut -f7 -d/) ${ECODE}"' \## include the original commit order  -s /NodeCore/public/builds/node/order \## include a script to sort the results back into commit order  -s /NodeCore/public/builds/shasort.js \  --init 'npm install lstream' \## shasort reads on stdin and sorts it into the given order  -r 'node /assets/NodeCore/public/builds/shasort.js /assets/NodeCore/public/builds/node/order' < /dev/null

Now, this isn't very scientific. Generally when doing a benchmark (especiallyin a virtualized environment) you're going to want to run your script multipletimes in a row and do some statistical analysis to make sure you're gettingsignificant and accurate results. But just as a first pass, let's see if we canidentify any anomalies.

To reiterate, we've run this test script across every build we have (at thetime I ran this 550 commits) and it completed in totaltimejust under 4mins.

...58e4edaf6855025099d400ccc1ac23291b109a41 11.68805868782ff891e226ecadde68d000a12e7eb1fd0a17d13 10.348681261fe176929c2963d1e48d34e8f2cb367d9801395a0 12.3254430680181fee411e217236c4ec0bf22c61466df5a56b5 11.5550788477684e0b554c7d7ee007959e250700473f64c9fa6 -d2d07421cad4a20778bf591e279358dd0442382e -0693d22f86de01b179343cc568a5609726bef9bb -c56a96c25cabf40801a800c76b44f08d94ac839b -8985bb8bfd0c4b9fa8dcf001306f1cf7e6c886b4 -110a9cd8db515c4d1a9ac5cd8837291da7c6c5ea -9b3de60d3537df657e75887436a5b1df5ed80c2d -588040d20d87adc1dced78a3c7243b0a27ae8ec5 -704fd8f3745527fc080f96e54e5ec1857c505399 -eec43351c44c0bec31a83e1a28be15e30722936a 7.490435929f0a05e4bc39beb1a15b34dfe906fed3be37c5ac8 5.23402218928609d17790215ae1b7c7c59e8157ea92cd7cf2f 8.67025743371ade1c212365099dccb16ee7a9094261629c35a 5.993378213...

That output is significantly abbreviated. The first column is the commit sha1and the second is the time it took to run the test script. You'll notice thatfor some of these commits there's an - in the timing field, that means wedon't have a build for that commit (because the build was broken).

So we can see a series of commits where the build was broken, and on one sideof the branch history we see the test script taking 5 ~ 8 seconds to run, andon the other side 10 ~ 12 seconds. It's not always this easy, but if you lookat the log for commit704fd8fyou can notice that this is when we upgraded to v8 3.20. That's pretty damningevidence. And sure enough, if I checkout before and after that commit on mylocal machine I can see that indeed that is the offending commit.

Other uses

One of the fun parts about having Manta at your disposalis that you never know just what use case will pop up next. Going forward, the Nodeteam will be using more of these cached builds for regression testing and benchmarking.That will translate into a better Node experience for everyone.

Tangentially, I've been working on a node-bisect script, which will allowcore developers and community members alike to do more traditional bisect workover our commit history without having to build Node for each commit. That is,if you're on linux and want to find out which commit in our development branchbroke your application? No problem, the script will just download the binariesfor the commits you're on, run your test case, and reduce to the answer you need.Stay tuned.

Like many things that are painfully manual in testing, finding a regression isgenerally simple, but time consuming to execute. Using a parallel, automated processmakes sense no matter the size of your project.



Post written by TJ Fontaine