Deploying Couchbase in infrastructure containers for unparalleled elasticity and performance

June 03, 2015 - by Casey Bisson

Couchbase's standout performance, built in sharding, and cross-datacenter replication features make it ideal for cloud-scale applications serving hundreds of millions or billions of users. However, typical cloud infrastructure is ironically unsuited to high performance Couchbase deployments. The high network and file I/O latency imposed by hardware virtual machines--the hardware hypervisor tax--have forced many users to trade cloud elasticity for bare metal performance.

Infrastructure containers running in Joyent's Triton Elastic Container Service or in private data centers powered by Triton Elastic Container Infrastructure offer an alternative that delivers the bare metal performance and elasticity demanded by sophisticated Couchbase ops teams.

Containerizing an application in infrastructure containers is easy because they offer all the services of a typical unix host and behave similarly to hardware virtual machines. Containers enjoy their own virtual NICs, filesystems, and all the resource and security isolation that you'd expect of a hardware VM, but with the elastic performance and bursting that's only possible with containers.

Heck, it's easy, let's try it out! You can follow along at home for free by leveraging our $250 Couchbase without Compromise promotion.

Create an infrastructure container running container-optimized CentOS

Couchbase recommends installing on RHEL-based operating systems such as CentOS, so let's start with an infrastructure running container-optimized CentOS.

You can start a container in the dashboard with just a few clicks. Find the "create instance" button, search for CentOS, and then choose the memory, CPU, and disk size.

Create instance button

Choose CentOS

I prefer to create containers using the command line tools, since that allows me to kickstart the installation with a script that automatically runs as the container is provisioned.

If you've got the tools installed, just copy and past this code block:

curl -sL -o couchbase-install-triton-centos.bash https://raw.githubusercontent.com/misterbisson/couchbase-benchmark/master/bin/install-triton-centos.bash
sdc-createmachine \
    --name=couchbase-container-benchmarks-1 \
    --image=$(sdc-listimages | json -a -c "this.name === 'centos-6' && this.type === 'smartmachine'" id | tail -1) \
    --package=$(sdc-listpackages | json -a -c "this.memory === 16384 && /^t4/.test(this.name)" id | tail -1) \
    --networks=$(sdc-listnetworks | json -a -c "this.name ==='default'" id) \
    --networks=$(sdc-listnetworks | json -a -c "this.name ==='Joyent-SDC-Public'" id) \
    --script=./couchbase-install-triton-centos.bash

Another advantage of using that code block is that it provisions the container in our beta data center, which features the latest equipment and software before it's deployed in our production data centers. The generation of software and hardware there will be in all our data centers worldwide soon, but it's fun to test on the latest stuff, no?

Install and configure Couchbase

Infrastructure containers look and feel a lot like virtual machines, just faster, so installing software is as simple as SSHing in. You can get the connection information in the portal or via the API.

Container details

Lookup the IP address for this new instance using the API:

sdc-listmachines | json -a -c "this.name === 'couchbase-container-benchmarks-1'" ips.1

Once in, installing and configuring Couchbase is easy with this script:

curl -sL https://raw.githubusercontent.com/misterbisson/couchbase-benchmark/master/bin/install-triton-centos.bash | bash

You can see the details of how that script works in the Github repo. There are actually three scripts that install, set environment variables, and configure Couchbase.

When done, the script will output some summary information about Couchbase and how to connect to the dashboard. Take a moment to open the dashboard now, because we're going to run some benchmarks in a moment and it will be fun to watch the graphs there while the action happens.

# Couchbase is installed and configured
#
# Dashboard: http://165.225.138.202:8091
# Internal IP: 10.112.5.196
# Bucket: benchmark
# username=Administrator
# password=password
#

Run the benchmarks

The benchmarks will load a large data set and execute a hanful of queries designed by decimal.io's Corbin Uselton to test relative performance. As the benchmarks execute, look at the time to load data and query it. Shorter times are better.

The benchmarking tool uses Node.js, so some of the first steps are to install Node.js and some other dependencies. Kick it all off like so:

curl -sL https://raw.githubusercontent.com/misterbisson/couchbase-benchmark/master/bin/benchmark.bash | bash

The result should look similar to the following:

series 1: load test docs [====================] 300000/300000 100% 8.3s elapsed
completed series 1

series 2: load test docs [====================] 300000/300000 100% 7.6s elapsed
completed series 2

series 3: load test docs [====================] 300000/300000 100% 7.9s elapsed
completed series 3

series 4: load test docs [====================] 300000/300000 100% 7.0s elapsed
completed series 4

series 5: load test docs [====================] 300000/300000 100% 7.3s elapsed
completed series 5

series 6: load test docs [====================] 300000/300000 100% 8.7s elapsed
completed series 6

series 7: load test docs [====================] 300000/300000 100% 7.5s elapsed
completed series 7

series 8: load test docs [====================] 300000/300000 100% 8.8s elapsed
completed series 8
query people with SUVs
people with SUVs: 342732
people with SUVs in: 9322ms
query number of convertibles
number of convertibles: 342446
number of convertibles in: 346ms
query average age
average age: 42
average age: 331ms
waiting 60 seconds to run queries again
query people with SUVs
people with SUVs: 342732
people with SUVs in: 7041ms
query number of convertibles
number of convertibles: 342446
number of convertibles in: 343ms
query average age
average age: 42
average age: 436ms

The benchmarks will run five times, but you can trigger them manually using the following command string:

cloud-benchmark run -d /root/cb-cloud-benchmark-data-79bd88b76cbf9cbec987d84f1ef6ad996973d526 -c couchbase://127.0.0.1

There are instructions in the repository for installing on AWS and on generic CentOS environments, go ahead and compare the numbers. I think you'll be impressed with the performance per dollar. Say goodbye to the hypervisor tax and say hello to bare metal performance with the ease and scale of cloud infrastructure: Couchbase without Compromise.

Surprises

When installing, you might notice Couchbase detect 32 or 48 CPUs. That's no lie. Containers really do run on bare metal and can see all the CPUs the hardware has available. How many of those CPUs you can actually use (or memory, or disk) depends on the package you selected with creating the container. The installation script includes some additional guidance to tell Couchbase not to try using all the CPUs it can see, for best performance. Part of the container performance advantage is the flexibility to individually schedule threads on any available CPU, but it does require telling the application how many threads it's limited to running simultaneously.