Cloud Scale Architectures

Allow web applications to really scale. Billions of pageviews per month. Millions of users per day. Thousands of queries per second.

Achieving this level of scale requires sophisticated architectures of multiple Joyent SmartMachines. Typical architectural patterns adopted by our customers are reviewed below.

Dynamic CPU Bursting for Vertical Scalability

Even the smallest SmartMachine includes automatic CPU bursting. When web applications require additional resources, they have immediate access to a deep resource pool for additional CPU. The performance boost is tremendous. And it comes exactly when needed — not minutes later.

As you scale-up the architecture for a web application, the just-in-time (JIT) vertical scalability reduces the need for over-provisioning in the web tier, saving operators substantial expense.

Stage 0 – Single SmartMachine Deployment

Successful web applications on a single SmartOS SmartMachine are not uncommon. For example, a PHP application running on Apache with MySQL can be deployed together on the same instance of the SmartOS SmartMachine.

Web applications running on Joyent are connected to Tier-1 networking and have access to services such as NFS drives, which can be used for back-up or for storage shared across multiple web tier servers.

Stage 1 – Basic Tiered Deployment

The recommended approach to a small-scale web application architecture is to separate the database and web tiers. This provides enhanced performance over the basic deployment by providing access to specialized SmartMachines, such as the MySQL SmartMachine.

This architecture also provides modularity at the network level that makes it easier to transition to larger scale web applications without modifying the architecture. As web application traffic grows, you can add additional infrastructure where you need it — such as increasing the size of the database without affecting the web tier.

Stage 2 – Load Balancing, Caching, Redundant Web Tier, and Redundant Database Tier

The second stage adds load balancing. For example, the Zeus SmartMachine provides a simple, yet powerful, web-based interface to configure and run load balancing and page caching. Load balancing distributes requests across multiple SmartMachines running the web tier. Page caching maintains copies of dynamically generated web pages in memory and then quickly serves them up – handling 10,000s of requests per second. Depending on the web application, page caching can dramatically reduce the load on back-end servers.

With load balancing in place, the web tier can scale horizontally with the addition of a second SmartMachine. For example, a web application may have a pair of SmartMachines loaded with PHP code and running Apache. The same principle applies for Rails code running on Mongrel or within Passenger.

The database tier can also run a redundant configuration, such as dual master MySQL SmartMachines.

Stage 3 – Redundant Load Balancing, Caching, and Traffic Management

A typical next step is the addition of a second load balancer, such as a Zeus SmartMachine, to deliver redundant traffic management. A load balancer can automatically take care of clustering, which usually takes only minutes to add fail-over and to increase the capacities of caching and throughput.

Stage 4 – Horizontally Scale the Web Tier

In the fourth stage, the addition of extra capacity means horizontally scaling the web tier. If a given web server can handle 500 requests per second and a web application must support 2,500 requests per second, then five SmartMachines are typically required in the web tier.

Stage 5 – Vertically Scale the Database Tier

The last major step in scaling a web application typically requires scaling-out the capacity of the database tier by increasing the size of your MySQL SmartMachine or Riak SmartMachine. Make sure that your data set size does not exceed 80% of your available memory in order to allow your database adequate time to respond to queries without having to hit the disk for data. Retrieving data from memory is always faster than retrieving data from disk.