Cloud DNS: How to Speed Up Your Cloud Apps at No Extra Charge

October 31, 2011 - by alexsalkever

Note: This is a guest post from Cory von Wallenstein, the VP of Engineering for Dyn - a Joyent Cloud partner that provides  enterprise DNS and email delivery services. Follow Cory and Dyn on Twitter at @cvonwallenstein and @DynInc, respectively.

When it comes to the Joyent Cloud, most site owners and operators are familiar with the benefits of powering their applications via Infrastructure-as-a-Service:

  1. Pay for what you use, nothing more
  2. Scale up or down your infrastructure with demand
  3. Automate your infrastructure with APIs and configuration management tools
  4. Reduce provisioning time from days or hours to minutes or seconds in many cases.

There’s a fifth major benefit that those on the leading edge of Joyent Cloud technology are taking advantage of, and as cloud providers like Joyent continue to expand their offerings and footprints, this benefit continues to become more and more compelling:

  1. Don’t lump all of your infrastructure in one geographical basket; buy smaller pieces of infrastructure close to where your users live, delivering a faster experience that’s more highly available.

The economics of going global in the Joyent Cloud are compelling because you likely won’t pay significantly more than you are now:

  • If you’re already deployed to the cloud and serving 100% of your nominal load from a single physical location, it doesn’t cost significantly more from an infrastructure standpoint to serve 50% of your nominal load from each of two physical Joyent Cloud locations to become highly available.
    • I’ll note there are some architectural caveats here. You’ll need to make sure your application is architected to be served from multiple physical locations, but if you’re intent on maximizing performance and minimizing exposure to downtime, you’re architected accordingly!
  • Once you’re in two locations, each with 50% nominal load but capable of handling 100% of the nominal load in a failover scenario, it doesn’t cost significantly more to leverage your Joyent Cloud provider’s global footprint to segment further. You can serve 25% of your nominal load from each of four global locations, or 10% from each of ten global locations, or adjust based on user geography to serve 80% from two locations and 20% from a third, perhaps on a different continent all together.

So, with compelling benefits (increased availability and performance for a global audience) combined with compelling economics (if it’s true infrastructure-as-a-service and you can divide your workload, it doesn’t cost significantly more to spread your workload globally than it does to serve your workload entirely from one physical location), you’re ready to go global, right?

Well, there’s one important piece missing. Just because you can serve your application globally, how can you make sure the right users get to the right location, and make sure that location is available for your users to reach?

After all, if everyone types in “example.org” into their address bar, how do you make sure the users on the West coast of the USA get sent to Joyent Cloud data centers on the West Coast? How do you make sure the users on the East Coast of the USA get sent to Joyent Cloud datacenters on the East Coast?

That’s where DynECT Managed DNS services come in, all of which leverage the global DNS system to efficiently and accurately route your users to the right global location.  Here’s a quick overview of the many flavors of global load balancing achievable via DNS, ordered from simplest (and least effective) to most advanced (and most effective). Let’s dive in!

Round Robin Load Balancing

The simplest of all global load balancers: if you have three global locations and you want 33% of your global traffic to go to each, simply store three records in your DNS server under the same label (e.g., basic-round-robin.standingonthebrink.com) like so:

basic-round-robin.standingonthebrink.com IN A 1.2.3.4
basic-round-robin.standingonthebrink.com IN A 2.2.3.4
basic-round-robin.standingonthebrink.com IN A 3.2.3.4

You can add these records to your current DNS server or use a managed DNS product like DynECT Managed DNS.

When users request basic-round-robin.standingonthebrink.com, their computers will see three options, and will choose one pseudo-randomly (sort of… read on).

$ dig @NS1.P24.DYNECT.NET basic-round-robin.standingonthebrink.com +short

1.2.3.4

2.2.3.4

3.2.3.4

In a perfect world, 33% of users would land at 1.2.34, 33% at 2.2.3.4, and 33% at 3.2.3.4. In reality, many resolvers and browsers have a “preference” for which IP to connect to in this scenario, and will tend to prefer the numerically smallest IP address. You may actually find that 40-50% of your users connect to 1.2.3.4, and 25-30% connect to each of 2.2.3.4 and 3.2.3.4. Not perfectly balanced.

It works, but it could be better! Let’s explore the next approach.

DynECT Round Robin Load Balancing

To address the challenges of client-preference for some IP addresses over others in a traditional round robin load balancer resulting in skewed traffic distribution toward the numerically smaller IP addresses, the DynECT DNS team added an important enhancement when they added a global load balancing service: rather than return all IP addresses for a query for basic-round-robin.standingonthebrink.com (in the example above, we configured 3 IP addresses, and all 3 IP addresses get returned to the client, leaving it up to the client to determine which one to connect to), the DynECT Managed DNS platform can be configured to only return a subset of the three configured IP addresses to end-users.

By doing so, even traffic distribution can be more closely achieved, since the client is no longer able to favor one IP address consistently over another.  This capability is called the “serve count” and controls how many IP addresses will get returned for a query to the DynECT load balancing service. Here, we’ll configure DynECT Round Robin Load Balancing for dynect-round-robin.standingonthebrink.com:

Figure 2 - Configuring DynECT Round Robin Load Balancing with a Serve Count of 1

Here’s the effect on the answers to the queries; each query receives only a single A record in response (we set the serve count to 1; had we set it to 2 or 3, more answers would be returned). The DNS server is now responsible for introducing the necessary randomness in issuing responses to distribute traffic load.

$ dig @NS1.P24.DYNECT.NET dynect-round-robin.standingonthebrink.com +short

1.2.3.4

$ dig @NS1.P24.DYNECT.NET dynect-round-robin.standingonthebrink.com +short

3.2.3.4

$ dig @NS1.P24.DYNECT.NET dynect-round-robin.standingonthebrink.com +short

2.2.3.4

If we wanted to change how frequently one address is returned in comparison to the others (e.g., send 50% of the traffic to one IP address and 25% to each of the other two), we don’t need to add any additional A records to the service. We can simply adjust the “weight” drop down to impact how frequently each address is returned; here, with a weight of 1 for each IP address, they’re returned in equal responses to queries, ensuring as even of a traffic distribution as possible.

But, what happens if one of these IP addresses becomes unavailable? What happens then? Let’s continue exploring the next option for global JoyentCloud traffic management using DNS.

DynECT Active Failover

So we’re happily managing traffic for our globally deployed web application in the JoyentCloud, but what happens if one of our instances of the application experiences a problem? What ensures users don’t get routed to infrastructure that is no longer available?

That’s where DynECT Active Failover comes in. Building on top of DynECT Round Robin Load Balancing, DynECT Active Failover adds a monitoring component to ensure only available IP addresses are returned in response to queries. Here, we turn that on my selecting the “Serve Mode” of “Monitor & Remove on Failure.”

Figure 3 - Add DynECT Active Failover to Monitor IP Addresses

You can pick from a number of monitoring protocols (e.g., HTTP, HTTPS, Ping, SMTP, etc.) and a number of behaviors to execute on failover (e.g., remove the bad address from the load balancer and shift load onto the other IP addresses, remove the bad address from the load balancer and replace it with a standby address, etc.).

The net result is that if your infrastructure becomes unavailable in one of your global locations, your users will get seamlessly routed around the problem to ensure a seamless web experience.

So far so good! We’ve ensured that our traffic is being routed to our JoyentCloud infrastructure in the proportions we want, and if one of those locations becomes unavailable, the traffic will seamlessly shift to the other locations.

But what about the performance of our application? If our users on the East Coast are getting dragged over to the West Coast JoyentCloud infrastructure, they’re receiving a far slower user experience than if we ensured they were routed to the nearby East Coast JoyentCloud infrastructure. We’ll explore geotargeting next!

DynECT Traffic Management (GSLB)

Building on top of our previous examples, we’re now going to add geo-targeting with the following rules for queries to dynect-traffic-management.standingonthebrink.com:

  1. If the end user is determined to be on the West Coast, send them to one of our two IP addresses for the West Coast. If one becomes unavailable, failover to the other; if both become unavailable, failover to the East Coast.
  2. If the end user is determined to be on the East coast, send them to the single East Coast IP address. If that IP address becomes unavailable, send them to the West Coast.
  3. For anyone else in the world, send them to one of the three IP addresses at random. (We could get more specific and further optimize here, but for the brevity of the example, we won’t.)

The following shows the specific rulesets created in DynECT Managed DNS for the US West and US East regions (two of seven regions, where the other five are US Central, EU West, EU Central, EU East and Asia). We’ve mapped the appropriate IP addresses, given them equal weight, set the serve count to 1, set the serve mode to “Monitor & Remove on Failure”, and most importantly set “Failover to Global Status” to On; this important piece ensures that if we remove all of the IP addresses from the region due to failure detection, we’ll fallback to using the “global” ruleset, ensuring there are additional IP addresses to route users to.

Figure 4 - Add Geotargeting to Improve User Experience

With the geotargeting in effect, we now see the following query behavior from different vantage points around the globe.

From the East Coast, we only see a single IP address returned for each query (note that all IP addresses start with “3”):

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

3.2.3.4

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

3.2.3.4

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

3.2.3.4

From the West Coast, we see load balancing occurring between the two IP addresses configured (note that all IP addresses start with either “1” or “2”, and “3” is nowhere to be found):

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

2.2.3.4

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

1.2.3.4

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

2.2.3.4

From any other vantage point on the Internet (in this example, we’re using a server in London), we are globally load balanced among all three IP addresses (note that we see all three addresses):

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

3.2.3.4

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

1.2.3.4

$ dig @NS1.P24.DYNECT.NET dynect-traffic-management.standingonthebrink.com +short

2.2.3.4

Awesome, we’re really getting in control of our global traffic management now. But what if we need more granularity in our control? What if we want to split users by country? Let’s explore that next.

DynECT Geo Traffic Management

The DynECT Traffic Management service explored above relies on region-based control of how to manage traffic. You define one of seven regions around the world and what IP addresses should be returned for users within those regions.

Depending on the locations of your users, the regions available (three in the USA/North America, three in Europe, and one in Asia) may be tremendously too large. Here, we’ll explore Asia.

Let’s explore how we can break the “Asia” region into smaller sub-regions that include:

  1. China, India and surrounding neighbors
  2. Australia
  3. Japan
  4. Everything else in the region

To do so, you’ll want to leverage the DynECT Geo Traffic Management service that performs a database lookup on the source IP address for every query received and makes a decision for where the user is located in the world by the country believed to be home to that user.

Currently in beta, this service on DynECT Managed DNS allows you to create your own regions composed of individual countries around the world and define the traffic management behavior for users in those countries. It’s perfect for fine-grained control of your global traffic management.

So, all in all, we’ve explored a number of ways to manage your global traffic in the Joyent Cloud using DNS, but each additional capability added an additional configuration requirement, increasing the complexity of getting up and running. How can we address this? Enter DynECT Real Time Traffic Management

DynECT Real Time Traffic Management

The motivations for using DynECT Real Time Traffic Management are simple:

  1. Ensure users are routed to the appropriate infrastructure for them using real time insight into application and global network performance, and
  2. Automatically determine the right global traffic management rules to optimize your performance and availability based on real time network performance and availability measurements.

In all previous examples of DNS load balancing services, you as administrator were required to configure regions and IP addresses manually. By contrast, with DynECT Real Time Traffic Management, you only provide your list of Joyent Cloud IP addresses and the DynECT Managed DNS platform takes care of the rest.

The DynECT Managed DNS platform will measure availability and latency to each of your IP addresses from multiple global Internet vantage points, and automatically create the traffic management rules to ensure users are always routed to the best possible JoyentCloud instances for them.

For instance, let’s say one of your global instances comes under heavy load, which increases the amount of time required to service each individual user; DynECT Real Time Traffic Management will take this performance impact into consideration and automatically adjust the global traffic management rules to transfer some of the users to further away, but less heavily loaded and thus faster JoyentCloud instances. When additional capacity is deployed or the heavy load subsides, DynECT Real Time Traffic Management will detect this condition as well and seamlessly shift global traffic back.

It’s All in the DNS

All in all, we’ve explored a number of techniques for globally managing your JoyentCloud traffic using a global DNS footprint, with some particular examples pulled from the DynECT Managed DNS platform. All of this capability is enabled thanks to the tremendously valuable role DNS plays in the architecture of the Internet; it sits between the end-user and everything else they want to access, providing a tremendous amount of flexibility and control into the best way for the end-user to reach the content they seek.

If you have any questions about how best to globally manage your JoyentCloud traffic, be sure to get in touch with your friendly representative, or reach out to the Dyn.com sales and concierge teams for one-on-one assistance in optimizing your global infrastructure.

:

Sign up Now for Instant Cloud Access

Get Started

View PricingSee Benchmarks