Multi-layered load balancing to achieve speed, throughput, reliability, and compliance

Do you know what's this? Can you recognize multiple layers of load balancing working here?

Traffic hitting ConfigCat's servers

Do you know why you see distinct group of lines fluctuating around different trends? It's there to achieve four contradictory goals at the same time. Let me explain that.

The figure above shows traffic hitting ConfigCat's servers and ConfigCat has servers all around the world.

ConfigCat servers around the world

Of course, those servers are behind a load balancer. But not behind a simple one. Because we had...

Four goals when designing our CDN

  1. Small response times. Requests hitting ConfigCat's CDN should be replied by the servers closest to the request's source.
  2. Unlimited throughput. ConfigCat's CDN should handle any increase in the number of requests just by us throwing additional servers at it.
  3. Resilience to server failures. ConfigCat's CDN should tolerate if individual CDN servers die. Other servers should jump in and take the load from the dead servers.
  4. Resilience to provider failures. ConfigCat's CDN should stay alive in case one or two datacenters are experiencing issues.

These requirements (just like in any interesting engineering problem) are contradictory.

And we have solved them by...

Utilizing multiple layers of load balancers and multiple providers.

  • The load balancer in the top layer does geo-location based routing to a second layer of load balancers.
  • Load balancers in the second layer distribute traffic equally across servers.
  • There is a watchdog checking the individual servers and feeding back their health status to the load balancers in the second layer.
  • Redundant providers manage servers in redundant datacenters.

How this works

The top layer load balancer targets Goal #1. Thanks to the geo-location based routing, it makes sure requests will always hit the closest group of servers. Goal #1 ✅.

Requests hitting the second layer of load balancers will be distributed equally among CDN servers under their control. If there is an increase in the number of requests hitting a certain region of the world, we just add new servers to the load balancer responsible for that region. Goal #2 ✅.

If an individual server dies, the watchdogs will recognize this and let the load balancers know to not send traffic there. This is how Goal #3 is achieved ✅.

Additionally, in regions where it is available, we have multiple groups of servers located at different cities' different datacenters, managed by different providers. This adds an extra layer of reliability and tolerance to datacenter-related failures. It's not that the world can collapse but certainly we have some tolerance against providers going down or cities vanishing. (Goal #4 ✅)

Regions, cities, datacenters

Let's revisit this figure.

ConfigCat servers around the world

What we see here is ten cities (some of them with multiple datacenters) organized into five regions:

  • West Coast,
  • East Coast,
  • Europe,
  • Singapore,
  • Australia.

The whole thing together 🙂

And now, revisiting this figure.

Traffic hitting ConfigCat's servers

Let me explain this.

  • An individual line represents traffic hitting an individual server.
  • The groups of lines represent the traffic hitting a region.
  • Lines in the same group move together because the load balancers in the seconds layer distribute the traffic roughly equally across them.
  • Lines in different groups follow different trends because of the difference of traffic trends throughout the globe.

What we see here is traffic building up in four regions of the world after launching ConfigCat's new global CDN network.

And why do I use the term "ConfigCat's new global CDN network"?

Because ConfigCat has multiple CDN networks.

Multiple CDN networks

Here's the list of ConfigCat's CDN networks on the status page. All Networks consist of multiple regions, cities, datacenters, and servers.

CDN Networks on the ConfigCat status page

Cool. But Why? Why multiple CDN networks?

To support Data Governance.

  • To allow you to tell us if you want your data distributed in the EU only. So you can easily stay GDPR compliant.
  • To allow you to tell us if you want your data distributed globally. So your users can get the lowest response times possible.

What's next

We will add new CDN networks. Again,

  • To allow you to have your data in the US only.
  • Or Australia only.
  • Or Singapore only.

You get it.