Wednesday, June 17, 2020

Cloud-Native Design Techniques - Scale

Designing systems for the cloud, or to be cloud-native, is all the rage right now. There are numerous articles written about this, but most of them come across like partial cooking instructions where the reader is expected to know the recipe already. I want to talk about cloud-native design, but not make any assumptions about prior knowledge.

For the past 14 years, I've worked on a system that processes electronic payment transactions. When we built this system, we specifically designed the applications to handle scale. We were betting the business on being able to scale the software linearly with business growth. What exactly does this mean and how is it done?

First, let's define what it means for an application to "scale." In the simplest terms, an application is able to "scale" when it can handle more requests or load without changing the application itself, without failing, and generally without significant changes to the infrastructure that executes it. Let's take for example an order handling system. The code below loosely defines this system:

We can see in the code that three activities occur for a new order: 1) the order is validated, 2) the order is billed, and 3) the order is shipped. In this example, the order validation must occur for the other two activities to happen, but not all of the activities are chained together. We can see that the process of billing an order interacts with a database. Database interactions generally require requests over a network and can be constrained by the resources available to the database server. This operation could be slow or contentious. If orders occur infrequently, this approach is fine and will work well enough. If orders are streaming in rapidly, this approach will either cause significant delays in processing orders, or it will crash the entire system as resources get overwhelmed.

There are three ways that we could easily modify this system to handle scale. The first approach would be to upgrade the server it runs on, or to run it on multiple servers. A hardware-based approach is the easiest for developers because they don't really have to do anything, but it may not be the best long-term solution. Upgrading the existing server may allow for more orders to be handled for a while, but the system will fail again as the number of orders increases. Running this system on multiple servers may work, or it may cause more contention in shared resources, like a database, that causes the system to fail even sooner. If this application is running on a virtual machine in a public cloud, modifying the underlying hardware is an expensive approach to making it scale.

A second approach would be to run the HandleOrder method in a new thread for each order. This approach isn't terrible, but it isn't as easy as it sounds. Simply creating a new thread for each order may work, but that assumes that everything in our HandleOrder method is thread-safe. Most database operations are not thread-safe, so the order billing operation will be an issue. In my experience, a lot of code is only thread-safe by accident, not because it was intentionally written that way. Using threading to scale this application may be a valid approach, but it requires diligence and understanding to get it right. The other limitation to this approach is that at some point, the resources of the server will be exhausted and the app will no longer scale. When this happens, a hardware approach will be necessary.

The third approach to making this system scale would be to break up the operations and handle them independently. We see that the order validation is the only required step for billing or shipping an order. The system can be broken up into three pieces, two of which can be called independently. The code below illustrates how the separation could occur:

The code above separates the billing and shipping calls and allows them to run simultaneously since they aren't dependent on each other. (This code is overly simplistic to illustrate my point.) Breaking the system into constituent parts allows us to scale the BillOrder function separately from the ShipOrder function. As our load of orders grows, we can run the BillOrder function separately from the ShipOrder function, using the HandleOrder function as an orchestrator for the work. In a public cloud environment, this approach would allow us to run the billing and shipping functions as either serverless functions or independent Docker containers. (Either serverless functions or independent containers are cheaper options than running full virtual machines in public cloud environments.)

Designing for scale requires an architect to see the divisions in an application and to separate the system along these divisions. Once that separation is properly done, the system is better prepared to take advantage of cloud techologies to handle scale in a more cost-effective manner.

No comments:

Post a Comment