Amazon Web Services: Load Balancing

So the EC2 instance host our application on the cloud and everything seems fine so far, but probably our beatiful work end up on making the application very popular so a ton of new users is willing to use our software. Eventually that will bring issues on the performance as the hardware offered by an instance used to start with an initial version won't be enough as the usage grows.

One of the ways to solve this is to add more power to our EC2 instace, this is, scaling it vertically. But what happens if the instance dies or simply reboots or if there is a hardware issue, because at the end they are only machine in someone's else home. What happens is that our app becomes unavailable and probably we'll need to put it to work again. The best approach is to have additional instances, so more can be added or removed on demand. This is, scaling our architecture horizontally. Each instance will run the same code and know how to solve each request independently.

Although we can have several replicas of the code, clients accesing our app needs an unique entry point, here is where it comes the concept of Load Balancing. A load balancer is a new piece in our architecture which works as a middle man between the outside requests and the instances running our program. AWS provides this services through Elastic Load Balancing which automatically distributes incoming application traffic across multiple Amazon EC2 instances. It enables you to achieve fault tolerance in your applications, seamlessly providing the required amount of load balancing capacity needed to route application traffic.

Read more details about AWS Elastic Load Balancer here

Autoscaling

When discussing architecture scalability, one of the options mentioned was to do it horizontally. Fortunately, Amazon provides Auto Scaling, a service that helps to maintain application availability and allows to scale the EC2 instances capacity up or down automatically according to conditions you define. You can use Auto Scaling to help ensure that you are running your desired number of Amazon EC2 instances. Auto Scaling can also automatically increase the number of Amazon EC2 instances during demand spikes to maintain performance and decrease capacity during a low usage period of the application. The most common conditions used are CPU usage and Network traffic where we can adjust the values that work better for the expected performance.

Take a look to the docs of Auto Scaling here

Next Steps

Handling all these services can be complicated, but one more time AWS will help us to keep them all together with Elastic Beanstalk