Introduction

Scalability and high availability are the must-have features for web applications. Amazon Web Services(AWS) provides various tools and services that can be used to create highly-scalable and available applications. In this tutorial, we’ll learn how to create a highly-scalable, highly-available web application using Amazon Auto Scaling.

What is Amazon Auto Scaling?

Amazon Auto Scaling is a service that allows you to automatically adjust the number of instances in your Amazon EC2 Auto Scaling group based on the criteria that you define. Auto Scaling enables you to maintain application availability and allows you to scale your Amazon EC2 capacity up or down automatically according to conditions like the fluctuations in traffic.

Amazon Auto Scaling uses a variety of criteria to adjust the number of instances in your Amazon EC2 Auto Scaling group, such as the average CPU utilisation, memory utilisation, and network traffic. Based on these parameters, Amazon Autoscaling can scale your application horizontally by launching new instances, and vertically by increasing the available capacity of the existing instances.

Prerequisites

Before starting the tutorial, make sure you have:

An AWS account
Familiarity with Amazon EC2 and Auto Scaling
Familiarity with Amazon Simple Storage Service (S3)
A web application that has been deployed to an Amazon EC2 instance
A load balancer that distributes incoming traffic to the Amazon EC2 instances

Step 1: Create an Amazon Machine Image (AMI)

An Amazon Machine Image (AMI) is a pre-configured virtual machine image used to create EC2 instances. In this step, we’ll create an AMI of the EC2 instance that contains the application we want to scale.

Open the Amazon EC2 console.
Select EC2 instances from the navigation bar on the left and select an instance that has the application we want to scale.
Once you have the correct instance selected, right-click and select Create Image.
In the Create Image dialog box, enter a unique name and description for the image, and then select Create Image.
Once the image is created successfully, it will be listed in the AMIs tab of EC2 console.

Step 2: Create an Auto Scaling Group

In this step, we’ll create an Auto Scaling group that will launch instances based on the AMI we just created.

Open the Amazon EC2 console.
Select the Auto Scaling groups from the navigation bar on the left and select Create an Auto Scaling group.
In the Create Auto Scaling group dialog box, provide the necessary details such as the name of the group, the launch configuration and the subnet in which the group will be created.
Under the Advanced options, set the minimum, maximum, and desired size for the group. This will define the number of instances that will be launched and the capacity of the group.
For the health check type, select EC2 and set the grace period as required.
Once the configuration is complete, select Create Auto Scaling group.
After creation, the Auto Scaling group will be listed in the Auto Scaling groups tab.

Step 3: Create a Launch Configuration

A launch configuration is a blueprint that describes the instance you want to launch. In this step, we’ll create a launch configuration with the AMI we created in step 1.

Open the Amazon EC2 console.
Select the Launch Configurations from the navigation bar on the left and select Create launch configuration.
In the Create launch configuration dialog box, select the AMI we created in step 1.
Choose the instance type, which will determine the hardware specifications of the instances.
In the Configure Details step, provide a unique name, description and specify the key pair, security groups, and user data as required.
Review and select Create launch configuration.
Finally, we need to specify the launch configuration in the Auto Scaling Group we created in step 2.

Step 4: Create a Target Group

A target group is a set of targets (EC2 instances or containers) and a route to distribute traffic to them. In this step, we’ll create a target group that will be used by the load balancer to distribute incoming traffic.

Open the Amazon EC2 console.
Select Target Groups from the left navigation pane and then select Create target group.
Enter a name and description for the group.
Under the Target type, select Instances.
Choose the protocol and port on which the instances in the group will receive traffic.
Under Health checks, enter information about how the status of an instance is determined and how long to wait before performing the first health check.
After the configuration is complete, select Create target group.

Step 5: Create a Load Balancer

A load balancer distributes incoming traffic across a group of instances or containers. In this step, we’ll create a load balancer that will distribute incoming traffic to the instances running in the Auto Scaling group.

Open the Amazon EC2 console.
Select Load Balancers from the left navigation pane, and then select Create Load Balancer.
Choose the type of load balancer you want to create.
Under the Basic Configuration select:
1. A name for the load balancer
2. A VPC for the load balancer
3. At least one listener to use with the load balancer that specifies a protocol and port for front-end (client to load balancer) and back-end (load balancer to instances) traffic.
In the Configure Security Settings step, select the certificate and security policy for HTTPS traffic.
In the Configure Security Groups step, select the security group(s) that should be associated with the load balancer.
In the Configure Routing step, select the target group created in step 4.
Review and create the load balancer.

Step 6: Configure Auto Scaling Policies

Auto Scaling policies define how the number of instances in the Auto Scaling group should be increased or decreased based on different criteria. In this step, we’ll create scaling policies to scale up and down the application.

Open the Amazon EC2 console.
Select the Auto Scaling groups from the left navigation pane, and then select the Auto Scaling Group we created in step 2.
Select the Scaling Policies tab, and then select Create Scaling Policy.
In the Create Scaling Policy dialog box, select the Target Tracking Policy type.
Provide a name for the policy and select the target metric for auto-scaling such as CPU utilization or network traffic.
Define the desired capacity for the group based on the selected metric.
Select the cooldown period as required.
Click Create and the policy will be created based on the provided configuration.

Step 7: Test the Setup

Now that all the necessary configurations are done, we need to test the setup to ensure that it’s working as expected. Start by making sure that traffic is being routed through the load balancer. Visit the DNS name of the load balancer to confirm this is the case.

After verifying that the load balancer is routing traffic, we can test the scaling policies by simulating high traffic on the application. This can be done by using the Apache benchmarking tool to generate a high number of requests to the load balancer.

Suppose the benchmark testing indicates that the application is having trouble handling peak traffic loads. In that case, we can revisit the scaling policies and adjust their configurations to be more suitable for the specific traffic patterns.

Conclusion

In this tutorial, we learned how to use Amazon Auto Scaling service to create a highly-scalable, highly-available application on AWS. We created a launch configuration and an Auto Scaling Group to launch instances based on an Amazon Machine Image (AMI). We also created a target group and a load balancer to distribute incoming traffic to the instances in the Auto Scaling Group. Finally, we configured scaling policies to increase or decrease the number of instances based on different metrics.

By following this tutorial, you now have a highly-scalable, highly-available application running on AWS that automatically adjusts to changing traffic conditions.