This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Auto Scaling Concepts

Auto Scaling Concepts

This section discusses Auto Scaling concepts and terminology.

1 - Understanding Launch Configurations

The launch configuration defines the settings used by the Eucalyptus instances launched within an Auto Scaling group. This includes the image name, the instance type, key pairs, security groups, and block device mappings. You associate the launch configuration with an Auto Scaling group. Each Auto Scaling group can have one (and only one) associated launch configuration.

Once you’ve created a launch configuration, you can’t change it; you must create a new launch configuration and then associate it with an Auto Scaling group. After you’ve created and attached a new launch configuration to an Auto Scaling group, any new instances will be launched using parameters defined in the new launch configuration; existing instances in the Auto Scaling group are not affected.

For more information, see Creating a Basic Auto Scaling Configuration .

2 - Understanding Auto Scaling Groups

An Auto Scaling group is the central component of Auto Scaling. An Auto Scaling group defines the parameters for the Eucalyptus instances are used for scaling, as well as the minimum, maximum, and (optionally) the desired number of instances to use for Auto Scaling your application. If you don’t specify the desired number of instances in your Auto Scaling group, the default value will be the same as the minimum number of instances defined.

In addition to instance and capacity definitions, each Auto Scaling group must specify one (and only one) launch configuration.

For more information, see Creating a Basic Auto Scaling Configuration .

3 - Understanding Auto Scaling Policies

An Auto Scaling policy defines how to perform scaling actions in response to CloudWatch alarms. Auto scaling policies can either scale-in , which terminates instances in your Auto Scaling group, or scale-out , which will launch new instances in your Auto Scaling group. You can define an Auto Scaling policy based on demand, or based on a fixed schedule.

Demand-Based Auto Scaling Policies

Demand-based Auto Scaling policies scale your application dynamically based on CloudWatch metrics (such as average CPU utilization) gathered from the instances running in your scaling group. For example, you can configure a CloudWatch alarm to monitor the CPU usages of the instances in your Auto Scaling group. A CloudWatch alarm definition includes thresholds - minimum and maximum values for the defined metric - that will cause the alarm to fire.

For example, you can define the lower and upper thresholds of the CloudWatch alarm at 40% and 80% CPU usage. Once you’ve created the CloudWatch alarm, you can create a scale-out policy that launches 10 new instances when the CloudWatch alarm breaches the upper threshold (the average CPU usage is at or above 80%), and a scale-in policy that terminates 10 instances when the CloudWatch alarm breaches the lower threshold (the average CPU usage of the instances in the Auto Scaling group falls below 40%). Your Auto Scaling group will execute the appropriate Auto Scaling policy when it receives the message from the CloudWatch alarm.

Cooldown Period

A cooldown period is the amount of time after an Auto Scaling activity takes place where further Auto Scaling activity is suspended. This is to allow time for the Auto Scaling activities (such as new instance launches or terminations) to fully complete so that resources are not unnecessarily launched or terminated. You can specify this amount of time; if you don’t specify a cooldown period, Auto Scaling uses a default cooldown period of 300 seconds (5 minutes).

For more information, go to Configuring a Demand-Based Scaling Policy .

4 - Understanding Health Checks

Auto scaling periodically performs health checks on the instances in your Auto Scaling group. By default, Auto Scaling group determines the health state of each instance by periodically checking the results of Eucalyptus instance status checks. When Auto Scaling determines that an instance is unhealthy, it terminates that instance and launches a new one.

If your Auto Scaling group is using elastic load balancing, Auto Scaling can determine the health status of the instances by checking the results of both Eucalyptus instance status checks and elastic load balancing instance health.

Auto scaling determines an instance is unhealthy if the calls to either Eucalyptus instance status checks or elastic load balancing instance health status checks return any state other than OK (or  InService ).

If there are multiple elastic load balancers associated with your Auto Scaling group, Auto Scaling will make health check calls to each load balancer. If any of the health check calls return any state other than  InService , that instance will be marked as unhealthy. After Auto Scaling marks an instance as unhealthy, it will remain marked in that state, even if subsequent calls from other load balancers return an  InService  state for the same instance.

For more information, go to Configuring Health Checks .

5 - Understanding Instance Termination Policies

Before Auto Scaling selects an instance to terminate, it first identifies the availability zone used by the group that contains the most instances. If all availability zones have the same number of instances, Auto Scaling selects a random availability zone, and then uses the termination policy to select the instance within that randomly selected availability zone for termination.

By default, Auto Scaling chooses the instance that was launched with the oldest launch configuration. If more than one instance was launched using the oldest launch configuration, Auto Scaling the instance that was created first using the oldest launch configuration.

You can override the default instance termination policy for your Auto Scaling group with one of the following options:

OptionDescription
OldestInstanceThe oldest instance in the Auto Scaling group should be terminated.
NewestInstanceThe newest instance in the Auto Scaling group should be terminated.
OldestLaunchConfigurationThe first instance created using the oldest launch configuration should be terminated.
DefaultUse the default instance termination policy.

You can have more than one termination policy per Auto Scaling group. The termination policies are executed in the order they were created.

For more information, go to Configuring an Instance Termination Policy .