Using CloudWatch
Using CloudWatch
CloudWatch is a Eucalyptus service that enables you to monitor, manage, and publish various metrics, as well as configure alarm actions based on data from metrics. You can use the default metrics that come with Eucalyptus, or you can use your own metrics.CloudWatch collects raw data from your cloud’s resources and generates the information into readable, near real-time metrics. These metrics are recorded for a period of two weeks. This allows you to access historical information and provides you with information about how your resource is performing.
To find out more about what CloudWatch is, see CloudWatch Overview .
To find out how to use CloudWatch, see CloudWatch Tasks .
1 - CloudWatch Overview
CloudWatch Overview
This section describes the concepts and details you need to understand the CloudWatch service. This section also includes procedures to complete the most common tasks for CloudWatch.CloudWatch is a Eucalyptus service that collects, aggregates, and dispenses data from your cloud’s resources. This data allows you to make operational and business decisions based on actual performance metrics. You can use CloudWatch to collect metrics about your cloud resources, such as the performance of your instances. You can also publish your own metrics directly to CloudWatch.
CloudWatch monitors the following cloud resources:
- instances
- Elastic Block Store (EBS) volumes
- Auto Scaling instances
- Eucalyptus Load Balancers (ELB)
Alarms
CloudWatch alarms help you make decisions more easily by automatically making changes to the resources you are monitoring, based on rules that you define. For example, you can create alarms that initiate Auto Scaling actions on your behalf. For more information about alarm tasks, see Configuring Alarms .
Common Use Cases
A common use for CloudWatch is to keep your applications and services healthy and running efficiently. For example, CloudWatch can determine that your application runs best when network traffic remains below a certain threshold level on your instances. You can then create an automated procedure to ensure that you always have the right number of instances to match the amount of traffic you have.
Another use for CloudWatch is to diagnose problems by looking at system performance before and after a problem occurs. CloudWatch helps you identify the cause and verify your fix by tracking performance in real time.
1.1 - CloudWatch Concepts
This section describes the terminology and concepts you need in order to understand and use CloudWatch.
Metric
A metric is a time-ordered set of data points. You can get metric data from Eucalyptus cloud resources (like instances or volumes), or you can publish your own set of custom metric data points to CloudWatch. You then retrieve statistics about those data points as an ordered set of time-series data.
Data points represent values of a variable over time. For example you can get metrics for the CPU usage of a particular instance, or for the latency of an elastic load balancer (ELB).
Each metric is uniquely defined by a name, a namespace, and zero or more dimensions. Each data point has a time stamp, and (optionally) a unit of measure. When you request statistics, the returned data stream is identified by namespace, metric name, dimension, and (optionally) the unit. For more information about Eucalyptus-supported metrics, see Namespaces, Metrics, and Dimensions .
CloudWatch stores your metric data for two weeks. You can publish metric data from multiple sources, such as incoming network traffic from dozens of different instances, or requested page views from several different web applications. You can request statistics on metric data points that occur within a specified time window.
Namespace
A namespace is a conceptual container for a collection of metrics. Eucalyptus treats metrics in different namespaces as unique. This means that metrics from different services cannot mistakenly be aggregated into the same statistical set.
Namespace names are strings you define when you create a metric. The names must be valid XML characters, typically containing the alphanumeric characters “0-9A-Za-z” plus “."(period), “-” (hyphen), “_” (underscore), “/” (slash), “#” (hash), and “:” (colon). All Eucalyptus services that provide CloudWatch data follow the convention AWS/, such as AWS/EC2 and AWS/ELB. For more information, see Namespaces .
Note
A namespace name must be less than 256 characters. There is no default namespace. You must specify a namespace for each data element you put into CloudWatch.Dimension
A dimension is a name-value pair that uniquely identifies a metric. A dimension helps you design a conceptual structure for your statistics plan. Because dimensions are part of the unique identifier for a metric, metric name, namespace, and dimension key-value pairs define unique metrics.
You specify dimensions when you create a metric with the euwatch-put-data command. Eucalyptus services that report data to CloudWatch also attach dimensions to each metric. You can use dimensions to filter result sets that CloudWatch queries return. For example, you can get statistics for a specific instance by calling euwatch-get-stats with the InstanceID dimension set to a specific instance ID.
For Eucalyptus metrics, CloudWatch can aggregate data across select dimensions. For example, if you request a metric in the AWS/EC2 namespace and do not specify any dimensions, CloudWatch aggregates all data for the specified metric to create the statistic that you requested. However, CloudWatch does not aggregate across dimensions for custom metrics
Note
You can assign up to ten dimensions to a metric.Time Stamp
Each metric data point must be marked with a time stamp. The time stamp can be up to two weeks in the past and up to two hours into the future. If you do not provide a time stamp, CloudWatch creates a time stamp for you based on the time the data element was received.
The time stamp you use in the request must be a dateTime object, with the complete date plus hours, minutes, and seconds. For example: 2007-01-31T23:59:59Z. For more information, go to http://www.w3.org/TR/xmlschema-2/#dateTime . Although it is not required, we recommend you provide the time stamp in the Coordinated Universal Time (UTC or Greenwich Mean Time) time zone. When you retrieve your statistics from CloudWatch, all times reflect the UTC time zone.
Unit
A unit represents a statistic’s measurement in time or amount. For example, the units for the instance NetworkIn metric is Bytes because NetworkIn tracks the number of bytes that an instance receives on all network interfaces.
You can also specify a unit when you create a custom metric. Units help provide conceptual meaning to your data. Metric data points you create that specify a unit of measure, such as Percent, will be aggregated separately. The following list provides some of the more common units that CloudWatch supports:
- Seconds
- Bytes
- Bits
- Percent
- Count
- Bytes/Second (bytes per second)
- Bits/Second (bits per second)
- Count/Second (counts per second)
- None (default when no unit is specified)
Though CloudWatch attaches no significance to a unit, other applications can derive semantic information based on the unit you choose. When you publish data without specifying a unit, CloudWatch associates it with the None unit. When you get statistics without specifying a unit, CloudWatch aggregates all data points of the same unit together. If you have two otherwise identical metrics with different units, two separate data streams will be returned, one for each unit.
Statistic
A statistic is computed aggregation of metric data over a specified period of time. CloudWatch provides statistics based on the metric data points you or Eucalyptus provide. Aggregations are made using the namespace, metric name, dimensions, and the data point unit of measure, within the time period you specify. The following table describes the available statistics.
Statistic | Description |
---|
Minimum | The lowest value observed during the specified period. You can use this value to determine low volumes of activity for your application. |
Maximum | The highest value observed during the specified period. You can use this value to determine high volumes of activity for your application. |
Sum | All values submitted for the matching metric added together. You can use this statistic for determining the total volume of a metric. |
Average | The value of Sum / SampleCount during the specified period. By comparing this statistic with the Minimum and Maximum, you can determine the full scope of a metric and how close the average use is to the Minimum and Maximum. This comparison helps you to know when to increase or decrease your resources as needed. |
SampleCount | The count (number) of data points used for the statistical calculation. |
Period
A period is the length of time, in seconds, associated with a specific CloudWatch statistic. Each statistic represents an aggregation of the metrics data collected for a specified period of time. You can adjust how the data is aggregated by varying the length of the period. A period can be as short as one minute (60 seconds) or as long as two weeks (1,209,600 seconds).
The values you select for the StartTime and EndTime options determine how many periods CloudWatch returns. For example, if you set values for the Period, EndTime, and StartTime options for 60 seconds, CloudWatch returns an aggregated set of statistics for each minute of the previous hour. If you want statistics aggregated into ten-minute blocks, set Period to 600. For statistics aggregated over the entire hour, use a Period value of 3600.
Periods are also an important part of the CloudWatch alarms feature. When you create an alarm to monitor a specific metric, you are asking CloudWatch to compare that metric to the threshold value that you supplied. You have control over how CloudWatch makes that comparison. You can specify the period over which the comparison is made, as well as how many consecutive periods the threshold must be breached before you are notified.
Aggregation
CloudWatch aggregates statistics according a length of time that you set. You can publish as many data points as you want with the same or similar time stamps. CloudWatch aggregates these data points by period length. You can publish data points for a metric that share not only the same time stamp, but also the same namespace and dimensions.
Subsequent calls to euwatch-get-stats return aggregated statistics about those data points. You can even do this in one euwatch-put-data request. CloudWatch accepts multiple data points in the same euwatch-put-data call with the same time stamp. You can also publish multiple data points for the same or different metrics, with any time stamp. The size of a euwatch-put-data request, however, is limited to 8KB for HTTP GET requests and 40KB for HTTP POST requests. You can include a maximum of 20 data points in one PutMetricData request.
For large data sets that would make the use of euwatch-put-data impractical, CloudWatch allows you to insert a pre-aggregated data set called a StatisticSet. With StatisticSets you give CloudWatch the Min, Max, Sum, and SampleCount of a number of data points. A common use case for StatisticSets is when you are collecting data many times in a minute. For example, if you have a metric for the request latency of a server, it doesn’t make sense to do a euwatch-put-data request with every request. We suggest you collect the latency of all hits to that server, aggregate them together once a minute and send that StatisticSet to CloudWatch.
CloudWatch doesn’t differentiate the source of a metric. If you publish a metric with the same namespace and dimensions from different sources, CloudWatch treats this as a single metric. This can be useful for service metrics in a distributed, scaled system. For example, all the hosts in a web server application could publish identical metrics representing the latency of requests they are processing. CloudWatch treats these as a single metric, allowing you to get the statistics for minimum, maximum, average, and sum of all requests across your application.
Alarm
An alarm watches a single metric over a time period you set, and performs one or more actions based on the value of the metric relative to a given threshold over a number of time periods. CloudWatch alarms will not invoke actions just because they are in a particular state. The state must have changed and been maintained for a specified number of periods.
For example, Auto Scaling works with CloudWatch alarms to perform scaling activities. When an Auto Scaling activity reacts to a CloudWatch alarm, the cooldown period is the amount of time after the activity takes place where further Auto Scaling activity is suspended. This is to allow time for the Auto Scaling activities (such as new instance launches or terminations) to fully complete so that resources are not unnecessarily launched or terminated. You can specify this amount of time; if you don’t specify a cooldown period, Auto Scaling uses a default cooldown period of 300 seconds (5 minutes).
After an alarm invokes an action due to a change in state, the alarm continues to invoke the action for every period that the alarm remains in the new state.
An alarm has three possible states:
: The metric is within the defined threshold.
: The metric is outside of the defined threshold.
: The alarm has just started, the metric is not available, or not enough data is available for the metric to determine the alarm state.
The following lists some common features of alarms:
You can create up to 5000 alarms per Eucalyptus account. To create or update an alarm, use the command.
You can list any or all of the currently configured alarms, and list any alarms in a particular state using the command.
You can disable and enable alarms by using the and commands
You can test an alarm by setting it to any state using the command. This temporary state change lasts only until the next alarm comparison occurs.
Finally, you can view an alarm’s history using the command. CloudWatch preserves alarm history for two weeks. Each state transition is marked with a unique time stamp. In rare cases, your history might show more than one notification for a state change. The time stamp enables you to confirm unique state changes.
1.2 - Namespaces, Metrics, and Dimensions
Namespaces, Metrics, and Dimensions
This section discusses the namespaces, metrics, and dimensions that CloudWatch supports for Eucalyptus services.
1.2.1 - Namespaces
All Eucalyptus services that provide CloudWatch data use a namespace string, beginning with “AWS/”. This section describes the service namespaces.The following table lists the namespaces for services that push metric data points to CloudWatch.
Service | Namespace |
---|
Elastic Block Store | AWS/EBS |
Elastic Compute Cloud | AWS/EC2 |
Auto Scaling | AWS/Autoscaling |
Elastic Load Balancing | AWS/ELB |
1.2.2 - Instance Metrics and Dimensions
This section describes the instance metrics and dimensions available to CloudWatch.
Available Metrics for Instances
Metric | Description | Unit |
---|
CPUUtilization | The percentage of allocated EC2 compute units that are currently in use on the instance. This metric identifies the processing power required to run an application upon a selected instance. | Percent |
DiskReadOps | Completed read operations from all ephemeral disks available to the instance (if your instance uses EBS, see EBS Metrics.) This metric identifies the rate at which an application reads a disk. This can be used to determine the speed in which an application reads data from a hard disk. | Count |
DiskWriteOps | Completed write operations to all ephemeral disks available to the instance (if your instance uses Amazon EBS, see Amazon EBS Metrics.) This metric identifies the rate at which an application writes to a hard disk. This can be used to determine the speed in which an application saves data to a hard disk. | Count |
DiskReadBytes | Bytes read from all ephemeral disks available to the instance (if your instance uses Amazon EBS, see Amazon EBS Metrics.) This metric is used to determine the volume of the data the application reads from the hard disk of the instance. This can be used to determine the speed of the application. | Bytes |
DiskWriteBytes | Bytes written to all ephemeral disks available to the instance (if your instance uses Amazon EBS, see Amazon EBS Metrics.) This metric is used to determine the volume of the data the application writes onto the hard disk of the instance. This can be used to determine the speed of the application. | Bytes |
NetworkIn | The number of bytes received on all network interfaces by the instance. This metric identifies the volume of incoming network traffic to an application on a single instance. | Bytes |
NetworkOut | The number of bytes sent out on all network interfaces by the instance. This metric identifies the volume of outgoing network traffic to an application on a single instance. | Bytes |
StatusCheckFailed | A combination of StatusCheckFailed_Instance and StatusCheckFailed_System that reports if either of the status checks has failed. Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure. | Count |
StatusCheckFailed_Instance | Reports whether the instance has passed the EC2 instance status check in the last 5 minutes. Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure. | Count |
StatusCheckFailed_System | Reports whether the instance has passed the EC2 system status check in the last 5 minutes. Values for this metric are either 0 (zero) or 1 (one.) A zero indicates that the status check passed. A one indicates a status check failure. | Count |
Available Dimensions for Instances
You can filter the instance data using any of the dimensions in the following table.
Dimension | Description |
---|
AutoScalingGroupName | This dimension filters the data you request for all instances in a specified capacity group. An AutoScalingGroup is a collection of instances you define if you’re using the Auto Scaling service. This dimension is available only for instance metrics when the instances are in such an Auto Scaling group. |
ImageId | This dimension filters the data you request for all instances running this Eucalyptus Machine Image (EMI). |
InstanceId | This dimension filters the data you request for the identified instance only. This helps you pinpoint an exact instance from which to monitor data. |
InstanceType | This dimension filters the data you request for all instances running with this specified instance type. This helps you categorize your data by the type of instance running. For example, you might compare data from an m1.small instance and an m1.large instance to determine which has the better business value for your application. |
1.2.3 - EBS Metrics and Dimensions
This section describes the Elastic Block Store (EBS) metrics and dimensions available to CloudWatch.
Available Metrics for EBS
Metric | Description | Unit |
---|
VolumeReadBytes | The total number of bytes transferred in the period. | Bytes |
VolumeWriteBytes | The total number of bytes transferred in the period. | Bytes |
VolumeReadOps | The total number of operations in the period. | Count |
VolumeWriteOps | The total number of operations in the period. | Count |
VolumeTotalReadTime | The total number of seconds spent by all operations that completed in the period. If multiple requests are submitted at the same time, this total could be greater than the length of the period. For example, say the period is 5 minutes (300 seconds); if 700 operations completed during that period, and each operation took 1 second, the value would be 700 seconds. | Seconds |
VolumeTotalWriteTime | The total number of seconds spent by all operations that completed in the period. If multiple requests are submitted at the same time, this total could be greater than the length of the period. For example, say the period is 5 minutes (300 seconds); if 700 operations completed during that period, and each operation took 1 second, the value would be 700 seconds. | Seconds |
VolumeIdleTime | The total number of seconds in the period when no read or write operations were submitted. | Seconds |
VolumeQueueLength | The number of read and write operation requests waiting to be completed in the period. | Count |
Available Dimensions for EBS
The only dimension that EBS sends to CloudWatch is the Volume ID. This means that all available statistics are filtered by Volume ID.
1.2.4 - Auto Scaling Metrics and Dimensions
This section discusses the Auto Scaling metrics and dimensions available to CloudWatch.
Available Metrics for Auto Scaling
Metric | Description | Unit |
---|
GroupMinSize | The minimum size of the Auto Scaling group. | Count |
GroupMaxSize | The maximum size of the Auto Scaling group. | Count |
GroupDesiredCapacity | The number of instances that the Auto Scaling group attempts to maintain. | Count |
GroupInServiceInstances | The number of instances that are running as part of the Auto Scaling group. This metric does not include instances that are pending or terminating. | Count |
GroupPendingInstances | The number of instances that are pending. A pending instance is not yet in service. This metric does not include instances that are in service or terminating. | Count |
GroupTerminatingInstances | The number of instances that are in the process of terminating. This metric does not include instances that are in service or pending. | Count |
GroupTotalInstances | The total number of instances in the Auto Scaling group. This metric identifies the number of instances that are in service, pending, and terminating. | Count |
Available Dimensions for Auto Scaling
The only dimension that Auto Scaling sends to CloudWatch is the name of the Auto Scaling group. This means that all available statistics are filtered by Auto Scaling group name.
1.2.5 - ELB Metrics and Dimensions
This section discusses the Elastic Load Balancing (ELB) metrics and dimensions available to CloudWatch.
Available Metrics for ELB
Metric | Description | Unit |
---|
Latency | Time elapsed after the request leaves the load balancer until it receives the corresponding response. Valid Statistics: Minimum | Maximum |
RequestCount | The number of requests handled by the load balancer. | Count |
HealthyHostCount | The number of healthy instances registered with the load balancer in a specified availability zone. Healthy instances are those that have not failed more health checks than the value of the unhealthy threshold. Constraints: You must provide both LoadBalancerName and AvailabilityZone dimensions for this metric.Valid Statistics: Minimum | Maximum |
UnHealthyHostCount | The number of unhealthy instances registered with the load balancer. These are instances that have failed more health checks than the value of the unhealthy threshold. Constraints: You must provide both LoadBalancerName and AvailabilityZone dimensions for this metric.Valid Statistics: Minimum | Maximum |
HTTPCode_ELB_4XX | Count of HTTP response codes generated by ELB that are in the 4xx (client error) series. Valid Statistics: Sum | Count |
HTTPCode_ELB_5XX | Count of HTTP response codes generated by ELB that are in the 5xx (server error) series. ELB can generate 5xx errors if no back-end instances are registered, no healthy back-end instances, or the request rate exceeds ELB’s current available capacity. This response count does not include any responses that were generated by back-end instances. Valid Statistics: Sum | Count |
HTTPCode_Backend_2XX | Count of HTTP response codes generated by back-end instances that are in the 2xx (success) series. Valid Statistics: Sum | Count |
HTTPCode_Backend_3XX | Count of HTTP response codes generated by back-end instances that are in the 3xx (user action required) series. Valid Statistics: Sum | Count |
HTTPCode_Backend_4XX | Count of HTTP response codes generated by back-end instances that are in the 4xx (client error) series. This response count does not include any responses that were generated by ELB. Valid Statistics: Sum | Count |
HTTPCode_Backend_5XX | Count of HTTP response codes generated by back-end instances that are in the 5xx (server error) series. This response count does not include any responses that were generated by ELB. Valid Statistics: Sum | Count |
Available Dimensions for ELB
You can use the currently available dimensions for ELB to refine the metrics returned by a query. For example, you could use HealthyHostCount
and dimensions LoadBalancerName
and AvailabilityZone
to get the average number of healthy instances behind the specified load balancer within the specified Availability Zone for a given period of time.
You can aggregate ELB data along any of the following dimensions shown in the following table.
Metric | Description |
---|
LoadBalancerName | Limits the metric data to instances that are connected to the specified load balancer. |
AvailabilityZone | Limits the metric data to load balancers in the specified availability zone. |
2 - CloudWatch Tasks
CloudWatch Tasks
This section details the tasks you can perform using CloudWatch.This section expands on the basic concepts presented in the preceding section (see CloudWatch Overview and includes procedures for using CloudWatch. This section also shows you how to view metrics that Eucalyptus services provide to CloudWatch and how to publish custom metrics with CloudWatch.
2.1 - Configuring Monitoring
Configuring Monitoring
This section describes how to enable and disable monitoring for your cloud resources.
2.1.1 - Enable Monitoring
This section describes steps for enabling monitoring on your cloud resources.To enable monitoring on your resources, following the steps for your resource.
Enable monitoring for an instance
To enable monitoring for a running instance, enter the following command:
euca-monitor-instances [instance_id]
To enable monitoring when you launch an instance, enter the following command:
euca-run-instances [image_id] -k gsg-keypair --monitor
Enable monitoring for a scaling group
To enable monitoring for an existing Auto Scaling group: Create a launch configuration with --monitoring-enabled
option. Make a euscale-update-auto-scaling-group request to update your Auto Scaling group with the launch configuration you created in the previous step. Auto Scaling will enable monitoring for new instances that it creates. Choose one of the following actions to deal with all existing instances in the Auto Scaling group:
To . . . | Do this . . . |
---|
Preserve existing instances | Make a euca-monitor-instances request for all existing instances to enable monitoring. |
Terminate existing instances | Make a euscale-terminate-instance-in-auto-scaling-group request for all existing instances. Auto Scaling will use the updated launch configuration to create replacement instances with monitoring enabled. |
To enable monitoring when you create a new Auto Scaling group: Create a launch configuration with --monitoring-enabled
option.
Enable monitoring for a load balancer
Elastic Load Balancing (ELB) sends metrics and dimensions for all load balancers to CloudWatch. By default, you do not need to specifically enable monitoring.
Note
ELB only sends CloudWatch metrics when requests are sent through the load balancer. If there are no requests or data for a given metric, ELB does not report to CloudWatch. If there are requests sent through the load balancer, ELB measures and sends metrics for that load balancer in 60-second intervals.2.1.2 - Disable Monitoring
This section describes steps for disabling monitoring on your cloud resources.To disable monitoring on your resources, following the steps for your resource.
Disable monitoring for an instance
To disable monitoring for a running instance, enter the following command:
euca-unmonitor-instances [instance_id]
Disable monitoring for a scaling group
To enable monitoring for an existing Auto Scaling group: Create a launch configuration with --monitoring-disabled
option. Make a euscale-update-auto-scaling-group request to update your Auto Scaling group with the launch configuration you created in the previous step. Auto Scaling will disable monitoring for new instances that it creates. Choose one of the following actions to deal with all existing instances in the Auto Scaling group:
To . . . | Do this . . . |
---|
Preserve existing instances | Make a euca-unmonitor-instances request for all existing instances to disable monitoring. |
Terminate existing instances | Make a euscale-terminate-instance-in-auto-scaling-group request for all existing instances. Auto Scaling will use the updated launch configuration to create replacement instances with monitoring disabled. |
To enable monitoring when you create a new Auto Scaling group: Create a launch configuration with --monitoring-disabled
option.
Disable monitoring for a load balancer
There is no way to disable monitoring for a load balancer.
2.2 - Viewing and Publishing Metrics
Viewing and Publishing Metrics
This section describes how to view Eucalyptus metrics as well as how to publish your own metrics.
2.2.1 - List Available Metrics
You can list available metrics via Euca2ools.To list available metrics:
Enter the following command.
euwatch-list-metrics
Eucalyptus returns a listing of all metrics, as shown in the following partial example output:
Metric Name Namespace Dimensions
CPUUtilization AWS/EC2 {InstanceId=i-5431413d}
CPUUtilization AWS/EC2 {InstanceType=m1.medium}
DiskReadBytes AWS/EC2 {InstanceId=i-1d3d4d74}
DiskReadBytes AWS/EC2 {InstanceType=m1.medium}
DiskReadOps AWS/EC2 {InstanceId=i-d3c8baba}
DiskReadOps AWS/EC2 {InstanceType=m1.medium}
DiskWriteBytes AWS/EC2 {InstanceId=i-6732420e}
DiskWriteBytes AWS/EC2 {InstanceType=m1.medium}
DiskWriteOps AWS/EC2 {InstanceId=i-e03d4d89}
DiskWriteOps AWS/EC2 {InstanceType=m1.medium}
NetworkIn AWS/EC2 {InstanceId=i-e0304089}
NetworkIn AWS/EC2 {InstanceType=m1.medium}
NetworkOut AWS/EC2 {InstanceId=i-69334300}
NetworkOut AWS/EC2 {InstanceType=m1.medium}
StatusCheckFailed AWS/EC2 {InstanceId=i-6f8418e6}
StatusCheckFailed AWS/EC2 {InstanceType=m1.medium}
StatusCheckFailed_Instance AWS/EC2 {InstanceId=i-6f8418e6}
StatusCheckFailed_Instance AWS/EC2 {InstanceType=m1.medium}
StatusCheckFailed_System AWS/EC2 {InstanceId=i-6f8418e6}
StatusCheckFailed_System AWS/EC2 {InstanceType=m1.medium}
2.2.2 - Get Statistics for a Metric
You can get statistics for metrics via Euca2ools.To get statistics for a metric:
Enter the following command.
euwatch-get-stats -n NAMESPACE -s STAT1,STAT2,...
[--dimensions KEY1=VALUE1,KEY2=VALUE2,...]
[--start-time YYYY-MM-DDThh:mm:ssZ]
[--end-time YYYY-MM-DDThh:mm:ssZ] [--period SECONDS]
[--unit UNIT] [--show-empty-fields] [-U URL]
[--region USER@REGION] [-I KEY_ID] [-S KEY]
[--security-token TOKEN] [--debug] [--debugger]
[--version] [-h]
METRIC
The following example returns the average CPU utilization for the i-c08804a9 instance at one hour resolution.
euwatch-get-stats --namespace "AWS/EC2" --statistics "Average" \
--dimensions "InstanceId=i-c08804a9" --start-time 2016-12-14T23:00:00.000Z \
--end-time 2016-12-15T23:00:00.000Z --period 3600 CPUUtilization
The following example returns CPU utilization for all of your cloud’s instances.
euwatch-get-stats --namespace "AWS/EC2" --statistics "Average,Minimum,Maximum" \
--start-time 2016-02-14T23:00:00.000Z --end-time 2016-03-14T23:00:00.000Z \
--period 3600 CPUUtilization
2.2.3 - Publish Custom Metrics
CloudWatch allows you to publish your own metrics, such as application performance, system health, or customer usage.
Publish a single data point
To publish a single data point for a new or existing metric, call euwatch-put-data with one value and time stamp. For example, the following actions each publish one data point:
euwatch-put-data --metric-name PageViewCount --namespace "TestService" --value 2 --timestamp 2011-03-14T12:00:00.000Z
euwatch-put-data --metric-name PageViewCount --namespace "TestService" --value 4 --timestamp 2011-03-14T12:00:01.000Z
euwatch-put-data --metric-name PageViewCount --namespace "TestService" --value 5 --timestamp 2011-03-14T12:00:02.000Z
You can publish data points with time stamps as granular as one-thousandth of a second. However, CloudWatch aggregates the data to a minimum granularity of 60 seconds. For example, the PageViewCount
metric from the previous examples contains three data points with time stamps just seconds apart. CloudWatch aggregates the three data points because they all have time stamps within a 60-second period.
CloudWatch uses 60-second boundaries when aggregating data points. For example, CloudWatch aggregates the data points from the previous example because all three data points fall within the 60-second period that begins at 2011-03-14T12:00:00.000Z and ends at 2011-03-14T12:00:59.999Z.
Publish statistic sets
You can also aggregate your data before you publish to CloudWatch. When you have multiple data points per minute, aggregating data minimizes the number of calls to euwatch-put-data . For example, instead of calling euwatch-put-data multiple times for three data points that are within three seconds of each other, you can aggregate the data into a statistic set that you publish with one call:
euwatch-put-data --metric-name PageViewCount --namespace "TestService" -s "Sum=11,Minimum=2,Maximum=5,SampleCount=3" --timestamp 2011-03-14T12:00:00.000
Publish the value zero
When your data is more sporadic and you have periods that have no associated data, you can choose to publish the value zero (0) for that period or no value at all. You might want to publish zero instead of no value if you use periodic calls to PutMetricData to monitor the health of your application. For example, you can set an Amazon CloudWatch alarm to notify you if your application fails to publish metrics every five minutes. You want such an application to publish zeros for periods with no associated data.
You might also publish zeros if you want to track the total number of data points or if you want statistics such as minimum and average to include data points with the value 0.
2.2.4 - Modify Metric Polling Timing
You can modify metrics timing and reporting defaults.When using the default CloudWatch properties, metrics reporting can take around 15 minutes:
- 5 minutes to receive a sensor data point for an instance.
- 5 more minutes to receive a second sensor data point for an instance.
- 1 more minute to calculate the difference between these two and send a single data point to CloudWatch.
- 1 more minute for CloudWatch to put the data in the database, making it available for a call.
- 5 more minutes for info to be available in the database.
Note
The above workflow is sequential and cumulative.The sensor data point timing values can be shortened by changing variables in the CLC.
Note
These changes will increase network traffic as polling will be done more frequently.To modify metrics defaults:Modify the default polling interval CLC variable to a number less than 5.
cloud.monitor.default_poll_interval_mins
This is how often the CLC sends a request to the CC for sensor data. Default value is 5 minutes. Modify the history size CLC variable to a number less than 5.
cloud.monitor.history_size
This is how many data value samples are sent in each sensor data request. The default value is 5. The frequency requests is either 1 minute (if the cloud.monitor.default_poll_interval_mins
is 1 minute) or half the value of cloud.monitor.default_poll_interval_mins
if that value is greater). So by default, with a cloud.monitor.default_poll_interval_mins
of 5 minutes and cloud.monitor.history_size
size of 5, every 5 minutes the CLC asks for the last 5 data points from the CC, which should be timed for every 2.5 minutes (e.g., 2.5 minutes ago, 5 minutes ago, 7.5 minutes ago, and 10 minutes ago).
Note
These values may be skewed a bit based on the time the CC uses.2.3 - Configuring Alarms
Configuring Alarms
This section describes how to create, test, and delete and alarm.
2.3.1 - Create an Alarm
You can create a CloudWatch alarm using a resource’s metric, and then add an action using the action’s dedicated Amazon Resource Name (ARN). You can add the action to any alarm state.
Note
Eucalyptus currently only supports actions for executing Auto Scaling policies.To create an alarm, perform the following step.
Enter the following command:
euwatch-put-metric-alarm [alarm_name] --unit Percent --namespace "AWS/EC2"
-- dimensions "InstanceId=[instance_id]" --statistic [statistic] --metric-name
[metric] --comparison-operator [operator] --threshold [value] --period
[seconds] --evaluation-periods [value] -- alarm-actions [action]
For example, the following triggers an Auto Scaling policy if the average CPUUtilization is less than 10 percent over a 24 hour period.
euwatch-put-metric-alarm test-Alarm --unit Percent --namespace "AWS/EC2"
-- dimensions "InstanceId=i-abc123" --statistic Average --metric-name CPUUtilization
--comparison-operator LessThanThreshold --threshold 10 --period 86400
--evaluation-periods 4 -- alarm-actions arn:aws:autoscaling::429942273585:scalingPolicy:
12ad560b-58b2-4051-a6d3-80e53b674de4:autoScalingGroupName/testgroup01:
policyName/testgroup01-pol01
2.3.2 - Test an Alarm
You can test the CloudWatch alarms by temporarily changing the state of your alarm to “ALARM” using the command: euwatch-set-alarm-state .
euwatch-set-alarm-state --alarm-name TestAlarm --state ALARM
2.3.3 - Delete an Alarm
To delete an alarm, perform the following step.
Enter the following command:
euwatch-delete-alarms [alarm_name]
For example, to delete an alarm named TestAlarm
enter:
euwatch-delete-alarms TestAlarm