Monday, April 24, 2017

Creating an alarm in CloudWatch for a Linux EC2 instance

AWS CloudWatch does a couple of things as mentioned in the AWS documentation here. But, one of the interesting thing is that it can gather the metrics (CPU, Network, Disk etc) for the different Amazon resources.

In this blog, we will look how to look at the CPU metrics on a Linux EC2 instance and trigger an Alarm with a corresponding action. To make it simple, if the CPU Utilization in the Linux instance goes beyond 50% then we will be notified through an email. We can watch different metrics, we will use CPU Utilization for now. CloudWatch also allows to create custom metrics, like the application response time.

I will go with the assumption that the Linux instance has already been created as shown in this blog. So, here are the steps.

1. In the Linux instance make sure that the detailed monitoring is enabled. If not then enable it from the EC2 management console as shown below.

2. Go to the CloudWatch management console and select Metrics on the left pane. Select EC2 in the the `All Metrics` tab.

3. Again select the `Per-Instance Metrics` from the `All Metrics` tab.

4. Go back to the EC2 management console and get the `Instance Id` as shown below.

5. Go back to the CloudWatch management console. Search for the InstanceId and select CPUUtilization. Now the line graph will be populated as shown below for the CPUUtilization. We should be also able to select multiple metrics at the same time.

6. The graph is shown with a 5 minute granularity. By changing it to minute granularity, we would be getting a much smoother graph. Click on `Graphed Metrics` tab and change the Period to `1 minute`.

7. Now, lets try to create an Alarm. Click on Alarm in the left pane and then on the `Create Alarm` button.

8. Click on the `Per-Instance Metrics` under EC2. Search for the EC2 instance id and select CPUUtilization as shown below.

9. Now that the metric has been selected, we have to define the alarm. Click on `Next`. Specify the Name, Description and CPU % as shown below.

10. When the alarm is breached (CPU > 50%), then the notification has to be done. In the same screen, go a bit down and specify the notification by click on `New list` as shown below.

11. Click on `Create Alarm` in the same screen. To avoid spamming, AWS will send an email with the confirmation link which has to be clicked. Go ahead and check your email for confirmation.

12. Once the link has been clicked in the email, we will get a confirmation as below.

13. Now, we are all setup to receive an email notification when the CPU > 50%. Login to the Linux instance as mentioned in this blog and run the below command to increase the CPU on the Linux instance. The command doesn't do anything useful, but simply increases the CPU usage. There are more ways to increase the CPU usage, more here.
dd if=/dev/urandom | bzip2 -9 >> /dev/null

14. If we go back to the Metrics then we will notice that the CPU has spiked on the Linux instance because of the above command.

15. And the Alarm also moved from the OK status to the ALARM status as shown below.

16. Check your email and there would be a notification from AWS about the breach of the Alarm threshold.

17. Make sure to delete the Alarm and terminate the Linux instance.

The procedure is a bit too long and it took me good amount of time to write this blog, but once you do it a couple of times and understand whats being done, then it would be a piece of cake.

No comments:

Post a Comment