Programmatically setting up missing alerts in your Observability tool

Automated deployment of alerts is a critical aspect of incident management in modern cloud environments. By automatically configuring and deploying alerts based on predefined criteria, organisations can proactively detect and respond to issues, ensuring the reliability and availability of their services.

Deploy Alerts

Note: The steps outlined in - Mapping Resources to a Service are necessary to perform before deploying alerts.

There are two easy ways to set up alerts: Specific Selection and Bulk Selection. Below are the steps to deploy alerts.

Step 1: Navigate to the "Alert Health" column in the table to identify any missing alerts that need to be set up. Compare the number of missing alerts with the total alerts provided just below it. If there are no missing alerts, it will appear as green in color. However, if there are missing alerts, the number will be greater than zero and will appear as red in color.

Specific Selection

Step 2: After identifying the missing alerts that need to be set up, click on the resource name associated with them.

Step 3: Navigate and click on "Deploy Alerts" to deploy all the alerts. You can find these alerts below the resource's details, including both static alerts and anomaly detection alerts. Each section has two subsections: "Alerts Deployed" and "Alerts Missing." Any missing alerts will appear in the "Alerts Missing" section, while the deployed alerts will appear in the "Alerts Deployed" section.

*Know more on static alerts and Anomaly Detection Alerts → Here.*

Or else,

Click on the "Cloud" icon located on the right side of the table in the Action column. This action will directly deploy all the alerts. If the alerts are not deployed, they will appear in blue. Once deployed, they will appear in green with a tick mark.

Bulk Selection

Step 1: Select multiple resources with missing alerts by ticking the boxes on the left side of the table.

Step 2: After making your selection, navigate to the "Bulk Actions" menu and click on "Deploy Alerts → Deploy." This action will automatically deploy all the alerts for the selected resources within a moment.

The "Search" functionality allows users to quickly locate specific resources based on their name. Users can input the name of the resource they are looking for in the search bar to retrieve relevant results.

Additionally, users can further refine their search results using the "Filter By" options. These filters include:

Type: Users can filter resources by their type, such as SQS, EC2, ElastiCache, etc. This helps users narrow down their search to specific types of resources.
Environment: Users can filter resources based on the environment they belong to, such as Production, Development, Staging, etc. This allows users to focus on resources within a specific environment.
Health: Users can filter resources based on their health status, such as Healthy, Unhealthy, Missing Alerts, etc. This helps users identify resources that require attention or investigation.
Application Service: Users can filter resources associated with a particular application service. This helps users find resources that are relevant to a specific application or service.

Moreover, users can refine their search results based on when the resources were created. The "Created in" options include:

1 day: Shows resources created within the last day.
3 days: Shows resources created within the last three days.
7 days: Shows resources created within the last seven days.
30 days: Shows resources created within the last thirty days.
Custom: Allows users to specify a custom date range for resource creation

Action Buttons

In the "Actions" column located on the left side of the table, there are two icons available:

1. Config alerts: This icon enables users to access the configuration settings of the specific resource, facilitating customization and adjustments to the alerts associated with it.

2. Deploy alerts: This icon enables users to deploy alerts directly for the specific resource, streamlining the process of setting up monitoring and alerting for efficient resource management.

Bulk Actions

The bulk actions feature offers users the capability to perform multiple actions simultaneously on selected resources. This functionality significantly enhances efficiency by allowing users to map services, deploy alerts, and rescan resources in one go, rather than executing these actions individually.

Map Service: This action enables users to map services to selected resources, streamlining the process of assigning teams or application services responsible for monitoring and managing those resources.
Deploy Alerts: With this feature, users can deploy alerts directly for multiple selected resources at once. This simplifies the setup of monitoring and alerting configurations across multiple resources, ensuring comprehensive visibility and proactive incident management.
Rescan Resources: This action initiates a rescan of resources within the AWS environment, facilitating the detection of any new resources that may have been added since the last scan. Users are prompted to wait while the system scans the AWS universe for new resources, ensuring that the monitoring system remains up-to-date and comprehensive in its coverage.

Overall, the bulk actions functionality is designed to optimize user workflow, saving time and effort by allowing simultaneous execution of essential tasks across multiple resources.

Example Scenario

The provided information represents a resource named "fetch_gcp_resources_dlq" within the AWS environment. Let's break down each component and explain how it relates to the given information:

Resource Name: "fetch_gcp_resources_dlq" - This is the name of the resource within the AWS environment.
Type: SQS (Simple Queue Service) - Indicates the type of resource, which in this case is an SQS queue.
Environment: PRODUCTION - Specifies the environment in which the resource operates, indicating that it's deployed in a production environment.
Alert Health: 0 missing - Indicates the health status of alerts associated with this resource. In this case, there are no missing alerts, suggesting that all necessary alerts have been set up.
Application Service: NewTestService - Represents the application or service associated with this resource, indicating that it's linked to "NewTestService."
Actions: config alerts, deploy alerts - Specifies available actions for this resource. Users can configure alerts and deploy them using the provided actions.
Cloud Id: [https://sqs.ap-south 1.amazonaws.com/016340224242/fetch_gcp_resources_dlq]

Last updated 1 year ago