Alert noise Reduction & Optimisation

Alert Analytics provides a comprehensive overview of all the alerts within the system. This section contains four distinct cards, each serving a unique purpose and displaying the latest data continuously. All metrics shown are from the last 30 days. You can also easily search for and filter alerts based on teams using our search filters.

Below are the sections and their purposes:

Alerts based on Integration Type

This section provides a detailed breakdown of alerts categorized by their integration type. The purpose is to understand the distribution and frequency of alerts associated with different integration mechanisms within the system.

Total Alerts: 44

This is the cumulative number of alerts triggered across all integration types.

Explanation of Columns:

Integration Type: The specific type of system integration (e.g., AWS, Azure, etc.)
Occurrence: The number of alerts triggered for each integration type.
Percentage: The proportion of alerts for each integration type relative to the total number of alerts. This is calculated as: Percentage = ( Total Alerts / Occurrence) × 100

Alerts based on Application Service

This section provides a detailed breakdown of alerts categorized by their application service. The purpose is to understand the distribution and frequency of alerts associated with different services within the application.

Total Alerts: 44

This is the cumulative number of alerts triggered across all application services.

Explanation of Columns:

Service: The specific application service (e.g., Payments, Haarvish, etc.)
Occurrences: The number of alerts triggered and grouped by each application service.
Percentage: The proportion of alerts for each service relative to the total number of alerts.

Most Frequent Alerts

This section presents a comprehensive analysis of the most frequently occurring alerts, categorized by their integration, alert name, associated resource, occurrence, and percentage relative to the total alerts.

Integration Type: Webhook
Alert: Test Alert: CPU Utilization is High
Resource Name: Test Instance
Occurrence: 5

This alert, related to CPU load on the prod-bluejay-db resource, is the most frequent among all alerts.

Alerts opened and closed within 5 minutes

Alerts opened and closed within 5 minutes are typically indicative of transient issues or false positives. They should be monitored closely to identify any recurring patterns or underlying issues that may need to be addressed for system stability and reliability.

Alert Details

The table below provides details of the alerts opened and closed within the 5 minutes.

Explanation of Columns:

Integration: The system integration through which the alert was triggered.
Service: The specific service associated with the alert.
Resource Name: The name of the resource for which the alert has been triggered.
Metric Name: The name of the metric for which the alert was triggered.
Opened At: The timestamp indicating when the alert was opened.
Closed At: The timestamp indicating when the alert was closed.

Last updated 10 months ago