Menu
Choose a product
Scroll for more
Grafana Cloud
Failure
Failure alerts indicate some kind of deviation in the system’s configuration from its desired state. For example, when replicas are configured in a Redis database, there must be at least one master instance. When there are none, Redis is not operating as configured. These kind of problems are reported as failure alerts.
Here’s an example of an alert rule to report this failure:
code
# Redis Master Missing
# Note this covers both cluster mode and HA mode, thus we are counting by redis_mode
- alert: RedisMissingMaster
expr: |-
count by (job, service, redis_mode, namespace, asserts_env, asserts_site) (
redis_instance_info{role="master"}
) == 0
for: 1m
labels:
asserts_severity: critical
asserts_entity_type: Service
asserts_alert_category: failure| Asserts Meta Label | Description |
|---|---|
asserts_env | Used by the knowledge graph to identify the environment. All discovered entities and observed metrics are automatically scoped to an environment. |
asserts_site | Used by the knowledge graph to identify the region/site within an environment. For example, you could have a prod environment but multiple regions, such as us-east-1, us-west-2, etc. This label is used to capture the region information. Note that this depends on how environment information is encoded in the metrics. Sometimes, both the environment and the region information may be encoded in a single label value; in such cases, the asserts_env label contains that value, and this label may not be present. |
asserts_entity_type | Used by the knowledge graph to identify the level at which the metric is being observed. The workload, service, and job are special labels that the knowledge graph uses to identify the Service. These labels are also used to discover the Service entity in the knowledge graph entity model. In this example, while aggregating, these labels are retained, so this metric is observed for the corresponding Service entity. |
asserts_severity | This label is used to indicate the severity of the problem as either warning (yellow) or critical (red). |
asserts_alert_category | The knowledge graph categorizes all alerts into the following categories: Saturation, Amend (configuration changes to the system), Anomaly, Failure, and Error. In this example, the label asserts_alert_category is used to categorize this alert as a Failure. |
Was this page helpful?
Related resources from Grafana Labs
Additional helpful documentation, links, and articles:
Video

Getting started with managing your metrics, logs, and traces using Grafana
In this webinar, we’ll demo how to get started using the LGTM Stack: Loki for logs, Grafana for visualization, Tempo for traces, and Mimir for metrics.
Video

Intro to Kubernetes monitoring in Grafana Cloud
In this webinar you’ll learn how Grafana offers developers and SREs a simple and quick-to-value solution for monitoring their Kubernetes infrastructure.
Video

Building advanced Grafana dashboards
In this webinar, we’ll demo how to build and format Grafana dashboards.