<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Examples on Grafana Labs</title><link>https://grafana.com/docs/grafana/v12.4/alerting/examples/</link><description>Recent content in Examples on Grafana Labs</description><generator>Hugo -- gohugo.io</generator><language>en</language><atom:link href="/docs/grafana/v12.4/alerting/examples/index.xml" rel="self" type="application/rss+xml"/><item><title>Example of multi-dimensional alerts on time series data</title><link>https://grafana.com/docs/grafana/v12.4/alerting/examples/multi-dimensional-alerts/</link><pubDate>Fri, 03 Apr 2026 12:35:46 -0500</pubDate><guid>https://grafana.com/docs/grafana/v12.4/alerting/examples/multi-dimensional-alerts/</guid><content><![CDATA[&lt;h1 id=&#34;example-of-multi-dimensional-alerts-on-time-series-data&#34;&gt;Example of multi-dimensional alerts on time series data&lt;/h1&gt;
&lt;p&gt;This example shows how a single alert rule can generate multiple alert instances — one for each label set (or time series). This is called &lt;strong&gt;multi-dimensional alerting&lt;/strong&gt;: one alert rule, many alert instances.&lt;/p&gt;
&lt;p&gt;In Prometheus, each unique combination of labels defines a distinct time series. Grafana Alerting uses the same model: each label set is evaluated independently, and a separate alert instance is created for each series.&lt;/p&gt;
&lt;p&gt;This pattern is common in dynamic environments when monitoring a group of components like multiple CPUs, containers, or per-host availability. Instead of defining individual alert rules or aggregated alerts, you alert on &lt;em&gt;each dimension&lt;/em&gt; — so you can detect particular issues and include that level of detail in notifications.&lt;/p&gt;
&lt;p&gt;For example, a query returns one series per CPU:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;&lt;code&gt;cpu&lt;/code&gt; label value&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;CPU percent usage&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;cpu-0&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;95&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;cpu-1&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;30&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;cpu-2&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;85&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;With a threshold of &lt;code&gt;&amp;gt; 80&lt;/code&gt;, this would trigger two alert instances for &lt;code&gt;cpu-0&lt;/code&gt; and one for &lt;code&gt;cpu-2&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&#34;examples-overview&#34;&gt;Examples overview&lt;/h2&gt;
&lt;p&gt;Imagine you want to trigger alerts when CPU usage goes above 80%, and you want to track each CPU core independently.&lt;/p&gt;
&lt;p&gt;You can use a Prometheus query like this:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sum by(cpu) (
  rate(node_cpu_seconds_total{mode!=&amp;#34;idle&amp;#34;}[1m])
)&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This query returns the active CPU usage rate per CPU core, averaged over the past minute.&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;CPU core&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Active usage rate&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;cpu-0&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;95&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;cpu-1&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;30&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;cpu-2&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;85&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;This produces one series for each existing CPU.&lt;/p&gt;
&lt;p&gt;When Grafana Alerting evaluates the query, it creates an individual alert instance for each returned series.&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Alert instance&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Value&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;{cpu=&amp;ldquo;cpu-0&amp;rdquo;}&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;95&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;{cpu=&amp;ldquo;cpu-1&amp;rdquo;}&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;30&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;{cpu=&amp;ldquo;cpu-2&amp;rdquo;}&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;85&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;With a threshold condition like &lt;code&gt;$A &amp;gt; 80&lt;/code&gt;, Grafana evaluates each instance separately and fires alerts only where the condition is met:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Alert instance&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Value&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;State&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;{cpu=&amp;ldquo;cpu-0&amp;rdquo;}&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;95&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Firing&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;{cpu=&amp;ldquo;cpu-1&amp;rdquo;}&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;30&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Normal&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;{cpu=&amp;ldquo;cpu-2&amp;rdquo;}&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;85&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Firing&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;Multi-dimensional alerts help you surface issues on individual components—problems that might be missed when alerting on aggregated data (like total CPU usage).&lt;/p&gt;
&lt;p&gt;Each alert instance targets a specific component, identified by its unique label set. This makes alerts more specific and actionable. For example, you can set a 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rules/annotation-label/#annotations&#34;&gt;&lt;code&gt;summary&lt;/code&gt; annotation&lt;/a&gt; in your alert rule that identifies the affected CPU:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;High CPU usage on {{$labels.cpu}}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;In the previous example, the two firing alert instances would display summaries indicating the affected CPUs:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;High CPU usage on &lt;code&gt;cpu-0&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;High CPU usage on &lt;code&gt;cpu-2&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;try-it-with-testdata&#34;&gt;Try it with TestData&lt;/h2&gt;
&lt;p&gt;You can quickly experiment with multi-dimensional alerts using the 
    &lt;a href=&#34;/docs/grafana/v12.4/datasources/testdata/&#34;&gt;&lt;strong&gt;TestData&lt;/strong&gt; data source&lt;/a&gt;, which can generate multiple random time series.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Add the &lt;strong&gt;TestData&lt;/strong&gt; data source through the &lt;strong&gt;Connections&lt;/strong&gt; menu.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Go to &lt;strong&gt;Alerting&lt;/strong&gt; and create an alert rule&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;TestData&lt;/strong&gt; as the data source.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Configure the TestData scenario&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Scenario: &lt;strong&gt;Random Walk&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;Labels: &lt;code&gt;cpu=cpu-$seriesIndex&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Series count: 3&lt;/li&gt;
&lt;li&gt;Min: 70, Max: 100&lt;/li&gt;
&lt;li&gt;Spread: 2&lt;/li&gt;
&lt;/ul&gt;
&lt;figure
       class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
       style=&#34;max-width: 750px;&#34;
       itemprop=&#34;associatedMedia&#34;
       itemscope=&#34;&#34;
       itemtype=&#34;http://schema.org/ImageObject&#34;
     &gt;&lt;a
           class=&#34;lightbox-link&#34;
           href=&#34;/media/docs/alerting/testdata-random-series-v2.png&#34;
           itemprop=&#34;contentUrl&#34;
         &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
             class=&#34;lazyload &#34;
             data-src=&#34;/media/docs/alerting/testdata-random-series-v2.png&#34;data-srcset=&#34;/media/docs/alerting/testdata-random-series-v2.png?w=320 320w, /media/docs/alerting/testdata-random-series-v2.png?w=550 550w, /media/docs/alerting/testdata-random-series-v2.png?w=750 750w, /media/docs/alerting/testdata-random-series-v2.png?w=900 900w, /media/docs/alerting/testdata-random-series-v2.png?w=1040 1040w, /media/docs/alerting/testdata-random-series-v2.png?w=1240 1240w, /media/docs/alerting/testdata-random-series-v2.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Generating random time series data using the TestData data source&#34;width=&#34;1165&#34;height=&#34;676&#34;/&gt;
           &lt;noscript&gt;
             &lt;img
               src=&#34;/media/docs/alerting/testdata-random-series-v2.png&#34;
               alt=&#34;Generating random time series data using the TestData data source&#34;width=&#34;1165&#34;height=&#34;676&#34;/&gt;
           &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;reduce-time-series-data-for-comparison&#34;&gt;Reduce time series data for comparison&lt;/h2&gt;
&lt;p&gt;The example returns three time series like shown above with values across the selected time range.&lt;/p&gt;
&lt;p&gt;To alert on each series, you need to reduce the time series to a single value that the alert condition can evaluate and determine the alert instance state.&lt;/p&gt;
&lt;p&gt;Grafana Alerting provides several ways to reduce time series data:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Data source query functions&lt;/strong&gt;. The earlier example used the Prometheus &lt;code&gt;sum&lt;/code&gt; function to sum the rate results by &lt;code&gt;cpu,&lt;/code&gt;producing a single value per CPU core.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Reduce expression&lt;/strong&gt;. In the query and condition section, Grafana provides the &lt;code&gt;Reduce&lt;/code&gt; expression to aggregate time series data.
&lt;ul&gt;
&lt;li&gt;In &lt;strong&gt;Default mode&lt;/strong&gt;, the &lt;strong&gt;When&lt;/strong&gt; input selects a reducer (like &lt;code&gt;last&lt;/code&gt;, &lt;code&gt;mean&lt;/code&gt;, or &lt;code&gt;min&lt;/code&gt;), and the threshold compares that reduced value.&lt;/li&gt;
&lt;li&gt;In &lt;strong&gt;Advanced mode&lt;/strong&gt;, you can add the 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rules/queries-conditions/#reduce&#34;&gt;&lt;strong&gt;Reduce&lt;/strong&gt; expression&lt;/a&gt; (e.g., &lt;code&gt;last()&lt;/code&gt;, &lt;code&gt;mean()&lt;/code&gt;) before defining the threshold (alert condition).&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For demo purposes, this example uses the &lt;strong&gt;Advanced mode&lt;/strong&gt; with a &lt;strong&gt;Reduce&lt;/strong&gt; expression:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Toggle &lt;strong&gt;Advanced mode&lt;/strong&gt; in the top right section of the query panel to enable adding additional expressions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Add the &lt;strong&gt;Reduce&lt;/strong&gt; expression using a function like &lt;code&gt;mean()&lt;/code&gt; to reduce each time series to a single value.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Define the alert condition using a &lt;strong&gt;Threshold&lt;/strong&gt; like &lt;code&gt;$reducer &amp;gt; 80&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Preview&lt;/strong&gt; to evaluate the alert rule.&lt;/p&gt;
&lt;figure
       class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
       style=&#34;max-width: 750px;&#34;
       itemprop=&#34;associatedMedia&#34;
       itemscope=&#34;&#34;
       itemtype=&#34;http://schema.org/ImageObject&#34;
     &gt;&lt;a
           class=&#34;lightbox-link captioned&#34;
           href=&#34;/media/docs/alerting/using-expressions-with-multiple-series.png&#34;
           itemprop=&#34;contentUrl&#34;
         &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
             class=&#34;lazyload mb-0&#34;
             data-src=&#34;/media/docs/alerting/using-expressions-with-multiple-series.png&#34;data-srcset=&#34;/media/docs/alerting/using-expressions-with-multiple-series.png?w=320 320w, /media/docs/alerting/using-expressions-with-multiple-series.png?w=550 550w, /media/docs/alerting/using-expressions-with-multiple-series.png?w=750 750w, /media/docs/alerting/using-expressions-with-multiple-series.png?w=900 900w, /media/docs/alerting/using-expressions-with-multiple-series.png?w=1040 1040w, /media/docs/alerting/using-expressions-with-multiple-series.png?w=1240 1240w, /media/docs/alerting/using-expressions-with-multiple-series.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Alert preview using a Reduce expression and a threshold condition&#34;width=&#34;1049&#34;height=&#34;416&#34;title=&#34;The alert condition evaluates the reduced value for each alert instance and shows whether each instance is Firing or Normal.&#34;/&gt;
           &lt;noscript&gt;
             &lt;img
               src=&#34;/media/docs/alerting/using-expressions-with-multiple-series.png&#34;
               alt=&#34;Alert preview using a Reduce expression and a threshold condition&#34;width=&#34;1049&#34;height=&#34;416&#34;title=&#34;The alert condition evaluates the reduced value for each alert instance and shows whether each instance is Firing or Normal.&#34;/&gt;
           &lt;/noscript&gt;&lt;/div&gt;&lt;figcaption class=&#34;w-100p caption text-gray-13  &#34;&gt;The alert condition evaluates the reduced value for each alert instance and shows whether each instance is Firing or Normal.&lt;/figcaption&gt;&lt;/a&gt;&lt;/figure&gt;


&lt;div class=&#34;admonition admonition-tip&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Tip&lt;/p&gt;&lt;p&gt;You can explore this &lt;strong&gt;&lt;a href=&#34;https://play.grafana.org/alerting/grafana/multi-dimensional-alerts/view?tech=docs&amp;amp;pg=alerting-examples&amp;amp;plcmt=callout-tip&amp;amp;cta=alert-multi-dimensional&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;alerting example in Grafana Play&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Open the example to view alert evaluation results, generated alert instances, the alert history timeline, and alert rule details.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;learn-more&#34;&gt;Learn more&lt;/h2&gt;
&lt;p&gt;This example shows how Grafana Alerting implements a multi-dimensional alerting model: one rule, many alert instances and why reducing time series data to a single value is required for evaluation.&lt;/p&gt;
&lt;p&gt;For additional learning resources, check out:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/alerting-get-started-pt2/&#34;&gt;Get started tutorial – Create multi-dimensional alerts and route them&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/table-data/&#34;&gt;Example of alerting on tabular data&lt;/a&gt;
Update the interval of a rule group or modify the rules of the group.&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="example-of-multi-dimensional-alerts-on-time-series-data">Example of multi-dimensional alerts on time series data&lt;/h1>
&lt;p>This example shows how a single alert rule can generate multiple alert instances — one for each label set (or time series). This is called &lt;strong>multi-dimensional alerting&lt;/strong>: one alert rule, many alert instances.&lt;/p></description></item><item><title>Example of alerting on tabular data</title><link>https://grafana.com/docs/grafana/v12.4/alerting/examples/table-data/</link><pubDate>Fri, 03 Apr 2026 12:35:46 -0500</pubDate><guid>https://grafana.com/docs/grafana/v12.4/alerting/examples/table-data/</guid><content><![CDATA[&lt;h1 id=&#34;example-of-alerting-on-tabular-data&#34;&gt;Example of alerting on tabular data&lt;/h1&gt;
&lt;p&gt;Not all data sources return time series data. SQL databases, CSV files, and some APIs often return results as rows or arrays of columns or fields — commonly referred to as tabular data.&lt;/p&gt;
&lt;p&gt;This example shows how to create an alert rule using data in table format. Grafana treats each row as a separate alert instance, as long as the data meets the expected format.&lt;/p&gt;
&lt;h2 id=&#34;how-grafana-alerting-evaluates-tabular-data&#34;&gt;How Grafana Alerting evaluates tabular data&lt;/h2&gt;
&lt;p&gt;When a query returns data in table format, Grafana transforms each row into a separate alert instance.&lt;/p&gt;
&lt;p&gt;To evaluate each row (alert instance), it expects:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;strong&gt;Only one numeric column.&lt;/strong&gt; This is the value used for evaluating the alert condition.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Non-numeric columns.&lt;/strong&gt; These columns defines the label set. The column name becomes a label name; and the cell value becomes the label value.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Unique label sets per row.&lt;/strong&gt; Each row must be uniquely identifiable by its labels. This ensures each row represents a distinct alert instance.&lt;/li&gt;
&lt;/ol&gt;


&lt;div class=&#34;admonition admonition-caution&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Caution&lt;/p&gt;&lt;p&gt;These three conditions must be met—otherwise, Grafana can’t evaluate the table data and the rule will fail.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;h2 id=&#34;example-overview&#34;&gt;Example overview&lt;/h2&gt;
&lt;p&gt;Imagine you store disk usage in a &lt;code&gt;DiskSpace&lt;/code&gt; table and you want to trigger alerts when the available space drops below 5%.&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Time&lt;/th&gt;
              &lt;th&gt;Host&lt;/th&gt;
              &lt;th&gt;Disk&lt;/th&gt;
              &lt;th&gt;PercentFree&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;2021-06-07&lt;/td&gt;
              &lt;td&gt;web1&lt;/td&gt;
              &lt;td&gt;/etc&lt;/td&gt;
              &lt;td&gt;3&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;2021-06-07&lt;/td&gt;
              &lt;td&gt;web2&lt;/td&gt;
              &lt;td&gt;/var&lt;/td&gt;
              &lt;td&gt;4&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;2021-06-07&lt;/td&gt;
              &lt;td&gt;web3&lt;/td&gt;
              &lt;td&gt;/var&lt;/td&gt;
              &lt;td&gt;8&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;To calculate the free space per Host and Disk in this case, you can use &lt;code&gt;$__timeFilter&lt;/code&gt; to filter by time but without returning the date to Grafana:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;SQL&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-sql&#34;&gt;SELECT
  Host,
  Disk,
  AVG(PercentFree) AS PercentFree
FROM DiskSpace
WHERE $__timeFilter(Time)
GROUP BY Host, Disk&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This query returns the following table response:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Host&lt;/th&gt;
              &lt;th&gt;Disk&lt;/th&gt;
              &lt;th&gt;PercentFree&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;web1&lt;/td&gt;
              &lt;td&gt;/etc&lt;/td&gt;
              &lt;td&gt;3&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;web2&lt;/td&gt;
              &lt;td&gt;/var&lt;/td&gt;
              &lt;td&gt;4&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;web3&lt;/td&gt;
              &lt;td&gt;/var&lt;/td&gt;
              &lt;td&gt;8&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;When Alerting evaluates the query response, the data is transformed into three alert instances as previously detailed:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The numeric column becomes the value for the alert condition.&lt;/li&gt;
&lt;li&gt;Additional columns define the label set for each alert instance.&lt;/li&gt;
&lt;/ul&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Alert instance&lt;/th&gt;
              &lt;th&gt;Value&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;{Host=&amp;quot;web1&amp;quot;, Disk=&amp;quot;/etc&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;3&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;{Host=&amp;quot;web2&amp;quot;, Disk=&amp;quot;/var&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;4&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;{Host=&amp;quot;web3&amp;quot;, Disk=&amp;quot;/var&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;8&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;Finally, an alert condition that checks for less than 5% of free space (&lt;code&gt;$A &amp;lt; 5&lt;/code&gt;) would result in two alert instances firing:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th&gt;Alert instance&lt;/th&gt;
              &lt;th&gt;Value&lt;/th&gt;
              &lt;th&gt;State&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;{Host=&amp;quot;web1&amp;quot;, Disk=&amp;quot;/etc&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;3&lt;/td&gt;
              &lt;td&gt;Firing&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;{Host=&amp;quot;web2&amp;quot;, Disk=&amp;quot;/var&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;4&lt;/td&gt;
              &lt;td&gt;Firing&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td&gt;&lt;code&gt;{Host=&amp;quot;web3&amp;quot;, Disk=&amp;quot;/var&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td&gt;8&lt;/td&gt;
              &lt;td&gt;Normal&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;h2 id=&#34;try-it-with-testdata&#34;&gt;Try it with TestData&lt;/h2&gt;
&lt;p&gt;To test this quickly, you can simulate the table using the 
    &lt;a href=&#34;/docs/grafana/v12.4/datasources/testdata/&#34;&gt;&lt;strong&gt;TestData&lt;/strong&gt; data source&lt;/a&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Add the &lt;strong&gt;TestData&lt;/strong&gt; data source through the &lt;strong&gt;Connections&lt;/strong&gt; menu.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Go to &lt;strong&gt;Alerting&lt;/strong&gt; and create an alert rule&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Select &lt;strong&gt;TestData&lt;/strong&gt; as the data source.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;From &lt;strong&gt;Scenario&lt;/strong&gt;, select &lt;strong&gt;CSV Content&lt;/strong&gt; and paste this CSV:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;Bash&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-bash&#34;&gt;host, disk, percentFree
web1, /etc, 3
web2, /var, 4
web3, /var, 8&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Set a condition like &lt;code&gt;$A &amp;lt; 5&lt;/code&gt; and &lt;strong&gt;Preview&lt;/strong&gt; the alert.&lt;/p&gt;
&lt;p&gt;Grafana evaluates the table data and fires the two first alert instances.&lt;/p&gt;
&lt;figure
       class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
       style=&#34;max-width: 750px;&#34;
       itemprop=&#34;associatedMedia&#34;
       itemscope=&#34;&#34;
       itemtype=&#34;http://schema.org/ImageObject&#34;
     &gt;&lt;a
           class=&#34;lightbox-link&#34;
           href=&#34;/media/docs/alerting/example-table-data-preview.png&#34;
           itemprop=&#34;contentUrl&#34;
         &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
             class=&#34;lazyload &#34;
             data-src=&#34;/media/docs/alerting/example-table-data-preview.png&#34;data-srcset=&#34;/media/docs/alerting/example-table-data-preview.png?w=320 320w, /media/docs/alerting/example-table-data-preview.png?w=550 550w, /media/docs/alerting/example-table-data-preview.png?w=750 750w, /media/docs/alerting/example-table-data-preview.png?w=900 900w, /media/docs/alerting/example-table-data-preview.png?w=1040 1040w, /media/docs/alerting/example-table-data-preview.png?w=1240 1240w, /media/docs/alerting/example-table-data-preview.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Alert preview with tabular data using the TestData data source&#34;width=&#34;1080&#34;height=&#34;881&#34;/&gt;
           &lt;noscript&gt;
             &lt;img
               src=&#34;/media/docs/alerting/example-table-data-preview.png&#34;
               alt=&#34;Alert preview with tabular data using the TestData data source&#34;width=&#34;1080&#34;height=&#34;881&#34;/&gt;
           &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;


&lt;div class=&#34;admonition admonition-tip&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Tip&lt;/p&gt;&lt;p&gt;You can explore this &lt;strong&gt;&lt;a href=&#34;https://play.grafana.org/alerting/grafana/tabular-data/view?tech=docs&amp;amp;pg=alerting-examples&amp;amp;plcmt=callout-tip&amp;amp;cta=alert-tabular-data&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;alerting example in Grafana Play&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Open the example to view alert evaluation results, generated alert instances, the alert history timeline, and alert rule details.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;csv-data-with-infinity&#34;&gt;CSV data with Infinity&lt;/h2&gt;
&lt;p&gt;Note that when the &lt;a href=&#34;/docs/plugins/yesoreyeram-infinity-datasource/latest/csv/&#34;&gt;Infinity plugin fetches CSV data&lt;/a&gt;, all the columns are parsed and returned as strings. By default, this causes the query expression to fail in Alerting.&lt;/p&gt;
&lt;p&gt;To make it work, you need to format the CSV data as &lt;a href=&#34;#how-grafana-alerting-evaluates-tabular-data&#34;&gt;expected by Grafana Alerting&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In the query editor, specify the column names and their types to ensure that only one column is treated as a number.&lt;/p&gt;
&lt;figure
    class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
    style=&#34;max-width: 750px;&#34;
    itemprop=&#34;associatedMedia&#34;
    itemscope=&#34;&#34;
    itemtype=&#34;http://schema.org/ImageObject&#34;
  &gt;&lt;a
        class=&#34;lightbox-link&#34;
        href=&#34;/media/docs/alerting/example-table-data-infinity-csv-data.png&#34;
        itemprop=&#34;contentUrl&#34;
      &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
          class=&#34;lazyload &#34;
          data-src=&#34;/media/docs/alerting/example-table-data-infinity-csv-data.png&#34;data-srcset=&#34;/media/docs/alerting/example-table-data-infinity-csv-data.png?w=320 320w, /media/docs/alerting/example-table-data-infinity-csv-data.png?w=550 550w, /media/docs/alerting/example-table-data-infinity-csv-data.png?w=750 750w, /media/docs/alerting/example-table-data-infinity-csv-data.png?w=900 900w, /media/docs/alerting/example-table-data-infinity-csv-data.png?w=1040 1040w, /media/docs/alerting/example-table-data-infinity-csv-data.png?w=1240 1240w, /media/docs/alerting/example-table-data-infinity-csv-data.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Using the Infinity data source plugin to fetch CSV data in Alerting&#34;width=&#34;930&#34;height=&#34;960&#34;/&gt;
        &lt;noscript&gt;
          &lt;img
            src=&#34;/media/docs/alerting/example-table-data-infinity-csv-data.png&#34;
            alt=&#34;Using the Infinity data source plugin to fetch CSV data in Alerting&#34;width=&#34;930&#34;height=&#34;960&#34;/&gt;
        &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;h2 id=&#34;differences-with-time-series-data&#34;&gt;Differences with time series data&lt;/h2&gt;
&lt;p&gt;Working with time series is similar—each series is treated as a separate alert instance, based on its label set.&lt;/p&gt;
&lt;p&gt;The key difference is the data format:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Time series data&lt;/strong&gt; contains multiple values over time, each with its own timestamp.
To evaluate the alert condition, alert rules &lt;strong&gt;must reduce each series to a single number&lt;/strong&gt; using a function like &lt;code&gt;last()&lt;/code&gt;, &lt;code&gt;avg()&lt;/code&gt;, or &lt;code&gt;max()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tabular data&lt;/strong&gt; doesn’t require reduction, as each row contains only a single numeric value used to evaluate the alert condition.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For comparison, see the 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;multi-dimensional time series data example&lt;/a&gt;.&lt;/p&gt;
]]></content><description>&lt;h1 id="example-of-alerting-on-tabular-data">Example of alerting on tabular data&lt;/h1>
&lt;p>Not all data sources return time series data. SQL databases, CSV files, and some APIs often return results as rows or arrays of columns or fields — commonly referred to as tabular data.&lt;/p></description></item><item><title>Trace-based alerts</title><link>https://grafana.com/docs/grafana/v12.4/alerting/examples/trace-based-alerts/</link><pubDate>Fri, 03 Apr 2026 12:35:46 -0500</pubDate><guid>https://grafana.com/docs/grafana/v12.4/alerting/examples/trace-based-alerts/</guid><content><![CDATA[&lt;h1 id=&#34;examples-of-trace-based-alerts&#34;&gt;Examples of trace-based alerts&lt;/h1&gt;
&lt;p&gt;Metrics are the foundation of most alerting systems. They are usually the first signal that something is wrong, but they don’t always indicate &lt;em&gt;where&lt;/em&gt; or &lt;em&gt;why&lt;/em&gt; a failure occurs.&lt;/p&gt;
&lt;p&gt;Traces fill that gap by showing the complete path a request takes through your system. They map the workflows across services, indicating where the request slows down or fails.&lt;/p&gt;
&lt;figure
    class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
    style=&#34;max-width: 750px;&#34;
    itemprop=&#34;associatedMedia&#34;
    itemscope=&#34;&#34;
    itemtype=&#34;http://schema.org/ImageObject&#34;
  &gt;&lt;a
        class=&#34;lightbox-link&#34;
        href=&#34;/media/docs/alerting/screenshot-traces-visualization-11.5.png&#34;
        itemprop=&#34;contentUrl&#34;
      &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
          class=&#34;lazyload &#34;
          data-src=&#34;/media/docs/alerting/screenshot-traces-visualization-11.5.png&#34;data-srcset=&#34;/media/docs/alerting/screenshot-traces-visualization-11.5.png?w=320 320w, /media/docs/alerting/screenshot-traces-visualization-11.5.png?w=550 550w, /media/docs/alerting/screenshot-traces-visualization-11.5.png?w=750 750w, /media/docs/alerting/screenshot-traces-visualization-11.5.png?w=900 900w, /media/docs/alerting/screenshot-traces-visualization-11.5.png?w=1040 1040w, /media/docs/alerting/screenshot-traces-visualization-11.5.png?w=1240 1240w, /media/docs/alerting/screenshot-traces-visualization-11.5.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Trace view&#34;width=&#34;804&#34;height=&#34;411&#34;/&gt;
        &lt;noscript&gt;
          &lt;img
            src=&#34;/media/docs/alerting/screenshot-traces-visualization-11.5.png&#34;
            alt=&#34;Trace view&#34;width=&#34;804&#34;height=&#34;411&#34;/&gt;
        &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;p&gt;Traces report duration and errors directly to specific services and spans, helping to find the affected component and service scope. With this additional context, alerting on tracing data can help you &lt;strong&gt;identify root causes faster&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;You can create trace-based alerts in Grafana Alerting using two main approaches:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Querying metrics generated from tracing data.&lt;/li&gt;
&lt;li&gt;Using TraceQL, a query language for traces available in Grafana Tempo.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This guide provides introductory examples and distinct approaches for setting up &lt;strong&gt;trace-based alerts&lt;/strong&gt; in Grafana. Tracing data is commonly collected using &lt;strong&gt;OpenTelemetry (OTel)&lt;/strong&gt; instrumentation. OTel allows you to integrate trace data from a wide range of applications and environments into Grafana.&lt;/p&gt;
&lt;h2 id=&#34;alerting-on-span-metrics&#34;&gt;Alerting on span metrics&lt;/h2&gt;
&lt;p&gt;OpenTelemetry provides processors that convert tracing data into Prometheus-style metrics.&lt;/p&gt;
&lt;p&gt;The &lt;strong&gt;service graph&lt;/strong&gt; and &lt;strong&gt;span metrics&lt;/strong&gt; processors are the standard options in Alloy and Tempo to generate Prometheus metrics from traces. They can generate the rate, error, and duration (RED) metrics from sampled spans.&lt;/p&gt;
&lt;p&gt;You can then create alert rules that query metrics derived from traces.&lt;/p&gt;
&lt;figure
    class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
    style=&#34;max-width: 750px;&#34;
    itemprop=&#34;associatedMedia&#34;
    itemscope=&#34;&#34;
    itemtype=&#34;http://schema.org/ImageObject&#34;
  &gt;&lt;a
        class=&#34;lightbox-link&#34;
        href=&#34;/media/docs/alerting/why-trace-based-metrics.png&#34;
        itemprop=&#34;contentUrl&#34;
      &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
          class=&#34;lazyload &#34;
          data-src=&#34;/media/docs/alerting/why-trace-based-metrics.png&#34;data-srcset=&#34;/media/docs/alerting/why-trace-based-metrics.png?w=320 320w, /media/docs/alerting/why-trace-based-metrics.png?w=550 550w, /media/docs/alerting/why-trace-based-metrics.png?w=750 750w, /media/docs/alerting/why-trace-based-metrics.png?w=900 900w, /media/docs/alerting/why-trace-based-metrics.png?w=1040 1040w, /media/docs/alerting/why-trace-based-metrics.png?w=1240 1240w, /media/docs/alerting/why-trace-based-metrics.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Why metrics if you have traces?&#34;width=&#34;2158&#34;height=&#34;942&#34;/&gt;
        &lt;noscript&gt;
          &lt;img
            src=&#34;/media/docs/alerting/why-trace-based-metrics.png&#34;
            alt=&#34;Why metrics if you have traces?&#34;width=&#34;2158&#34;height=&#34;942&#34;/&gt;
        &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;p&gt;&lt;a href=&#34;/docs/tempo/latest/metrics-from-traces/service_graphs/&#34;&gt;Service graph metrics&lt;/a&gt; focus on inter-service communication and dependency health. They measure the calls between services, helping Grafana to infer the service topology. However, they measure only the interaction between two services—they don’t include the internal processing time of the client service.&lt;/p&gt;
&lt;p&gt;You can use service graph metrics to detect infrastructure issues such as network degradation or service mesh problems.&lt;/p&gt;
&lt;p&gt;For trace-based alerts, we recommend using &lt;a href=&#34;/docs/tempo/latest/metrics-from-traces/span-metrics/&#34;&gt;span metrics&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Span metrics&lt;/strong&gt; measure the total processing time of a service request: capturing what happens inside the service, not just the communication between services. They include the time spent on internal processing and waiting on downstream calls, providing an &lt;strong&gt;end-to-end picture of service performance&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Depending on how you create span metrics, the following span metrics are generated:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Span metrics generator&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Metric name&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Prometheus metric type&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Description&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;a href=&#34;/docs/alloy/latest/reference/components/otelcol/otelcol.connector.spanmetrics/&#34;&gt;Alloy&lt;/a&gt; and &lt;a href=&#34;https://github.com/open-telemetry/opentelemetry-collector-contrib/tree/main/connector/spanmetricsconnector&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OTEL span metrics connector&lt;/a&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;traces_span_metrics_calls_total&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Counter&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Total count of the span&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;traces_span_metrics_duration_seconds&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Histogram (native or classic)&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Duration of the span&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;a href=&#34;/docs/tempo/latest/metrics-from-traces/span-metrics/span-metrics-metrics-generator/&#34;&gt;Tempo&lt;/a&gt; and &lt;a href=&#34;/docs/grafana-cloud/monitor-applications/application-observability/setup/metrics-labels/&#34;&gt;Grafana Cloud Application Observability&lt;/a&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;traces_spanmetrics_calls_total&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Counter&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Total count of the span&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;traces_spanmetrics_latency&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Histogram (native or classic)&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Duration of the span&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;traces_spanmetrics_size_total&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Counter&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Total size of spans ingested&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;Each metric includes by default the following labels: &lt;code&gt;service&lt;/code&gt;, &lt;code&gt;span_name&lt;/code&gt;, &lt;code&gt;span_kind&lt;/code&gt;, &lt;code&gt;status_code&lt;/code&gt;, &lt;code&gt;status_message&lt;/code&gt;, &lt;code&gt;job&lt;/code&gt;, and &lt;code&gt;instance&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;In the metrics generator, you can customize how traces are converted into metrics by configuring histograms, exemplars, metric dimensions, and other options.&lt;/p&gt;
&lt;p&gt;The following examples assume that span metrics have already been generated using one of these options or an alternative.&lt;/p&gt;
&lt;h3 id=&#34;detect-slow-span-operations&#34;&gt;Detect slow span operations&lt;/h3&gt;
&lt;p&gt;This example shows how to define an alert rule that detects when operations handled by a service become slow.&lt;/p&gt;
&lt;p&gt;Before looking at the query, it’s useful to review a few &lt;a href=&#34;/docs/tempo/latest/introduction/trace-structure/&#34;&gt;trace elements&lt;/a&gt; that shape how it works:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A trace represents a single request or transaction as it flows through multiple spans and services. A span refers to a specific operation within a service.&lt;/li&gt;
&lt;li&gt;Each span includes the operation name (&lt;code&gt;span_name&lt;/code&gt;) and its duration (the metric value), as well as additional fields like &lt;a href=&#34;https://opentelemetry.io/docs/concepts/signals/traces/#span-status&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;span status&lt;/a&gt; (&lt;code&gt;status_code&lt;/code&gt;) and &lt;a href=&#34;https://opentelemetry.io/docs/concepts/signals/traces/#span-kind&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;span kind&lt;/a&gt; (&lt;code&gt;span_kind&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;A server span represents work performed on the receiving side of a request, while a client span represents the outbound call (parent span) waiting for a response (client → server).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To detect slow inbound operations within a specific service, you can define an alert rule that detects when the percentile latency of server spans exceeds a threshold. For example:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Detect when 95% of requests (excluding errors) do not complete faster than 2 seconds.&lt;/em&gt;&lt;/p&gt;
&lt;h4 id=&#34;using-native-histograms&#34;&gt;Using native histograms&lt;/h4&gt;
&lt;p&gt;The following PromQL query uses the &lt;code&gt;traces_span_metrics_duration_seconds&lt;/code&gt; native histogram metric to define the alert rule query.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;histogram_quantile(0.95,
 sum by (span_name) (
   rate(traces_span_metrics_duration_seconds{
     service_name=&amp;#34;&amp;lt;SERVICE_NAME&amp;gt;&amp;#34;,
     span_kind=&amp;#34;SPAN_KIND_SERVER&amp;#34;,
     status_code!=&amp;#34;STATUS_CODE_ERROR&amp;#34;
   }[10m])
 )
) &amp;gt; 2&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Here’s the query breakdown:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;traces_span_metrics_duration_seconds&lt;/code&gt;
It’s a native histogram produced from spans using Alloy or the OTEL collector. The metric is filtered by:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;service_name=&amp;quot;&amp;lt;SERVICE_NAME&amp;gt;&amp;quot;&lt;/code&gt; targets a particular service.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;span_kind=&amp;quot;SPAN_KIND_SERVER&amp;quot;&lt;/code&gt; selects spans handling inbound requests.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;status_code!=&amp;quot;STATUS_CODE_ERROR&amp;quot;&lt;/code&gt; excludes spans that ended with errors.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;You should query &lt;code&gt;traces_spanmetrics_latency&lt;/code&gt; when using other span metric generators.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;rate(...[10m])&lt;/code&gt;
Converts the histogram into a per-second histogram over the last 10 minutes (the distribution of spans per second during that period).
This makes the time window explicit and ensures latencies can be calculated over the last 10 minutes using &lt;code&gt;histogram_*&lt;/code&gt; functions.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;sum by (span_name)( … )&lt;/code&gt;
Merges all series that share the same &lt;code&gt;span_name&lt;/code&gt;. This creates a &lt;a href=&#34;/docs/grafana/latest/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;multidimensional alert&lt;/a&gt; that generates one alert instance per span name (operation).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;histogram_quantile(0.95, ...)&lt;/code&gt;
Calculates p95 latency from the histogram after applying the rate.
The query runs as an &lt;strong&gt;instant Prometheus query&lt;/strong&gt;, returning a single value for the 10-minute window.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;gt; 2&lt;/code&gt;
Defines the threshold condition. It returns only series whose p95 latency exceeds 2 seconds.
Alternatively, you can set this threshold as a Grafana Alerting expression in the UI, as shown in the following screenshot.&lt;/p&gt;
&lt;figure
      class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
      style=&#34;max-width: 750px;&#34;
      itemprop=&#34;associatedMedia&#34;
      itemscope=&#34;&#34;
      itemtype=&#34;http://schema.org/ImageObject&#34;
    &gt;&lt;a
          class=&#34;lightbox-link captioned&#34;
          href=&#34;/media/docs/alerting/trace-based-alertrule-screenshot.png&#34;
          itemprop=&#34;contentUrl&#34;
        &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
            class=&#34;lazyload mb-0&#34;
            data-src=&#34;/media/docs/alerting/trace-based-alertrule-screenshot.png&#34;data-srcset=&#34;/media/docs/alerting/trace-based-alertrule-screenshot.png?w=320 320w, /media/docs/alerting/trace-based-alertrule-screenshot.png?w=550 550w, /media/docs/alerting/trace-based-alertrule-screenshot.png?w=750 750w, /media/docs/alerting/trace-based-alertrule-screenshot.png?w=900 900w, /media/docs/alerting/trace-based-alertrule-screenshot.png?w=1040 1040w, /media/docs/alerting/trace-based-alertrule-screenshot.png?w=1240 1240w, /media/docs/alerting/trace-based-alertrule-screenshot.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Alert rule querying span metrics and using threshold expression&#34;width=&#34;1289&#34;height=&#34;937&#34;title=&#34;Alert rule querying span metrics and using threshold expression&#34;/&gt;
          &lt;noscript&gt;
            &lt;img
              src=&#34;/media/docs/alerting/trace-based-alertrule-screenshot.png&#34;
              alt=&#34;Alert rule querying span metrics and using threshold expression&#34;width=&#34;1289&#34;height=&#34;937&#34;title=&#34;Alert rule querying span metrics and using threshold expression&#34;/&gt;
          &lt;/noscript&gt;&lt;/div&gt;&lt;figcaption class=&#34;w-100p caption text-gray-13  &#34;&gt;Alert rule querying span metrics and using threshold expression&lt;/figcaption&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&#34;using-classic-histograms&#34;&gt;Using classic histograms&lt;/h4&gt;
&lt;p&gt;Native histograms are stable in Prometheus since v3.8.0. Your span metric generator may therefore create classic histograms for latency span metrics, either &lt;code&gt;traces_span_metrics_duration_seconds&lt;/code&gt; or &lt;code&gt;traces_spanmetrics_latency&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When using classic histograms, the metric is the same but the metric format changes. A classic histogram represents a histogram with fixed buckets and exposes three metrics:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;_bucket&lt;/code&gt;: cumulative buckets of the observations.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;_sum&lt;/code&gt;: total sum of all observed values.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;_count&lt;/code&gt;: count of observed values.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;To calculate percentiles accurately, especially exceeding a particular threshold (e.g. &lt;code&gt;`2s`&lt;/code&gt;), you have to configure the classic histogram with the explicit bucket, such as:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;[&amp;#34;100ms&amp;#34;, &amp;#34;250ms&amp;#34;, &amp;#34;1s&amp;#34;, &amp;#34;2s&amp;#34;, &amp;#34;5s&amp;#34;]&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;otelcol.connector.spanmetrics&lt;/code&gt; can configure the buckets using the &lt;a href=&#34;/docs/alloy/latest/reference/components/otelcol/otelcol.connector.spanmetrics/#explicit&#34;&gt;&lt;code&gt;explicit&lt;/code&gt; block&lt;/a&gt;. The metric-generator in Tempo can configure the &lt;a href=&#34;/docs/tempo/latest/configuration/#metrics-generator&#34;&gt;&lt;code&gt;span_metrics.histogram_buckets&lt;/code&gt; setting&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s the equivalent PromQL for classic histograms:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;histogram_quantile(0.95,
 sum by (span_name, le) (
   rate(traces_span_metrics_duration_seconds_bucket{
     service_name=&amp;#34;&amp;lt;SERVICE_NAME&amp;gt;&amp;#34;,
     span_kind=&amp;#34;SPAN_KIND_SERVER&amp;#34;,
     status_code!=&amp;#34;STATUS_CODE_ERROR&amp;#34;
   }[10m])
 )
) &amp;gt; 2&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Key differences compared with the native histograms example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;You must configure a histogram bucket matching the desired threshold (for example, &lt;code&gt;2s&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;You must query the &lt;code&gt;_bucket&lt;/code&gt; metric, not the base metric.&lt;/li&gt;
&lt;li&gt;You must include &lt;code&gt;le&lt;/code&gt; in the &lt;code&gt;sum by (…)&lt;/code&gt; grouping for &lt;code&gt;histogram_quantile&lt;/code&gt; calculation.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Everything else remains the same.&lt;/p&gt;


&lt;div class=&#34;admonition admonition-note&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Note&lt;/p&gt;&lt;p&gt;The alert rules in these examples create &lt;a href=&#34;/docs/grafana/latest/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;multi-dimensional alerts&lt;/a&gt;: one alert instance for each distinct span name.&lt;/p&gt;
&lt;p&gt;Dynamic span routes such as &lt;code&gt;/product/1234&lt;/code&gt; can create separate metric dimensions and alerts for each unique span, which can significantly impact metric costs and performance for large volumes.&lt;/p&gt;
&lt;p&gt;To prevent high-cardinality data, normalize dynamic routes like &lt;code&gt;/product/{id}&lt;/code&gt; using semantic attributes such as &lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/registry/attributes/http/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;&lt;code&gt;http.route&lt;/code&gt;&lt;/a&gt; and &lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/registry/attributes/url/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;&lt;code&gt;url.template&lt;/code&gt;&lt;/a&gt;, and limit dimensions to low-cardinality fields such as &lt;code&gt;service_name&lt;/code&gt;, &lt;code&gt;status_code&lt;/code&gt;, or &lt;code&gt;http_method&lt;/code&gt;.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;h3 id=&#34;detect-high-error-rate&#34;&gt;Detect high error rate&lt;/h3&gt;
&lt;p&gt;This example defines an alert rule that detects when the error rate for any operation exceeds 20%. You can use this error rate alerts to identify increases in request errors, such as 5xx responses or internal failures.&lt;/p&gt;
&lt;p&gt;The following query calculates the fraction of failed server spans for each service and operation.&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;(
  sum by (service, span_name) (
    rate(traces_span_metrics_calls_total{
      span_kind=&amp;#34;SPAN_KIND_SERVER&amp;#34;,
      status_code=&amp;#34;STATUS_CODE_ERROR&amp;#34;
    }[10m])
  )
/
  sum by (service, span_name) (
    rate(traces_span_metrics_calls_total{
      span_kind=&amp;#34;SPAN_KIND_SERVER&amp;#34;
    }[10m])
  )
) &amp;gt; 0.2&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Here’s the query breakdown&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;traces_span_metrics_calls_total&lt;/code&gt;
A counter metric produced from spans that tracks the number of completed span operations.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;span_kind=&amp;quot;SPAN_KIND_SERVER&amp;quot;&lt;/code&gt; selects spans handling inbound requests.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;status_code=&amp;quot;STATUS_CODE_ERROR&amp;quot;&lt;/code&gt; selects only spans that ended in error.&lt;/li&gt;
&lt;li&gt;Omitting the &lt;code&gt;status_code&lt;/code&gt; filter in the denominator includes all spans, returning the total span count.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;em&gt;Check whether your metric generator instead creates the &lt;code&gt;traces_spanmetrics_calls_total&lt;/code&gt; metric, and adjust the metric name.&lt;/em&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;rate(...[10m])&lt;/code&gt;
Converts the cumulative histogram into a per-second histogram over the last 10 minutes (the distribution of spans per second during that period).
This makes the time window explicit and ensures counters can be calculated over the last 10 minutes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;sum by (service, span_name)( … )&lt;/code&gt;
Aggregates per service and operation, creating one alert instance for each &lt;code&gt;(service, span_name)&lt;/code&gt; combination.
This is a &lt;a href=&#34;/docs/grafana/latest/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;multidimensional alert&lt;/a&gt; that applies to all services, helping identify which service and corresponding operation is failing.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;sum by () (...) / sum by () (...)&lt;/code&gt;
Divides failed spans by total spans to calculate the error rate per operation.
The result is a ratio between &lt;code&gt;0&lt;/code&gt; and &lt;code&gt;1,&lt;/code&gt; where &lt;code&gt;1&lt;/code&gt; means all operations failed.
The query runs as an &lt;strong&gt;instant Prometheus query&lt;/strong&gt;, returning a single value for the 10-minute window.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;gt; 0.2&lt;/code&gt;
Defines the threshold condition. It returns only series whose error rate is higher than 20% of spans.
Alternatively, you can set this threshold as a Grafana Alerting expression in the UI.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;enable-traffic-guardrails&#34;&gt;Enable traffic guardrails&lt;/h3&gt;
&lt;p&gt;When the traffic is very low, even a single slow or failing request can trigger the alerts.&lt;/p&gt;
&lt;p&gt;To avoid these types of false positives during low-traffic periods, you can include a &lt;strong&gt;minimum traffic condition&lt;/strong&gt; in your alert rule queries. For example:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum by (service, span_name)(
  increase(traces_span_metrics_calls_total{
    span_kind=&amp;#34;SPAN_KIND_SERVER&amp;#34;
  }[10m])
) &amp;gt; 300&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This query returns only spans that handled more than 300 requests in the 10-minute period.&lt;/p&gt;
&lt;p&gt;This minimum level of traffic helps prevent false positives, ensuring the alert evaluates a significant number of spans before triggering.&lt;/p&gt;
&lt;p&gt;You can combine this traffic condition with the &lt;strong&gt;error-rate&lt;/strong&gt; query to ensure alerts fire only when both conditions are met:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;((
  sum by (service, span_name) (
    rate(traces_span_metrics_calls_total{
      span_kind=&amp;#34;SPAN_KIND_SERVER&amp;#34;,
      status_code=&amp;#34;STATUS_CODE_ERROR&amp;#34;
    }[10m])
  )
/
  sum by (service, span_name) (
    rate(traces_span_metrics_calls_total{
      span_kind=&amp;#34;SPAN_KIND_SERVER&amp;#34;
    }[10m])
  )
) &amp;gt; 0.2)
and
(
    sum by (service, span_name)(
    increase(traces_span_metrics_calls_total{
      span_kind=&amp;#34;SPAN_KIND_SERVER&amp;#34;
    }[10m])
) &amp;gt; 300 )&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;For a given span, the alert fires when:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;error rate exceeds 20%&lt;/strong&gt; over the last 10 minutes.&lt;/li&gt;
&lt;li&gt;The span &lt;strong&gt;handled at least 300 requests&lt;/strong&gt; over the last 10 minutes.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong&gt;Alternatively&lt;/strong&gt;, you can split the alert into separate queries and combine them using a math expression as the threshold. In the example below, &lt;code&gt;$ErrorRateCondition&lt;/code&gt; is the Grafana reference for the error-rate query, and &lt;code&gt;$TrafficCondition&lt;/code&gt; is the reference for the traffic query.&lt;/p&gt;
&lt;figure
    class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
    style=&#34;max-width: 500px;&#34;
    itemprop=&#34;associatedMedia&#34;
    itemscope=&#34;&#34;
    itemtype=&#34;http://schema.org/ImageObject&#34;
  &gt;&lt;a
        class=&#34;lightbox-link&#34;
        href=&#34;/media/docs/alerting/traffic-guardrail-with-separate-queries.png&#34;
        itemprop=&#34;contentUrl&#34;
      &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
          class=&#34;lazyload &#34;
          data-src=&#34;/media/docs/alerting/traffic-guardrail-with-separate-queries.png&#34;data-srcset=&#34;/media/docs/alerting/traffic-guardrail-with-separate-queries.png?w=320 320w, /media/docs/alerting/traffic-guardrail-with-separate-queries.png?w=550 550w, /media/docs/alerting/traffic-guardrail-with-separate-queries.png?w=750 750w, /media/docs/alerting/traffic-guardrail-with-separate-queries.png?w=900 900w, /media/docs/alerting/traffic-guardrail-with-separate-queries.png?w=1040 1040w, /media/docs/alerting/traffic-guardrail-with-separate-queries.png?w=1240 1240w, /media/docs/alerting/traffic-guardrail-with-separate-queries.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Alert rule with threshold based on two queries&#34;width=&#34;658&#34;height=&#34;250&#34;/&gt;
        &lt;noscript&gt;
          &lt;img
            src=&#34;/media/docs/alerting/traffic-guardrail-with-separate-queries.png&#34;
            alt=&#34;Alert rule with threshold based on two queries&#34;width=&#34;658&#34;height=&#34;250&#34;/&gt;
        &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;p&gt;In this case, you must ensure both queries group by the same labels.&lt;/p&gt;
&lt;p&gt;The advantage of this approach is that you can observe the results of both independent queries. You can then access the query results through the &lt;a href=&#34;/docs/grafana/latest/alerting/alerting-rules/templates/reference/#values&#34;&gt;&lt;code&gt;$values&lt;/code&gt; variable&lt;/a&gt; and display them in notifications or use them in custom labels.&lt;/p&gt;
&lt;p&gt;A potential drawback of splitting queries is that each query runs separately. This increases backend load and can affect query performance, especially in environments with a large number of active alerts.&lt;/p&gt;
&lt;p&gt;You can apply this traffic guardrail pattern to any alert rule.&lt;/p&gt;
&lt;h3 id=&#34;consider-sampling&#34;&gt;Consider sampling&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;/docs/tempo/latest/set-up-for-tracing/instrument-send/set-up-collector/tail-sampling/&#34;&gt;Sampling&lt;/a&gt; is a technique used to reduce the amount of collected spans for cost-saving purposes. There are two main strategies which can be combined:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Head sampling&lt;/strong&gt;: The decision to record or drop a span is made when the trace begins. The condition can be configured probabilistically (a percentage of traces) or by filtering out certain operations.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Tail sampling&lt;/strong&gt;: The decision is made after the trace completes. This allows sampling more interesting operations, such as slow or failing requests.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;With &lt;strong&gt;head sampling&lt;/strong&gt;, alerting on span metrics should be done with caution, since span metrics will represent only a subset of all traces.&lt;/p&gt;
&lt;p&gt;With &lt;strong&gt;tail sampling&lt;/strong&gt;, it’s important to generate span metrics before a sampling decision is made. &lt;a href=&#34;/docs/grafana-cloud/adaptive-telemetry/adaptive-traces/&#34;&gt;Grafana Cloud Adaptive Traces&lt;/a&gt; handle this automatically. With Alloy or the OpenTelemetry Collector, make sure the SpanMetrics connector runs before the filtering or &lt;a href=&#34;/docs/alloy/latest/reference/components/otelcol/otelcol.processor.tail_sampling/&#34;&gt;tail sampling processor&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;using-traceql&#34;&gt;Using TraceQL&lt;/h2&gt;
&lt;p&gt;TraceQL is a query language for searching and filtering traces in Grafana Tempo, which uses a syntax similar to &lt;code&gt;PromQL&lt;/code&gt; and &lt;code&gt;LogQL&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;With TraceQL, you can skip converting tracing data into span metrics and query raw trace data directly. It provides a more flexible filtering based on the trace structure, attributes, or resource metadata, and can detect issues faster as it does not wait for metric generation.&lt;/p&gt;
&lt;p&gt;TraceQL isn&amp;rsquo;t suitable for all scenarios. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Inadequate for long-term analysis&lt;/strong&gt;
Trace data has a significantly shorter retention period than metrics. For historical monitoring, it’s recommended to convert key tracing data into metrics to ensure the persistence of important data.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Inadequate for alerting after sampling&lt;/strong&gt;
TraceQL can only query traces that are actually stored in Tempo. If sampling drops a large portion of traces, TraceQL-based alerts may miss real issues. Refer to &lt;a href=&#34;#consider-sampling&#34;&gt;consider sampling&lt;/a&gt; for guidance on how to generate span metrics before sampling.&lt;/p&gt;


&lt;div class=&#34;admonition admonition-caution&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Caution&lt;/p&gt;&lt;p&gt;TraceQL alerting is available in Grafana v12.1 or higher, supported as an &lt;a href=&#34;/docs/release-life-cycle/&#34;&gt;experimental feature&lt;/a&gt;.
Engineering and on-call support isn&amp;rsquo;t available. Documentation is either limited or not provided outside of code comments. No SLA is provided.&lt;/p&gt;
&lt;p&gt;While TraceQL can be powerful for exploring and detecting issues directly from trace data, &lt;strong&gt;alerting with TraceQL shouldn&amp;rsquo;t be used in production environments yet&lt;/strong&gt;. Use it for testing and experimentation at this moment.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following example demonstrates how to recreate the previous &lt;strong&gt;alert rule that detected slow span operations&lt;/strong&gt; using TraceQL.&lt;/p&gt;
&lt;p&gt;Follow these steps to create the alert:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Enable TraceQL alerting
To use TraceQL in alerts, you must enable the &lt;a href=&#34;/docs/grafana/latest/setup-grafana/configure-grafana/#feature_toggles&#34;&gt;&lt;strong&gt;&lt;code&gt;tempoAlerting&lt;/code&gt;&lt;/strong&gt; feature flag in your Grafana configuration&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;If you use Grafana Cloud, contact Support to enable TraceQL alerting.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Configure the alert query&lt;/p&gt;
&lt;p&gt;In your alert rule, select the &lt;strong&gt;Tempo&lt;/strong&gt; data source, then convert the original PromQL query into the equivalent TraceQL query:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;traceql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-traceql&#34;&gt;{status != error &amp;amp;&amp;amp; kind = server &amp;amp;&amp;amp; .service.name = &amp;#34;&amp;lt;SERVICE_NAME&amp;gt;&amp;#34;}
| quantile_over_time(duration, .95) by (name)&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;For a given service, this query calculates the &lt;strong&gt;p95 latency&lt;/strong&gt; for all server spans, excluding errors, and groups them by span name.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Configure the time range&lt;/p&gt;
&lt;p&gt;Currently, TraceQL alerting supports only range queries.
To define the time window, set the query time range to &lt;strong&gt;the last 10 minutes.&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;From: &lt;code&gt;now-10m&lt;/code&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;To: &lt;code&gt;now&lt;/code&gt;&lt;/p&gt;
&lt;figure
         class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
         style=&#34;max-width: 750px;&#34;
         itemprop=&#34;associatedMedia&#34;
         itemscope=&#34;&#34;
         itemtype=&#34;http://schema.org/ImageObject&#34;
       &gt;&lt;a
             class=&#34;lightbox-link&#34;
             href=&#34;/media/docs/alerting/traceql-alert-configure-time-range.png&#34;
             itemprop=&#34;contentUrl&#34;
           &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
               class=&#34;lazyload &#34;
               data-src=&#34;/media/docs/alerting/traceql-alert-configure-time-range.png&#34;data-srcset=&#34;/media/docs/alerting/traceql-alert-configure-time-range.png?w=320 320w, /media/docs/alerting/traceql-alert-configure-time-range.png?w=550 550w, /media/docs/alerting/traceql-alert-configure-time-range.png?w=750 750w, /media/docs/alerting/traceql-alert-configure-time-range.png?w=900 900w, /media/docs/alerting/traceql-alert-configure-time-range.png?w=1040 1040w, /media/docs/alerting/traceql-alert-configure-time-range.png?w=1240 1240w, /media/docs/alerting/traceql-alert-configure-time-range.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Time range configuration for TraceQL alert rule&#34;width=&#34;933&#34;height=&#34;579&#34;/&gt;
             &lt;noscript&gt;
               &lt;img
                 src=&#34;/media/docs/alerting/traceql-alert-configure-time-range.png&#34;
                 alt=&#34;Time range configuration for TraceQL alert rule&#34;width=&#34;933&#34;height=&#34;579&#34;/&gt;
             &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Add a reducer expression.&lt;/p&gt;
&lt;p&gt;Range queries return time series data, not a single value. The alert rule must then &lt;strong&gt;reduce&lt;/strong&gt; time series data to a single numeric value before comparing it against a threshold.&lt;/p&gt;
&lt;p&gt;Add a &lt;strong&gt;Reduce&lt;/strong&gt; expression to convert the query results into a single value.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Set the threshold condition.&lt;/p&gt;
&lt;p&gt;Create a &lt;strong&gt;Threshold&lt;/strong&gt; expression to fire when the p95 latency exceeds 2 seconds: &lt;strong&gt;$B &amp;gt; 2&lt;/strong&gt;.&lt;/p&gt;
&lt;figure
       class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
       style=&#34;max-width: 750px;&#34;
       itemprop=&#34;associatedMedia&#34;
       itemscope=&#34;&#34;
       itemtype=&#34;http://schema.org/ImageObject&#34;
     &gt;&lt;a
           class=&#34;lightbox-link&#34;
           href=&#34;/media/docs/alerting/traceql-alert-configure-threshold.png&#34;
           itemprop=&#34;contentUrl&#34;
         &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
             class=&#34;lazyload &#34;
             data-src=&#34;/media/docs/alerting/traceql-alert-configure-threshold.png&#34;data-srcset=&#34;/media/docs/alerting/traceql-alert-configure-threshold.png?w=320 320w, /media/docs/alerting/traceql-alert-configure-threshold.png?w=550 550w, /media/docs/alerting/traceql-alert-configure-threshold.png?w=750 750w, /media/docs/alerting/traceql-alert-configure-threshold.png?w=900 900w, /media/docs/alerting/traceql-alert-configure-threshold.png?w=1040 1040w, /media/docs/alerting/traceql-alert-configure-threshold.png?w=1240 1240w, /media/docs/alerting/traceql-alert-configure-threshold.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Alert rule configuration showing reducer and threshold expressions for TraceQL query&#34;width=&#34;939&#34;height=&#34;321&#34;/&gt;
           &lt;noscript&gt;
             &lt;img
               src=&#34;/media/docs/alerting/traceql-alert-configure-threshold.png&#34;
               alt=&#34;Alert rule configuration showing reducer and threshold expressions for TraceQL query&#34;width=&#34;939&#34;height=&#34;321&#34;/&gt;
           &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;This final alert detects when 95% of the server spans for a particular service (excluding errors) take longer than 2 seconds to complete, using raw trace data instead of span metrics.&lt;/p&gt;
&lt;h2 id=&#34;additional-resources&#34;&gt;Additional resources&lt;/h2&gt;
&lt;p&gt;To explore related topics and expand the examples in this guide, see the following resources:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;/docs/tempo/latest/introduction/trace-structure/&#34;&gt;Trace structure&lt;/a&gt;: Learn how traces and spans are structured.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;/docs/tempo/latest/&#34;&gt;Grafana Tempo documentation&lt;/a&gt;: Full reference for Grafana’s open source tracing backend.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;/docs/tempo/latest/metrics-from-traces/span-metrics/span-metrics-metrics-generator/&#34;&gt;Span metrics using the metrics generator in Tempo&lt;/a&gt;: Generate span metrics directly from traces with Tempo’s built-in metrics generator.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;/docs/tempo/latest/metrics-from-traces/span-metrics/span-metrics-alloy/&#34;&gt;Span metrics using Grafana Alloy&lt;/a&gt;: Configure Alloy to export span metrics from OpenTelemetry (OTel) traces.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;/docs/grafana/latest/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;Multi-dimensional alerts&lt;/a&gt;: Learn how to trigger multiple alert instances per alert rule like in these examples.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;/docs/grafana-cloud/alerting-and-irm/slo/&#34;&gt;Grafana SLO documentation&lt;/a&gt;: Use span metrics to define Service Level Objectives (SLOs) in Grafana.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;a href=&#34;/docs/tempo/latest/set-up-for-tracing/instrument-send/set-up-collector/tail-sampling/#sampling&#34;&gt;Trace sampling&lt;/a&gt;: explore strategies and configuration in Grafana Tempo.&lt;/p&gt;


&lt;div class=&#34;admonition admonition-note&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Note&lt;/p&gt;&lt;p&gt;OpenTelemetry instrumentations can record metrics independently of spans.&lt;/p&gt;
&lt;p&gt;These &lt;a href=&#34;https://opentelemetry.io/docs/specs/semconv/general/metrics/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;OTEL metrics&lt;/a&gt; are not derived from traces and are not affected by sampling. They can serve as an alternative to span-derived metrics.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="examples-of-trace-based-alerts">Examples of trace-based alerts&lt;/h1>
&lt;p>Metrics are the foundation of most alerting systems. They are usually the first signal that something is wrong, but they don’t always indicate &lt;em>where&lt;/em> or &lt;em>why&lt;/em> a failure occurs.&lt;/p></description></item><item><title>Example of dynamic labels in alert instances</title><link>https://grafana.com/docs/grafana/v12.4/alerting/examples/dynamic-labels/</link><pubDate>Fri, 03 Apr 2026 12:35:46 -0500</pubDate><guid>https://grafana.com/docs/grafana/v12.4/alerting/examples/dynamic-labels/</guid><content><![CDATA[&lt;h1 id=&#34;example-of-dynamic-labels-in-alert-instances&#34;&gt;Example of dynamic labels in alert instances&lt;/h1&gt;
&lt;p&gt;Labels are essential for scaling your alerting setup. They define metadata like &lt;code&gt;severity&lt;/code&gt;, &lt;code&gt;team&lt;/code&gt;, &lt;code&gt;category&lt;/code&gt;, or &lt;code&gt;environment&lt;/code&gt;, which you can use for alert routing.&lt;/p&gt;
&lt;p&gt;A label like &lt;code&gt;severity=&amp;quot;critical&amp;quot;&lt;/code&gt; can be set statically in the alert rule configuration, or dynamically based on a query value such as the current free disk space. Dynamic labels &lt;strong&gt;adjust label values at runtime&lt;/strong&gt;, allowing you to reuse the same alert rule across different scenarios.&lt;/p&gt;
&lt;p&gt;This example shows how to define dynamic labels based on query values, along with key behavior to keep in mind when using them.&lt;/p&gt;
&lt;p&gt;First, it&amp;rsquo;s important to understand how Grafana Alerting treats 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rules/annotation-label/#labels&#34;&gt;labels&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;alert-instances-are-defined-by-labels&#34;&gt;Alert instances are defined by labels&lt;/h2&gt;
&lt;p&gt;Each alert rule creates a separate alert instance for every unique combination of labels.&lt;/p&gt;
&lt;p&gt;This is called 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;multi-dimensional alerts&lt;/a&gt;: one rule, many instances—&lt;strong&gt;one per unique label set&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;For example, a rule that queries CPU usage per host might return multiple series (or dimensions):&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, instance=&amp;quot;prod-server-1&amp;quot; }&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, instance=&amp;quot;prod-server-2&amp;quot; }&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, instance=&amp;quot;prod-server-3&amp;quot; }&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Each unique label combination defines a distinct alert instance, with its own evaluation state and potential notifications.&lt;/p&gt;
&lt;p&gt;The full label set of an alert instance can include:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Labels from the query result (e.g., &lt;code&gt;instance&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Auto-generated labels (e.g., &lt;code&gt;alertname&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;User-defined labels from the rule configuration&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;user-defined-labels&#34;&gt;User-defined labels&lt;/h2&gt;
&lt;p&gt;As shown earlier, alert instances automatically include labels from the query result, such as &lt;code&gt;instance&lt;/code&gt; or &lt;code&gt;job&lt;/code&gt;. To add more context or control alert routing, you can define &lt;em&gt;user-defined labels&lt;/em&gt; in the alert rule configuration:&lt;/p&gt;
&lt;figure
    class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
    style=&#34;max-width: 750px;&#34;
    itemprop=&#34;associatedMedia&#34;
    itemscope=&#34;&#34;
    itemtype=&#34;http://schema.org/ImageObject&#34;
  &gt;&lt;a
        class=&#34;lightbox-link&#34;
        href=&#34;/media/docs/alerting/example-dynamic-labels-edit-labels-v3.png&#34;
        itemprop=&#34;contentUrl&#34;
      &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
          class=&#34;lazyload &#34;
          data-src=&#34;/media/docs/alerting/example-dynamic-labels-edit-labels-v3.png&#34;data-srcset=&#34;/media/docs/alerting/example-dynamic-labels-edit-labels-v3.png?w=320 320w, /media/docs/alerting/example-dynamic-labels-edit-labels-v3.png?w=550 550w, /media/docs/alerting/example-dynamic-labels-edit-labels-v3.png?w=750 750w, /media/docs/alerting/example-dynamic-labels-edit-labels-v3.png?w=900 900w, /media/docs/alerting/example-dynamic-labels-edit-labels-v3.png?w=1040 1040w, /media/docs/alerting/example-dynamic-labels-edit-labels-v3.png?w=1240 1240w, /media/docs/alerting/example-dynamic-labels-edit-labels-v3.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Edit labels UI in the alert rule configuration.&#34;width=&#34;933&#34;height=&#34;380&#34;/&gt;
        &lt;noscript&gt;
          &lt;img
            src=&#34;/media/docs/alerting/example-dynamic-labels-edit-labels-v3.png&#34;
            alt=&#34;Edit labels UI in the alert rule configuration.&#34;width=&#34;933&#34;height=&#34;380&#34;/&gt;
        &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;p&gt;User-defined labels can be either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Fixed labels&lt;/strong&gt;: These have the same value for every alert instance. They are often used to include common metadata, such as team ownership.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Templated labels&lt;/strong&gt;: These calculate their values based on the query result at evaluation time.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;templated-labels&#34;&gt;Templated labels&lt;/h2&gt;
&lt;p&gt;Templated labels evaluate their values dynamically, based on the query result. This allows the label value to vary per alert instance.&lt;/p&gt;
&lt;p&gt;Use templated labels to inject additional context into alerts. To learn about syntax and use cases, refer to 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/alerting-rules/templates/&#34;&gt;Template annotations and labels&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;You can define templated labels that produce either:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;A fixed value per alert instance.&lt;/li&gt;
&lt;li&gt;A dynamic value per alert instance that changes based on the last query result.&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;fixed-values-per-alert-instance&#34;&gt;Fixed values per alert instance&lt;/h3&gt;
&lt;p&gt;You can use a known label value to enrich the alert with additional metadata not present in existing labels. For example, you can map the &lt;code&gt;instance&lt;/code&gt; label to an &lt;code&gt;env&lt;/code&gt; label that represents the deployment environment:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;Go&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-go&#34;&gt;{{- if eq $labels.instance &amp;#34;prod-server-1&amp;#34; -}}production
{{- else if eq $labels.instance &amp;#34;stag-server-1&amp;#34; -}}staging
{{- else -}}development
{{- end -}}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This produces alert instances like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, instance=&amp;quot;prod-server-1&amp;quot;, env=&amp;quot;production&amp;quot;}&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, instance=&amp;quot;stag-server-1&amp;quot;, env=&amp;quot;staging&amp;quot;}&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In this example, the &lt;code&gt;env&lt;/code&gt; label is fixed for each alert instance and does not change during its lifecycle.&lt;/p&gt;
&lt;h3 id=&#34;dynamic-values-per-alert-instance&#34;&gt;Dynamic values per alert instance&lt;/h3&gt;
&lt;p&gt;You can define a label whose value depends on the numeric result of a query—mapping it to a predefined set of options. This is useful for representing &lt;code&gt;severity&lt;/code&gt; levels within a single alert rule.&lt;/p&gt;
&lt;p&gt;Instead of defining three separate rules like:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;CPU ≥ 90&lt;/em&gt; → &lt;code&gt;severity=critical&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;CPU ≥ 80&lt;/em&gt; → &lt;code&gt;severity=warning&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;CPU ≥ 70&lt;/em&gt; → &lt;code&gt;severity=minor&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can define a single rule and assign &lt;code&gt;severity&lt;/code&gt; dynamically using a template:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;Go&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-go&#34;&gt;{{/* $values.B.Value refers to the numeric result from query B */}}
{{- if gt $values.B.Value 90.0 -}}critical
{{- else if gt $values.B.Value 80.0 -}}warning
{{- else if gt $values.B.Value 70.0 -}}minor
{{- else -}}none
{{- end -}}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This pattern lets you express multiple alerting scenarios in a single rule, while still routing based on the &lt;code&gt;severity&lt;/code&gt; label value.&lt;/p&gt;
&lt;h2 id=&#34;example-overview&#34;&gt;Example overview&lt;/h2&gt;
&lt;p&gt;In the previous severity template, you can set the alert condition to &lt;code&gt;$B &amp;gt; 70&lt;/code&gt; to prevent firing when &lt;code&gt;severity=none&lt;/code&gt;, and then use the &lt;code&gt;severity&lt;/code&gt; label to route distinct alert instances to different contact points.&lt;/p&gt;
&lt;p&gt;For example, configure a 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/notifications/notification-policies/&#34;&gt;notification policy&lt;/a&gt; that matches &lt;code&gt;alertname=&amp;quot;ServerHighCPU&amp;quot;&lt;/code&gt; with the following children policies:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;severity=critical&lt;/code&gt; → escalate to an incident response and management solution (IRM).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;severity=warning&lt;/code&gt; → send to the team&amp;rsquo;s Slack channel.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;severity=minor&lt;/code&gt; → send to a non-urgent queue or log-only dashboard.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The resulting alerting flow might look like this:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Time&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;$B query&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Alert instance&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Routed to&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t1&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;65&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, severity=&amp;quot;none&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;Not firing&lt;/code&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t2&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;75&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, severity=&amp;quot;minor&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Non-urgent queue&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t3&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;85&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, severity=&amp;quot;warning&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Team Slack channel&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t4&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;95&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{alertname=&amp;quot;ServerHighCPU&amp;quot;, severity=&amp;quot;critical&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;IRM escalation chain&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;This alerting setup allows you to:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Use a single rule for multiple severity levels.&lt;/li&gt;
&lt;li&gt;Route alerts dynamically using the label value.&lt;/li&gt;
&lt;li&gt;Simplify alert rule maintenance and avoid duplication.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;However, dynamic labels can introduce unexpected behavior when label values change. The next section explains this.&lt;/p&gt;
&lt;h2 id=&#34;caveat-a-label-change-affects-a-distinct-alert-instance&#34;&gt;Caveat: a label change affects a distinct alert instance&lt;/h2&gt;
&lt;p&gt;Remember: &lt;strong&gt;alert instances are defined by their labels&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;If a dynamic label changes between evaluations, this new value affects a separate alert instance.&lt;/p&gt;
&lt;p&gt;Here&amp;rsquo;s what happens if &lt;code&gt;severity&lt;/code&gt; changes from &lt;code&gt;minor&lt;/code&gt; to &lt;code&gt;warning&lt;/code&gt;:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;The instance with &lt;code&gt;severity=&amp;quot;minor&amp;quot;&lt;/code&gt; disappears → it becomes a missing series.&lt;/li&gt;
&lt;li&gt;A new instance with &lt;code&gt;severity=&amp;quot;warning&amp;quot;&lt;/code&gt; appears → it starts from scratch.&lt;/li&gt;
&lt;li&gt;After two evaluations without data, the &lt;code&gt;minor&lt;/code&gt; instance is &lt;strong&gt;resolved and evicted&lt;/strong&gt;.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Here’s a sequence example:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Time&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Query value&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Instance &lt;code&gt;severity=&amp;quot;none&amp;quot;&lt;/code&gt;&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Instance &lt;code&gt;severity=&amp;quot;minor&amp;quot;&lt;/code&gt;&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Instance &lt;code&gt;severity=&amp;quot;warning&amp;quot;&lt;/code&gt;&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t0&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t1&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;75&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;🔴 📩&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t2&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;85&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;⚠️ MissingSeries&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;🔴 📩&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t3&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;85&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;⚠️ MissingSeries&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;🔴&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t4&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;50&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;🟢&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;📩 Resolved and evicted&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;⚠️ MissingSeries&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t5&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;50&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;🟢&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;⚠️ MissingSeries&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;t6&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;50&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;🟢&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;📩 Resolved and evicted&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;Learn more about this behavior in 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/&#34;&gt;Stale alert instances&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;In this example, the &lt;code&gt;minor&lt;/code&gt; and &lt;code&gt;warning&lt;/code&gt; alerts likely represent the same underlying issue, but Grafana treats them as distinct alert instances. As a result, this scenario generates two firing notifications and two resolved notifications, one for each instance.&lt;/p&gt;
&lt;p&gt;This behavior is important to keep in mind when dynamic label values change frequently.&lt;/p&gt;
&lt;p&gt;It can lead to multiple notifications firing and resolving in short intervals, resulting in &lt;strong&gt;noisy and confusing notifications&lt;/strong&gt;.&lt;/p&gt;
&lt;h2 id=&#34;try-it-with-testdata&#34;&gt;Try it with TestData&lt;/h2&gt;
&lt;p&gt;You can replicate this scenario using the 
    &lt;a href=&#34;/docs/grafana/v12.4/datasources/testdata/&#34;&gt;TestData data source&lt;/a&gt; to simulate an unstable signal—like monitoring a noisy sensor.&lt;/p&gt;
&lt;p&gt;This setup reproduces label flapping and shows how dynamic label values affect alert instance behavior.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Add the &lt;strong&gt;TestData&lt;/strong&gt; data source through the &lt;strong&gt;Connections&lt;/strong&gt; menu.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create an alert rule.&lt;/p&gt;
&lt;p&gt;Navigate to &lt;strong&gt;Alerting&lt;/strong&gt; → &lt;strong&gt;Alert rules&lt;/strong&gt; and click &lt;strong&gt;New alert rule&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Simulate a query (&lt;code&gt;$A&lt;/code&gt;) that returns a noisy signal.&lt;/p&gt;
&lt;p&gt;Select &lt;strong&gt;TestData&lt;/strong&gt; as the data source and configure the scenario.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Scenario: Random Walk&lt;/li&gt;
&lt;li&gt;Series count: 1&lt;/li&gt;
&lt;li&gt;Start value: 51&lt;/li&gt;
&lt;li&gt;Min: 50, Max: 100&lt;/li&gt;
&lt;li&gt;Spread: 100 (ensures large changes between consecutive data points)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Add an expression.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Type: Reduce&lt;/li&gt;
&lt;li&gt;Input: A&lt;/li&gt;
&lt;li&gt;Function: Last (to get the most recent value)&lt;/li&gt;
&lt;li&gt;Name: B&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Define the alert condition.&lt;/p&gt;
&lt;p&gt;Use a threshold like &lt;code&gt;$B &amp;gt;= 50&lt;/code&gt; (it always fires).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Click &lt;strong&gt;Edit Labels&lt;/strong&gt; to add a dynamic label.&lt;/p&gt;
&lt;p&gt;Create a new label &lt;code&gt;severity&lt;/code&gt; and set its value to the following:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;Go&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-go&#34;&gt;{{/* $values.B.Value refers to the numeric result from query B */}}
{{- if gt $values.B.Value 90.0 -}}P1
{{- else if gt $values.B.Value 80.0 -}}P2
{{- else if gt $values.B.Value 70.0 -}}P3
{{- else if gt $values.B.Value 60.0 -}}P4
{{- else if gt $values.B.Value 50.0 -}}P5
{{- else -}}none
{{- end -}}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Set evaluation behavior.&lt;/p&gt;
&lt;p&gt;Set a short evaluation interval (e.g., &lt;code&gt;10s&lt;/code&gt;) to observe quickly label flapping and alert instance transitions in the history.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Preview alert routing to verify the label template.&lt;/p&gt;
&lt;p&gt;In &lt;strong&gt;Configure notifications&lt;/strong&gt;, toggle &lt;strong&gt;Advanced options&lt;/strong&gt;.&lt;br /&gt;
Click &lt;strong&gt;Preview routing&lt;/strong&gt; and check the value of the &lt;code&gt;severity&lt;/code&gt; label:&lt;/p&gt;
&lt;figure
       class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
       style=&#34;max-width: 750px;&#34;
       itemprop=&#34;associatedMedia&#34;
       itemscope=&#34;&#34;
       itemtype=&#34;http://schema.org/ImageObject&#34;
     &gt;&lt;a
           class=&#34;lightbox-link captioned&#34;
           href=&#34;/media/docs/alerting/example-dynamic-labels-preview-label.png&#34;
           itemprop=&#34;contentUrl&#34;
         &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
             class=&#34;lazyload mb-0&#34;
             data-src=&#34;/media/docs/alerting/example-dynamic-labels-preview-label.png&#34;data-srcset=&#34;/media/docs/alerting/example-dynamic-labels-preview-label.png?w=320 320w, /media/docs/alerting/example-dynamic-labels-preview-label.png?w=550 550w, /media/docs/alerting/example-dynamic-labels-preview-label.png?w=750 750w, /media/docs/alerting/example-dynamic-labels-preview-label.png?w=900 900w, /media/docs/alerting/example-dynamic-labels-preview-label.png?w=1040 1040w, /media/docs/alerting/example-dynamic-labels-preview-label.png?w=1240 1240w, /media/docs/alerting/example-dynamic-labels-preview-label.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Preview routing multiple times to verify how label values change over time.&#34;width=&#34;1007&#34;height=&#34;298&#34;title=&#34;Preview routing multiple times to verify how label values change over time.&#34;/&gt;
           &lt;noscript&gt;
             &lt;img
               src=&#34;/media/docs/alerting/example-dynamic-labels-preview-label.png&#34;
               alt=&#34;Preview routing multiple times to verify how label values change over time.&#34;width=&#34;1007&#34;height=&#34;298&#34;title=&#34;Preview routing multiple times to verify how label values change over time.&#34;/&gt;
           &lt;/noscript&gt;&lt;/div&gt;&lt;figcaption class=&#34;w-100p caption text-gray-13  &#34;&gt;Preview routing multiple times to verify how label values change over time.&lt;/figcaption&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Observe alert state changes.&lt;/p&gt;
&lt;p&gt;Click &lt;strong&gt;Save rule and exit&lt;/strong&gt;, and open the 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/monitor-status/view-alert-state-history/&#34;&gt;alert history view&lt;/a&gt; to see how changes in &lt;code&gt;severity&lt;/code&gt; affect the state of distinct alert instances.&lt;/p&gt;
&lt;figure
       class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
       style=&#34;max-width: 750px;&#34;
       itemprop=&#34;associatedMedia&#34;
       itemscope=&#34;&#34;
       itemtype=&#34;http://schema.org/ImageObject&#34;
     &gt;&lt;a
           class=&#34;lightbox-link captioned&#34;
           href=&#34;/media/docs/alerting/example-dynamic-labels-alert-history-page.png&#34;
           itemprop=&#34;contentUrl&#34;
         &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
             class=&#34;lazyload mb-0&#34;
             data-src=&#34;/media/docs/alerting/example-dynamic-labels-alert-history-page.png&#34;data-srcset=&#34;/media/docs/alerting/example-dynamic-labels-alert-history-page.png?w=320 320w, /media/docs/alerting/example-dynamic-labels-alert-history-page.png?w=550 550w, /media/docs/alerting/example-dynamic-labels-alert-history-page.png?w=750 750w, /media/docs/alerting/example-dynamic-labels-alert-history-page.png?w=900 900w, /media/docs/alerting/example-dynamic-labels-alert-history-page.png?w=1040 1040w, /media/docs/alerting/example-dynamic-labels-alert-history-page.png?w=1240 1240w, /media/docs/alerting/example-dynamic-labels-alert-history-page.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;You can find multiple transitions over time as the label value fluctuates.&#34;width=&#34;810&#34;height=&#34;419&#34;title=&#34;You can find multiple transitions over time as the label value fluctuates.&#34;/&gt;
           &lt;noscript&gt;
             &lt;img
               src=&#34;/media/docs/alerting/example-dynamic-labels-alert-history-page.png&#34;
               alt=&#34;You can find multiple transitions over time as the label value fluctuates.&#34;width=&#34;810&#34;height=&#34;419&#34;title=&#34;You can find multiple transitions over time as the label value fluctuates.&#34;/&gt;
           &lt;/noscript&gt;&lt;/div&gt;&lt;figcaption class=&#34;w-100p caption text-gray-13  &#34;&gt;You can find multiple transitions over time as the label value fluctuates.&lt;/figcaption&gt;&lt;/a&gt;&lt;/figure&gt;


&lt;div class=&#34;admonition admonition-tip&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Tip&lt;/p&gt;&lt;p&gt;You can explore this &lt;strong&gt;&lt;a href=&#34;https://play.grafana.org/alerting/grafana/dynamic-label/view?tech=docs&amp;amp;pg=alerting-examples&amp;amp;plcmt=callout-tip&amp;amp;cta=alert-dynamic-labels&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;alerting example in Grafana Play&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Open the example to view alert evaluation results, generated alert instances, the alert history timeline, and alert rule details.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;considerations&#34;&gt;Considerations&lt;/h2&gt;
&lt;p&gt;Dynamic labels lets you reuse a single alert rule across multiple escalation scenarios—but it also introduces complexity. When the label value depends on a noisy metric and changes frequently, it can lead to flapping alert instances and excessive notifications.&lt;/p&gt;
&lt;p&gt;These alerts often require tuning to stay reliable and benefit from continuous review. To get the most out of this pattern, consider the following:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Tune evaluation settings and queries for stability&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Increase the 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rule-evaluation/&#34;&gt;evaluation interval and pending period&lt;/a&gt; to reduce the frequency of state changes. Additionally, consider smoothing metrics with functions like &lt;code&gt;avg_over_time&lt;/code&gt; to reduce flapping.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use wider threshold bands&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Define broader ranges in your label template logic to prevent label switching caused by small value changes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Disable resolved notifications&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;When labels change frequently and alerts resolve quickly, you can reduce the number of notifications by disabling resolved notifications at the contact point.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Disable the Missing series evaluations setting&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/&#34;&gt;Missing series evaluations setting&lt;/a&gt; (default: 2) defines how many intervals without data are allowed before resolving an instance. Consider disabling it if it&amp;rsquo;s unnecessary for your use case, as it can complicate alert troubleshooting.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Preserve context across related alerts&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Ensure alert metadata includes enough information to help correlate related alerts during investigation.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Use separate alert rules and static labels when simpler&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;In some cases, defining separate rules with static labels may be easier to manage than one complex dynamic rule. This also allows you to customize alert queries for each specific case.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;learn-more&#34;&gt;Learn more&lt;/h2&gt;
&lt;p&gt;Here&amp;rsquo;s a list of additional resources related to this example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;Multi-dimensional alerting example&lt;/a&gt; – Explore how Grafana creates separate alert instances for each unique set of labels.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rules/annotation-label/#labels&#34;&gt;Labels&lt;/a&gt; – Learn about the different types of labels and how they define alert instances.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/alerting-rules/templates/&#34;&gt;Template labels in alert rules&lt;/a&gt; – Use templating to set label values dynamically based on query results.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rule-evaluation/stale-alert-instances/&#34;&gt;Stale alert instances&lt;/a&gt; – Understand how Grafana resolves and removes stale alert instances.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/missing-data/&#34;&gt;Handle missing data&lt;/a&gt; – Learn how Grafana distinguishes between missing series and &lt;code&gt;NoData&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/notifications/notification-policies/&#34;&gt;Notification policies and routing&lt;/a&gt; – Create multiple notification policies to route alerts based on label values like &lt;code&gt;severity&lt;/code&gt; or &lt;code&gt;team&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://play.grafana.org/alerting/grafana/dynamic-label/view?tech=docs&amp;amp;pg=alerting-examples&amp;amp;plcmt=learn-more&amp;amp;cta=alert-dynamic-labels&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Dynamic label example in Grafana Play&lt;/a&gt; - View this example in Grafana Play to explore alert instances and state transitions with dynamic labels.&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="example-of-dynamic-labels-in-alert-instances">Example of dynamic labels in alert instances&lt;/h1>
&lt;p>Labels are essential for scaling your alerting setup. They define metadata like &lt;code>severity&lt;/code>, &lt;code>team&lt;/code>, &lt;code>category&lt;/code>, or &lt;code>environment&lt;/code>, which you can use for alert routing.&lt;/p></description></item><item><title>Example of dynamic thresholds per dimension</title><link>https://grafana.com/docs/grafana/v12.4/alerting/examples/dynamic-thresholds/</link><pubDate>Fri, 03 Apr 2026 12:35:46 -0500</pubDate><guid>https://grafana.com/docs/grafana/v12.4/alerting/examples/dynamic-thresholds/</guid><content><![CDATA[&lt;h1 id=&#34;example-of-dynamic-thresholds-per-dimension&#34;&gt;Example of dynamic thresholds per dimension&lt;/h1&gt;
&lt;p&gt;In Grafana Alerting, each alert rule supports only one condition expression.&lt;/p&gt;
&lt;p&gt;That&amp;rsquo;s enough in many cases—most alerts use a fixed numeric threshold like &lt;code&gt;latency &amp;gt; 3s&lt;/code&gt; or &lt;code&gt;error_rate &amp;gt; 5%&lt;/code&gt; to determine their state.&lt;/p&gt;
&lt;p&gt;As your alerting setup grows, you may find that different targets require different threshold values.&lt;/p&gt;
&lt;p&gt;Instead of duplicating alert rules, you can assign a &lt;strong&gt;different threshold value to each target&lt;/strong&gt;—while keeping the same condition. This simplifies alert maintenance.&lt;/p&gt;
&lt;p&gt;This example shows how to do that using 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;multi-dimensional alerts&lt;/a&gt; and a 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rules/queries-conditions/#math&#34;&gt;Math expression&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;example-overview&#34;&gt;Example overview&lt;/h2&gt;
&lt;p&gt;You&amp;rsquo;re monitoring latency across multiple API services. Initially, you want to get alerted if the 95th percentile latency (&lt;code&gt;p95_api_latency&lt;/code&gt;) exceeds 3 seconds, so your alert rule uses a single static threshold:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;p95_api_latency &amp;gt; 3&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;But the team quickly finds that some services require stricter thresholds. For example, latency for payment APIs should stay under 1.5s, while background jobs can tolerate up to 5s. The team establishes different thresholds per service:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;p95_api_latency{service=&amp;quot;checkout-api&amp;quot;}&lt;/code&gt;: must stay under &lt;code&gt;1.5s&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;p95_api_latency{service=&amp;quot;auth-api&amp;quot;}&lt;/code&gt;: also strict, &lt;code&gt;1.5s&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;p95_api_latency{service=&amp;quot;catalog-api&amp;quot;}&lt;/code&gt;: less critical, &lt;code&gt;3s&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;p95_api_latency{service=&amp;quot;async-tasks&amp;quot;}&lt;/code&gt;: background jobs can tolerate up to &lt;code&gt;5s&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You want to avoid creating one alert rule per service—this is harder to maintain.&lt;/p&gt;
&lt;p&gt;In Grafana Alerting, you can define one alert rule that monitors multiple similar components like this scenario. This is called 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;multi-dimensional alerts&lt;/a&gt;: one alert rule, many alert instances—&lt;strong&gt;one per unique label set&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;But there&amp;rsquo;s an issue: Grafana supports only &lt;strong&gt;one alert condition per rule&lt;/strong&gt;.&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;One alert rule
├─ One condition ( e.g., $A &amp;gt; 3)
│  └─ Applies to all returned series in $A
│     ├─ {service=&amp;#34;checkout-api&amp;#34;}
│     ├─ {service=&amp;#34;auth-api&amp;#34;}
│     ├─ {service=&amp;#34;catalog-api&amp;#34;}
│     └─ {service=&amp;#34;async-tasks&amp;#34;}&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;To evaluate per-service thresholds, you need a distinct threshold value for each returned series.&lt;/p&gt;
&lt;h2 id=&#34;dynamic-thresholds-using-a-math-expression&#34;&gt;Dynamic thresholds using a Math expression&lt;/h2&gt;
&lt;p&gt;You can create a dynamic alert condition by operating on two queries with a 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rules/queries-conditions/#math&#34;&gt;Math expression&lt;/a&gt;.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;$A&lt;/code&gt; for query results (e.g., &lt;code&gt;p95_api_latency&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$B&lt;/code&gt; for per-service thresholds (from CSV data or another query).&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$A &amp;gt; $B&lt;/code&gt; is the &lt;em&gt;Math&lt;/em&gt; expression that defines the alert condition.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Grafana evaluates the &lt;em&gt;Math&lt;/em&gt; expression &lt;strong&gt;per series&lt;/strong&gt;, by joining series from &lt;code&gt;$A&lt;/code&gt; and &lt;code&gt;$B&lt;/code&gt; based on their shared labels before applying the expression.&lt;/p&gt;
&lt;p&gt;Here’s an example of an arithmetic operation:&lt;/p&gt;


&lt;div data-shared=&#34;alerts/math-example.md&#34;&gt;
            &lt;ul&gt;
&lt;li&gt;&lt;code&gt;$A&lt;/code&gt; returns series &lt;code&gt;{host=&amp;quot;web01&amp;quot;} 30&lt;/code&gt; and &lt;code&gt;{host=&amp;quot;web02&amp;quot;} 20&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$B&lt;/code&gt; returns series &lt;code&gt;{host=&amp;quot;web01&amp;quot;} 10&lt;/code&gt; and &lt;code&gt;{host=&amp;quot;web02&amp;quot;} 0&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$A &#43; $B&lt;/code&gt; returns &lt;code&gt;{host=&amp;quot;web01&amp;quot;} 40&lt;/code&gt; and &lt;code&gt;{host=&amp;quot;web02&amp;quot;} 20&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

        
&lt;p&gt;In practice, you must align your threshold input with the label sets returned by your alert query.&lt;/p&gt;
&lt;p&gt;The following table illustrates how a per-service threshold is evaluated in the previous example:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;$A: p95 latency query&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;$B: threshold value&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;$C: $A&amp;gt;$B&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;State&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;checkout-api&amp;quot;} 3&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;checkout-api&amp;quot;} 1.5&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;checkout-api&amp;quot;} 1&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;strong&gt;Firing&lt;/strong&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;auth-api&amp;quot;} 1&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;auth-api&amp;quot;} 1.5&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;auth-api&amp;quot;} 0&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;strong&gt;Normal&lt;/strong&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;catalog-api&amp;quot;} 2&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;catalog-api&amp;quot;} 3&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;catalog-api&amp;quot;} 0&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;strong&gt;Normal&lt;/strong&gt;&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;sync-work&amp;quot;} 3&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;sync-work&amp;quot;} 5&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{service=&amp;quot;sync-work&amp;quot;} 0&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;strong&gt;Normal&lt;/strong&gt;&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;In this example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;$A&lt;/code&gt; comes from the &lt;code&gt;p95_api_latency&lt;/code&gt; query.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$B&lt;/code&gt; is manually defined with a threshold value for each series in &lt;code&gt;$A&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;The alert condition compares &lt;code&gt;$A&amp;gt;$B&lt;/code&gt; using a &lt;em&gt;Math&lt;/em&gt; relational operator (e.g., &lt;code&gt;&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;&lt;/code&gt;, &lt;code&gt;&amp;gt;=&lt;/code&gt;, &lt;code&gt;&amp;lt;=&lt;/code&gt;, &lt;code&gt;==&lt;/code&gt;, &lt;code&gt;!=&lt;/code&gt;) that joins series by matching labels.&lt;/li&gt;
&lt;li&gt;Grafana evaluates the alert condition and sets the firing state where the condition is true.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The &lt;em&gt;Math&lt;/em&gt; expression works as long as each series in &lt;code&gt;$A&lt;/code&gt; can be matched with exactly one series in &lt;code&gt;$B&lt;/code&gt;. They must align in a way that produces a one-to-one match between series in &lt;code&gt;$A&lt;/code&gt; and &lt;code&gt;$B&lt;/code&gt;.&lt;/p&gt;


&lt;div class=&#34;admonition admonition-caution&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Caution&lt;/p&gt;&lt;p&gt;If a series in one query doesn’t match any series in the other, it’s excluded from the result and a warning message is displayed:&lt;/p&gt;
&lt;p&gt;&lt;em&gt;1 items &lt;strong&gt;dropped from union(s)&lt;/strong&gt;: [&amp;quot;$A &amp;gt; $B&amp;quot;: ($B: {service=payment-api})]&lt;/em&gt;&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Labels in both series don’t need to be identical&lt;/strong&gt;. If labels are a subset of the other, they can join. For example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;$A&lt;/code&gt; returns series &lt;code&gt;{host=&amp;quot;web01&amp;quot;, job=&amp;quot;event&amp;quot;}&lt;/code&gt; 30 and &lt;code&gt;{host=&amp;quot;web02&amp;quot;, job=&amp;quot;event&amp;quot;}&lt;/code&gt; 20.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$B&lt;/code&gt; returns series &lt;code&gt;{host=&amp;quot;web01&amp;quot;}&lt;/code&gt; 10 and &lt;code&gt;{host=&amp;quot;web02&amp;quot;}&lt;/code&gt; 0.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;$A&lt;/code&gt; &#43; &lt;code&gt;$B&lt;/code&gt; returns &lt;code&gt;{host=&amp;quot;web01&amp;quot;, job=&amp;quot;event&amp;quot;}&lt;/code&gt; 40 and &lt;code&gt;{host=&amp;quot;web02&amp;quot;, job=&amp;quot;event&amp;quot;}&lt;/code&gt; 20.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;try-it-with-testdata&#34;&gt;Try it with TestData&lt;/h2&gt;
&lt;p&gt;You can use the 
    &lt;a href=&#34;/docs/grafana/v12.4/datasources/testdata/&#34;&gt;TestData data source&lt;/a&gt; to replicate this example:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Add the &lt;strong&gt;TestData&lt;/strong&gt; data source through the &lt;strong&gt;Connections&lt;/strong&gt; menu.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Create an alert rule.&lt;/p&gt;
&lt;p&gt;Navigate to &lt;strong&gt;Alerting&lt;/strong&gt; → &lt;strong&gt;Alert rules&lt;/strong&gt; and click &lt;strong&gt;New alert rule&lt;/strong&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Simulate a query (&lt;code&gt;$A&lt;/code&gt;) that returns latencies for each service.&lt;/p&gt;
&lt;p&gt;Select &lt;strong&gt;TestData&lt;/strong&gt; as the data source and configure the scenario.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Scenario: Random Walk&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Alias: latency&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Labels: service=api-$seriesIndex&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Series count: 4&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Start value: 1&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Min: 1, Max: 4&lt;/p&gt;
&lt;p&gt;This uses &lt;code&gt;$seriesIndex&lt;/code&gt; to assign unique service labels: &lt;code&gt;api-0&lt;/code&gt;, &lt;code&gt;api-1&lt;/code&gt;, etc.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;figure
       class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
       style=&#34;max-width: 750px;&#34;
       itemprop=&#34;associatedMedia&#34;
       itemscope=&#34;&#34;
       itemtype=&#34;http://schema.org/ImageObject&#34;
     &gt;&lt;a
           class=&#34;lightbox-link&#34;
           href=&#34;/media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png&#34;
           itemprop=&#34;contentUrl&#34;
         &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
             class=&#34;lazyload &#34;
             data-src=&#34;/media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png&#34;data-srcset=&#34;/media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png?w=320 320w, /media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png?w=550 550w, /media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png?w=750 750w, /media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png?w=900 900w, /media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png?w=1040 1040w, /media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png?w=1240 1240w, /media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;TestData data source returns 4 series to simulate latencies for distinct API services.&#34;width=&#34;2330&#34;height=&#34;1348&#34;/&gt;
           &lt;noscript&gt;
             &lt;img
               src=&#34;/media/docs/alerting/example-dynamic-thresholds-latency-series-v2.png&#34;
               alt=&#34;TestData data source returns 4 series to simulate latencies for distinct API services.&#34;width=&#34;2330&#34;height=&#34;1348&#34;/&gt;
           &lt;/noscript&gt;&lt;/div&gt;&lt;/a&gt;&lt;/figure&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Define per-service thresholds with static data.&lt;/p&gt;
&lt;p&gt;Add a new query (&lt;code&gt;$B&lt;/code&gt;) and select &lt;strong&gt;TestData&lt;/strong&gt; as the data source.&lt;/p&gt;
&lt;p&gt;From &lt;strong&gt;Scenario&lt;/strong&gt;, select &lt;strong&gt;CSV Content&lt;/strong&gt; and paste this CSV:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt; service,value
 api-0,1.5
 api-1,1.5
 api-2,3
 api-3,5&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The &lt;code&gt;service&lt;/code&gt; column must match the labels from &lt;code&gt;$A&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;value&lt;/code&gt; column is a numeric value used for the alert comparison.&lt;/p&gt;
&lt;p&gt;For details on CSV format requirements, see 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/table-data/&#34;&gt;table data examples&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Add a new &lt;strong&gt;Reduce&lt;/strong&gt; expression (&lt;code&gt;$C&lt;/code&gt;).&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Type: Reduce&lt;/li&gt;
&lt;li&gt;Input: A&lt;/li&gt;
&lt;li&gt;Function: Mean&lt;/li&gt;
&lt;li&gt;Name: C&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This calculates the average latency for each service: &lt;code&gt;api-0&lt;/code&gt;, &lt;code&gt;api-1&lt;/code&gt;, etc.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Add a new &lt;strong&gt;Math&lt;/strong&gt; expression.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Type: Math&lt;/li&gt;
&lt;li&gt;Expression: &lt;code&gt;$C &amp;gt; $B&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;Set this expression as the &lt;strong&gt;alert condition&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This fires if the average latency (&lt;code&gt;$C&lt;/code&gt;) exceeds the threshold from &lt;code&gt;$B&lt;/code&gt; for any service.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Preview&lt;/strong&gt; the alert.&lt;/p&gt;
&lt;figure
       class=&#34;figure-wrapper figure-wrapper__lightbox w-100p &#34;
       style=&#34;max-width: 750px;&#34;
       itemprop=&#34;associatedMedia&#34;
       itemscope=&#34;&#34;
       itemtype=&#34;http://schema.org/ImageObject&#34;
     &gt;&lt;a
           class=&#34;lightbox-link captioned&#34;
           href=&#34;/media/docs/alerting/example-dynamic-thresholds-preview-v3.png&#34;
           itemprop=&#34;contentUrl&#34;
         &gt;&lt;div class=&#34;img-wrapper w-100p h-auto&#34;&gt;&lt;img
             class=&#34;lazyload mb-0&#34;
             data-src=&#34;/media/docs/alerting/example-dynamic-thresholds-preview-v3.png&#34;data-srcset=&#34;/media/docs/alerting/example-dynamic-thresholds-preview-v3.png?w=320 320w, /media/docs/alerting/example-dynamic-thresholds-preview-v3.png?w=550 550w, /media/docs/alerting/example-dynamic-thresholds-preview-v3.png?w=750 750w, /media/docs/alerting/example-dynamic-thresholds-preview-v3.png?w=900 900w, /media/docs/alerting/example-dynamic-thresholds-preview-v3.png?w=1040 1040w, /media/docs/alerting/example-dynamic-thresholds-preview-v3.png?w=1240 1240w, /media/docs/alerting/example-dynamic-thresholds-preview-v3.png?w=1920 1920w&#34;data-sizes=&#34;auto&#34;alt=&#34;Alert preview evaluating multiple series with distinct threshold values&#34;width=&#34;1175&#34;height=&#34;873&#34;title=&#34;Alert preview evaluating multiple series with distinct threshold values&#34;/&gt;
           &lt;noscript&gt;
             &lt;img
               src=&#34;/media/docs/alerting/example-dynamic-thresholds-preview-v3.png&#34;
               alt=&#34;Alert preview evaluating multiple series with distinct threshold values&#34;width=&#34;1175&#34;height=&#34;873&#34;title=&#34;Alert preview evaluating multiple series with distinct threshold values&#34;/&gt;
           &lt;/noscript&gt;&lt;/div&gt;&lt;figcaption class=&#34;w-100p caption text-gray-13  &#34;&gt;Alert preview evaluating multiple series with distinct threshold values&lt;/figcaption&gt;&lt;/a&gt;&lt;/figure&gt;


&lt;div class=&#34;admonition admonition-tip&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Tip&lt;/p&gt;&lt;p&gt;You can explore this &lt;strong&gt;&lt;a href=&#34;https://play.grafana.org/alerting/grafana/dynamic-thresholds/view?tech=docs&amp;amp;pg=alerting-examples&amp;amp;plcmt=callout-tip&amp;amp;cta=alert-dynamic-thresholds&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;alerting example in Grafana Play&lt;/a&gt;&lt;/strong&gt;.&lt;/p&gt;
&lt;p&gt;Open the example to view alert evaluation results, generated alert instances, the alert history timeline, and alert rule details.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;other-use-cases&#34;&gt;Other use cases&lt;/h2&gt;
&lt;p&gt;This example showed how to build a single alert rule with different thresholds per series using 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;multi-dimensional alerts&lt;/a&gt; and 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/fundamentals/alert-rules/queries-conditions/#math&#34;&gt;Math expressions&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This approach scales well when monitoring similar components with distinct reliability goals.&lt;/p&gt;
&lt;p&gt;By aligning series from two queries, you can apply a dynamic threshold—one value per label set—without duplicating rules.&lt;/p&gt;
&lt;p&gt;While this example uses static CSV content to define thresholds, the same technique works in other scenarios:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Dynamic thresholds from queries or recording rules&lt;/strong&gt;: Fetch threshold values from a real-time query, or from 
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/alerting-rules/create-recording-rules/&#34;&gt;custom recording rules&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Combine multiple conditions&lt;/strong&gt;: Build more advanced threshold logic by combining multiple conditions—such as latency, error rate, or traffic volume.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;For example, you can define a PromQL expression that sets a latency threshold which adjusts based on traffic—allowing higher response times during periods of high-load.&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;(
  // Fires when p95 latency &amp;gt; 2s during usual traffic (≤ 1000 req/s)
  service:latency:p95 &amp;gt; 2 and service:request_rate:rate1m &amp;lt;= 1000
)
or
(
  // Fires when p95 latency &amp;gt; 4s during high traffic (&amp;gt; 1000 req/s)
  service:latency:p95 &amp;gt; 4 and service:request_rate:rate1m &amp;gt; 1000
)&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
]]></content><description>&lt;h1 id="example-of-dynamic-thresholds-per-dimension">Example of dynamic thresholds per dimension&lt;/h1>
&lt;p>In Grafana Alerting, each alert rule supports only one condition expression.&lt;/p>
&lt;p>That&amp;rsquo;s enough in many cases—most alerts use a fixed numeric threshold like &lt;code>latency &amp;gt; 3s&lt;/code> or &lt;code>error_rate &amp;gt; 5%&lt;/code> to determine their state.&lt;/p></description></item><item><title>Examples of high-cardinality alerts</title><link>https://grafana.com/docs/grafana/v12.4/alerting/examples/high-cardinality-alerts/</link><pubDate>Fri, 03 Apr 2026 12:35:46 -0500</pubDate><guid>https://grafana.com/docs/grafana/v12.4/alerting/examples/high-cardinality-alerts/</guid><content><![CDATA[&lt;h1 id=&#34;examples-of-high-cardinality-alerts&#34;&gt;Examples of high-cardinality alerts&lt;/h1&gt;
&lt;p&gt;In Prometheus and Mimir, metrics are stored as time series, where each unique set of labels defines a distinct series.&lt;/p&gt;
&lt;p&gt;A large number of unique series (&lt;em&gt;high cardinality&lt;/em&gt;) can overload your metrics backend, slow down dashboard and alert queries, and quickly increase your observability costs or exceed the limits of your Grafana Cloud plan.&lt;/p&gt;
&lt;p&gt;These examples show how to detect and alert on early signs of high cardinality:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Total active series near limits&lt;/strong&gt;: detect when your Prometheus, Mimir, or Grafana Cloud Metrics instance approaches soft or hard limits.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Series increase per metric or scope:&lt;/strong&gt; fine-tune detection to identify growth in a particular metric or domain.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Sudden series growth&lt;/strong&gt;: detect runaway cardinality increases caused by misconfigured exporters or new deployments.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;High ingestion rate&lt;/strong&gt;: detect when too many samples per second are being ingested, even if the total series count is stable.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Use these alert patterns to act on high-cardinality growth, and consider implementing &lt;a href=&#34;/docs/grafana-cloud/adaptive-telemetry/adaptive-metrics/introduction/&#34;&gt;Adaptive Metrics recommendations&lt;/a&gt; to keep your observability costs under control.&lt;/p&gt;
&lt;h2 id=&#34;choose-metrics-to-monitor-active-series&#34;&gt;Choose metrics to monitor active series&lt;/h2&gt;
&lt;p&gt;First, identify which metric reports the number of active time series.&lt;/p&gt;
&lt;p&gt;Prometheus, Mimir, and Grafana Cloud expose this information differently:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Environment&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Metric&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Description&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;strong&gt;Prometheus&lt;/strong&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;prometheus_tsdb_head_series&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Reports the number of active series currently stored in memory (the head block) of a single Prometheus instance. It includes series that have stopped receiving samples for up to one hour.&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;strong&gt;Grafana Cloud&lt;/strong&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;grafanacloud_instance_active_series&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Tracks the number of &lt;a href=&#34;/docs/grafana-cloud/cost-management-and-billing/manage-invoices/understand-your-invoice/metrics-invoice/&#34;&gt;active series in your Grafana Cloud Metrics backend (Mimir&lt;/a&gt;).&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;strong&gt;Prometheus or Mimir&lt;/strong&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;count({__name__!=&amp;quot;&amp;quot;})&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Counts the number of series with recent samples by scanning the TSDB index. This query is expensive and should be exposed through a recording rule.&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;h2 id=&#34;detect-total-active-series-near-limits&#34;&gt;Detect total active series near limits&lt;/h2&gt;
&lt;p&gt;A high number of active series increases memory usage and can impact performance. Grafana Cloud enforces usage limits to prevent your instance from running into these performance issues.&lt;/p&gt;
&lt;p&gt;In Prometheus, you can alert when the total number of active series exceeds a threshold:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;prometheus_tsdb_head_series &amp;gt; 1.5e6&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This fires when the instance exceeds 1.5 million active series.&lt;br /&gt;
Adjust the threshold based on the capacity of your Prometheus host and observed load.&lt;/p&gt;
&lt;p&gt;In Grafana Cloud, use the &lt;code&gt;grafanacloud_instance_active_series&lt;/code&gt; metric to monitor active series across your managed Mimir backend:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;grafanacloud_instance_active_series &amp;gt; 1.5e6&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Grafana Cloud also provides account-level limits through the &lt;code&gt;grafanacloud_instance_metrics_limits&lt;/code&gt; metric.&lt;/p&gt;
&lt;p&gt;For more robust alerting, you can compare your current usage to the &lt;code&gt;max_global_series_per_user&lt;/code&gt; limit:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;(
  grafanacloud_instance_active_series
  / on (id)
    grafanacloud_instance_metrics_limits{limit_name=&amp;#34;max_global_series_per_user&amp;#34;}
)
* on (id) group_left(name) grafanacloud_instance_info
&amp;gt; 0.9&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;grafanacloud_instance_active_series&lt;/code&gt;&lt;br /&gt;
Returns the current number of active series per your Mimir (Prometheus) data source instance (&lt;code&gt;id&lt;/code&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;/ on (id) grafanacloud_instance_metrics_limits{limit_name=&amp;quot;max_global_series_per_user&amp;quot;}&lt;/code&gt;&lt;br /&gt;
Divides current usage by the account limit to calculate a utilization ratio between 0 and 1 (where &lt;code&gt;1&lt;/code&gt; means the limit is reached).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;* on (id) group_left(name) grafanacloud_instance_info&lt;/code&gt;&lt;br /&gt;
Joins instance metadata to display the instance name.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;gt; 0.9&lt;/code&gt;&lt;br /&gt;
Defines the threshold condition to fire when usage exceeds 90% of the limit.&lt;br /&gt;
Adjust this value according to your alert goal. Alternatively, you can set the threshold as a Grafana Alerting expression in the UI.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;detect-high-cardinality-per-metric&#34;&gt;Detect high-cardinality per metric&lt;/h2&gt;
&lt;p&gt;Instead of monitoring the total number of active series, you can fine-tune alerts to detect high cardinality within a specific scope — for example, by filtering on certain namespaces, services, or metrics known to generate many label combinations.&lt;/p&gt;
&lt;p&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;Multi-dimensional alerts&lt;/a&gt; let you evaluate each metric independently, so you can identify which metric is responsible for the label explosion instead of only tracking the overall total.&lt;/p&gt;
&lt;p&gt;You can apply label filters, or use &lt;code&gt;{__name__=~&amp;quot;regex&amp;quot;}&lt;/code&gt; to select specific metrics. Then, use &lt;code&gt;count by (__name__)&lt;/code&gt; to group results per metric name.&lt;/p&gt;
&lt;p&gt;Because the &lt;code&gt;__name__&lt;/code&gt; selector queries the entire TSDB index, it’s recommended to query this using a &lt;strong&gt;recording rule&lt;/strong&gt;:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;# Only HTTP/RPC-style metrics
active_series_per_metric:http_rpc =
label_replace(
  count by (__name__) ({__name__=~&amp;#34;http_.*|rpc_.*&amp;#34;}),
  &amp;#34;metric&amp;#34;, &amp;#34;$1&amp;#34;, &amp;#34;__name__&amp;#34;, &amp;#34;(.*)&amp;#34;
)&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This recording rule stores the number of active series per metric name.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;count by (__name__) ({__name__=~&amp;quot;http_.*|rpc_.*&amp;quot;})&lt;/code&gt;&lt;br /&gt;
Counts the number of active series per metric matching the &lt;code&gt;http_.*&lt;/code&gt; or &lt;code&gt;rpc_.*&lt;/code&gt; regex.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;label_replace(..., &amp;quot;metric&amp;quot;, &amp;quot;$1&amp;quot;, &amp;quot;__name__&amp;quot;, &amp;quot;(.*)&amp;quot;)&lt;/code&gt;&lt;br /&gt;
Copies the metric name into a new label called &lt;code&gt;metric&lt;/code&gt;.&lt;br /&gt;
This enables generating one alert instance per metric because &lt;code&gt;__name__&lt;/code&gt; is not treated as a regular label.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Adjust the threshold and recording rule scope based on the label usage and normal behavior of your observed metrics.&lt;/p&gt;
&lt;p&gt;After the recording rule is available, you can define this multi-dimensional alert rule as follows:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;active_series_per_metric:http_rpc &amp;gt; 100&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Grafana Alerting evaluates each row (or time series) returned by the &lt;code&gt;active_series_per_metric:http_rpc&lt;/code&gt; recording rule as a separate alert instance, producing independent alert instance states:&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Alert instance&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Value&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;State&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{metric=&amp;quot;http_requests_total&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;320&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Firing&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{metric=&amp;quot;rpc_client_calls_total&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;45&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Normal&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;&lt;code&gt;{metric=&amp;quot;rpc_server_errors_total&amp;quot;}&lt;/code&gt;&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;110&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Firing&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;&lt;p&gt;Each metric name (&lt;code&gt;__name__&lt;/code&gt;) becomes a separate alert instance, so you immediately see which metric exceeds the expected limit.&lt;/p&gt;
&lt;h2 id=&#34;detect-sudden-cardinality-growth&#34;&gt;Detect sudden cardinality growth&lt;/h2&gt;
&lt;p&gt;Even if the number of active series stays within proper limits, a sudden increase can signal a misbehaving exporter, a new deployment, or an unexpected label explosion. These peaks can help you prevent potential issues, or just inform you of deployment changes that might need adjustments.&lt;/p&gt;
&lt;p&gt;You can use any of the metrics from the previous examples to track short-term changes in the total active series:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;delta(active_series_metric[10m]) &amp;gt; 1000&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This alert fires when the number of active series increases by more than &lt;strong&gt;1000&lt;/strong&gt; within the last 10 minutes.&lt;br /&gt;
Adjust the time window (for example, &lt;code&gt;[5m]&lt;/code&gt; or &lt;code&gt;[30m]&lt;/code&gt;) and threshold to match your environment’s normal variability.&lt;/p&gt;
&lt;h2 id=&#34;detect-high-ingestion-rate&#34;&gt;Detect high ingestion rate&lt;/h2&gt;
&lt;p&gt;Even if label cardinality remains under control, a high ingestion rate can affect Prometheus performance or increase observability costs.&lt;/p&gt;
&lt;p&gt;In Prometheus, this usually happens when scrapes occur too frequently or when exporters generate large numbers of samples in short intervals.&lt;/p&gt;
&lt;p&gt;To find an appropriate threshold to be alerted, start by monitoring normal ingestion peaks and set the threshold to a value that stays below the point where scrapes or WAL operations begin to slow down.&lt;/p&gt;
&lt;p&gt;In Prometheus, use the &lt;code&gt;prometheus_tsdb_head_samples_appended_total&lt;/code&gt; metric to measure the number of samples appended per scrape:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;rate(prometheus_tsdb_head_samples_appended_total[10m]) &amp;gt; 1e5&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The alert rule query returns the average ingestion rate per second over the last 10 minutes and fires when the value exceeds 100 000 samples per second.&lt;/p&gt;
&lt;p&gt;In Grafana Cloud, use the &lt;code&gt;grafanacloud_instance_samples_per_second&lt;/code&gt; metric to monitor total ingestion rate of your Mimir instances:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;grafanacloud_instance_samples_per_second
  * on (id) group_left(name) grafanacloud_instance_info
&amp;gt; 1e5&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Alternatively, Grafana Cloud metrics limits are based on &lt;a href=&#34;/docs/grafana-cloud/cost-management-and-billing/manage-invoices/understand-your-invoice/metrics-invoice/&#34;&gt;data points per minute (DPM)&lt;/a&gt;: the number of samples sent per minute across all your active series.&lt;/p&gt;
&lt;p&gt;To monitor when your actual data-point rate approaches your DPM limit, you can compare total ingestion to your plan’s DPM limit:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;shell&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-shell&#34;&gt;(
  (grafanacloud_instance_samples_per_second * 60)
  / grafanacloud_org_metrics_included_dpm_per_series
)
&amp;gt; 0.9&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;(grafanacloud_instance_samples_per_second * 60)&lt;/code&gt;&lt;br /&gt;
Converts ingestion rate from data points per second to minutes.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;/ grafanacloud_org_metrics_included_dpm_per_series&lt;/code&gt;&lt;br /&gt;
Divides current DPM usage by the DPM limit to calculate a utilization ratio between 0 and 1 (where &lt;code&gt;1&lt;/code&gt; means the limit is reached).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;&amp;gt; 0.9&lt;/code&gt;&lt;br /&gt;
Defines the threshold condition to fire when the usage exceeds 90% of the DPM limit.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This alert helps you detect when your organization is ingesting data faster and approaching its limits. Use ingestion rate alerts to detect workload spikes, exporter misconfigurations, or rapid increases in ingestion volume and cost.&lt;/p&gt;
&lt;h2 id=&#34;learn-more&#34;&gt;Learn more&lt;/h2&gt;
&lt;p&gt;Here’s list of additional resources related to this example:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/grafana/v12.4/alerting/best-practices/multi-dimensional-alerts/&#34;&gt;Multi-dimensional alerting example&lt;/a&gt; – Learn how Grafana creates separate alert instances for each unique label set.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/docs/grafana-cloud/cost-management-and-billing/manage-invoices/understand-your-invoice/metrics-invoice/&#34;&gt;Understand Grafana Cloud active series and DPM&lt;/a&gt;– See how active series and data points per minute (DPM) are used to calculate metrics usage in Grafana Cloud.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/docs/grafana-cloud/cost-management-and-billing/usage-cost-alerts/&#34;&gt;Create Grafana Cloud usage alerts&lt;/a&gt; – Set up alerts when your usage or costs approach your predefined limits.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/docs/mimir/latest/manage/run-production-environment/planning-capacity/&#34;&gt;Plan capacity for Mimir&lt;/a&gt;– Learn how to plan ingestion rate and memory capacity for Mimir or Prometheus environments.&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/docs/grafana-cloud/adaptive-telemetry/adaptive-metrics/introduction/&#34;&gt;Adaptive Metrics recommendations&lt;/a&gt; – Use Adaptive Metrics to automatically reduce high-cardinality metrics and control observability costs.&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="examples-of-high-cardinality-alerts">Examples of high-cardinality alerts&lt;/h1>
&lt;p>In Prometheus and Mimir, metrics are stored as time series, where each unique set of labels defines a distinct series.&lt;/p>
&lt;p>A large number of unique series (&lt;em>high cardinality&lt;/em>) can overload your metrics backend, slow down dashboard and alert queries, and quickly increase your observability costs or exceed the limits of your Grafana Cloud plan.&lt;/p></description></item><item><title>Grafana Alerting tutorials</title><link>https://grafana.com/docs/grafana/v12.4/alerting/examples/tutorials/</link><pubDate>Fri, 03 Apr 2026 12:35:46 -0500</pubDate><guid>https://grafana.com/docs/grafana/v12.4/alerting/examples/tutorials/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-alerting-tutorials&#34;&gt;Grafana Alerting tutorials&lt;/h1&gt;
&lt;p&gt;This section provides step-by-step tutorials to help you learn Grafana Alerting and explore key features through practical, easy-to-follow examples.&lt;/p&gt;
&lt;h2 id=&#34;get-started-with-grafana-alerting&#34;&gt;Get started with Grafana Alerting&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/alerting-get-started/&#34;&gt;Create and receive your first alert&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/alerting-get-started-pt2/&#34;&gt;Create multi-dimensional alerts and route them&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/alerting-get-started-pt3/&#34;&gt;Group alert notifications&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/alerting-get-started-pt4/&#34;&gt;Template your alert notifications&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;additional-tutorials&#34;&gt;Additional tutorials&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/alerting-get-started-pt5/&#34;&gt;Route alerts using dynamic labels&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/alerting-get-started-pt6/&#34;&gt;Link alerts to visualizations&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/create-alerts-with-logs/&#34;&gt;Create alerts with log data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;/tutorials/create-alerts-from-flux-queries/&#34;&gt;Create alerts with InfluxDB and Flux queries&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
]]></content><description>&lt;h1 id="grafana-alerting-tutorials">Grafana Alerting tutorials&lt;/h1>
&lt;p>This section provides step-by-step tutorials to help you learn Grafana Alerting and explore key features through practical, easy-to-follow examples.&lt;/p>
&lt;h2 id="get-started-with-grafana-alerting">Get started with Grafana Alerting&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="/tutorials/alerting-get-started/">Create and receive your first alert&lt;/a>&lt;/li>
&lt;li>&lt;a href="/tutorials/alerting-get-started-pt2/">Create multi-dimensional alerts and route them&lt;/a>&lt;/li>
&lt;li>&lt;a href="/tutorials/alerting-get-started-pt3/">Group alert notifications&lt;/a>&lt;/li>
&lt;li>&lt;a href="/tutorials/alerting-get-started-pt4/">Template your alert notifications&lt;/a>&lt;/li>
&lt;/ul>
&lt;h2 id="additional-tutorials">Additional tutorials&lt;/h2>
&lt;ul>
&lt;li>&lt;a href="/tutorials/alerting-get-started-pt5/">Route alerts using dynamic labels&lt;/a>&lt;/li>
&lt;li>&lt;a href="/tutorials/alerting-get-started-pt6/">Link alerts to visualizations&lt;/a>&lt;/li>
&lt;li>&lt;a href="/tutorials/create-alerts-with-logs/">Create alerts with log data&lt;/a>&lt;/li>
&lt;li>&lt;a href="/tutorials/create-alerts-from-flux-queries/">Create alerts with InfluxDB and Flux queries&lt;/a>&lt;/li>
&lt;/ul></description></item></channel></rss>