<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Grafana Mimir advanced architecture on Grafana Labs</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/</link><description>Recent content in Grafana Mimir advanced architecture on Grafana Labs</description><generator>Hugo -- gohugo.io</generator><language>en</language><atom:link href="/docs/mimir/v3.1.x/references/architecture/index.xml" rel="self" type="application/rss+xml"/><item><title>Grafana Mimir deployment modes</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/deployment-modes/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/deployment-modes/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-deployment-modes&#34;&gt;Grafana Mimir deployment modes&lt;/h1&gt;
&lt;p&gt;Grafana Mimir offers two deployment modes to accommodate different operational requirements and scale needs. Choose the deployment mode that best fits your use case:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Monolithic mode: Run all components in a single process for simple deployments.&lt;/li&gt;
&lt;li&gt;Microservices mode: Deploy components separately for maximum scalability and flexibility.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Configure the deployment mode using the &lt;code&gt;-target&lt;/code&gt; parameter, which you can set via CLI flag or YAML configuration.&lt;/p&gt;
&lt;h2 id=&#34;about-monolithic-mode&#34;&gt;About monolithic mode&lt;/h2&gt;
&lt;p&gt;Monolithic mode runs all required components in a single process. This mode is ideal for getting started or running Grafana Mimir in a development environment.&lt;/p&gt;
&lt;p&gt;To enable monolithic mode, set &lt;code&gt;-target=all&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;To see the complete list of components that run in monolithic mode, use the &lt;code&gt;-modules&lt;/code&gt; flag:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;Bash&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-bash&#34;&gt;./mimir -modules&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;This diagram shows how Mimir works in monolithic mode:&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;monolithic-mode.svg&#34;
  alt=&#34;Mimir&amp;rsquo;s monolithic mode&#34;/&gt;&lt;/p&gt;
&lt;h3 id=&#34;scale-monolithic-mode&#34;&gt;Scale monolithic mode&lt;/h3&gt;
&lt;p&gt;You can horizontally scale monolithic mode by deploying multiple Mimir binaries with &lt;code&gt;-target=all&lt;/code&gt;. This approach, shown in the following diagram, provides high availability and increased scale without the configuration complexity of &lt;a href=&#34;#about-microservices-mode&#34;&gt;microservices mode&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;scaled-monolithic-mode.svg&#34;
  alt=&#34;Mimir&amp;rsquo;s horizontally scaled monolithic mode&#34;/&gt;&lt;/p&gt;


&lt;div class=&#34;admonition admonition-note&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Note&lt;/p&gt;&lt;p&gt;Because monolithic mode requires scaling all Grafana Mimir components together, this deployment mode isn&amp;rsquo;t recommended for large-scale deployments.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;h2 id=&#34;about-microservices-mode&#34;&gt;About microservices mode&lt;/h2&gt;
&lt;p&gt;Microservices mode deploys each component in separate processes, enabling independent scaling and creating granular failure domains. Microservices mode is recommended for production environments.&lt;/p&gt;


&lt;div class=&#34;admonition admonition-note&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Note&lt;/p&gt;&lt;p&gt;Even though the read path (query-frontend, querier, and store-gateway) runs separately from the write path (distributor and ingester), a healthy ring is typically required for successful queries. If the write components (distributor or ingester) are unavailable or unhealthy, the ring health check may fail, causing read queries to fail. Complete isolation of read versus write availability requires careful configuration of ring settings and failure tolerance.&lt;/p&gt;
&lt;p&gt;Specifically, the querier consults the 
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/hash-ring/&#34;&gt;hash ring&lt;/a&gt; to discover ingesters before reading recent data from them. If ingesters are unhealthy or unavailable, the ring reflects that state and the querier may be unable to satisfy queries for recent data. Achieving true read/write isolation requires zone-aware replication and careful ring configuration so that the loss of write-path components does not reduce the ring below the quorum needed for reads. For more information, refer to 
    &lt;a href=&#34;/docs/mimir/v3.1.x/configure/configure-zone-aware-replication/&#34;&gt;Configuring zone-aware replication&lt;/a&gt;.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;p&gt;The following diagrams show how Mimir works in microservices mode using ingest storage and classic architectures. For more information about the two supported architectures in Grafana Mimir, refer to 
    &lt;a href=&#34;/docs/mimir/v3.1.x/get-started/about-grafana-mimir-architecture/&#34;&gt;Grafana Mimir architecture&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Ingest storage architecture:&lt;/p&gt;
&lt;div align=&#34;center&#34;&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;/media/docs/mimir/ingest-storage-overview.png&#34;
  alt=&#34;Ingest storage architecture diagram&#34; width=&#34;835&#34;
     height=&#34;556&#34;/&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;Classic architecture:&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;microservices-mode.svg&#34;
  alt=&#34;Mimir&amp;rsquo;s microservices mode&#34;/&gt;&lt;/p&gt;
&lt;p&gt;In microservices mode, each Grafana Mimir process is invoked with its &lt;code&gt;-target&lt;/code&gt; parameter set to a specific Grafana Mimir component (for example, &lt;code&gt;-target=ingester&lt;/code&gt; or &lt;code&gt;-target=distributor&lt;/code&gt;). To get a working Grafana Mimir instance, you must deploy every required component. For more information about each of the Grafana Mimir components, refer to 
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/&#34;&gt;Grafana Mimir advanced architecture&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;To deploy Grafana Mimir in microservices mode, use &lt;a href=&#34;https://kubernetes.io/&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Kubernetes&lt;/a&gt; and the &lt;a href=&#34;/docs/helm-charts/mimir-distributed/latest/&#34;&gt;mimir-distributed Helm chart&lt;/a&gt;.&lt;/p&gt;
]]></content><description>&lt;h1 id="grafana-mimir-deployment-modes">Grafana Mimir deployment modes&lt;/h1>
&lt;p>Grafana Mimir offers two deployment modes to accommodate different operational requirements and scale needs. Choose the deployment mode that best fits your use case:&lt;/p></description></item><item><title>About Grafana Mimir network ports</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/ports/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/ports/</guid><content><![CDATA[&lt;h1 id=&#34;about-grafana-mimir-network-ports&#34;&gt;About Grafana Mimir network ports&lt;/h1&gt;
&lt;p&gt;Grafana Mimir uses various network ports to facilitate communication between its internal components, external services like Prometheus and Grafana, and for overall cluster operation. Proper port configuration is crucial for setting up your Mimir cluster, configuring firewalls, and ensuring secure communication between Mimir components and integrated tools.&lt;/p&gt;
&lt;p&gt;The ports required to run Grafana Mimir can vary slightly depending on your deployment mode and whether you&amp;rsquo;re using additional components like Grafana or a load balancer.&lt;/p&gt;
&lt;p&gt;The following table shows the default ports that are fundamental to operating Mimir, whether in a monolithic or distributed setup. You can update these values in your Mimir configuration.&lt;/p&gt;
&lt;section class=&#34;expand-table-wrapper&#34;&gt;&lt;div class=&#34;button-div&#34;&gt;
      &lt;button class=&#34;expand-table-btn&#34;&gt;Expand table&lt;/button&gt;
    &lt;/div&gt;&lt;div class=&#34;responsive-table-wrapper&#34;&gt;
    &lt;table&gt;
      &lt;thead&gt;
          &lt;tr&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Port&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Function&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Related components&lt;/th&gt;
              &lt;th style=&#34;text-align: left&#34;&gt;Description&lt;/th&gt;
          &lt;/tr&gt;
      &lt;/thead&gt;
      &lt;tbody&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;8080&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;HTTP API / remote write&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;All Mimir components&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;This is the main entry point for Prometheus to remote-write metrics to Mimir through the Distributor and for Grafana and Prometheus to query data through the Querier or Query-frontend. This port is not typically exposed, as Grafana Mimir generally runs behind an Nginx proxy, the GEM gateway, or Kubernetes ingress.&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;9095&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Internal gRPC communication&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;All Mimir components&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Used for high-performance communication between different Mimir components, such as Distributor to Ingester, or Querier to Ingester. This communication is essential for distributed deployments.&lt;/td&gt;
          &lt;/tr&gt;
          &lt;tr&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;7946&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Memberlist / Gossip&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;All Mimir components&lt;/td&gt;
              &lt;td style=&#34;text-align: left&#34;&gt;Used for service discovery and maintaining the consistent hash ring that allows Mimir components to find and communicate with each other. This process is critical for high availability and scaling.&lt;/td&gt;
          &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/div&gt;
&lt;/section&gt;]]></content><description>&lt;h1 id="about-grafana-mimir-network-ports">About Grafana Mimir network ports&lt;/h1>
&lt;p>Grafana Mimir uses various network ports to facilitate communication between its internal components, external services like Prometheus and Grafana, and for overall cluster operation. Proper port configuration is crucial for setting up your Mimir cluster, configuring firewalls, and ensuring secure communication between Mimir components and integrated tools.&lt;/p></description></item><item><title>Grafana Mimir components</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/components/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/components/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-components&#34;&gt;Grafana Mimir components&lt;/h1&gt;
&lt;p&gt;Grafana Mimir includes a set of components that interact to form a cluster.&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/compactor/&#34;&gt;Compactor&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/distributor/&#34;&gt;Distributor&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/ingester/&#34;&gt;Ingester&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/querier/&#34;&gt;Querier&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/query-frontend/&#34;&gt;Query-frontend&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/query-scheduler/&#34;&gt;Query-scheduler&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/store-gateway/&#34;&gt;Store-gateway&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/alertmanager/&#34;&gt;(Optional) Alertmanager&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/overrides-exporter/&#34;&gt;(Optional) Overrides-exporter&lt;/a&gt;&lt;/li&gt;&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/ruler/&#34;&gt;(Optional) Ruler&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;
]]></content><description>&lt;h1 id="grafana-mimir-components">Grafana Mimir components&lt;/h1>
&lt;p>Grafana Mimir includes a set of components that interact to form a cluster.&lt;/p>
&lt;ul>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/compactor/">Compactor&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/distributor/">Distributor&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/ingester/">Ingester&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/querier/">Querier&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/query-frontend/">Query-frontend&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/query-scheduler/">Query-scheduler&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/store-gateway/">Store-gateway&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/alertmanager/">(Optional) Alertmanager&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/overrides-exporter/">(Optional) Overrides-exporter&lt;/a>&lt;/li>&lt;li>
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/ruler/">(Optional) Ruler&lt;/a>&lt;/li>&lt;/ul></description></item><item><title>Grafana Mimir binary index-header</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/binary-index-header/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/binary-index-header/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-binary-index-header&#34;&gt;Grafana Mimir binary index-header&lt;/h1&gt;
&lt;p&gt;To query series inside blocks from object storage, the &lt;a href=&#34;../components/store-gateway/&#34;&gt;store-gateway&lt;/a&gt; must obtain information about each block index.
To obtain the required information, the store-gateway builds an index-header for each block and stores it on local disk.&lt;/p&gt;
&lt;p&gt;The store-gateway uses &lt;code&gt;GET byte range request&lt;/code&gt; to build the index-header, which contains specific sections of the block&amp;rsquo;s index. The store-gateway uses the index-header at query time.&lt;/p&gt;
&lt;p&gt;Because downloading specific sections of the original block&amp;rsquo;s index is a computationally easy operation, the index-header is not uploaded to the object storage.
If the index-header is not available on local disk, store-gateway instances (or the same instance after a rolling update completes without a persistent disk) re-build the index-header from the original block&amp;rsquo;s index.&lt;/p&gt;
&lt;h2 id=&#34;format-version-1&#34;&gt;Format (version 1)&lt;/h2&gt;
&lt;p&gt;The index-header is a subset of the block index and contains:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/index.md#symbol-table&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Symbol Table&lt;/a&gt;: Used to unintern string values&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://github.com/prometheus/prometheus/blob/master/tsdb/docs/format/index.md#postings-offset-table&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Posting Offset Table&lt;/a&gt;: Used to look up postings&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The following example shows the format of the index-header file that is located in each block&amp;rsquo;s store-gateway local directory. It is terminated by a table of contents that serves as an entry point into the index.&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;┌─────────────────────────────┬───────────────────────────────┐
│    magic(0xBAAAD792) &amp;lt;4b&amp;gt;   │      version(1) &amp;lt;1 byte&amp;gt;      │
├─────────────────────────────┬───────────────────────────────┤
│  index version(2) &amp;lt;1 byte&amp;gt;  │ index PostingOffsetTable &amp;lt;8b&amp;gt; │
├─────────────────────────────┴───────────────────────────────┤
│ ┌─────────────────────────────────────────────────────────┐ │
│ │      Symbol Table (exact copy from original index)      │ │
│ ├─────────────────────────────────────────────────────────┤ │
│ │      Posting Offset Table (exact copy from index)       │ │
│ ├─────────────────────────────────────────────────────────┤ │
│ │                          TOC                            │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
]]></content><description>&lt;h1 id="grafana-mimir-binary-index-header">Grafana Mimir binary index-header&lt;/h1>
&lt;p>To query series inside blocks from object storage, the &lt;a href="../components/store-gateway/">store-gateway&lt;/a> must obtain information about each block index.
To obtain the required information, the store-gateway builds an index-header for each block and stores it on local disk.&lt;/p></description></item><item><title>Grafana Mimir bucket index</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/bucket-index/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/bucket-index/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-bucket-index&#34;&gt;Grafana Mimir bucket index&lt;/h1&gt;
&lt;p&gt;The bucket index is a per-tenant file that contains the list of blocks and block deletion marks in the storage. The bucket index is stored in the backend object storage, is periodically updated by the compactor, and used by queriers, store-gateways, and rulers (in &lt;a href=&#34;../components/ruler/#internal&#34;&gt;internal&lt;/a&gt; operational mode) to discover blocks in the storage.&lt;/p&gt;
&lt;h2 id=&#34;benefits&#34;&gt;Benefits&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&#34;../components/querier/&#34;&gt;querier&lt;/a&gt;, &lt;a href=&#34;../components/store-gateway/&#34;&gt;store-gateway&lt;/a&gt; and &lt;a href=&#34;../components/ruler/&#34;&gt;ruler&lt;/a&gt; must have an almost&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt; up-to-date view of the storage bucket, in order to find the right blocks to look up at query time (querier) and to load a block&amp;rsquo;s &lt;a href=&#34;../binary-index-header/&#34;&gt;index-header&lt;/a&gt; (store-gateway).
Because of this, they need to periodically scan the bucket to look for new blocks uploaded by ingesters or compactors, and blocks deleted (or marked for deletion) by compactors.&lt;/p&gt;
&lt;p&gt;When the bucket index is enabled, the querier, store-gateway, and ruler periodically look up the per-tenant bucket index instead of scanning the bucket via &lt;code&gt;list objects&lt;/code&gt; operations.&lt;/p&gt;
&lt;p&gt;This provides the following benefits:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Reduced number of API calls to the object storage by querier and store-gateway&lt;/li&gt;
&lt;li&gt;No &amp;ldquo;list objects&amp;rdquo; storage API calls performed by querier and store-gateway&lt;/li&gt;
&lt;li&gt;The &lt;a href=&#34;../components/querier/&#34;&gt;querier&lt;/a&gt; is up and running immediately after the startup, so there is no need to run an initial bucket scan&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;structure-of-the-index&#34;&gt;Structure of the index&lt;/h2&gt;
&lt;p&gt;The &lt;code&gt;bucket-index.json.gz&lt;/code&gt; contains:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;blocks&lt;/code&gt;&lt;/strong&gt;&lt;br /&gt;
List of complete blocks of a tenant, including blocks marked for deletion. Partial blocks are excluded from the index.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;block_deletion_marks&lt;/code&gt;&lt;/strong&gt;&lt;br /&gt;
List of block deletion marks.&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;&lt;code&gt;updated_at&lt;/code&gt;&lt;/strong&gt;&lt;br /&gt;
A Unix timestamp, with precision measured in seconds, displays the last time index was updated and written to the storage.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-it-gets-updated&#34;&gt;How it gets updated&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&#34;../components/compactor/&#34;&gt;compactor&lt;/a&gt; periodically scans the bucket and uploads an updated bucket index to the storage.
You can configure the frequency with which the bucket index is updated via &lt;code&gt;-compactor.cleanup-interval&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;The use of the bucket index is optional, but the index is built and updated by the compactor even if &lt;code&gt;-blocks-storage.bucket-store.bucket-index.enabled=false&lt;/code&gt;.
This behavior ensures that the bucket index for any tenant exists and that query result consistency is guaranteed if a Grafana Mimir cluster operator enables the bucket index in a live cluster.
The overhead introduced by keeping the bucket index updated is not significant.&lt;/p&gt;
&lt;h2 id=&#34;how-its-used-by-the-querier&#34;&gt;How it&amp;rsquo;s used by the querier&lt;/h2&gt;
&lt;p&gt;At query time the &lt;a href=&#34;../components/querier/&#34;&gt;querier&lt;/a&gt; and &lt;a href=&#34;../components/ruler/&#34;&gt;ruler&lt;/a&gt; determine whether the bucket index for the tenant has already been loaded to memory.
If not, the querier and ruler download it from the storage and cache it.&lt;/p&gt;
&lt;p&gt;Because the bucket index is a small file, lazy downloading it doesn&amp;rsquo;t have a significant impact on first query performance, but it does allow a querier to get up and running without pre-downloading every tenant&amp;rsquo;s bucket index.
In addition, if the &lt;a href=&#34;../components/querier/#metadata-cache&#34;&gt;metadata cache&lt;/a&gt; is enabled, the bucket index is cached for a short time in a shared cache, which reduces the latency and number of API calls to the object storage in case multiple queriers and rulers fetch the same tenant&amp;rsquo;s bucket index within a short time.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;bucket-index-querier-workflow.png&#34;
  alt=&#34;Querier - Bucket index&#34;/&gt;&lt;/p&gt;
&lt;!-- Diagram source at https://docs.google.com/presentation/d/1bHp8_zcoWCYoNU2AhO2lSagQyuIrghkCncViSqn14cU/edit --&gt;
&lt;p&gt;While in-memory, a background process keeps the bucket index updated periodically so that subsequent queries from the same tenant to the same querier instance use the cached (and periodically updated) bucket index.&lt;/p&gt;
&lt;p&gt;The following configuration options determine bucket index update intervals:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;-blocks-storage.bucket-store.sync-interval&lt;/code&gt;&lt;br /&gt;
This option configures how frequently a cached bucket index is refreshed.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-blocks-storage.bucket-store.bucket-index.update-on-error-interval&lt;/code&gt;&lt;br /&gt;
If downloading a bucket index fails, the failure is cached for a short time so that the backend storage doesn&amp;rsquo;t experience a large volume of storage requests.
This option configures the frequency with which the bucket store attempts to load a failed bucket index.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If a bucket index is unused for the amount of time configured via &lt;code&gt;-blocks-storage.bucket-store.bucket-index.idle-timeout&lt;/code&gt; (for example, if a querier instance is not receiving any query from the tenant), the querier removes it from memory and stops updating it at regular intervals.
This is useful for tenants that are resharded to different queriers when &lt;a href=&#34;../../../configure/configure-shuffle-sharding/&#34;&gt;shuffle sharding&lt;/a&gt; is enabled.&lt;/p&gt;
&lt;p&gt;At query time the querier and ruler determine how old a bucket index is based on its &lt;code&gt;updated_at&lt;/code&gt; field.
The query fails if the bucket index is older than the period configured via &lt;code&gt;-blocks-storage.bucket-store.bucket-index.max-stale-period&lt;/code&gt;.
This circuit breaker ensures queriers and rulers do not return any partial query results due to a stale view over the long-term storage.&lt;/p&gt;
&lt;h2 id=&#34;how-its-used-by-the-store-gateway&#34;&gt;How it&amp;rsquo;s used by the store-gateway&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&#34;../components/store-gateway/&#34;&gt;store-gateway&lt;/a&gt;, at startup and periodically, fetches the bucket index for each tenant that belongs to its shard, and uses it as the source of truth for the blocks and deletion marks in the storage. This removes the need to periodically scan the bucket to discover blocks belonging to its shard.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr /&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;Ingesters regularly add new blocks to the bucket as they offload data to long-term storage,
and compactors subsequently compact these blocks and mark the original blocks for deletion.
Actual deletion happens after the delay value that is associated with the parameter &lt;code&gt;-compactor.deletion-delay&lt;/code&gt;.
An attempt to fetch a deleted block will lead to failure of the query.
Therefore, in this context, an &lt;em&gt;almost up-to-date&lt;/em&gt; view is a view that’s outdated by less than the value of &lt;code&gt;-compactor.deletion-delay&lt;/code&gt;.&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
]]></content><description>&lt;h1 id="grafana-mimir-bucket-index">Grafana Mimir bucket index&lt;/h1>
&lt;p>The bucket index is a per-tenant file that contains the list of blocks and block deletion marks in the storage. The bucket index is stored in the backend object storage, is periodically updated by the compactor, and used by queriers, store-gateways, and rulers (in &lt;a href="../components/ruler/#internal">internal&lt;/a> operational mode) to discover blocks in the storage.&lt;/p></description></item><item><title>Grafana Mimir hash rings</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/hash-ring/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/hash-ring/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-hash-rings&#34;&gt;Grafana Mimir hash rings&lt;/h1&gt;
&lt;p&gt;Hash rings are a distributed &lt;a href=&#34;https://en.wikipedia.org/wiki/Consistent_hashing&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;consistent hashing scheme&lt;/a&gt; that Grafana Mimir uses for sharding, replication, and service discovery.&lt;/p&gt;
&lt;p&gt;The following Mimir features are built on top of hash rings:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Service discovery: Instances can discover each other by looking up which peers are registered in the ring.&lt;/li&gt;
&lt;li&gt;Health check: Instances periodically send a heartbeat to the ring to signal that they are healthy. An instance is considered unhealthy if it misses heartbeats for a configured period.&lt;/li&gt;
&lt;li&gt;Zone-aware replication: Optionally replicate data across failure domains for high availability. For more information, refer to 
    &lt;a href=&#34;/docs/mimir/v3.1.x/configure/configure-zone-aware-replication/&#34;&gt;Configure zone-aware replication&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Shuffle sharding: Optionally limit the blast radius of failures in a multi-tenant cluster by isolating tenants. For more information, refer to 
    &lt;a href=&#34;/docs/mimir/v3.1.x/configure/configure-shuffle-sharding/&#34;&gt;Configure shuffle sharding&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;how-the-hash-ring-is-used-for-sharding&#34;&gt;How the hash ring is used for sharding&lt;/h2&gt;
&lt;p&gt;The primary use of hash rings in Mimir is to consistently shard data, such as time series, and workloads, such as compaction jobs, without a central coordinator or single point of failure.&lt;/p&gt;
&lt;p&gt;Each of the following Mimir components joins its own dedicated hash ring for sharding:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/ingester/&#34;&gt;Ingesters&lt;/a&gt;: Shard and replicate series.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/compactor/&#34;&gt;Compactors&lt;/a&gt;: Shard compaction jobs.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/store-gateway/&#34;&gt;Store-gateways&lt;/a&gt;: Shard blocks to query from long-term storage.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/ruler/&#34;&gt;(Optional) Rulers&lt;/a&gt;: Shard rule groups to evaluate.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/alertmanager/&#34;&gt;(Optional) Alertmanagers&lt;/a&gt;: Shard tenants.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A hash ring is a data structure that represents the data space as 32-bit unsigned integers.
Each instance of a Mimir component owns a set of token ranges that define which portion of the data space it is responsible for.&lt;/p&gt;
&lt;p&gt;The data or workload to be sharded is hashed using a function that returns a 32-bit unsigned integer, called a token.
The instance that owns that token handles the data.&lt;/p&gt;
&lt;p&gt;When an instance starts, it generates a fixed number of tokens and registers them in the ring.
A token is owned by the instance that registered the smallest value greater than the lookup token being looked up and wraps around to zero after &lt;code&gt;(2^32)-1&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Hash rings provide consistent hashing.
When an instance joins or leaves the ring, only a small, bounded portion of data moves.
On average, only &lt;code&gt;n/m&lt;/code&gt; tokens move, where &lt;code&gt;n&lt;/code&gt; is the total number of tokens (32-bit unsigned integer) and &lt;code&gt;m&lt;/code&gt; is the number of instances that are registered in the ring.&lt;/p&gt;
&lt;h2 id=&#34;how-series-sharding-works&#34;&gt;How series sharding works&lt;/h2&gt;
&lt;p&gt;The most important hash ring in Grafana Mimir is the one used to shard series.
The implementation details depend on the configured architecture.&lt;/p&gt;
&lt;h3 id=&#34;series-sharding-in-ingest-storage-architecture&#34;&gt;Series sharding in ingest storage architecture&lt;/h3&gt;


&lt;div class=&#34;admonition admonition-note&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Note&lt;/p&gt;&lt;p&gt;This guidance applies to ingest storage architecture. For more information about the supported architectures in Grafana Mimir, refer to 
    &lt;a href=&#34;/docs/mimir/v3.1.x/get-started/about-grafana-mimir-architecture/&#34;&gt;Grafana Mimir architecture&lt;/a&gt;.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;p&gt;In ingest storage architecture, 
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/distributor/&#34;&gt;distributors&lt;/a&gt; shard incoming series across Kafka partitions.
Each series is assigned to a single Kafka partition.
Replication is handled by Kafka.&lt;/p&gt;
&lt;p&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/ingester/&#34;&gt;Ingesters&lt;/a&gt; own Kafka partitions, consuming the series written to the partitions they own and making those series available for querying.
Each ingester owns one partition, but multiple ingesters can own the same partition for high availability.&lt;/p&gt;
&lt;p&gt;Series sharding in ingest storage architecture relies on two hash rings that work together:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Partitions ring&lt;/li&gt;
&lt;li&gt;Ingesters ring&lt;/li&gt;
&lt;/ul&gt;
&lt;h4 id=&#34;write-path&#34;&gt;Write path&lt;/h4&gt;
&lt;p&gt;The partitions ring is the source of truth for the Kafka partitions that Grafana Mimir currently uses.
Each partition owns a range of tokens used to shard series among partitions and includes the unique identifiers of the ingesters that own that partition.&lt;/p&gt;
&lt;p&gt;When a distributor receives a write request containing series data:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;It hashes each series using the &lt;code&gt;fnv32a&lt;/code&gt; hashing function.&lt;/li&gt;
&lt;li&gt;It looks up the resulting token in the partitions ring to determine the Kafka partition for that series.&lt;/li&gt;
&lt;li&gt;It writes the series to the matching Kafka partition.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;A write request is considered successful when all series in the request are successfully committed to Kafka.&lt;/p&gt;
&lt;h4 id=&#34;read-path&#34;&gt;Read path&lt;/h4&gt;
&lt;p&gt;The ingesters ring is the source of truth for all ingesters currently running in the Grafana Mimir cluster and is used for service discovery.
Each ingester registers itself in the ring and periodically updates its heartbeat.&lt;/p&gt;
&lt;p&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/querier/&#34;&gt;Queriers&lt;/a&gt; watch the ingesters ring to identify healthy ingesters and their IP addresses. When a querier receives a query:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;It looks up the partitions ring to find which partitions contain the relevant data.&lt;/li&gt;
&lt;li&gt;It looks up the ingesters ring to find which ingesters own those partitions.&lt;/li&gt;
&lt;li&gt;It fetches the matching series by contacting the ingesters that own the partitions.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;In ingest storage architecture, consistency is guaranteed with a quorum of 1.
Each partition needs to be queried only once.
If multiple ingesters own the same partition, the querier fetches data from only one of the healthy ingesters for that partition.&lt;/p&gt;
&lt;h4 id=&#34;partitions-ring-lifecycle&#34;&gt;Partitions ring lifecycle&lt;/h4&gt;
&lt;p&gt;A partition in the ring can be in one of the following states:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Pending&lt;/code&gt;: No writes or reads are allowed.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Active&lt;/code&gt;: The partition is in read-write mode.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Inactive&lt;/code&gt;: The partition is in read-only mode.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Partitions are not live components and cannot register themselves in the ring.
Their lifecycle is managed by ingesters.
Each ingester manages the lifecycle of the partition it owns.&lt;/p&gt;
&lt;p&gt;When ingesters are scaled out, new partitions are added to the ring.
When ingesters are scaled in, their partitions are removed from the ring through a &lt;a href=&#34;#partition-decommissioning-and-downscaling&#34;&gt;decommissioning procedure&lt;/a&gt;.&lt;/p&gt;
&lt;h5 id=&#34;partition-creation-and-activation&#34;&gt;Partition creation and activation&lt;/h5&gt;
&lt;p&gt;When an ingester starts up, it checks whether the partition it owns already exists in the ring.
If the partition does not exist, the ingester creates it in the &lt;code&gt;Pending&lt;/code&gt; state and adds itself as the partition owner.&lt;/p&gt;
&lt;p&gt;This is the initial state for a new partition, allowing time for additional ingesters to join as owners and for ring changes to propagate across instances.
While a partition is in the &lt;code&gt;Pending&lt;/code&gt; state, distributors cannot write to it, and queriers cannot read from it.&lt;/p&gt;
&lt;p&gt;After the partition has at least one owner and remains in &lt;code&gt;Pending&lt;/code&gt; for longer than a configured grace period, the ingester transitions it to the &lt;code&gt;Active&lt;/code&gt; state.
When a partition is &lt;code&gt;Active&lt;/code&gt;, distributors can write to it, and queriers must read from it.
This is the normal operational state of a partition.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;partitions-lifecycle-ingester-start.png&#34;
  alt=&#34;Partitions lifecycle - How it works when an ingester starts&#34;/&gt;&lt;/p&gt;
&lt;h5 id=&#34;partition-decommissioning-and-downscaling&#34;&gt;Partition decommissioning and downscaling&lt;/h5&gt;
&lt;p&gt;Grafana&amp;rsquo;s &lt;a href=&#34;https://github.com/grafana/rollout-operator&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Kubernetes Rollout Operator&lt;/a&gt; manages partition and ingester downscaling.&lt;/p&gt;
&lt;p&gt;When an ingester is marked for termination due to a downscaling event, the rollout operator invokes the &amp;ldquo;prepare delayed downscale endpoint&amp;rdquo; API exposed by the ingester.
This API switches the partition from &lt;code&gt;Active&lt;/code&gt; to &lt;code&gt;Inactive&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When a partition is &lt;code&gt;Inactive&lt;/code&gt;, distributors can no longer write to it, but queriers must still read from it.
The partition remains in this state until it is safe to stop querying the ingester, specifically, when the data has become available for querying from long-term object storage.&lt;/p&gt;
&lt;p&gt;Once the grace period passes, the rollout operator invokes a second API exposed by the ingester, the &amp;ldquo;prepare shutdown endpoint&amp;rdquo;.
This API removes the ingester as a partition owner from the ring.
If the partition has no remaining owners, it is then removed from the ring entirely.&lt;/p&gt;
&lt;p&gt;Finally, the rollout operator terminates the ingester pod, completing the safe downscaling procedure.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;partitions-lifecycle-ingester-stop.png&#34;
  alt=&#34;Partitions lifecycle - How it works when an ingester stop&#34;/&gt;&lt;/p&gt;
&lt;h3 id=&#34;series-sharding-in-classic-architecture&#34;&gt;Series sharding in classic architecture&lt;/h3&gt;


&lt;div class=&#34;admonition admonition-note&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Note&lt;/p&gt;&lt;p&gt;This guidance applies to classic architecture. For more information about the supported architectures in Grafana Mimir, refer to 
    &lt;a href=&#34;/docs/mimir/v3.1.x/get-started/about-grafana-mimir-architecture/&#34;&gt;Grafana Mimir architecture&lt;/a&gt;.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;p&gt;In classic architecture, 
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/distributor/&#34;&gt;distributors&lt;/a&gt; shard and replicate the incoming series among 
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/ingester/&#34;&gt;ingesters&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Each ingester joins the ingesters hash ring and owns a subset of token ranges.
When a distributor receives a write request containing series data, it hashes each series using the &lt;code&gt;fnv32a&lt;/code&gt; hashing function.
It then looks up the resulting token in the ingesters hash ring to find the authoritative owner and replicates the series to the next &lt;code&gt;RF - 1&lt;/code&gt; ingesters in the ring (where &lt;code&gt;RF&lt;/code&gt; is the replication factor, &lt;code&gt;3&lt;/code&gt; by default).&lt;/p&gt;
&lt;p&gt;Then the distributor writes the series to the &lt;code&gt;RF&lt;/code&gt; ingesters owning the series itself.
A write request is considered successful when each series is written to a quorum of ingesters.
With a replication factor of 3, a quorum is reached when at least 2 ingesters successfully receive each series.&lt;/p&gt;
&lt;p&gt;To illustrate, consider four ingesters and a token space from &lt;code&gt;0&lt;/code&gt; to &lt;code&gt;9&lt;/code&gt;:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Ingester #1 is registered in the ring with the token &lt;code&gt;2&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Ingester #2 is registered in the ring with the token &lt;code&gt;4&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Ingester #3 is registered in the ring with the token &lt;code&gt;6&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Ingester #4 is registered in the ring with the token &lt;code&gt;9&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;A distributor receives an incoming sample for the series &lt;code&gt;{__name__=&amp;quot;cpu_seconds_total&amp;quot;,instance=&amp;quot;1.1.1.1&amp;quot;}&lt;/code&gt;.
It hashes the series’ labels, and the result of the hashing function is the token &lt;code&gt;3&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;To find which ingester owns token &lt;code&gt;3&lt;/code&gt;, the distributor looks up the token &lt;code&gt;3&lt;/code&gt; in the ingesters ring and finds the ingester that is registered with the smallest token larger than &lt;code&gt;3&lt;/code&gt;.
The ingester #2, which is registered with token &lt;code&gt;4&lt;/code&gt;, is the authoritative owner of the series &lt;code&gt;{__name__=&amp;quot;cpu_seconds_total&amp;quot;,instance=&amp;quot;1.1.1.1&amp;quot;}&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;classic-hash-ring-without-replication.png&#34;
  alt=&#34;Hash ring without replication&#34;/&gt;&lt;/p&gt;
&lt;p&gt;By default, Grafana Mimir replicates each series to three ingesters.
After finding the authoritative owner of the series, the distributor continues to walk the ring clockwise to find the remaining two instances where the series should be replicated.
In the example that follows, the series are replicated to the instances of &lt;code&gt;Ingester #3&lt;/code&gt; and &lt;code&gt;Ingester #4&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;classic-hash-ring-with-replication.png&#34;
  alt=&#34;Hash ring with replication&#34;/&gt;&lt;/p&gt;
&lt;h2 id=&#34;how-the-hash-ring-is-used-for-service-discovery&#34;&gt;How the hash ring is used for service discovery&lt;/h2&gt;
&lt;p&gt;Grafana Mimir also uses the ring for built-in service discovery.
Since instances register themselves in their ring and periodically send heartbeats, it&amp;rsquo;s convenient to use the hash ring for internal service discovery as well.&lt;/p&gt;
&lt;p&gt;When the hash ring is used exclusively for service discovery, rather than sharding, instances don&amp;rsquo;t register tokens in the ring.
Instead, they only register their presence and periodically update a heartbeat timestamp.
When other instances need to find the healthy instances of a given component, they look up the ring to find the instances that have successfully updated the heartbeat timestamp in the ring.&lt;/p&gt;
&lt;p&gt;The Grafana Mimir components using the ring for service discovery or coordination are:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/distributor/&#34;&gt;Distributors&lt;/a&gt;: Enforce global rate limits as local limits by dividing the global limit by the number of healthy distributor instances. For more information, refer to 
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/distributor/#rate-limiting&#34;&gt;Rate limiting&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/query-scheduler/&#34;&gt;Query-schedulers&lt;/a&gt;: Allow query-frontends and queriers to discover available schedulers.&lt;/li&gt;
&lt;li&gt;
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/overrides-exporter/&#34;&gt;(Optional) Overrides-exporters&lt;/a&gt;: Self-elect a leader among replicas to export high-cardinality metrics. No strict leader election is required.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;share-a-hash-ring-between-grafana-mimir-instances&#34;&gt;Share a hash ring between Grafana Mimir instances&lt;/h2&gt;
&lt;p&gt;Hash ring data structures need to be shared between Grafana Mimir instances.
To propagate changes to a given hash ring, Grafana Mimir uses a key-value store.
You can configure the key-value store independently for the hash rings of different components.&lt;/p&gt;
&lt;p&gt;For more information, refer to &lt;a href=&#34;../key-value-store/&#34;&gt;Grafana Mimir key-value store&lt;/a&gt;.&lt;/p&gt;
]]></content><description>&lt;h1 id="grafana-mimir-hash-rings">Grafana Mimir hash rings&lt;/h1>
&lt;p>Hash rings are a distributed &lt;a href="https://en.wikipedia.org/wiki/Consistent_hashing" target="_blank" rel="noopener noreferrer">consistent hashing scheme&lt;/a> that Grafana Mimir uses for sharding, replication, and service discovery.&lt;/p>
&lt;p>The following Mimir features are built on top of hash rings:&lt;/p></description></item><item><title>Grafana Mimir key-value store</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/key-value-store/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/key-value-store/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-key-value-store&#34;&gt;Grafana Mimir key-value store&lt;/h1&gt;
&lt;p&gt;A key-value (KV) store is a database that stores data indexed by key.
Grafana Mimir requires a key-value store for the following features:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;../hash-ring/&#34;&gt;Hash ring&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;../../../configure/configure-high-availability-deduplication/&#34;&gt;(Optional) Distributor high-availability tracker&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;supported-key-value-store-backends&#34;&gt;Supported key-value store backends&lt;/h2&gt;
&lt;p&gt;Grafana Mimir supports the following key-value (KV) store backends:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Gossip-based &lt;a href=&#34;https://github.com/hashicorp/memberlist&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;memberlist&lt;/a&gt; protocol (default)&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://www.consul.io&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Consul&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://etcd.io&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Etcd&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;gossip-based-memberlist-protocol-default&#34;&gt;Gossip-based memberlist protocol (default)&lt;/h3&gt;
&lt;p&gt;By default, Grafana Mimir instances use a Gossip-based protocol to join a memberlist cluster.
The data is shared between the instances using peer-to-peer communication and no external dependency is required.&lt;/p&gt;
&lt;p&gt;We recommend that you use memberlist to run Grafana Mimir.&lt;/p&gt;
&lt;p&gt;To configure memberlist, refer to &lt;a href=&#34;../../../configure/configure-hash-rings/&#34;&gt;configuring hash rings&lt;/a&gt;.&lt;/p&gt;


&lt;div class=&#34;admonition admonition-note&#34;&gt;&lt;blockquote&gt;&lt;p class=&#34;title text-uppercase&#34;&gt;Note&lt;/p&gt;&lt;p&gt;The Gossip-based memberlist protocol isn&amp;rsquo;t supported for the &lt;a href=&#34;../../../configure/configure-high-availability-deduplication/&#34;&gt;optional distributor high-availability tracker&lt;/a&gt;.&lt;/p&gt;&lt;/blockquote&gt;&lt;/div&gt;

&lt;h3 id=&#34;consul&#34;&gt;Consul&lt;/h3&gt;
&lt;p&gt;Grafana Mimir supports &lt;a href=&#34;https://www.consul.io&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Consul&lt;/a&gt; as a backend KV store.
If you want to use Consul, you must install it. The Grafana Mimir installation does not include Consul.&lt;/p&gt;
&lt;p&gt;To configure Consul, refer to &lt;a href=&#34;../../../configure/configure-hash-rings/&#34;&gt;configuring hash rings&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;etcd&#34;&gt;Etcd&lt;/h3&gt;
&lt;p&gt;Grafana Mimir supports &lt;a href=&#34;https://etcd.io&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;etcd&lt;/a&gt; as a backend KV store.
If you want to use etcd, you must install it. The Grafana Mimir installation does not include etcd.&lt;/p&gt;
&lt;p&gt;To configure etcd, refer to &lt;a href=&#34;../../../configure/configure-hash-rings/&#34;&gt;configuring hash rings&lt;/a&gt;.&lt;/p&gt;
]]></content><description>&lt;h1 id="grafana-mimir-key-value-store">Grafana Mimir key-value store&lt;/h1>
&lt;p>A key-value (KV) store is a database that stores data indexed by key.
Grafana Mimir requires a key-value store for the following features:&lt;/p></description></item><item><title>Grafana Mimir memberlist and gossip protocol</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/memberlist-and-the-gossip-protocol/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/memberlist-and-the-gossip-protocol/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-memberlist-and-gossip-protocol&#34;&gt;Grafana Mimir memberlist and gossip protocol&lt;/h1&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/hashicorp/memberlist&#34; target=&#34;_blank&#34; rel=&#34;noopener noreferrer&#34;&gt;Memberlist&lt;/a&gt; is a Go library that manages cluster membership, node failure detection, and message passing using a gossip-based protocol.
Memberlist is eventually consistent and network partitions are partially tolerated by attempting to communicate to potentially dead nodes through multiple routes.&lt;/p&gt;
&lt;p&gt;By default, Grafana Mimir uses memberlist to implement a &lt;a href=&#34;../key-value-store/&#34;&gt;key-value (KV) store&lt;/a&gt; to share the &lt;a href=&#34;../hash-ring/&#34;&gt;hash ring&lt;/a&gt; data structures between instances.&lt;/p&gt;
&lt;p&gt;When using a memberlist-based KV store, each instance maintains a copy of the hash rings.
Each Mimir instance updates a hash ring locally and uses memberlist to propagate the changes to other instances.
Updates generated locally and updates received from other instances are merged together to form the current state of the ring on the instance.&lt;/p&gt;
&lt;p&gt;To configure memberlist, refer to &lt;a href=&#34;../../../configure/configure-hash-rings/&#34;&gt;configuring hash rings&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;how-memberlist-propagates-hash-ring-changes&#34;&gt;How memberlist propagates hash ring changes&lt;/h2&gt;
&lt;p&gt;When using a memberlist-based KV store, every Grafana Mimir instance propagates the hash ring data structures to other instances using the following techniques:&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Propagating only the differences introduced in recent changes.&lt;/li&gt;
&lt;li&gt;Propagating the full hash ring data structure.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Every &lt;code&gt;-memberlist.gossip-interval&lt;/code&gt; an instance randomly selects a subset of all Grafana Mimir cluster instances configured by &lt;code&gt;-memberlist.gossip-nodes&lt;/code&gt; and sends the latest changes to the selected instances.
This operation is performed frequently and it&amp;rsquo;s the primary technique used to propagate changes.&lt;/p&gt;
&lt;p&gt;In addition, every &lt;code&gt;-memberlist.pullpush-interval&lt;/code&gt; an instance randomly selects another instance in the Grafana Mimir cluster and transfers the full content of the KV store, including all hash rings (unless &lt;code&gt;-memberlist.pullpush-interval&lt;/code&gt; is zero, which disables this behavior).
After this operation is complete, the two instances have the same content as the KV store.
This operation is computationally more expensive, and as a result, it&amp;rsquo;s performed less frequently. The operation ensures that the hash rings periodically reconcile to a common state.&lt;/p&gt;
]]></content><description>&lt;h1 id="grafana-mimir-memberlist-and-gossip-protocol">Grafana Mimir memberlist and gossip protocol&lt;/h1>
&lt;p>&lt;a href="https://github.com/hashicorp/memberlist" target="_blank" rel="noopener noreferrer">Memberlist&lt;/a> is a Go library that manages cluster membership, node failure detection, and message passing using a gossip-based protocol.
Memberlist is eventually consistent and network partitions are partially tolerated by attempting to communicate to potentially dead nodes through multiple routes.&lt;/p></description></item><item><title>Grafana Mimir query sharding</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/query-sharding/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/query-sharding/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-query-sharding&#34;&gt;Grafana Mimir query sharding&lt;/h1&gt;
&lt;p&gt;Mimir includes the ability to run a single query across multiple machines. This is
achieved by breaking the dataset into smaller pieces. These smaller pieces are
called shards. Each shard then gets queried in a partial query, and those
partial queries are distributed by the query-frontend to run on different
queriers in parallel. The results of those partial queries are aggregated by the
query-frontend to return the full query result.&lt;/p&gt;
&lt;p&gt;Query sharding is applied on the &lt;a href=&#34;../../http-api/#instant-query&#34;&gt;&lt;code&gt;query&lt;/code&gt;&lt;/a&gt;
and &lt;a href=&#34;../../http-api/#range-query&#34;&gt;&lt;code&gt;query_range&lt;/code&gt;&lt;/a&gt; APIs only.&lt;/p&gt;
&lt;h2 id=&#34;query-sharding-at-glance&#34;&gt;Query sharding at glance&lt;/h2&gt;
&lt;p&gt;Not all queries are shardable. While the full query is not shardable, the inner
parts of a query could still be shardable.&lt;/p&gt;
&lt;p&gt;In particular associative aggregations (like &lt;code&gt;sum&lt;/code&gt;, &lt;code&gt;min&lt;/code&gt;, &lt;code&gt;max&lt;/code&gt;, &lt;code&gt;count&lt;/code&gt;,
&lt;code&gt;avg&lt;/code&gt;) are shardable, while some query functions (like &lt;code&gt;absent&lt;/code&gt;, &lt;code&gt;absent_over_time&lt;/code&gt;,
&lt;code&gt;histogram_quantile&lt;/code&gt;, &lt;code&gt;sort_desc&lt;/code&gt;, &lt;code&gt;sort&lt;/code&gt;) are not.&lt;/p&gt;
&lt;p&gt;In the following examples we look at a concrete example with a shard count of
&lt;code&gt;3&lt;/code&gt;. All the partial queries that include a label selector &lt;code&gt;__query_shard__&lt;/code&gt;
are executed in parallel. The &lt;code&gt;concat()&lt;/code&gt; annotation is used to show when partial
query results are concatenated/merged by the query-frontend.&lt;/p&gt;
&lt;h3 id=&#34;example-1-full-query-is-shardable&#34;&gt;Example 1: Full query is shardable&lt;/h3&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum(rate(metric[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Is executed as (assuming a shard count of 3):&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum(
  concat(
    sum(rate(metric{__query_shard__=&amp;#34;1_of_3&amp;#34;}[1m]))
    sum(rate(metric{__query_shard__=&amp;#34;2_of_3&amp;#34;}[1m]))
    sum(rate(metric{__query_shard__=&amp;#34;3_of_3&amp;#34;}[1m]))
  )
)&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;example-2-inner-part-is-shardable&#34;&gt;Example 2: Inner part is shardable&lt;/h3&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;histogram_quantile(0.99, sum by(le) (rate(metric[1m])))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Is executed as (assuming a shard count of 3):&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;histogram_quantile(0.99, sum by(le) (
  concat(
    sum by(le) (rate(metric{__query_shard__=&amp;#34;1_of_3&amp;#34;}[1m]))
    sum by(le) (rate(metric{__query_shard__=&amp;#34;2_of_3&amp;#34;}[1m]))
    sum by(le) (rate(metric{__query_shard__=&amp;#34;3_of_3&amp;#34;}[1m]))
  )
))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;h3 id=&#34;example-3-query-with-two-shardable-portions&#34;&gt;Example 3: Query with two shardable portions&lt;/h3&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum(rate(failed[1m])) / sum(rate(total[1m]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;Is executed as (assuming a shard count of 3):&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum(
  concat(
    sum (rate(failed{__query_shard__=&amp;#34;1_of_3&amp;#34;}[1m]))
    sum (rate(failed{__query_shard__=&amp;#34;2_of_3&amp;#34;}[1m]))
    sum (rate(failed{__query_shard__=&amp;#34;3_of_3&amp;#34;}[1m]))
  )
)
/
sum(
  concat(
    sum (rate(total{__query_shard__=&amp;#34;1_of_3&amp;#34;}[1m]))
    sum (rate(total{__query_shard__=&amp;#34;2_of_3&amp;#34;}[1m]))
    sum (rate(total{__query_shard__=&amp;#34;3_of_3&amp;#34;}[1m]))
  )
)&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;&lt;img
  class=&#34;lazyload d-inline-block&#34;
  data-src=&#34;query-sharding.png&#34;
  alt=&#34;Flow of a query with two shardable portions&#34;/&gt;&lt;/p&gt;
&lt;h2 id=&#34;how-to-enable-query-sharding&#34;&gt;How to enable query sharding&lt;/h2&gt;
&lt;p&gt;In order to enable query sharding you need to opt-in by setting
&lt;code&gt;-query-frontend.parallelize-shardable-queries&lt;/code&gt; to &lt;code&gt;true&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;Each shardable portion of a query is split into
&lt;code&gt;-query-frontend.query-sharding-total-shards&lt;/code&gt; partial queries. If a query has multiple
inner portions that can be sharded, each portion is sharded
&lt;code&gt;-query-frontend.query-sharding-total-shards&lt;/code&gt; times. In some cases, this could lead to
an explosion of queries. For this reason, there is a parameter that allows to
modify the default hard limit of 128 queries on the total number of partial
queries a single input query can generate:
&lt;code&gt;-query-frontend.query-sharding-max-sharded-queries&lt;/code&gt;.&lt;/p&gt;
&lt;p&gt;When running a query over a large time range and
&lt;code&gt;-query-frontend.split-queries-by-interval&lt;/code&gt; is enabled, the
&lt;code&gt;-query-frontend.query-sharding-max-sharded-queries&lt;/code&gt; limit applies on the total
number of queries which have been split by time (first) and by shards (second).&lt;/p&gt;
&lt;p&gt;As an example, if &lt;code&gt;-query-frontend.query-sharding-max-sharded-queries=128&lt;/code&gt; and
&lt;code&gt;-query-frontend.split-queries-by-interval=24h&lt;/code&gt;, and you run a query over 8 days, each
daily query will have a max of 128 / 8 days = 16 partial queries per day.&lt;/p&gt;
&lt;p&gt;After enabling query sharding in a microservices deployment, the query
frontends will start processing the aggregation of the partial queries. Hence
it is important to configure some PromQL engine specific parameters on the
query-frontend too:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;-querier.max-concurrent&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-querier.timeout&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-querier.max-samples&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-querier.default-evaluation-interval&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;-querier.lookback-delta&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;operational-considerations&#34;&gt;Operational considerations&lt;/h2&gt;
&lt;p&gt;Splitting a single query into sharded queries increases the quantity of queries
that must be processed. Parallelization decreases the query processing time,
but increases the load on querier components and their underlying data stores
(ingesters for recent data and store-gateway for historic data). The
caching layer for chunks and indexes will also experience an increased load.&lt;/p&gt;
&lt;p&gt;We also recommend to increase the maximum number of queries scheduled in
parallel by the query-frontend, multiplying the previously set value of
&lt;code&gt;-querier.max-query-parallelism&lt;/code&gt; by
&lt;code&gt;-query-frontend.query-sharding-total-shards&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&#34;cardinality-estimation-for-query-sharding-experimental&#34;&gt;Cardinality estimation for query sharding (experimental)&lt;/h2&gt;
&lt;p&gt;When the number of parallel sharded queries increases, so does the load on the queriers and their dependencies. Therefore, to balance the tradeoff, only use shard queries as much as necessary.
Queries that return more series, such as those that are of high cardinality, need to fetch more data and should therefore be split into a larger number of shards.
Queries that return few or no series should be executed with fewer or no shards at all.
When determining the number of shards to use for a given query, the sharding logic can optionally take into account the cardinality (number of series) observed during previous executions of the same query for similar time ranges.&lt;/p&gt;
&lt;p&gt;To enable this feature, set &lt;code&gt;-query-frontend.query-sharding-target-series-per-shard&lt;/code&gt; to a value representing roughly how many series each shard should fetch, and configure the results cache via the &lt;code&gt;query-frontend.results-cache.*&lt;/code&gt; flags.
This is necessary even when results caching is disabled, as the estimates are stored in the same cache that&amp;rsquo;s used for query result caching.
The value that you set for this flag is one of several parameters that the sharding logic uses to determine the appropriate number of shards for a query.
Therefore, it will not strictly be complied with in all cases, and the actual number of series fetched per shard might exceed the limit.
This is likely to happen in cases where the cardinality of a query changes rapidly within a short period of time.&lt;/p&gt;
&lt;p&gt;Estimates for query cardinality are only ever used to reduce the number of shards compared to the case when cardinality estimation is disabled.
Other parameters that limit the total number of shards, such as &lt;code&gt;-query-frontend.query-sharding-total-shards&lt;/code&gt;, will still provide an upper bound for the number of shards even when cardinality estimation is enabled and would suggest the use of a higher number of shards.&lt;/p&gt;
&lt;p&gt;The histogram metric &lt;code&gt;cortex_query_frontend_cardinality_estimation_difference&lt;/code&gt; tracks the difference between the estimated and actual number of series fetched.&lt;/p&gt;
&lt;h2 id=&#34;verification&#34;&gt;Verification&lt;/h2&gt;
&lt;h3 id=&#34;query-statistics&#34;&gt;Query statistics&lt;/h3&gt;
&lt;p&gt;The query statistics logged by the query-frontend allow to check if query sharding was
used for an individual query. The field &lt;code&gt;sharded_queries&lt;/code&gt; contains the amount
of parallelly executed partial queries.&lt;/p&gt;
&lt;p&gt;When &lt;code&gt;sharded_queries&lt;/code&gt; is &lt;code&gt;0&lt;/code&gt;, either the query is not shardable or query
sharding is disabled for cluster or tenant. This is a log line of an
unshardable query:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sharded_queries=0  param_query=&amp;#34;absent(up{job=\&amp;#34;my-service\&amp;#34;})&amp;#34;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;When &lt;code&gt;sharded_queries&lt;/code&gt; matches the configured shard count, query sharding is
operational and the query has only a single leg (assuming time splitting is
disabled or the query doesn&amp;rsquo;t span across multiple days). The following log
line represents that case with a shard count of &lt;code&gt;16&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sharded_queries=16 query=&amp;#34;sum(rate(prometheus_engine_queries[5m]))&amp;#34;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;When &lt;code&gt;sharded_queries&lt;/code&gt; is a multiple of the configured shard count, query
sharding is operational and the query has multiple legs (assuming time
splitting is disabled or the query doesn&amp;rsquo;t span across multiple days). The
following log line shows a query with two legs and with a configured shard
count of &lt;code&gt;16&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&#34;code-snippet code-snippet__mini&#34;&gt;&lt;div class=&#34;lang-toolbar__mini&#34;&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet code-snippet__border&#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-none&#34;&gt;sharded_queries=32 query=&amp;#34;sum(rate(prometheus_engine_queries{engine=\&amp;#34;ruler\&amp;#34;}[5m]))/sum(rate(prometheus_engine_queries[5m]))&amp;#34;&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The query-frontend also exposes metrics, which can be useful to understand the
query workload&amp;rsquo;s parallelism as a whole.&lt;/p&gt;
&lt;p&gt;You can run the following query to get the ratio of queries which have been successfully sharded:&lt;/p&gt;

&lt;div class=&#34;code-snippet &#34;&gt;&lt;div class=&#34;lang-toolbar&#34;&gt;
    &lt;span class=&#34;lang-toolbar__item lang-toolbar__item-active&#34;&gt;promql&lt;/span&gt;
    &lt;span class=&#34;code-clipboard&#34;&gt;
      &lt;button x-data=&#34;app_code_snippet()&#34; x-init=&#34;init()&#34; @click=&#34;copy()&#34;&gt;
        &lt;img class=&#34;code-clipboard__icon&#34; src=&#34;/media/images/icons/icon-copy-small-2.svg&#34; alt=&#34;Copy code to clipboard&#34; width=&#34;14&#34; height=&#34;13&#34;&gt;
        &lt;span&gt;Copy&lt;/span&gt;
      &lt;/button&gt;
    &lt;/span&gt;
    &lt;div class=&#34;lang-toolbar__border&#34;&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;div class=&#34;code-snippet &#34;&gt;
    &lt;pre data-expanded=&#34;false&#34;&gt;&lt;code class=&#34;language-promql&#34;&gt;sum(rate(cortex_frontend_query_sharding_rewrites_succeeded_total[$__rate_interval])) /
sum(rate(cortex_frontend_query_sharding_rewrites_attempted_total[$__rate_interval]))&lt;/code&gt;&lt;/pre&gt;
  &lt;/div&gt;
&lt;/div&gt;
&lt;p&gt;The histogram &lt;code&gt;cortex_frontend_sharded_queries_per_query&lt;/code&gt; allows to understand
how many sharded sub queries are generated per query.&lt;/p&gt;
]]></content><description>&lt;h1 id="grafana-mimir-query-sharding">Grafana Mimir query sharding&lt;/h1>
&lt;p>Mimir includes the ability to run a single query across multiple machines. This is
achieved by breaking the dataset into smaller pieces. These smaller pieces are
called shards. Each shard then gets queried in a partial query, and those
partial queries are distributed by the query-frontend to run on different
queriers in parallel. The results of those partial queries are aggregated by the
query-frontend to return the full query result.&lt;/p></description></item><item><title>Grafana Mimir query engine</title><link>https://grafana.com/docs/mimir/v3.1.x/references/architecture/mimir-query-engine/</link><pubDate>Wed, 03 Jun 2026 09:01:40 +0200</pubDate><guid>https://grafana.com/docs/mimir/v3.1.x/references/architecture/mimir-query-engine/</guid><content><![CDATA[&lt;h1 id=&#34;grafana-mimir-query-engine&#34;&gt;Grafana Mimir query engine&lt;/h1&gt;
&lt;p&gt;The Mimir Query Engine (MQE) is an alternative to Prometheus&amp;rsquo; query engine.
You can use it in 
    &lt;a href=&#34;/docs/mimir/v3.1.x/references/architecture/components/querier/&#34;&gt;queriers&lt;/a&gt;
to evaluate PromQL queries.&lt;/p&gt;
&lt;p&gt;MQE produces equivalent results to Prometheus&amp;rsquo; engine, generally uses less memory and CPU
than Prometheus&amp;rsquo; engine, and evaluates queries at least as fast, if not faster.
It supports all stable PromQL features and transparently falls back to Prometheus&#39;
engine for queries that use unsupported features.&lt;/p&gt;
&lt;h2 id=&#34;how-to-enable-mqe&#34;&gt;How to enable MQE&lt;/h2&gt;
&lt;p&gt;MQE is enabled by default. To disable it, either set the
&lt;code&gt;-querier.query-engine=prometheus&lt;/code&gt; CLI flag on queriers or set the equivalent YAML
configuration file option.&lt;/p&gt;
&lt;h2 id=&#34;fallback-to-prometheus-engine&#34;&gt;Fallback to Prometheus&amp;rsquo; engine&lt;/h2&gt;
&lt;p&gt;By default, MQE falls back to Prometheus&amp;rsquo; engine for any queries that use unsupported
features.&lt;/p&gt;
&lt;p&gt;To disable this behaviour, either set the &lt;code&gt;-querier.enable-query-engine-fallback=false&lt;/code&gt;
CLI flag on queriers, or set the equivalent YAML configuration file option. If fallback
is disabled and MQE receives a query it does not support, then the query fails.&lt;/p&gt;
&lt;p&gt;To force a query supported by MQE to use Prometheus&amp;rsquo; engine, add the
&lt;code&gt;X-Mimir-Force-Prometheus-Engine: true&lt;/code&gt; HTTP header to the query request. This header only
has an effect if fallback is enabled.&lt;/p&gt;
&lt;h2 id=&#34;query-memory-consumption-limit&#34;&gt;Query memory consumption limit&lt;/h2&gt;
&lt;p&gt;MQE supports enforcing a per-query memory consumption limit. This allows you to ensure that
a single memory-hungry query cannot monopolize a large proportion of available memory in a
querier, or cause it to exhaust all available memory and crash.&lt;/p&gt;
&lt;p&gt;While evaluating a query, MQE estimates the memory consumed by the query, such as memory used
for the final result and any intermediate calculations, and stops the query with an

    &lt;a href=&#34;/docs/mimir/v3.1.x/manage/mimir-runbooks/#err-mimir-max-estimated-memory-consumption-per-query&#34;&gt;&lt;code&gt;err-mimir-max-estimated-memory-consumption-per-query&lt;/code&gt;&lt;/a&gt;
error if the estimate exceeds the configured limit.&lt;/p&gt;
&lt;p&gt;The estimate is based on the memory consumed by samples currently held in memory for query
evaluation. This includes both raw samples decoded from chunks, and samples held in memory as
intermediate results of calculations or as the final result. It also includes some other large
sources of memory consumption for intermediate results.&lt;/p&gt;
&lt;p&gt;This estimate has the following limitations:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;It doesn&amp;rsquo;t consider the memory consumed by series labels.&lt;/li&gt;
&lt;li&gt;It doesn&amp;rsquo;t consider the memory consumed by chunks that are currently in memory.
However, the maximum chunks and maximum chunks bytes limits continue to be enforced.&lt;/li&gt;
&lt;li&gt;It makes an assumption about the memory consumed by each native histogram, rather than
accurately calculating the memory consumed by each histogram.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;By default, no limit is enforced. To configure the default limit for all tenants, set either
the &lt;code&gt;-querier.max-estimated-memory-consumption-per-query&lt;/code&gt; CLI flag, or set the
equivalent YAML configuration file option. You can override this default limit on a per-tenant
basis by setting &lt;code&gt;max_estimated_memory_consumption_per_query&lt;/code&gt; for that tenant. Setting the
limit to 0 disables it.&lt;/p&gt;
&lt;p&gt;The limit is not enforced for queries that run through Prometheus&amp;rsquo; engine, and setting the limit
has no impact if MQE is disabled or if the query falls back to Prometheus&amp;rsquo; engine.&lt;/p&gt;
&lt;h2 id=&#34;known-differences-compared-to-prometheus-engine&#34;&gt;Known differences compared to Prometheus&amp;rsquo; engine&lt;/h2&gt;
&lt;p&gt;The following are known differences between MQE and Prometheus&amp;rsquo; engine:&lt;/p&gt;
&lt;h3 id=&#34;binary-operations-that-produce-no-series&#34;&gt;Binary operations that produce no series&lt;/h3&gt;
&lt;p&gt;When MQE evaluates a binary operation (such as &lt;code&gt;&#43;&lt;/code&gt;, &lt;code&gt;-&lt;/code&gt;, &lt;code&gt;/&lt;/code&gt;, &lt;code&gt;and&lt;/code&gt;, &lt;code&gt;or&lt;/code&gt;, etc.), it checks if the binary operation will produce no series, or if some series from one side of the operation can be skipped, based on the series labels on both sides.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;If MQE can determine that a binary operation produce no series based on the series labels on both sides, it skips evaluating both sides.
For example, if the query is &lt;code&gt;foo / bar&lt;/code&gt;, and &lt;code&gt;foo&lt;/code&gt; selects a single series &lt;code&gt;foo{env=&amp;quot;test&amp;quot;}&lt;/code&gt;, and
&lt;code&gt;bar&lt;/code&gt; selects a single series &lt;code&gt;bar{env=&amp;quot;prod&amp;quot;}&lt;/code&gt;, then the query cannot produce any series and so the
data for each side is not evaluated.&lt;/li&gt;
&lt;li&gt;If MQE can determine that some series on one side will not match anything on the other side, it will skip evaluating the series that do not match the other side.
For example, if the query is &lt;code&gt;foo / on (env) bar&lt;/code&gt;, and &lt;code&gt;foo&lt;/code&gt; has series &lt;code&gt;foo{env=&amp;quot;1&amp;quot;, region=&amp;quot;a&amp;quot;}&lt;/code&gt; and &lt;code&gt;foo{env=&amp;quot;2&amp;quot;, region=&amp;quot;a&amp;quot;}&lt;/code&gt; and &lt;code&gt;bar&lt;/code&gt; has &lt;code&gt;bar{env=&amp;quot;1&amp;quot;, cluster=&amp;quot;x&amp;quot;}&lt;/code&gt;, &lt;code&gt;bar{env=&amp;quot;3&amp;quot;, cluster=&amp;quot;x&amp;quot;}&lt;/code&gt; and &lt;code&gt;bar{env=&amp;quot;3&amp;quot;, cluster=&amp;quot;y&amp;quot;}&lt;/code&gt;,
MQE will ignore the &lt;code&gt;env=&amp;quot;3&amp;quot;&lt;/code&gt; series from &lt;code&gt;bar&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This has some noticeable side effects, including:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;aborted stream because query was cancelled: context canceled: query execution finished&lt;/code&gt; might be
logged by the querier, as streaming data from ingesters and store-gateways is aborted without reading
all the data, as it isn&amp;rsquo;t needed.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Some annotations aren&amp;rsquo;t emitted. For example, if the query above was
&lt;code&gt;rate(foo[1m]) / sum(rate(bar[1m]))&lt;/code&gt;, Prometheus&amp;rsquo; engine emits annotations such as
&lt;code&gt;metric might not be a counter, name does not end in _total/_sum/_count/_bucket: &amp;quot;foo&amp;quot;&lt;/code&gt; and
&lt;code&gt;metric might not be a counter, name does not end in _total/_sum/_count/_bucket: &amp;quot;bar&amp;quot;&lt;/code&gt;. In contrast,
MQE doesn&amp;rsquo;t emit these annotations, as they are only emitted during the evaluation of the series data,
and not during the evaluation of the series labels.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;code&gt;found duplicate series for the match group&lt;/code&gt; errors aren&amp;rsquo;t returned by MQE if a match group has no
series on one side but multiple series on the other side and those series have samples that conflict
with each other.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h3 id=&#34;topk-and-bottomk&#34;&gt;&lt;code&gt;topk&lt;/code&gt; and &lt;code&gt;bottomk&lt;/code&gt;&lt;/h3&gt;
&lt;p&gt;MQE and Prometheus&amp;rsquo; engine produce different results for queries that use &lt;code&gt;topk&lt;/code&gt;
and &lt;code&gt;bottomk&lt;/code&gt; if different series have samples with the same values. Prometheus&amp;rsquo; engine does not
have deterministic behavior in this case and selects different series on each evaluation of the
query. MQE&amp;rsquo;s implementation differs from Prometheus&amp;rsquo; engine, which can also lead to different results.&lt;/p&gt;
]]></content><description>&lt;h1 id="grafana-mimir-query-engine">Grafana Mimir query engine&lt;/h1>
&lt;p>The Mimir Query Engine (MQE) is an alternative to Prometheus&amp;rsquo; query engine.
You can use it in
&lt;a href="/docs/mimir/v3.1.x/references/architecture/components/querier/">queriers&lt;/a>
to evaluate PromQL queries.&lt;/p></description></item></channel></rss>