---
title: "Grafana Mimir for Long-Term Prometheus Storage"
description: "Reference for Grafana Mimir architecture, deployment modes, tenant isolation, remote_write configuration, retention policies, and performance tuning. Covers distributors, ingesters, store-gateway, compactor, and practical setup examples for production long-term metrics storage."
url: https://agent-zone.ai/knowledge/observability/mimir-deep-dive/
section: knowledge
date: 2026-02-22
categories: ["observability"]
tags: ["mimir","prometheus","long-term-storage","grafana","metrics","remote-write","tsdb","object-storage","multi-tenancy"]
skills: ["mimir-deployment","prometheus-remote-write","metrics-storage-architecture","tenant-isolation-configuration","performance-tuning"]
tools: ["mimir","prometheus","grafana","helm","kubectl","s3","gcs","minio"]
levels: ["intermediate","advanced"]
word_count: 1852
formats:
  json: https://agent-zone.ai/knowledge/observability/mimir-deep-dive/index.json
  html: https://agent-zone.ai/knowledge/observability/mimir-deep-dive/?format=html
  api: https://api.agent-zone.ai/api/v1/knowledge/search?q=Grafana+Mimir+for+Long-Term+Prometheus+Storage
---


# Grafana Mimir for Long-Term Prometheus Storage

Prometheus stores metrics on local disk with a practical retention limit of weeks to a few months. Beyond that, you need a long-term storage solution. Grafana Mimir is a horizontally scalable, multi-tenant time series database designed for exactly this purpose. It is API-compatible with Prometheus -- Grafana queries Mimir using the same PromQL, and Prometheus pushes data to Mimir via remote_write.

Mimir is the successor to Cortex. Grafana Labs forked Cortex, rewrote significant portions for performance, and released Mimir under the AGPLv3 license. If you see references to Cortex architecture, the concepts map directly to Mimir with improvements.

## Architecture Overview

Mimir splits the metrics pipeline into specialized components. Each component can be scaled independently based on the bottleneck.

```
Prometheus --> [Distributor] --> [Ingester] --> Object Storage (S3/GCS/MinIO)
                                    |
Grafana   <-- [Query Frontend] <-- [Querier] <-- [Store-Gateway] <-- Object Storage
                                    |
                              [Compactor] --> Object Storage
```

### Distributor

The distributor is the entry point for all incoming metrics. It receives samples via the Prometheus remote_write API, validates them (label names, sample timestamps, series limits), and distributes them to ingesters using consistent hashing.

Key responsibilities:
- **Validation**: Rejects samples with invalid label names, too many labels, labels exceeding length limits, or timestamps too far in the past or future.
- **HA deduplication**: When running multiple Prometheus replicas for high availability, the distributor deduplicates samples from replicas so only one copy is stored.
- **Sharding**: Uses a hash ring (stored in Consul, etcd, or memberlist) to determine which ingesters should receive each series.
- **Replication**: Writes each series to `replication_factor` ingesters (default 3) for durability.

```yaml
# Distributor configuration
distributor:
  ring:
    kvstore:
      store: memberlist   # No external dependency for ring
  ha_tracker:
    enable_ha_tracker: true
    kvstore:
      store: memberlist
    # Prometheus instances with the same cluster/replica labels
    # are treated as HA pairs
```

The distributor is stateless and can be scaled horizontally behind a load balancer. It is CPU-bound during high-ingestion periods due to validation and hashing.

### Ingester

Ingesters hold recent data in memory and periodically flush it to long-term object storage as TSDB blocks. Each ingester owns a portion of the hash ring and is responsible for the series assigned to it.

Key responsibilities:
- **In-memory storage**: Recent samples (last ~2 hours) are stored in an in-memory TSDB head block with a write-ahead log (WAL) for crash recovery.
- **Block flushing**: Every 2 hours, the head block is compacted into an immutable TSDB block and uploaded to object storage.
- **Replication**: Each series is replicated across `replication_factor` ingesters. During reads, the querier deduplicates data from replicas.

```yaml
# Ingester configuration
ingester:
  ring:
    replication_factor: 3
    kvstore:
      store: memberlist
  # WAL settings for crash recovery
  wal_dir: /data/ingester/wal
  # How long to keep data in memory before flushing
  blocks_storage:
    tsdb:
      dir: /data/ingester/tsdb
      block_ranges_period: [2h]
      retention_period: 13h   # Keep recent blocks for fast queries
```

Ingesters are the most resource-intensive component. They are memory-bound (holding active series in RAM) and require persistent storage for WAL. Plan for 1-2GB of memory per 100,000 active series as a starting estimate.

**Scaling ingesters:** Adding or removing ingesters triggers ring rebalancing. Mimir handles this gracefully -- the leaving ingester flushes its data, and the joining ingester starts receiving new writes. However, rebalancing creates temporary increased load. Scale ingesters during low-traffic periods when possible.

### Store-Gateway

The store-gateway provides efficient access to historical blocks in object storage. It downloads block index files and caches them locally, enabling fast series lookups without scanning entire blocks.

```yaml
# Store-gateway configuration
store_gateway:
  sharding_ring:
    replication_factor: 3
    kvstore:
      store: memberlist
  # Cache block index headers on local disk
  bucket_store:
    sync_dir: /data/store-gateway/sync
    index_cache:
      backend: memcached
      memcached:
        addresses: memcached.mimir.svc:11211
    chunks_cache:
      backend: memcached
      memcached:
        addresses: memcached.mimir.svc:11211
```

The store-gateway shards blocks across instances using a hash ring, so each instance only downloads and caches a subset of blocks. Caching (index headers in memory, full index and chunks in Memcached) is critical for query performance.

### Querier

The querier executes PromQL queries by combining data from two sources: ingesters (for recent, not-yet-flushed data) and store-gateways (for historical blocks in object storage).

```yaml
# Querier configuration
querier:
  # Maximum time range for a single query
  max_query_lookback: 30d
  # Query ingesters for recent data
  query_ingesters_within: 13h
```

The `query_ingesters_within` setting is important. It tells the querier to only contact ingesters for data within the last 13 hours (matching the ingester retention). Data older than this is served entirely from store-gateways, reducing ingester query load.

### Query Frontend

The query frontend sits in front of queriers and optimizes query execution. It is optional but strongly recommended for production.

Key optimizations:
- **Splitting**: Splits long-range queries into smaller time ranges that can be executed in parallel.
- **Caching**: Caches query results so repeated dashboard loads do not re-execute queries.
- **Queuing**: Queues requests and distributes them fairly across queriers, preventing a single heavy query from consuming all querier resources.

```yaml
# Query frontend configuration
query_frontend:
  # Split queries by day for parallel execution
  split_queries_by_interval: 24h
  # Cache results
  results_cache:
    backend: memcached
    memcached:
      addresses: memcached.mimir.svc:11211
      max_item_size: 10485760  # 10MB
```

### Compactor

The compactor runs periodically to merge small TSDB blocks in object storage into larger, more efficient blocks. It also handles retention enforcement by deleting blocks that exceed the configured retention period.

```yaml
# Compactor configuration
compactor:
  data_dir: /data/compactor
  compaction_interval: 1h
  # Retention
  deletion_delay: 12h   # Delay deletion to allow queries to complete
  tenant_cleanup_delay: 6h
```

The compactor is a singleton per tenant (or uses sharding for multi-tenant setups). It requires temporary local disk space to download, merge, and re-upload blocks. Plan for local storage equal to 2-3x the size of your largest tenant's uncompacted blocks.

## Deployment Modes

### Monolithic

All components run in a single binary/process. Suitable for development, testing, and small production deployments handling up to a few million active series.

```bash
# Run monolithic Mimir
mimir -target=all \
  -blocks-storage.backend=s3 \
  -blocks-storage.s3.endpoint=s3.amazonaws.com \
  -blocks-storage.s3.bucket-name=mimir-blocks \
  -blocks-storage.s3.region=us-east-1
```

Helm deployment in monolithic mode:

```yaml
# mimir-values.yaml (monolithic)
mimir:
  structuredConfig:
    common:
      storage:
        backend: s3
        s3:
          endpoint: s3.amazonaws.com
          bucket_name: mimir-blocks
          region: us-east-1
    blocks_storage:
      storage_prefix: blocks
    limits:
      max_global_series_per_user: 1500000
deploymentMode: SingleBinary
singleBinary:
  replicas: 1
  resources:
    requests:
      cpu: "2"
      memory: 8Gi
```

### Microservices

Each component runs as a separate Deployment or StatefulSet. Required for large-scale deployments (tens of millions of active series, hundreds of Prometheus instances).

```bash
helm repo add grafana https://grafana.github.io/helm-charts
helm install mimir grafana/mimir-distributed \
  --namespace mimir --create-namespace \
  -f mimir-values.yaml
```

```yaml
# mimir-values.yaml (microservices)
mimir:
  structuredConfig:
    common:
      storage:
        backend: s3
        s3:
          endpoint: s3.amazonaws.com
          bucket_name: mimir-blocks
          region: us-east-1
    limits:
      max_global_series_per_user: 5000000
      ingestion_rate: 200000
      ingestion_burst_size: 400000

distributor:
  replicas: 3
  resources:
    requests:
      cpu: "1"
      memory: 2Gi

ingester:
  replicas: 6
  persistentVolume:
    enabled: true
    size: 50Gi
  resources:
    requests:
      cpu: "2"
      memory: 8Gi

store_gateway:
  replicas: 3
  persistentVolume:
    enabled: true
    size: 20Gi

querier:
  replicas: 4
  resources:
    requests:
      cpu: "1"
      memory: 4Gi

query_frontend:
  replicas: 2

compactor:
  replicas: 1
  persistentVolume:
    enabled: true
    size: 100Gi
```

## Tenant Isolation

Mimir is natively multi-tenant. Each tenant gets isolated storage and configurable limits. Tenants are identified by the `X-Scope-OrgID` header on write and read requests.

### Prometheus Configuration per Tenant

```yaml
# prometheus.yaml for tenant "team-platform"
remote_write:
  - url: http://mimir-distributor.mimir.svc:8080/api/v1/push
    headers:
      X-Scope-OrgID: team-platform
    queue_config:
      max_samples_per_send: 1000
      max_shards: 10
      capacity: 2500
```

### Per-Tenant Limits

Configure limits per tenant to prevent one team from consuming all resources:

```yaml
# Mimir runtime configuration (overrides.yaml)
overrides:
  team-platform:
    max_global_series_per_user: 2000000
    ingestion_rate: 100000
    ingestion_burst_size: 200000
    max_query_lookback: 90d
    max_query_parallelism: 16

  team-frontend:
    max_global_series_per_user: 500000
    ingestion_rate: 25000
    ingestion_burst_size: 50000
    max_query_lookback: 30d
    max_query_parallelism: 8
```

Reload the runtime configuration without restarting Mimir by updating the ConfigMap and sending a SIGHUP to the process, or by using the `/runtime_config` endpoint.

## Remote Write Configuration

### Prometheus remote_write Tuning

The default Prometheus remote_write configuration is conservative. For production, tune these parameters:

```yaml
remote_write:
  - url: http://mimir-distributor.mimir.svc:8080/api/v1/push
    headers:
      X-Scope-OrgID: my-tenant
    queue_config:
      capacity: 10000            # Buffer size before blocking
      max_shards: 30             # Parallel senders (increase for high-volume)
      min_shards: 1
      max_samples_per_send: 2000 # Batch size per request
      batch_send_deadline: 5s    # Max time to wait for a full batch
      min_backoff: 30ms
      max_backoff: 5s
    metadata_config:
      send: true
      send_interval: 1m
```

**Key tuning knobs:**

- **max_shards**: Increase when remote_write is falling behind (check `prometheus_remote_storage_samples_pending`). Each shard is a parallel sender.
- **capacity**: Increase when you see `prometheus_remote_storage_samples_dropped_total` increasing. This is the in-memory buffer before samples are dropped.
- **max_samples_per_send**: Larger batches are more efficient but increase latency. 1000-2000 is the sweet spot.

### Monitoring remote_write Health

Essential Prometheus metrics to monitor:

```promql
# Samples pending in the remote write queue (should stay low)
prometheus_remote_storage_samples_pending

# Samples failed to send (should be zero)
rate(prometheus_remote_storage_samples_failed_total[5m])

# Samples successfully sent per second
rate(prometheus_remote_storage_samples_total[5m])

# Highest timestamp successfully sent (for lag calculation)
prometheus_remote_storage_queue_highest_sent_timestamp_seconds
```

## Retention Policies

Configure retention at two levels:

### Block-Level Retention (Compactor)

The compactor enforces retention by deleting blocks older than the configured period:

```yaml
limits:
  compactor_blocks_retention_period: 365d  # Keep metrics for 1 year
```

Per-tenant retention:

```yaml
overrides:
  team-platform:
    compactor_blocks_retention_period: 365d
  team-staging:
    compactor_blocks_retention_period: 30d
```

### Object Storage Lifecycle (Belt and Suspenders)

As a safety net, configure object storage lifecycle rules to delete objects older than your longest retention period plus a buffer:

```hcl
# Terraform: S3 lifecycle rule
resource "aws_s3_bucket_lifecycle_configuration" "mimir" {
  bucket = aws_s3_bucket.mimir_blocks.id

  rule {
    id     = "expire-old-blocks"
    status = "Enabled"
    expiration {
      days = 400  # 365 + 35 day buffer
    }
  }
}
```

## Performance Tuning

### Ingester Memory

The primary scaling dimension. Each active series consumes approximately 5-10KB of memory in the ingester.

```
Active series: 2,000,000
Memory per series: ~8KB (average)
Ingester memory needed: ~16GB
Replication factor: 3
Ingesters: 6

Memory per ingester: ~8GB (each ingester holds 1/6 of 3 copies)
```

Monitor with:

```promql
# Active series per ingester
cortex_ingester_memory_series

# Memory used by ingesters
process_resident_memory_bytes{container="ingester"}
```

### Store-Gateway Caching

Without caching, every query to historical data requires downloading index files from object storage. This is slow and expensive.

**Index cache**: Stores series index lookups. Size it to hold the index for your most-queried time ranges. Start with 1GB per million series per day of retention.

**Chunks cache**: Stores actual sample data. Size it to hold the chunks for your most common queries. Start with 2GB per million series per frequently-queried day.

Use Memcached for both caches in production:

```yaml
memcached:
  replicas: 3
  resources:
    requests:
      cpu: "500m"
      memory: 4Gi
  maxItemMemory: 3072  # MB, leave room for overhead
```

### Query Performance

Slow queries are usually caused by high cardinality (too many series matching a selector) or long time ranges without splitting.

- Enable query splitting in the query frontend (`split_queries_by_interval: 24h`).
- Set `max_query_parallelism` per tenant to limit resource consumption per query.
- Use `max_fetched_series_per_query` to prevent runaway queries from loading millions of series.

```yaml
limits:
  max_fetched_series_per_query: 100000
  max_fetched_chunks_per_query: 2000000
  max_query_parallelism: 16
  max_query_length: 30d  # Maximum time range per query
```

## Practical Checklist for Agents

When deploying or operating Mimir:

1. **Start monolithic** for clusters with fewer than 1 million active series. Move to microservices when scaling demands it.
2. **Use memberlist** for the hash ring instead of Consul or etcd. It eliminates an external dependency and works well up to dozens of ring members.
3. **Set per-tenant limits** from day one. A single team sending high-cardinality metrics can degrade the entire cluster.
4. **Monitor ingester memory** closely. OOM-killed ingesters lose in-flight data between WAL checkpoints. Set memory requests conservatively with headroom.
5. **Deploy Memcached** for index and chunks caching. Without caching, historical queries are unacceptably slow.
6. **Tune remote_write** on the Prometheus side. The default settings are too conservative for most production workloads. Increase `max_shards` and `capacity` if samples are pending or dropping.
7. **Configure retention** in both Mimir (compactor) and object storage (lifecycle rules) as defense in depth.
8. **Test failover** by killing an ingester and verifying that queries still return correct data (due to replication) and that the replacement ingester catches up via WAL replay.