Prometheus Architecture Deep Dive

Pull-Based Scraping Model#

Prometheus pulls metrics from targets rather than having targets push metrics to it. Every scrape interval (default 15s in the global config), Prometheus sends an HTTP GET to each target’s metrics endpoint. The target responds with all its current metric values in Prometheus exposition format.

This pull model has concrete advantages. Prometheus controls the scrape rate, so a misbehaving target cannot flood the system. You can scrape a target from your laptop with curl http://target:8080/metrics to see exactly what Prometheus sees. Targets that go down are immediately detectable because the scrape fails.

PromQL Essentials: Practical Query Patterns

Instant Vectors vs Range Vectors#

An instant vector returns one sample per time series at a single point in time. A range vector returns multiple samples per time series over a time window.

# Instant vector: current value of each series
http_requests_total{job="api"}

# Range vector: last 5 minutes of samples for each series
http_requests_total{job="api"}[5m]

You cannot graph a range vector directly. Functions like rate() and increase() consume a range vector and return an instant vector, which Grafana can then plot.

Setting Up Full Observability from Scratch: Metrics, Logs, Traces, and Alerting

Setting Up Full Observability from Scratch#

This operational sequence deploys a complete observability stack on Kubernetes: metrics (Prometheus + Grafana), logs (Loki + Promtail), traces (Tempo + OpenTelemetry), and alerting (Alertmanager). Each phase is self-contained with verification steps. Complete them in order – later phases depend on earlier infrastructure.

Prerequisite: a running Kubernetes cluster with Helm installed and a monitoring namespace created.

kubectl create namespace monitoring --dry-run=client -o yaml | kubectl apply -f -
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo add grafana https://grafana.github.io/helm-charts
helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update

Phase 1 – Metrics (Prometheus + Grafana)#

Metrics are the foundation. Logging and tracing integrations all route through Grafana, so this phase must be solid before continuing.

Time-Series Database Selection and Operations

Time-Series Database Selection and Operations#

Time-series databases optimize for a specific access pattern: high-volume writes of timestamped data points, queries that aggregate over time ranges, and automatic expiration of old data. Choosing the right one depends on your data model, query patterns, retention requirements, and operational constraints.

When You Need a Time-Series Database#

A dedicated time-series database is justified when you have high write throughput (thousands to millions of data points per second), queries that are predominantly time-range aggregations, and data that has a defined retention period. Common use cases: infrastructure metrics, application performance monitoring, IoT sensor data, financial tick data, and log analytics.

Writing Effective Prometheus Alerting Rules

Rule Syntax#

Alerting rules live in rule files loaded by Prometheus. Each rule has an expression, an optional for duration, labels, and annotations.

groups:
  - name: example
    rules:
      - alert: HighErrorRate
        expr: job:http_errors:ratio5m > 0.05
        for: 5m
        labels:
          severity: critical
          team: backend
        annotations:
          summary: "Error rate above 5% for {{ $labels.job }}"
          description: "Current error rate is {{ $value | humanizePercentage }}"
          runbook_url: "https://wiki.internal/runbooks/high-error-rate"

The for duration is critical. Without it, a single bad scrape triggers an alert. With for: 5m, the condition must be continuously true across all evaluations for 5 minutes before the alert fires. During this window the alert is in pending state.

Advanced PromQL: Performance, Cardinality, and Complex Query Patterns

Cardinality Explosion#

Cardinality is the number of unique time series Prometheus tracks. Every unique combination of metric name and label key-value pairs creates a separate series. A metric with 3 labels, each having 100 possible values, generates up to 1,000,000 series. In practice, cardinality explosions are the single most common way to kill a Prometheus instance.

The usual culprits are labels containing user IDs, request paths with embedded IDs (like /api/users/a]3f7b2c1), session tokens, trace IDs, or any unbounded value set. A seemingly innocent label like path on an HTTP metric becomes catastrophic when your API has RESTful routes with UUIDs in the path.

Canary Deployments Deep Dive: Argo Rollouts, Flagger, and Metrics-Based Progressive Delivery

Canary Deployments Deep Dive#

A canary deployment sends a small percentage of traffic to a new version of your application while the majority continues hitting the stable version. You monitor the canary for errors, latency regressions, and business metric anomalies. If the canary is healthy, you gradually increase its traffic share until it handles 100%. If something is wrong, you roll back with minimal user impact.

Why Canary Over Rolling Update#

A standard Kubernetes rolling update replaces pods one by one until all pods run the new version. The problem is timing. By the time you notice a bug in your monitoring dashboards, the rolling update may have already replaced most or all pods. Every user is now hitting the broken version.

Long-Term Metrics Storage: Thanos vs Grafana Mimir vs VictoriaMetrics

The Retention Problem#

Prometheus stores metrics on local disk with a default retention of 15 days. Most production teams extend this to 30 or 90 days, but local storage has hard limits. A single Prometheus instance cannot scale disk beyond the node it runs on. It provides no high availability – if the instance goes down, you lose scraping and query access. And each Prometheus instance only sees its own targets, so there is no unified view across clusters or regions.

Monitoring Prometheus Itself: Capacity Planning, Self-Monitoring, and Scaling

Why Monitor Your Monitoring#

If Prometheus runs out of memory and crashes, you lose all alerting. If its disk fills up, it stops ingesting and you have a blind spot that may last hours before anyone notices. If scrapes start timing out, metrics go stale and alerts based on rate() produce no data (which means they silently stop firing rather than triggering). Prometheus must be the most reliably monitored component in your stack.

Prometheus Cardinality Management: Detecting, Preventing, and Reducing High-Cardinality Metrics

What Cardinality Means#

In Prometheus, cardinality is the number of unique time series. Every unique combination of metric name and label key-value pairs constitutes one series. The metric http_requests_total{method="GET", path="/api/users", status="200"} is one series. Change any label value and you get a different series. http_requests_total{method="POST", path="/api/users", status="201"} is a second series.

A single metric name can produce thousands or millions of series depending on its labels. A metric with no labels is exactly one series. A metric with one label that has 10 possible values is 10 series. A metric with three labels, each having 100 possible values, is up to 1,000,000 series (100 x 100 x 100), though in practice not every combination occurs.