Prometheus Architecture Deep Dive

Pull-Based Scraping Model#

Prometheus pulls metrics from targets rather than having targets push metrics to it. Every scrape interval (default 15s in the global config), Prometheus sends an HTTP GET to each target’s metrics endpoint. The target responds with all its current metric values in Prometheus exposition format.

This pull model has concrete advantages. Prometheus controls the scrape rate, so a misbehaving target cannot flood the system. You can scrape a target from your laptop with curl http://target:8080/metrics to see exactly what Prometheus sees. Targets that go down are immediately detectable because the scrape fails.

Long-Term Metrics Storage: Thanos vs Grafana Mimir vs VictoriaMetrics

The Retention Problem#

Prometheus stores metrics on local disk with a default retention of 15 days. Most production teams extend this to 30 or 90 days, but local storage has hard limits. A single Prometheus instance cannot scale disk beyond the node it runs on. It provides no high availability – if the instance goes down, you lose scraping and query access. And each Prometheus instance only sees its own targets, so there is no unified view across clusters or regions.