ArgoCD Multi-Cluster Management: Hub-Spoke Patterns, Cluster Registration, and Fleet Operations

ArgoCD Multi-Cluster Management#

A single ArgoCD instance can manage deployments across dozens of Kubernetes clusters. This is one of ArgoCD’s strongest features and the standard approach for organizations with multiple environments, regions, or cloud providers.

Hub-Spoke Architecture#

The standard multi-cluster pattern runs ArgoCD on one “hub” cluster that deploys to multiple “spoke” clusters:

Hub Cluster (management)
├── ArgoCD control plane
├── Application/ApplicationSet definitions
├── RBAC policies
└── Cluster credentials (Secrets)
    │
    ├──→ Spoke Cluster: dev (us-east-1)
    ├──→ Spoke Cluster: staging (us-west-2)
    ├──→ Spoke Cluster: prod-us (us-east-1)
    ├──→ Spoke Cluster: prod-eu (eu-west-1)
    └──→ Spoke Cluster: prod-apac (ap-southeast-1)

ArgoCD on the hub cluster connects to each spoke cluster’s API server to apply manifests and check health. The spoke clusters do not need ArgoCD installed.

GitOps for Kubernetes: Patterns, Tools, and Workflow Design

GitOps for Kubernetes#

GitOps is a deployment model where git is the source of truth for your cluster’s desired state. A controller running inside the cluster watches a git repository and continuously reconciles the live state to match what is declared in git. When you want to change something, you commit to git. The controller detects the change and applies it.

This replaces kubectl apply from laptops and CI pipelines with a pull-based model where the cluster pulls its own configuration. The benefits are an audit trail in git history, easy rollback via git revert, and drift detection when someone makes manual changes.

Long-Term Metrics Storage: Thanos vs Grafana Mimir vs VictoriaMetrics

The Retention Problem#

Prometheus stores metrics on local disk with a default retention of 15 days. Most production teams extend this to 30 or 90 days, but local storage has hard limits. A single Prometheus instance cannot scale disk beyond the node it runs on. It provides no high availability – if the instance goes down, you lose scraping and query access. And each Prometheus instance only sees its own targets, so there is no unified view across clusters or regions.

Multi-Cluster Kubernetes: Architecture, Networking, and Management Patterns

Multi-Cluster Kubernetes#

A single Kubernetes cluster is a single blast radius. A bad deployment, a control plane failure, a misconfigured admission webhook – any of these can take down everything. Multi-cluster is not about complexity for its own sake. It is about isolation, resilience, and operating workloads that span regions, regulations, or teams.

Why Multi-Cluster#

Blast radius isolation. A cluster-wide failure (etcd corruption, bad admission webhook, API server overload) only affects one cluster. Critical workloads in another cluster are untouched.

Multi-Cluster Emulation with Minikube Profiles

Multi-Cluster Emulation with Minikube Profiles#

Production infrastructure rarely runs on a single cluster. You have staging, production, maybe a dedicated cluster for CI or data workloads. Minikube profiles let you run multiple independent Kubernetes clusters on one machine, each with its own version, resources, and addons. This is how you test multi-cluster workflows without cloud accounts.

What Profiles Are#

A minikube profile is a fully independent cluster. Each profile has its own: