Crossplane for Platform Abstractions

What Crossplane Does#

Crossplane extends Kubernetes to provision and manage cloud infrastructure using the Kubernetes API. Instead of writing Terraform and running apply, you write Kubernetes manifests and kubectl apply them. Crossplane controllers reconcile the desired state with the actual cloud resources.

The real value is not replacing Terraform — it is building abstractions. Platform teams define custom resource types (like DatabaseClaim) that developers consume without knowing whether they are getting RDS, CloudSQL, or Azure Database. The composition layer maps the simple claim to the actual cloud resources.

Blue-Green Deployments: Traffic Switching, Database Compatibility, and Rollback Strategies

Blue-Green Deployments#

A blue-green deployment runs two identical production environments. One (blue) serves live traffic. The other (green) is idle or running the new version. When the green environment passes validation, you switch traffic from blue to green. If something goes wrong, you switch back. The old environment stays running until you are confident the new version is stable.

The fundamental advantage over rolling updates is atomicity. Traffic switches from 100% old to 100% new in a single operation. There is no period where some users see the old version and others see the new one.

Self-Hosted CI Runners at Scale: GitHub Actions Runner Controller, GitLab Runners on K8s, and Autoscaling

Self-Hosted CI Runners at Scale#

GitHub-hosted and GitLab SaaS runners work until they do not. You hit limits when you need private network access to deploy to internal infrastructure, specific hardware like GPUs or ARM64 machines, compliance requirements that prohibit running code on shared infrastructure, or cost control when you are burning thousands of dollars per month on hosted runner minutes.

Self-hosted runners solve these problems but introduce operational complexity: you now own runner provisioning, scaling, security, image updates, and cost management.

ArgoCD Multi-Cluster Management: Hub-Spoke Patterns, Cluster Registration, and Fleet Operations

ArgoCD Multi-Cluster Management#

A single ArgoCD instance can manage deployments across dozens of Kubernetes clusters. This is one of ArgoCD’s strongest features and the standard approach for organizations with multiple environments, regions, or cloud providers.

Hub-Spoke Architecture#

The standard multi-cluster pattern runs ArgoCD on one “hub” cluster that deploys to multiple “spoke” clusters:

Hub Cluster (management)
├── ArgoCD control plane
├── Application/ApplicationSet definitions
├── RBAC policies
└── Cluster credentials (Secrets)
    │
    ├──→ Spoke Cluster: dev (us-east-1)
    ├──→ Spoke Cluster: staging (us-west-2)
    ├──→ Spoke Cluster: prod-us (us-east-1)
    ├──→ Spoke Cluster: prod-eu (eu-west-1)
    └──→ Spoke Cluster: prod-apac (ap-southeast-1)

ArgoCD on the hub cluster connects to each spoke cluster’s API server to apply manifests and check health. The spoke clusters do not need ArgoCD installed.

ArgoCD Setup and Basics: Installation, CLI, First Application, and Sync Policies

ArgoCD Setup and Basics#

ArgoCD is a declarative GitOps continuous delivery tool for Kubernetes. It watches Git repositories and ensures your cluster state matches what is declared in those repos. When someone changes a manifest in Git, ArgoCD detects the drift and either alerts you or automatically applies the change.

Installation with Plain Manifests#

The fastest path to a running ArgoCD instance:

kubectl create namespace argocd
kubectl apply -n argocd -f https://raw.githubusercontent.com/argoproj/argo-cd/stable/manifests/install.yaml

This installs the full ArgoCD stack: API server, repo server, application controller, Redis, and Dex (for SSO). For a minimal install without SSO and the UI, use namespace-install.yaml instead, which also scopes ArgoCD to a single namespace.

Chaos Engineering: From First Experiment to Mature Practice

Sre

Why Break Things on Purpose#

Production systems fail in ways that testing environments never reveal. A database connection pool exhaustion under load, a cascading timeout across three services, a DNS cache that masks a routing change until it expires – these failures only surface when real conditions collide in ways nobody predicted. Chaos engineering is the discipline of deliberately injecting failures into a system to discover weaknesses before they cause outages.

Cloud-Native vs Portable Infrastructure: A Decision Framework

Cloud-Native vs Portable Infrastructure#

Every infrastructure decision sits on a spectrum between portability and fidelity. On one end, you have generic Kubernetes running on minikube or kind – it works everywhere, costs nothing, and captures the behavior of the Kubernetes API itself. On the other end, you have cloud-native managed services – EKS with IRSA and ALB Ingress Controller, GKE with Workload Identity and Cloud Load Balancing, AKS with Azure AD Pod Identity and Azure Load Balancer. These capture the behavior of the actual platform your workloads will run on.

CockroachDB Setup and Architecture

Architecture: What CockroachDB Actually Does Under the Hood#

CockroachDB is a distributed SQL database that stores data across multiple nodes while presenting a single logical database to clients. Understanding three concepts is essential before deploying it.

Ranges. All data is stored in key-value pairs, sorted by key. CockroachDB splits this sorted keyspace into contiguous chunks called ranges, each targeting 512 MiB by default. Every SQL table, index, and system table maps to one or more ranges. When a range grows beyond the threshold, it splits automatically.

Devcontainer Sandbox Templates: Zero-Cost Validation Environments for Infrastructure Development

Devcontainer Sandbox Templates#

Devcontainers provide disposable, reproducible development environments that run in a container. You define the tools, extensions, and configuration in a .devcontainer/ directory, and any compatible host – GitHub Codespaces, Gitpod, VS Code with Docker, or the devcontainer CLI – builds and launches the environment from that definition.

For infrastructure validation, devcontainers solve a specific problem: giving every developer and every CI run the exact same set of tools at the exact same versions, without requiring them to install anything on their local machine. A Kubernetes devcontainer includes kind, kubectl, helm, and kustomize at pinned versions. A Terraform devcontainer includes terraform, tflint, checkov, and cloud CLIs. The environment is ready to use the moment it starts.

Distributed Tracing in Practice

Trace, Span, and Context#

A trace represents a single request flowing through a distributed system. It is identified by a 128-bit trace ID. A span represents one unit of work within that trace – an HTTP handler, a database query, a message publish. Each span has a name, start time, duration, status, attributes (key-value pairs), and events (timestamped annotations). Spans form a tree: every span except the root has a parent span ID.