Developer Self-Service Workflows

The Cost of Not Having Self-Service#

A developer needs a PostgreSQL database. They file a ticket. It sits in a backlog for two days. A DBA provisions it, sends credentials via Slack DM. Elapsed time: 3 days. Actual need: 5 minutes of configuration. Multiply across every database, cache, queue, and namespace, and manual provisioning becomes the single largest drag on velocity. Self-service lets developers provision pre-approved resources directly, within guardrails the platform team defines.

Schema Evolution and Compatibility

Schema Evolution and Compatibility#

Every service contract changes over time. New fields get added, old fields get removed, types change. In a monolith, you update the schema and redeploy. In microservices, producers and consumers deploy independently. A schema change that breaks consumers causes production failures. Schema evolution rules and tooling exist to prevent this.

Compatibility Modes#

There are four compatibility modes. Understanding them is essential for operating any schema registry.

Game Day and Tabletop Exercise Planning

Why Run Exercises#

Runbooks that have never been tested are fiction. Failover procedures that have never been executed are hopes. Game days and tabletop exercises convert assumptions about system resilience into verified facts – or reveal that those assumptions were wrong before a real incident does.

The value is not just finding technical gaps. Exercises expose process gaps: unclear escalation paths, missing permissions, outdated contact lists, communication breakdowns between teams. These are invisible until a simulated failure forces people to actually follow the documented procedure.

Distributed Data Consistency Patterns

Distributed Data Consistency Patterns#

In a monolith with a single database, you get ACID transactions. In microservices, each service owns its database. Cross-service consistency requires explicit patterns because distributed ACID transactions are impractical in most real systems.

CAP Theorem: Practical Implications#

The CAP theorem states that a distributed system can provide at most two of three guarantees: Consistency, Availability, and Partition tolerance. Since network partitions are inevitable, you choose between consistency and availability during partitions.

Reliability Review Process

Why Regular Reviews Matter#

Reliability does not improve by accident. Without a structured review cadence, teams operate on vibes – “things feel okay” or “we’ve been having a lot of incidents lately.” Reliability reviews replace gut feelings with data. They surface slow-burning problems before they become outages, hold teams accountable for improvement actions, and create a shared understanding of system health across engineering and leadership.

Weekly Reliability Review#

The weekly review is a 30-minute tactical meeting focused on what happened this week and what needs attention next week. Attendees: on-call engineers, team leads, SRE. Keep it tight.

Service Decomposition Anti-Patterns

Service Decomposition Anti-Patterns#

Splitting a monolith into microservices is a common architectural goal. But bad decomposition creates systems that are harder to operate than the monolith they replaced. These anti-patterns are disturbingly common and often unrecognized until the team is deep in operational pain.

The Distributed Monolith#

The distributed monolith looks like microservices from the outside – separate repositories, separate deployments, separate CI pipelines – but behaves like a monolith at runtime. Services cannot be deployed independently because they are tightly coupled.

Multiple Temporal Servers on Minikube: Multi-Cluster Setup

Multiple Temporal Servers on Minikube#

Running two independent Temporal Server instances locally lets you develop and test cross-cluster patterns – worker bridges, namespace replication, and multi-region failover – without cloud infrastructure. This article walks through deploying two Temporal clusters on minikube using profiles and connecting them over Docker networking.

All configuration files and Makefile targets reference the companion repository at github.com/statherm/temporal-examples in the multi-cluster/ directory.

Why Multiple Clusters?#

A single Temporal cluster handles most use cases. You need multiple clusters when:

Platform Engineering Maturity Model

Why a Maturity Model#

Platform engineering investments fail when organizations skip levels. A team that cannot maintain shared Terraform modules reliably has no business building a self-service portal. The maturity model provides an honest assessment of where you are and what must be true before advancing.

This is not a five-year roadmap. Some organizations reach Level 2 and stay there — it serves their needs. The model helps you identify what level you need, what level you are at, and what is blocking progress.

Crossplane for Platform Abstractions

What Crossplane Does#

Crossplane extends Kubernetes to provision and manage cloud infrastructure using the Kubernetes API. Instead of writing Terraform and running apply, you write Kubernetes manifests and kubectl apply them. Crossplane controllers reconcile the desired state with the actual cloud resources.

The real value is not replacing Terraform — it is building abstractions. Platform teams define custom resource types (like DatabaseClaim) that developers consume without knowing whether they are getting RDS, CloudSQL, or Azure Database. The composition layer maps the simple claim to the actual cloud resources.

Temporal Cross-Cluster Communication: Architecture and Patterns

Temporal Cross-Cluster Communication#

When you operate multiple Temporal clusters – whether for regional deployment, compliance isolation, or blast radius reduction – workflows in one cluster eventually need to trigger work in another. This article examines three architectural approaches for cross-cluster communication, their tradeoffs, and guidance on choosing the right one for your situation.

This is an architecture guide. It establishes the concepts and patterns. The next article, Building a Worker Bridge, provides the full implementation.