CQRS and Event Sourcing

CQRS and Event Sourcing#

CQRS (Command Query Responsibility Segregation) and event sourcing are frequently discussed together but are independent patterns. You can use either one without the other. Understanding each separately is essential before combining them.

CQRS: Separate Read and Write Models#

Most applications use the same data model for reads and writes. The same orders table serves the API that creates orders and the API that lists them. This works until read and write requirements diverge significantly.

Platform Team Structure and Operating Model

Why the Operating Model Matters#

The platform team’s operating model determines whether the platform becomes a force multiplier or a bottleneck. A ticket-driven, gatekeeper-oriented team produces a platform developers route around. A product-oriented, self-service team produces a platform developers adopt voluntarily. Organizational structure shapes developer experience more than technology choices.

Team Topologies and Interaction Modes#

The Team Topologies framework (Skelton & Pais) defines four team types relevant to platform engineering:

Production Readiness Reviews

Sre

Why Services Need a Gate Before Production#

Every production outage caused by a service that launched without monitoring, without runbooks, without capacity planning, without anyone knowing who owns it at 3 AM – every one of those was preventable. A production readiness review is the gate between “it works on my machine” and “it is ready for real users.” Google formalized this as the PRR process. You do not need Google-scale infrastructure to benefit from it.

Azure DevOps Pipelines: YAML Pipelines, Templates, Service Connections, and AKS Integration

Azure DevOps Pipelines#

Azure DevOps Pipelines uses YAML files stored in your repository to define build and deployment workflows. The pipeline model has three levels: stages contain jobs, jobs contain steps. This hierarchy maps directly to how you think about CI/CD – build stage, test stage, deploy-to-staging stage, deploy-to-production stage – with each stage containing one or more parallel jobs.

Pipeline Structure#

A complete pipeline in azure-pipelines.yml:

trigger:
  branches:
    include:
      - main
      - release/*
  paths:
    exclude:
      - docs/**
      - README.md

pool:
  vmImage: 'ubuntu-latest'

variables:
  - group: common-vars
  - name: buildConfiguration
    value: 'Release'

stages:
  - stage: Build
    jobs:
      - job: BuildApp
        steps:
          - task: GoTool@0
            inputs:
              version: '1.22'
          - script: |
              go build -o $(Build.ArtifactStagingDirectory)/myapp ./cmd/myapp
            displayName: 'Build binary'
          - publish: $(Build.ArtifactStagingDirectory)
            artifact: drop

  - stage: Test
    dependsOn: Build
    jobs:
      - job: UnitTests
        steps:
          - task: GoTool@0
            inputs:
              version: '1.22'
          - script: go test ./... -v -coverprofile=coverage.out
            displayName: 'Run tests'
          - task: PublishCodeCoverageResults@2
            inputs:
              summaryFileLocation: coverage.out
              codecoverageTool: 'Cobertura'

  - stage: DeployStaging
    dependsOn: Test
    condition: and(succeeded(), eq(variables['Build.SourceBranch'], 'refs/heads/main'))
    jobs:
      - deployment: DeployToStaging
        environment: staging
        strategy:
          runOnce:
            deploy:
              steps:
                - download: current
                  artifact: drop
                - script: echo "Deploying to staging"

trigger controls which branches and paths trigger the pipeline. dependsOn creates stage ordering. condition adds logic – succeeded() checks the previous stage passed, and you can combine it with variable checks to restrict certain stages to specific branches.

Contract Testing for Microservices

Contract Testing for Microservices#

Integration tests verify that services work together, but they require all services to be running. In a system with 20 services, running full integration tests is slow, brittle, and blocks deployments. Contract testing solves this by verifying service interactions independently – each service tests against a contract instead of against a live instance of every dependency.

The Problem with Integration and E2E Tests#

A traditional integration test for “order service calls payment service” requires both services running, along with their databases, message brokers, and any other dependencies. Problems:

Service Catalog Management and Design

Why a Service Catalog Exists#

A service catalog answers: “What do we have, who owns it, and what state is it in?” Without one, this information lives in tribal knowledge and stale wiki pages. When an incident hits at 3 AM, the on-call engineer needs to know who owns the failing service, what it depends on, and where to find the runbook. The catalog provides this in seconds.

The catalog is also the foundation for other platform capabilities. Golden paths register outputs in it. Scorecards evaluate catalog entities. Self-service workflows provision resources linked to catalog entries.

SLO Practical Implementation Guide

Sre

From Theory to Running SLOs#

Every SRE resource explains what SLOs are. Few explain how to actually implement them from scratch – the Prometheus queries, the error budget math, the alerting rules, and the conversations with product managers when the budget runs out. This guide covers all of it.

Step 1: Choose Your SLIs#

SLIs must measure what users experience. Internal metrics like CPU usage or queue depth are useful for debugging but are not SLIs because users do not care about your CPU – they care whether the page loaded.

AWS CodePipeline and CodeBuild: Pipeline Structure, ECR Integration, ECS/EKS Deployments, and Cross-Account Patterns

AWS CodePipeline and CodeBuild#

AWS CodePipeline orchestrates CI/CD workflows as a series of stages. CodeBuild executes the actual build and test commands. Together they provide a fully managed pipeline that integrates natively with S3, ECR, ECS, EKS, Lambda, and CloudFormation. No servers to manage, no agents to maintain – but the trade-off is less flexibility than self-hosted systems and tighter coupling to the AWS ecosystem.

Pipeline Structure#

A CodePipeline has stages, and each stage has actions. Actions can run in parallel or sequentially within a stage. The most common pattern is Source -> Build -> Deploy:

Developer Self-Service Workflows

The Cost of Not Having Self-Service#

A developer needs a PostgreSQL database. They file a ticket. It sits in a backlog for two days. A DBA provisions it, sends credentials via Slack DM. Elapsed time: 3 days. Actual need: 5 minutes of configuration. Multiply across every database, cache, queue, and namespace, and manual provisioning becomes the single largest drag on velocity. Self-service lets developers provision pre-approved resources directly, within guardrails the platform team defines.

Schema Evolution and Compatibility

Schema Evolution and Compatibility#

Every service contract changes over time. New fields get added, old fields get removed, types change. In a monolith, you update the schema and redeploy. In microservices, producers and consumers deploy independently. A schema change that breaks consumers causes production failures. Schema evolution rules and tooling exist to prevent this.

Compatibility Modes#

There are four compatibility modes. Understanding them is essential for operating any schema registry.