Rate Limiting Implementation Patterns

Rate Limiting Implementation Patterns#

Rate limiting controls how many requests a client can make within a time period. It protects services from overload, ensures fair usage across clients, prevents abuse, and provides a mechanism for graceful degradation under load. Every production API needs rate limiting at some layer.

Algorithm Comparison#

Fixed Window#

The simplest algorithm. Divide time into fixed windows (e.g., 1-minute intervals) and count requests per window. When the count exceeds the limit, reject requests until the next window starts.

RBAC Patterns: Practical Access Control for Kubernetes

RBAC Patterns#

Kubernetes RBAC controls who can do what to which resources. It is built on four objects: Roles, ClusterRoles, RoleBindings, and ClusterRoleBindings. Getting RBAC right means understanding how these four pieces compose and knowing the common patterns that cover 90% of real-world needs.

The Four RBAC Objects#

Role – Defines permissions within a single namespace. Lists API groups, resources, and allowed verbs.

ClusterRole – Defines permissions cluster-wide or for non-namespaced resources (nodes, persistent volumes, namespaces themselves).

Real User Monitoring (RUM) and Frontend Observability: Core Web Vitals, Error Tracking, and Session Replay

What Real User Monitoring Measures#

Real User Monitoring (RUM) collects performance and behavior data from actual users interacting with your application in their real browsers, on their real networks, with their real hardware. Unlike synthetic monitoring, which tests a controlled scenario from a known location, RUM captures the full spectrum of user experience – including the user on a slow 3G connection in rural Brazil using a 4-year-old phone.

RUM answers questions that no amount of server-side monitoring can: How fast does the page actually load for users? Which JavaScript errors are users hitting in production? Where do users abandon a workflow? Which geographic regions experience worse performance?

Refactoring Terraform: When and How to Restructure Growing Infrastructure Code

Refactoring Terraform#

Terraform configurations grow organically. A project starts with 10 resources in one directory. Six months later it has 80 resources, 3 levels of modules, and a state file that takes 2 minutes to plan. Changes feel risky because everything is interconnected. New team members (or agents) cannot understand the structure without reading every file.

Refactoring addresses this — but Terraform refactoring is harder than code refactoring because the state file maps resource addresses to real infrastructure. Rename a resource and Terraform thinks you want to destroy the old one and create a new one. Move a resource into a module and Terraform plans to recreate it. Every structural change requires corresponding state manipulation.

Regulatory Compliance Frameworks: HIPAA, FedRAMP, ITAR, and SOX Technical Controls

Regulatory Compliance Frameworks#

Regulatory compliance translates legal requirements into technical controls. Understanding which regulations apply to your system and mapping them to infrastructure and application design is a core engineering responsibility in regulated industries.

This guide covers four major frameworks and their practical implications for software architecture. These are not exhaustive compliance guides — they map the most impactful technical controls for each framework.

HIPAA (Health Insurance Portability and Accountability Act)#

HIPAA applies to organizations handling Protected Health Information (PHI) — any data that can identify a patient and relates to their health condition, treatment, or payment.

Release Management Patterns: Versioning, Changelog Generation, Branching, Rollbacks, and Progressive Rollouts

Release Management Patterns#

Releasing software is more than merging to main and deploying. A disciplined release process ensures that every version is identifiable, every change is documented, every deployment is reversible, and failures are contained before they reach all users. This operational sequence walks through each phase of a production release workflow.

Phase 1 – Semantic Versioning#

Step 1: Adopt Semantic Versioning#

Semantic versioning (semver) communicates the impact of changes through the version number itself: MAJOR.MINOR.PATCH.

Resource Requests and Limits: CPU, Memory, QoS, and OOMKilled Debugging

Resource Requests and Limits#

Requests and limits control how Kubernetes schedules pods and enforces resource usage. Getting them wrong leads to pods that get evicted, throttled to a crawl, or that starve other workloads on the same node.

Requests vs Limits#

Requests are what the scheduler uses for placement. When you request 500m CPU and 256Mi memory, Kubernetes finds a node with that much allocatable capacity. The request is a guarantee – the kubelet reserves those resources for your container.

Running Kafka on Kubernetes with Strimzi

Running Kafka on Kubernetes with Strimzi#

Running Kafka on Kubernetes without an operator is painful. You need StatefulSets, headless Services, init containers for broker ID assignment, and careful handling of storage and networking. Strimzi eliminates most of this by managing the entire Kafka lifecycle through Custom Resource Definitions.

Installing Strimzi#

# Option 1: Helm
helm repo add strimzi https://strimzi.io/charts
helm install strimzi strimzi/strimzi-kafka-operator \
  --namespace kafka \
  --create-namespace

# Option 2: Direct YAML install
kubectl create namespace kafka
kubectl apply -f https://strimzi.io/install/latest?namespace=kafka -n kafka

Verify the operator is running:

Running Redis on Kubernetes

Running Redis on Kubernetes#

Redis on Kubernetes ranges from dead simple (single pod for caching) to operationally complex (Redis Cluster with persistence). The right choice depends on whether you need data durability, high availability, or just a fast throwaway cache.

Single-Instance Redis with Persistence#

For development or small workloads, a single Redis Deployment with a PVC is enough:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: redis
spec:
  replicas: 1
  selector:
    matchLabels:
      app: redis
  template:
    metadata:
      labels:
        app: redis
    spec:
      containers:
      - name: redis
        image: redis:7-alpine
        command: ["redis-server", "--appendonly", "yes", "--maxmemory", "256mb", "--maxmemory-policy", "allkeys-lru"]
        ports:
        - containerPort: 6379
        volumeMounts:
        - name: redis-data
          mountPath: /data
        resources:
          requests:
            cpu: 100m
            memory: 300Mi
          limits:
            cpu: 500m
            memory: 350Mi
      volumes:
      - name: redis-data
        persistentVolumeClaim:
          claimName: redis-data
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: redis-data
spec:
  accessModes: ["ReadWriteOnce"]
  resources:
    requests:
      storage: 5Gi
---
apiVersion: v1
kind: Service
metadata:
  name: redis
spec:
  selector:
    app: redis
  ports:
  - port: 6379
    targetPort: 6379

Set the memory limit in Redis (--maxmemory) lower than the container memory limit. If Redis uses 350Mi and the container limit is 350Mi, the kernel OOM-kills the process during background save operations when Redis forks and temporarily doubles its memory usage. A safe ratio: set maxmemory to 60-75% of the container memory limit.

Running Terraform in CI/CD Pipelines

The Core Pattern#

The standard CI/CD pattern for Terraform: run terraform plan on every pull request and post the output as a PR comment. Run terraform apply only when the PR merges to main. This gives reviewers visibility into what will change before approving.

GitHub Actions Workflow#

name: Terraform
on:
  pull_request:
    paths: ["infra/**"]
  push:
    branches: [main]
    paths: ["infra/**"]

permissions:
  id-token: write    # OIDC
  contents: read
  pull-requests: write  # PR comments

jobs:
  terraform:
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: infra
    steps:
      - uses: actions/checkout@v4

      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::123456789012:role/terraform-ci
          aws-region: us-east-1

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.7.0
          terraform_wrapper: true  # captures stdout for PR comments

      - name: Terraform Init
        run: terraform init -input=false

      - name: Terraform Format Check
        run: terraform fmt -check -recursive

      - name: Terraform Validate
        run: terraform validate

      - name: Terraform Plan
        id: plan
        run: terraform plan -input=false -no-color -out=tfplan
        continue-on-error: true

      - name: Post Plan to PR
        if: github.event_name == 'pull_request'
        uses: actions/github-script@v7
        with:
          script: |
            const plan = `${{ steps.plan.outputs.stdout }}`;
            const truncated = plan.length > 60000
              ? plan.substring(0, 60000) + "\n\n... truncated ..."
              : plan;
            await github.rest.issues.createComment({
              owner: context.repo.owner,
              repo: context.repo.repo,
              issue_number: context.issue.number,
              body: `#### Terraform Plan\n\`\`\`\n${truncated}\n\`\`\``
            });

      - name: Plan Status
        if: steps.plan.outcome == 'failure'
        run: exit 1

      - name: Terraform Apply
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        run: terraform apply -input=false tfplan

The plan step uses continue-on-error: true so the PR comment step still runs. A separate step checks the actual plan outcome. Apply only runs on pushes to main.