Introduction to Temporal: Durable Execution for Distributed Systems

Introduction to Temporal#

Temporal is a durable execution engine. You write workflows as ordinary code – if/else, loops, function calls – and Temporal guarantees that code runs to completion even when processes crash, machines fail, or deployments happen mid-execution. It eliminates the need to build retry logic, state machines, and recovery mechanisms by hand.

This article introduces the core concepts, architecture, and use cases. It is the first in a series that takes you from zero to running production workflows on Kubernetes.

DeepSeek V4 Operational Quirks: Pro vs Flash, Reasoning Echo, and the Discount Cliff

DeepSeek V4 Operational Quirks#

DeepSeek V4 ships two models behind one OpenAI-compatible API: V4-Pro (reasoning) at $1.74/M input / $3.48/M output and V4-Flash (chat) at $0.28/M input / $1.10/M output. Until 2026-05-31 V4-Pro carries a 75% discount, putting it at $0.435/M input — cheap enough to use as a heavy-tier coding model. After that, the cost steps up 4×.

The two models live on the same endpoint but want very different things. V4-Pro behaves like a reasoning model (thin prompts, reasoning_content echo required, tool_choice restrictions). V4-Flash behaves like a chat model (rich prompts win dramatically; rejects nothing). Confuse them and your matrix lights up red.

LLM Adapter Audit Checklist: 10 Bugs That Hide in OpenAI-Compatible Providers

LLM Adapter Audit Checklist#

When you wrap an OpenAI-compatible LLM provider (Moonshot, DeepSeek, xAI, Together, Fireworks, OpenRouter, vLLM, anything else that exposes POST /v1/chat/completions) in a Go HTTP client, the same ten bug classes show up. They all silently degrade or break the agent — none of them crash loudly. Each was observed in production across at least one of xAI, DeepSeek, or Moonshot during a two-week audit period.

This checklist is the audit. Run it against any new adapter before shipping. Each entry is Symptom → Cause → Fix with a code shape you can grep your repo for.

Moonshot Kimi K2.6 Operational Quirks: What Breaks in Production

Moonshot Kimi K2.6 Operational Quirks#

Kimi K2.6 is one of the cheapest competent reasoning models — $0.95/M input cache-miss, $0.16/M cache-hit, $4.00/M output, 256K context. It is also one of the most opinionated. Half of what works on OpenAI breaks here, and the failures are silent: empty content, mid-reasoning truncation, 400 errors that don’t mention the actual problem, and a cache key parameter that makes cost go up instead of down.

OFAT Matrix LLM Tuning: A Methodology for Picking Sampling Params, Tool Configs, and Prompts Without Guessing

OFAT Matrix LLM Tuning#

When a new provider or model lands and you have to decide what temperature, max_tokens, tool_choice, prompt-shape, and turn budget to ship in production, the default is to pick by hunch. Read the model card, copy a partner adapter’s defaults, ship. A week later you find out reasoning_effort=high doubled cost for no quality gain, max_tokens=2048 silently truncated half your tier-3 runs, and the “prompt-rich” pattern you copied from grok-4.3 actively hurts kimi.

Reasoning-Model Tuning Asymmetry: Why Thin Prompts Beat Rich Prompts (and When They Don't)

Reasoning-Model Tuning Asymmetry#

Practitioners assume “better prompt = better output”. For one model class, that assumption is correct. For the other, the same prompt makes things measurably worse. This article documents the asymmetry, names the dividing line, and gives you a 4-cell test to confirm it on your own canary before you commit to a prompt.

The asymmetry is empirical, not theoretical. It shows up cleanly across four independent OFAT (one-factor-at-a-time) matrices run between 2026-05-18 and 2026-05-20: sonnet POC, grok matrix v1+v2, deepseek matrix v1, kimi matrix v1.

xAI Grok Operational Quirks: Error Shapes, Rate-Limit HTML, and Per-Model Tool Surfaces

xAI Grok Operational Quirks#

xAI’s Grok API is OpenAI-compatible on paper. In practice it has more wire-format edge cases than any other provider in production: error responses change shape, rate-limit pages come back as HTML, assistant turns reject missing fields with HTTP 422, and the two flagship models (grok-4.3 and grok-4.20-reasoning) have incompatible parameter sets. Wrap it carelessly and the adapter crashes the conversation mid-turn.

This page is the production-confirmed quirks list, each as Symptom → Cause → Fix → Verify. Numbers come from two OFAT matrix runs (15 cells × N=3 baseline, 3 cells × N=5 validation) on api.x.ai and the heavy-tier POC. Full synthesis: ~/.claude/projects/-Users-mstather/memory/project_xai_adapter_wireerror_bug_2026_05_19.md and project_grok_matrix_v1_2026_05_19.md.

Temporal Namespaces and Task Queues: Organizing Workflows

Temporal Namespaces and Task Queues#

Namespaces and task queues are Temporal’s two primary organizational mechanisms. Namespaces provide isolation – separate history, retention, and access. Task queues route work to specific workers. Together, they determine where workflows run and how long their history is kept.

For the underlying architecture, see Introduction to Temporal.

Namespaces#

A namespace is a logical isolation boundary. Every workflow belongs to exactly one namespace. Namespaces provide history isolation (workflows cannot see across boundaries), independent retention policies, per-namespace search attributes, and scoped access control.

Your First Temporal Workflow in Go: DI, Idempotency, and the Worker Pattern

Your First Temporal Workflow in Go#

This article establishes the patterns used throughout the Temporal series: dependency injection for testable activities, idempotency for safe retries, and a clean worker binary. Every subsequent article builds on these foundations.

All code lives in the companion repo at github.com/statherm/temporal-examples. For background, see Introduction to Temporal and Namespaces and Task Queues.

Project Structure#

The companion repo organizes code by domain:

temporal-examples/
  cmd/worker/main.go         # Worker binary
  cmd/starter/main.go        # Workflow starter CLI
  internal/container/
    activities.go             # Activity implementations with DI
    workflow.go               # Workflow definitions
    types.go                  # Interfaces and types
  Makefile

Workflows and Activities#

A workflow is a deterministic function that orchestrates work. It takes workflow.Context, must not perform side effects, and dispatches work through activities. Activities use standard context.Context and perform real I/O:

Testing Temporal Workflows: Unit Tests, Integration Tests, and the Test Environment

Testing Temporal Workflows#

Temporal workflows have a property that most distributed systems lack: determinism. A workflow function, given the same inputs and the same sequence of activity results, will always produce the same output. This makes workflows far more testable than you might expect for code that orchestrates long-running, multi-step processes.

Activities are the opposite. They talk to databases, call APIs, read files, and produce side effects. You do not want your unit tests doing any of that. The testing strategy follows directly: test workflows by mocking their activities, and test activities by injecting mock dependencies.