Autonomy Tiers and Escalation as Runtime Contracts, Not Prompt Instructions

An agent is dispatched on a task it cannot complete. The spec is broken. The dependency is missing. The credentials are wrong. What happens next determines whether you have an autonomous fleet or a fleet that quietly fails.

The most common answer — instructing the agent in its prompt to “ask for help if stuck” — does not survive contact with production. Agents either keep grinding and produce broken work, or output text that looks like a question but never reaches a human, or politely “complete” the task by writing nothing and reporting success. None of these failure modes are visible from the outside until the dashboards have been lying for hours.

Heterogeneous A/B/C/D Pool Dispatch: Real Model Comparison Without an Eval Harness

You need to know whether model-X is worth deploying for your real workload. The benchmarks suggest yes, but benchmarks are static and your workload is not. The standard answer — build an eval harness — runs into two structural problems: harnesses are expensive to build well, and they tend to over-fit to the inputs you remembered to include in the corpus, missing the real production failure modes you discover only later.

Agentic Workflow Patterns: Plan-Execute-Observe Loops, ReAct, and Task Decomposition

Agentic Workflow Patterns#

An agent without a workflow pattern is a chatbot. What separates an agent from a single-turn LLM call is the loop: observe the environment, reason about what to do, act, observe the result, and decide whether to continue. The loop structure determines everything – how the agent plans, how it recovers from errors, when it stops, and whether it can handle tasks that take minutes or hours.

Human-in-the-Loop Patterns: Approval Gates, Escalation, and Progressive Autonomy

Human-in-the-Loop Patterns#

The most common failure mode in agent-driven work is not a wrong answer – it is a correct action taken without permission. An agent that deletes a file to “clean up,” force-pushes a branch to “fix history,” or restarts a service to “apply changes” can cause more damage in one unauthorized action than a dozen wrong answers.

Human-in-the-loop design is not about limiting agent capability. It is about matching autonomy to risk. Safe, reversible actions should proceed without interruption. Dangerous, irreversible actions should require explicit approval. The challenge is building this classification into the workflow without turning every action into a confirmation dialog.

MCP Server Development: Building Servers from Scratch

MCP Server Development#

This reference covers building MCP servers from scratch – the server lifecycle, defining tools with proper JSON Schema, exposing resources, choosing transports, handling errors, and testing the result. If you want to understand when to use MCP versus alternatives, see the companion article on MCP Server Patterns. This article focuses on how to build one.

Server Lifecycle#

An MCP server goes through four phases: initialization, capability negotiation, operation, and shutdown.

Multi-Agent Coordination: Patterns for Dividing and Conquering Infrastructure Tasks

Multi-Agent Coordination#

A single agent can read files, call APIs, and reason about results. But some tasks are too broad, too slow, or too dangerous for one agent to handle alone. Debugging a production outage might require one agent analyzing logs, another checking infrastructure state, and a third reviewing recent deployments – simultaneously. Multi-agent coordination is how you split work across agents without them stepping on each other.

The hard part is not spawning multiple agents. The hard part is deciding which coordination pattern fits the task, how agents share information, and what happens when they disagree.

Tool Use Patterns: Choosing, Chaining, and Validating Agent Tools

Tool Use Patterns#

An agent with access to 30 tools is not automatically more capable than one with 5. What matters is how it selects, sequences, and validates tool use. Poor tool use wastes tokens, introduces latency, and produces wrong results that look right.

Choosing the Right Tool#

When multiple tools could handle a task, the agent must pick the best one. This is harder than it sounds because tool descriptions are imperfect and tasks are ambiguous.

MCP Server Patterns: Building Tools for AI Agents

MCP Server Patterns#

Model Context Protocol (MCP) is Anthropic’s open standard for connecting AI agents to external tools and data. Instead of every agent framework inventing its own tool integration format, MCP provides a single protocol that any agent can speak.

An agent that supports MCP can discover tools at runtime, understand their inputs and outputs, and invoke them – without hardcoded integration code for each tool.

Server Structure: Three Primitives#

An MCP server exposes three types of capabilities: