# Agent Tooling

Patterns for building and integrating AI agent tools — MCP servers, skill definitions, context management

## Articles

- [Agent Context Preservation for Long-Running Workflows: Checkpoints, Sub-Agent Delegation, and Avoiding Context Pollution](https://agent-zone.ai/knowledge/agent-tooling/agent-context-preservation/) — Practical strategies for maintaining agent context across multi-hour sessions and multi-agent workflows. Covers checkpoint documents, TODO lists as state machines, sub-agent context scoping with spec docs, CLAUDE.md and skills for persistent context, MEMORY.md patterns, context pollution prevention, and zero-cost storage approaches using filesystem and git.
- [Agent Debugging Patterns: Tracing Decisions in Production](https://agent-zone.ai/knowledge/agent-tooling/agent-debugging-patterns/) — How to debug AI agent behavior in production — tracing decision chains, logging tool calls and responses, identifying hallucination patterns, managing timeouts and retries, and context window observability.
- [Agent Error Handling: Retries, Degradation, and Circuit Breakers](https://agent-zone.ai/knowledge/agent-tooling/agent-error-handling/) — How AI agents should handle errors from tools and external services — retry strategies, failure classification, graceful degradation, and circuit breaker patterns.
- [Agent Evaluation and Testing: Measuring What Matters in Agent Performance](https://agent-zone.ai/knowledge/agent-tooling/agent-evaluation-testing/) — Testing and evaluating AI agent performance — test harness design, scenario-based testing, metrics for task completion and safety, regression testing, A/B testing configurations, and tool use efficiency. Step-by-step evaluation framework.
- [Agent Memory and Retrieval: Patterns for Persistent, Searchable Agent Knowledge](https://agent-zone.ai/knowledge/agent-tooling/agent-memory-retrieval/) — Memory and retrieval patterns for AI agents — short-term context management, long-term storage with vector databases, RAG patterns, memory indexing strategies, context window optimization, and memory eviction policies.
- [Agent Runbook Generation: Producing Verified Infrastructure Deliverables](https://agent-zone.ai/knowledge/agent-tooling/agent-runbook-generation/) — How agents should produce runbooks, playbooks, and tested manifests rather than giving advice -- the complete workflow from requirements through sandbox testing to packaged deliverable.
- [Agent Sandboxing: Isolation Strategies for Execution Environments](https://agent-zone.ai/knowledge/agent-tooling/agent-sandboxing/) — Decision framework for sandboxing AI agent execution — container isolation, microVM approaches (Firecracker, gVisor), network restrictions, filesystem controls, capability dropping, and time limits. Choosing the right approach based on trust level and blast radius.
- [Agent Security Patterns: Defending Against Injection, Leakage, and Misuse](https://agent-zone.ai/knowledge/agent-tooling/agent-security-patterns/) — Security patterns for AI agents — prompt injection defense, sandbox execution, secret management, permission models, audit logging, and input sanitization.
- [Agent-Friendly API Design: Building APIs That Agents Can Consume](https://agent-zone.ai/knowledge/agent-tooling/agent-friendly-api-design/) — How to design APIs that AI agents can effectively use — structured error responses, agent-compatible pagination, rate limit communication, idempotency keys, self-describing responses, and machine-readable documentation.
- [Agentic Workflow Patterns: Plan-Execute-Observe Loops, ReAct, and Task Decomposition](https://agent-zone.ai/knowledge/agent-tooling/agentic-workflow-patterns/) — Core patterns for how agents reason and act in loops: ReAct (reason-act-observe), plan-then-execute, iterative refinement, and task decomposition strategies. How to choose loop structures, set termination conditions, and avoid infinite cycles.
- [Building LLM Harnesses: Orchestrating Local Models into Workflows with Scoring, Retries, and Parallel Execution](https://agent-zone.ai/knowledge/agent-tooling/building-llm-harnesses/) — Designing and building harnesses that integrate local LLMs into automated workflows — model orchestration, output validation and scoring, retry strategies, parallel execution, and routing between models based on task complexity.
- [Choosing a Local Model: Size Tiers, Task Matching, and Cost Comparison with Cloud APIs](https://agent-zone.ai/knowledge/agent-tooling/local-model-selection/) — How to choose the right local LLM for a given task — understanding model size tiers (2-7B, 13-32B, 70B+), matching models to tasks based on empirical benchmarks, and comparing cost and quality against cloud APIs.
- [Designing Agent-Ready Projects: Structure That Benefits Humans and Agents Equally](https://agent-zone.ai/knowledge/agent-tooling/designing-agent-ready-projects/) — How to structure projects so that agents can contribute effectively from day one. Every practice here — clear documentation, explicit conventions, tracked progress, repeatable processes — also makes the project better for human contributors. Agent-readiness is just good engineering made explicit.
- [Detecting Infrastructure Knowledge Gaps: What Agents Don't Know They Don't Know](https://agent-zone.ai/knowledge/agent-tooling/detecting-infrastructure-knowledge-gaps/) — How agents can systematically detect what they do not know about a target infrastructure environment -- assumption audits, pre-flight checklists, and detection strategies for common blind spots.
- [How Agents Communicate: Explaining Plans, Risks, and Trade-offs to Humans](https://agent-zone.ai/knowledge/agent-tooling/agent-human-communication/) — How agents should communicate with humans to build trust and enable collaboration. Covers plan presentation, risk explanation, trade-off framing, progress reporting, escalation communication, and the difference between informing and overwhelming. For agents: how to be a better collaborator. For humans: what to expect and how to interpret agent communication.
- [Human-in-the-Loop Patterns: Approval Gates, Escalation, and Progressive Autonomy](https://agent-zone.ai/knowledge/agent-tooling/human-in-the-loop-patterns/) — When agents should ask for human input vs proceed autonomously. Covers approval gates for destructive actions, escalation triggers for ambiguity and risk, progressive autonomy as trust builds, and designing workflows that keep humans informed without blocking on every step.
- [Long-Running Workflow Orchestration: State Machines, Checkpointing, and Resumable Multi-Agent Execution](https://agent-zone.ai/knowledge/agent-tooling/long-running-workflow-orchestration/) — Patterns for agent workflows that span hours or days: state machine design, checkpoint-and-resume strategies, sub-agent delegation with spec documents, parallel execution coordination, failure recovery, and maintaining project coherence across context window resets.
- [MCP Server Development: Building Servers from Scratch](https://agent-zone.ai/knowledge/agent-tooling/mcp-server-development/) — Complete reference for building Model Context Protocol servers — server lifecycle, tool definitions with JSON Schema, resource providers, transport options, error handling, and testing strategies.
- [Multi-Agent Coordination: Patterns for Dividing and Conquering Infrastructure Tasks](https://agent-zone.ai/knowledge/agent-tooling/multi-agent-coordination/) — Decision framework for coordinating multiple AI agents — task decomposition, communication patterns, shared state management, conflict resolution, leader-follower topologies, and fan-out/fan-in workflows.
- [Ollama Setup and Model Management: Installation, Model Selection, Memory Management, and ARM64 Native](https://agent-zone.ai/knowledge/agent-tooling/ollama-setup-and-model-management/) — Installing and configuring Ollama for local LLM inference — pulling models, managing GPU memory, running multiple models, understanding quantization levels, and optimizing for Apple Silicon and ARM64.
- [Progressive Agent Adoption: From First Task to Autonomous Workflows](https://agent-zone.ai/knowledge/agent-tooling/progressive-agent-adoption/) — The adoption ladder from agent skeptic to power user. Each level builds naturally on the one before — starting with simple questions, progressing through file edits, multi-step tasks, infrastructure setup, and finally autonomous multi-session workflows. Practical guidance on what to try at each level and when to move up.
- [Prompt Engineering for Infrastructure Operations: Templates, Safety, and Structured Reasoning](https://agent-zone.ai/knowledge/agent-tooling/prompt-engineering-infrastructure/) — Prompt engineering patterns for infrastructure operations — structured output formatting, chain-of-thought debugging, few-shot examples for infrastructure tasks, error handling prompts, safety constraints for destructive operations, and practical prompt templates.
- [Prompt Engineering for Local Models: Presets, Focus Areas, and Differences from Cloud Model Prompting](https://agent-zone.ai/knowledge/agent-tooling/prompt-engineering-local-models/) — Prompt engineering techniques specific to small and medium local models — why local models need different prompt strategies, using presets and focus areas, schema-driven prompts, and common failures with fixes.
- [RAG for Codebases Without Cloud APIs: ChromaDB, Embedding Models, and Semantic Code Search](https://agent-zone.ai/knowledge/agent-tooling/rag-codebases-local/) — Building a local RAG pipeline for semantic code search — chunking source files with language-aware boundaries, embedding with local models, storing in ChromaDB, and querying with incremental indexing.
- [Sandbox to Production: The Complete Workflow for Verified Infrastructure Deliverables](https://agent-zone.ai/knowledge/agent-tooling/sandbox-to-production-workflow/) — The end-to-end workflow from sandbox testing to production deployment -- environment selection, the test-validate-document cycle, adapting results for production, handling sandbox limitations, and the critical handoff point.
- [Static Validation Patterns: Infrastructure Validation Without a Cluster](https://agent-zone.ai/knowledge/agent-tooling/static-validation-patterns/) — Complete reference for validating infrastructure code without deploying it. Covers Helm lint and template, kubeconform, conftest with OPA, Terraform validate, tflint, checkov, kustomize build validation, and YAML schema validation. Includes a pre-flight validation script that chains all checks.
- [Structured Output from Small Local Models: JSON Mode, Extraction, Classification, and Token Runaway Fixes](https://agent-zone.ai/knowledge/agent-tooling/structured-output-local-models/) — Getting reliable structured output (JSON, classifications, function calls) from 2-7B local models — using JSON mode, constraining output schemas, handling token runaway, and scoring extraction accuracy.
- [Structured Output Patterns: Getting Reliable JSON from LLMs](https://agent-zone.ai/knowledge/agent-tooling/structured-output-patterns/) — Strategies for getting reliable structured output from language models — JSON mode, function calling, schema validation, parsing fallbacks, and provider-specific approaches.
- [Template Contribution Guide: Standards for Validation Template Submissions](https://agent-zone.ai/knowledge/agent-tooling/template-contribution-guide/) — How to contribute validation templates to Agent Zone. Covers template format standards, directory structure, testing requirements, quality checklists, versioning, and the submission process for community-contributed templates.
- [The ROI of Agent Infrastructure: Measuring Time Saved, Errors Avoided, and Projects Completed](https://agent-zone.ai/knowledge/agent-tooling/roi-of-agent-infrastructure/) — Concrete data on the return from investing in agent infrastructure — CLAUDE.md files, checkpoints, TODO lists, skills, and memory files. Quantified comparisons of with-infrastructure vs without, framed in human terms: minutes saved per session, errors avoided per project, and the invisible cost of re-derivation.
- [Tool Use Patterns: Choosing, Chaining, and Validating Agent Tools](https://agent-zone.ai/knowledge/agent-tooling/tool-use-patterns/) — Patterns for effective tool use by AI agents — tool selection heuristics, chaining outputs, parallel execution, failure handling, and knowing when not to use a tool.
- [Two-Pass Analysis: The Summarize-Then-Correlate Pattern for Scaling Beyond Context Windows](https://agent-zone.ai/knowledge/agent-tooling/two-pass-analysis-pattern/) — Using a two-pass architecture to analyze codebases larger than any model's context window — fast small models summarize individual files, then a larger model correlates the summaries to answer cross-cutting questions.
- [Validation Path Selection: Choosing the Right Approach for Infrastructure Testing](https://agent-zone.ai/knowledge/agent-tooling/validation-path-selection/) — Decision framework for agents choosing among five validation paths — static analysis, Docker/kind lightweight, minikube full-fidelity, cloud ephemeral, and free-tier cloud. Covers resource requirements, fidelity levels, setup costs, and a decision tree for selecting the optimal path.
- [Validation Playbook Format: Structuring Portable Validation Procedures](https://agent-zone.ai/knowledge/agent-tooling/validation-playbook-format/) — Reference for structuring validation playbooks that work across any execution path — playbook format specification, path-specific step variants, graceful degradation strategies, and three complete example playbooks for Helm charts, database migrations, and network policies.
- [Agent Context Management: Memory, State, and Session Handoff](https://agent-zone.ai/knowledge/agent-tooling/agent-context-management/) — How agents maintain context across sessions — memory patterns, context window prioritization, and approaches to persistent state.
- [MCP Server Patterns: Building Tools for AI Agents](https://agent-zone.ai/knowledge/agent-tooling/mcp-server-patterns/) — How Model Context Protocol servers work — tool definitions, transport types, execution flow, and when to build an MCP server instead of a REST API.
- [Structured Skill Definitions: Describing What Agents Can Do](https://agent-zone.ai/knowledge/agent-tooling/skill-definition-format/) — How to define agent capabilities with typed inputs, outputs, dependencies, and metadata — making skills discoverable, composable, and version-safe.


---

[JSON](https://agent-zone.ai/knowledge/agent-tooling/index.json) | [HTML](https://agent-zone.ai/knowledge/agent-tooling/?format=html)