Agent Memory and Retrieval: Patterns for Persistent, Searchable Agent Knowledge

Agent Memory and Retrieval#

An agent without memory repeats mistakes, forgets context, and relearns the same facts every session. An agent with too much memory wastes context window tokens on irrelevant history and retrieves noise instead of signal. Effective memory sits between these extremes – storing what matters, retrieving what is relevant, and forgetting what is stale.

This reference covers the concrete patterns for building agent memory systems, from simple file-based approaches to production-grade retrieval pipelines.

Agent Runbook Generation: Producing Verified Infrastructure Deliverables

Agent Runbook Generation#

An agent that says “you should probably add a readiness probe to your deployment” is giving advice. An agent that hands you a tested manifest with the readiness probe configured, verified against a real cluster, with rollback steps if the probe misconfigures – that agent is producing a deliverable. The difference matters.

The core thesis of infrastructure agent work is that the output is always a deliverable – a runbook, playbook, tested manifest, or validated configuration – never a direct action on someone else’s systems. This article covers the complete workflow for generating those deliverables: understanding requirements, planning steps, executing in a sandbox, capturing what worked, and packaging the result.

Agent Sandboxing: Isolation Strategies for Execution Environments

Agent Sandboxing#

An AI agent that can execute code, run shell commands, or call APIs needs a sandbox. Without one, a single bad tool call – whether from a bug, a hallucination, or a prompt injection attack – can read secrets, modify production data, or pivot to other systems. This article is a decision framework for choosing the right sandboxing strategy based on your trust level, threat model, and performance requirements.

Agent Security Patterns: Defending Against Injection, Leakage, and Misuse

Agent Security Patterns#

An AI agent with tool access is a program that can read files, call APIs, execute code, and modify systems – driven by natural language input. Every classic security concern applies, plus new attack surfaces unique to LLM-powered systems. This article covers practical defenses, not theoretical risks.

Prompt Injection Defense#

Prompt injection is the most agent-specific security threat. An attacker embeds instructions in data the agent processes – a file, a web page, an API response – and the agent follows those instructions as if they came from the user.

Agent-Friendly API Design: Building APIs That Agents Can Consume

Agent-Friendly API Design#

Most APIs are designed for human developers who read documentation, interpret ambiguous error messages, and adapt their approach based on experience. Agents do not have these skills. They parse structured responses, follow explicit instructions, and fail on ambiguity. An API that is pleasant for humans to use may be impossible for an agent to use reliably.

This reference covers practical patterns for designing APIs – or modifying existing ones – so that agents can consume them effectively.

Agent-Oriented Terraform: Linear Patterns for Machine-Managed Infrastructure

Agent-Oriented Terraform#

Most Terraform code is written by humans for humans. It favors abstraction, DRY principles, and deep module nesting — patterns that make sense when a human maintains a mental model of the codebase. Agents do not maintain mental models. They read code fresh each time, trace references to resolve dependencies, and reason about the full resource graph in a single context window.

The patterns that make Terraform elegant for humans make it expensive for agents. Deep module nesting multiplies the files an agent must read. Variable threading through three layers of modules hides dependencies behind indirection. Complex for_each over maps of objects creates resources that are invisible until runtime. The agent spends most of its context on navigation, not comprehension.

Agentic Workflow Patterns: Plan-Execute-Observe Loops, ReAct, and Task Decomposition

Agentic Workflow Patterns#

An agent without a workflow pattern is a chatbot. What separates an agent from a single-turn LLM call is the loop: observe the environment, reason about what to do, act, observe the result, and decide whether to continue. The loop structure determines everything – how the agent plans, how it recovers from errors, when it stops, and whether it can handle tasks that take minutes or hours.

AKS Identity and Security: Entra ID, Workload Identity, and Policy

AKS Identity and Security#

AKS identity operates at three levels: who can access the cluster API (authentication), what they can do inside it (authorization), and how pods authenticate to Azure services (workload identity). Each level has Azure-specific mechanisms that replace or extend vanilla Kubernetes patterns.

Entra ID Integration (Azure AD)#

AKS supports two Entra ID integration modes.

AKS-managed Azure AD: Enable with --enable-aad at cluster creation. AKS handles the app registrations and token validation. This is the recommended approach.

AKS Networking and Ingress Deep Dive

AKS Networking and Ingress#

AKS networking involves three layers: how pods communicate (CNI plugin), how traffic enters the cluster (load balancers and ingress controllers), and how the cluster connects to other Azure resources (VNet integration, private endpoints). Each layer has Azure-specific behavior that differs from generic Kubernetes.

Azure Load Balancer for Services#

When you create a Service of type LoadBalancer in AKS, Azure provisions a Standard SKU Azure Load Balancer. AKS manages the load balancer rules and health probes automatically.

AKS Setup and Configuration: Clusters, Node Pools, and Networking

AKS Setup and Configuration#

Azure Kubernetes Service handles the control plane for you – you pay nothing for it. What you configure is node pools, networking, identity, and add-ons. Getting these right at cluster creation matters because several choices (networking model, managed identity) cannot be changed later without rebuilding the cluster.

Creating a Cluster with az CLI#

The minimal command that produces a production-usable cluster:

az aks create \
  --resource-group myapp-rg \
  --name myapp-aks \
  --location eastus2 \
  --node-count 3 \
  --node-vm-size Standard_D4s_v5 \
  --network-plugin azure \
  --network-plugin-mode overlay \
  --vnet-subnet-id /subscriptions/<sub>/resourceGroups/myapp-rg/providers/Microsoft.Network/virtualNetworks/myapp-vnet/subnets/aks-subnet \
  --enable-managed-identity \
  --enable-aad \
  --aad-admin-group-object-ids <admin-group-id> \
  --generate-ssh-keys \
  --tier standard

Key flags: --network-plugin azure --network-plugin-mode overlay gives you Azure CNI Overlay, which avoids the IP exhaustion problems of classic Azure CNI. --tier standard enables the financially-backed SLA and uptime guarantees (the free tier has no SLA). --enable-aad integrates Entra ID (formerly Azure AD) for authentication.