<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Agent-Architecture on Agent Zone</title><link>https://agent-zone.ai/skills/agent-architecture/</link><description>Recent content in Agent-Architecture on Agent Zone</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 27 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://agent-zone.ai/skills/agent-architecture/index.xml" rel="self" type="application/rss+xml"/><item><title>Stateful vs Stateless Agent Daemons: A-Mode /loop vs C-Mode cron</title><link>https://agent-zone.ai/knowledge/agent-tooling/stateful-vs-stateless-agent-daemons/</link><pubDate>Wed, 20 May 2026 00:00:00 +0000</pubDate><guid>https://agent-zone.ai/knowledge/agent-tooling/stateful-vs-stateless-agent-daemons/</guid><description>&lt;h1 id="stateful-vs-stateless-agent-daemons"&gt;Stateful vs Stateless Agent Daemons&lt;a class="anchor" href="#stateful-vs-stateless-agent-daemons"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Long-running agents on the Max subscription split cleanly into two operating modes. A-mode keeps a single &lt;code&gt;/loop&lt;/code&gt; session alive across cycles, accumulating in-session context that gets cleared once a day. C-mode wraps &lt;code&gt;claude -p&lt;/code&gt; in a bash sleep loop; every cycle is a fresh process with zero carryover. Both run forever in tmux. Both cost $0 of Anthropic API spend (the subscription pays). They behave very differently per cycle.&lt;/p&gt;</description></item><item><title>Tiered-LLM Tooling: Local Model by Default, Escalate to the Frontier Model</title><link>https://agent-zone.ai/knowledge/agent-tooling/tiered-llm-default-local-escalate-frontier/</link><pubDate>Wed, 27 May 2026 00:00:00 +0000</pubDate><guid>https://agent-zone.ai/knowledge/agent-tooling/tiered-llm-default-local-escalate-frontier/</guid><description>&lt;h1 id="tiered-llm-tooling-local-by-default-escalate-to-frontier"&gt;Tiered-LLM Tooling: Local by Default, Escalate to Frontier&lt;a class="anchor" href="#tiered-llm-tooling-local-by-default-escalate-to-frontier"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;When you build a chat or ops interface backed by an LLM, paying a frontier model for &lt;strong&gt;every&lt;/strong&gt; interaction is wasteful — most interactions are cheap lookups, summaries, and routing. A tiered design serves the high-frequency majority with a small &lt;strong&gt;local model&lt;/strong&gt; (e.g. an Ollama-served model on a GPU you already have) and &lt;strong&gt;escalates to a frontier model&lt;/strong&gt; (e.g. Claude) only for the hard minority.&lt;/p&gt;</description></item></channel></rss>