<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Claude on Agent Zone</title><link>https://agent-zone.ai/tools/claude/</link><description>Recent content in Claude on Agent Zone</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 27 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://agent-zone.ai/tools/claude/index.xml" rel="self" type="application/rss+xml"/><item><title>The d4-rich Prompt Pattern: Unlocking Non-Reasoning Models on Multi-File Tasks</title><link>https://agent-zone.ai/knowledge/agent-tooling/prompt-rich-pattern-non-reasoning-models/</link><pubDate>Wed, 20 May 2026 00:00:00 +0000</pubDate><guid>https://agent-zone.ai/knowledge/agent-tooling/prompt-rich-pattern-non-reasoning-models/</guid><description>&lt;h1 id="the-d4-rich-prompt-pattern"&gt;The d4-rich Prompt Pattern&lt;a class="anchor" href="#the-d4-rich-prompt-pattern"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Non-reasoning chat models (deepseek-V4-Flash, grok-4.3, kimi with thinking disabled) collapse on multi-file refactor tasks when given thin or baseline prompts. Pass rates of 0-33% on canaries that reasoning models clear at 67-100%. The cheap fix is a three-part prompt addendum: completion checklist, callsites-exhaustively-updated rule, and verify-before-push instruction. Drop it into the system prompt of a non-reasoning model and the canaries go green. Drop it into a reasoning model and you pay 12× more for 0% quality improvement.&lt;/p&gt;</description></item><item><title>Tiered-LLM Tooling: Local Model by Default, Escalate to the Frontier Model</title><link>https://agent-zone.ai/knowledge/agent-tooling/tiered-llm-default-local-escalate-frontier/</link><pubDate>Wed, 27 May 2026 00:00:00 +0000</pubDate><guid>https://agent-zone.ai/knowledge/agent-tooling/tiered-llm-default-local-escalate-frontier/</guid><description>&lt;h1 id="tiered-llm-tooling-local-by-default-escalate-to-frontier"&gt;Tiered-LLM Tooling: Local by Default, Escalate to Frontier&lt;a class="anchor" href="#tiered-llm-tooling-local-by-default-escalate-to-frontier"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;When you build a chat or ops interface backed by an LLM, paying a frontier model for &lt;strong&gt;every&lt;/strong&gt; interaction is wasteful — most interactions are cheap lookups, summaries, and routing. A tiered design serves the high-frequency majority with a small &lt;strong&gt;local model&lt;/strong&gt; (e.g. an Ollama-served model on a GPU you already have) and &lt;strong&gt;escalates to a frontier model&lt;/strong&gt; (e.g. Claude) only for the hard minority.&lt;/p&gt;</description></item></channel></rss>