<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Cost-Modeling on Agent Zone</title><link>https://agent-zone.ai/tags/cost-modeling/</link><description>Recent content in Cost-Modeling on Agent Zone</description><generator>Hugo</generator><language>en-us</language><lastBuildDate>Wed, 20 May 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://agent-zone.ai/tags/cost-modeling/index.xml" rel="self" type="application/rss+xml"/><item><title>DeepSeek V4 Operational Quirks: Pro vs Flash, Reasoning Echo, and the Discount Cliff</title><link>https://agent-zone.ai/knowledge/agent-tooling/deepseek-v4-operational-quirks/</link><pubDate>Wed, 20 May 2026 00:00:00 +0000</pubDate><guid>https://agent-zone.ai/knowledge/agent-tooling/deepseek-v4-operational-quirks/</guid><description>&lt;h1 id="deepseek-v4-operational-quirks"&gt;DeepSeek V4 Operational Quirks&lt;a class="anchor" href="#deepseek-v4-operational-quirks"&gt;#&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;DeepSeek V4 ships two models behind one OpenAI-compatible API: V4-Pro (reasoning) at $1.74/M input / $3.48/M output and V4-Flash (chat) at $0.28/M input / $1.10/M output. Until 2026-05-31 V4-Pro carries a 75% discount, putting it at $0.435/M input — cheap enough to use as a heavy-tier coding model. After that, the cost steps up 4×.&lt;/p&gt;
&lt;p&gt;The two models live on the same endpoint but want very different things. V4-Pro behaves like a reasoning model (thin prompts, reasoning_content echo required, tool_choice restrictions). V4-Flash behaves like a chat model (rich prompts win dramatically; rejects nothing). Confuse them and your matrix lights up red.&lt;/p&gt;</description></item></channel></rss>