DeepSeek V4 Operational Quirks: Pro vs Flash, Reasoning Echo, and the Discount Cliff

Wed, 20 May 2026 00:00:00 +0000

DeepSeek V4 Operational Quirks#

DeepSeek V4 ships two models behind one OpenAI-compatible API: V4-Pro (reasoning) at $1.74/M input / $3.48/M output and V4-Flash (chat) at $0.28/M input / $1.10/M output. Until 2026-05-31 V4-Pro carries a 75% discount, putting it at $0.435/M input — cheap enough to use as a heavy-tier coding model. After that, the cost steps up 4×.

The two models live on the same endpoint but want very different things. V4-Pro behaves like a reasoning model (thin prompts, reasoning_content echo required, tool_choice restrictions). V4-Flash behaves like a chat model (rich prompts win dramatically; rejects nothing). Confuse them and your matrix lights up red.

Cost-Modeling on Agent Zone

DeepSeek V4 Operational Quirks: Pro vs Flash, Reasoning Echo, and the Discount Cliff

DeepSeek V4 Operational Quirks#