On May 25, 2026, DeepSeek made the 75% introductory discount on V4-Pro permanent. The model launched April 24 at $1.74 per million input tokens; the promotional rate of $0.435/M input and $0.87/M output is now the standard price. Cache hits dropped to $0.003625/M — effectively free for repeat context.
On SWE-bench Verified, Opus 4.7 (Adaptive) holds the Anthropic lead at 87.6%. DeepSeek V4-Pro-Max scores 80.6% — a 7-point gap, not parity. But at 1/34th the input cost, the question for most coding workflows isn't "is it as good?" but "is it good enough for the 90% of tasks that don't need Opus-grade reasoning?" For indie devs running $100-200/month Claude bills on agentic loops, that reframe changes the math.
Policy Diff
Vendor: DeepSeek
Change: V4-Pro API pricing permanently reduced to 25% of launch price. No longer promotional.
Effective: May 25, 2026. DeepSeek's pricing page still describes it as a "75% discount promotion" ending 2026/05/31, but Engadget and The Next Web report the post-promo price will match the current discounted rate — effectively locking in 1/4 pricing permanently.
Affected: Anyone paying per-token for coding models. Claude, GPT, and Gemini users now have a benchmark-competitive alternative at a fraction of the cost.
V4-Flash unchanged: $0.14/M input, $0.28/M output. Already the cheapest small model on the market.
Sources: Simon Willison's V4 overview (launch-day analysis). HN discussion (614 points, 544 comments as of this issue).
Price Watch
The permanent discount changes the cost landscape. Token Tape mapped the per-million-token rates across the models that matter for agentic coding work.
Model | Input $/M | Output $/M | SWE-bench Verified | Context |
|---|---|---|---|---|
Claude Opus 4.7 (Adaptive) | $15.00 | $75.00 | 87.6% | 200K |
DeepSeek V4-Pro-Max | ~$1.09 | ~$1.09 | 80.6% | 1M |
Claude Opus 4.6 | — | — | 80.8% | 200K |
DeepSeek V4-Pro | $0.435 | $0.87 | 73.6% | 1M |
DeepSeek V4-Flash | $0.14 | $0.28 | — | 1M |
Claude Sonnet 4.6 | $3.00 | $15.00 | 79.6% | 200K |
GPT-5.4 | $2.50 | $10.00 | — | 128K |
V4-Pro-Max at 80.6% sits right next to Opus 4.6 and a little behind Opus 4.7 on SWE-bench — 7 points behind the current Anthropic leader — but at roughly 1/14th the input cost. Standard V4-Pro at 73.6% is further back but at 1/34th the input cost. The trade-off is real: the gap exists, and it shows on complex multi-file architectural tasks. But for the volume of single-file edits, test generation, and boilerplate-heavy coding work that makes up most agentic sessions, the gap is less visible than the price difference.
SWE-bench scores from BenchLM.ai leaderboard, May 2026.
One HN commenter reported running V4-Flash through Claude Code's Anthropic-compatible endpoint at $1 USD/week for 3 hours/day of coding. That's anecdotal and per one user's workflow, but directionally it illustrates the cost floor V4-Flash enables: approximately what a Claude Pro subscription costs per year.
The trap to watch: DeepSeek's models are strong on benchmarks but handle complex multi-file architectural reasoning differently than Opus. The hybrid pattern — V4-Pro for the volume of coding work, Opus for the 10% that needs it — is what the HN thread converges on. That's the defensible position, not an all-or-nothing switch.
Harness Compatibility
DeepSeek V4 support across coding harnesses, as of this issue.
Cursor: Native support. V4-Pro available in the model picker.
Windsurf: Native support. Added V4-Pro to the model picker on April 25, one day after launch.
Aider: Native support. V4-Pro works out of the box.
Claude Code: Works via DeepSeek's Anthropic-compatible API endpoint. Set
ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropicand use model namedeepseek-v4-proordeepseek-v4-flash. DeepSeek's setup guide documents this explicitly.OpenClaw: Native support per DeepSeek's documentation.
Codex CLI: Not supported. OpenAI ecosystem only.
Gemini CLI: Not supported. Google ecosystem only.
DeepSeek Code Harness (announced, not available): On May 20, DeepSeek researcher Deli Chen posted job listings for a Harness Product Manager and Harness R&D Engineer. The framing: "Model + Harness = Agent." No release date, no beta — this is a hiring announcement, not a product launch. Token Tape will track it as it materializes.
This Week's Switch
Switch: DeepSeek V4-Pro via Cursor or Aider for the bulk of coding work. Why: V4-Pro handles the volume of coding work at $0.435/M input. Opus 4.7 stays in the stack for architectural decisions and complex multi-file reasoning where the 7-14 point SWE-bench gap shows. The math doesn't justify running every task through a $15/M model. Link: cursor.com (Cursor) / aider.chat (Aider)
The hybrid stack is the defensible play. V4-Pro handles the volume work where the benchmark gap doesn't materially affect output quality. Opus handles the architectural and multi-file reasoning tasks where it does. Monthly bill for an indie dev running this pattern: $20-50 instead of $100-200.