On May 25, 2026, DeepSeek made the 75% introductory discount on V4-Pro permanent. The model launched April 24 at $1.74 per million input tokens; the promotional rate of $0.435/M input and $0.87/M output is now the standard price. Cache hits dropped to $0.003625/M — effectively free for repeat context.

On SWE-bench Verified, Opus 4.7 (Adaptive) holds the Anthropic lead at 87.6%. DeepSeek V4-Pro-Max scores 80.6% — a 7-point gap, not parity. But at 1/34th the input cost, the question for most coding workflows isn't "is it as good?" but "is it good enough for the 90% of tasks that don't need Opus-grade reasoning?" For indie devs running $100-200/month Claude bills on agentic loops, that reframe changes the math.

Policy Diff

  • Vendor: DeepSeek

  • Change: V4-Pro API pricing permanently reduced to 25% of launch price. No longer promotional.

  • Effective: May 25, 2026. DeepSeek's pricing page still describes it as a "75% discount promotion" ending 2026/05/31, but Engadget and The Next Web report the post-promo price will match the current discounted rate — effectively locking in 1/4 pricing permanently.

  • Affected: Anyone paying per-token for coding models. Claude, GPT, and Gemini users now have a benchmark-competitive alternative at a fraction of the cost.

  • V4-Flash unchanged: $0.14/M input, $0.28/M output. Already the cheapest small model on the market.

  • Sources: Simon Willison's V4 overview (launch-day analysis). HN discussion (614 points, 544 comments as of this issue).

Price Watch

The permanent discount changes the cost landscape. Token Tape mapped the per-million-token rates across the models that matter for agentic coding work.

Model

Input $/M

Output $/M

SWE-bench Verified

Context

Claude Opus 4.7 (Adaptive)

$15.00

$75.00

87.6%

200K

DeepSeek V4-Pro-Max

~$1.09

~$1.09

80.6%

1M

Claude Opus 4.6

80.8%

200K

DeepSeek V4-Pro

$0.435

$0.87

73.6%

1M

DeepSeek V4-Flash

$0.14

$0.28

1M

Claude Sonnet 4.6

$3.00

$15.00

79.6%

200K

GPT-5.4

$2.50

$10.00

128K

V4-Pro-Max at 80.6% sits right next to Opus 4.6 and a little behind Opus 4.7 on SWE-bench — 7 points behind the current Anthropic leader — but at roughly 1/14th the input cost. Standard V4-Pro at 73.6% is further back but at 1/34th the input cost. The trade-off is real: the gap exists, and it shows on complex multi-file architectural tasks. But for the volume of single-file edits, test generation, and boilerplate-heavy coding work that makes up most agentic sessions, the gap is less visible than the price difference.

SWE-bench scores from BenchLM.ai leaderboard, May 2026.

One HN commenter reported running V4-Flash through Claude Code's Anthropic-compatible endpoint at $1 USD/week for 3 hours/day of coding. That's anecdotal and per one user's workflow, but directionally it illustrates the cost floor V4-Flash enables: approximately what a Claude Pro subscription costs per year.

The trap to watch: DeepSeek's models are strong on benchmarks but handle complex multi-file architectural reasoning differently than Opus. The hybrid pattern — V4-Pro for the volume of coding work, Opus for the 10% that needs it — is what the HN thread converges on. That's the defensible position, not an all-or-nothing switch.

Harness Compatibility

DeepSeek V4 support across coding harnesses, as of this issue.

  • Cursor: Native support. V4-Pro available in the model picker.

  • Windsurf: Native support. Added V4-Pro to the model picker on April 25, one day after launch.

  • Aider: Native support. V4-Pro works out of the box.

  • Claude Code: Works via DeepSeek's Anthropic-compatible API endpoint. Set ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic and use model name deepseek-v4-pro or deepseek-v4-flash. DeepSeek's setup guide documents this explicitly.

  • OpenClaw: Native support per DeepSeek's documentation.

  • Codex CLI: Not supported. OpenAI ecosystem only.

  • Gemini CLI: Not supported. Google ecosystem only.

DeepSeek Code Harness (announced, not available): On May 20, DeepSeek researcher Deli Chen posted job listings for a Harness Product Manager and Harness R&D Engineer. The framing: "Model + Harness = Agent." No release date, no beta — this is a hiring announcement, not a product launch. Token Tape will track it as it materializes.

This Week's Switch

Switch: DeepSeek V4-Pro via Cursor or Aider for the bulk of coding work. Why: V4-Pro handles the volume of coding work at $0.435/M input. Opus 4.7 stays in the stack for architectural decisions and complex multi-file reasoning where the 7-14 point SWE-bench gap shows. The math doesn't justify running every task through a $15/M model. Link: cursor.com (Cursor) / aider.chat (Aider)

The hybrid stack is the defensible play. V4-Pro handles the volume work where the benchmark gap doesn't materially affect output quality. Opus handles the architectural and multi-file reasoning tasks where it does. Monthly bill for an indie dev running this pattern: $20-50 instead of $100-200.

Keep Reading