TOKEN TAPE

On May 25, 2026, DeepSeek made the 75% introductory discount on V4-Pro permanent. The model launched April 24 at $1.74 per million input tokens; the promotional rate of $0.435/M input and $0.87/M output is now the standard price. Cache hits dropped to $0.003625/M — effectively free for repeat context.

On SWE-bench Verified, Opus 4.7 (Adaptive) holds the Anthropic lead at 87.6%. DeepSeek V4-Pro-Max scores 80.6% — a 7-point gap, not parity. But at 1/34th the input cost, the question for most coding workflows isn't "is it as good?" but "is it good enough for the 90% of tasks that don't need Opus-grade reasoning?" For indie devs running $100-200/month Claude bills on agentic loops, that reframe changes the math.

Policy Diff

Vendor: DeepSeek
Change: V4-Pro API pricing permanently reduced to 25% of launch price. No longer promotional.
Effective: May 25, 2026. DeepSeek's pricing page still describes it as a "75% discount promotion" ending 2026/05/31, but Engadget and The Next Web report the post-promo price will match the current discounted rate — effectively locking in 1/4 pricing permanently.
Affected: Anyone paying per-token for coding models. Claude, GPT, and Gemini users now have a benchmark-competitive alternative at a fraction of the cost.
V4-Flash unchanged: $0.14/M input, $0.28/M output. Already the cheapest small model on the market.
Sources: Simon Willison's V4 overview (launch-day analysis). HN discussion (614 points, 544 comments as of this issue).

Price Watch

The permanent discount changes the cost landscape. Token Tape mapped the per-million-token rates across the models that matter for agentic coding work.

Model	Input $/M	Output $/M	SWE-bench Verified	Context
Claude Opus 4.7 (Adaptive)	$15.00	$75.00	87.6%	200K
DeepSeek V4-Pro-Max	~$1.09	~$1.09	80.6%	1M
Claude Opus 4.6	—	—	80.8%	200K
DeepSeek V4-Pro	$0.435	$0.87	73.6%	1M
DeepSeek V4-Flash	$0.14	$0.28	—	1M
Claude Sonnet 4.6	$3.00	$15.00	79.6%	200K
GPT-5.4	$2.50	$10.00	—	128K

V4-Pro-Max at 80.6% sits right next to Opus 4.6 and a little behind Opus 4.7 on SWE-bench — 7 points behind the current Anthropic leader — but at roughly 1/14th the input cost. Standard V4-Pro at 73.6% is further back but at 1/34th the input cost. The trade-off is real: the gap exists, and it shows on complex multi-file architectural tasks. But for the volume of single-file edits, test generation, and boilerplate-heavy coding work that makes up most agentic sessions, the gap is less visible than the price difference.

SWE-bench scores from BenchLM.ai leaderboard, May 2026.

One HN commenter reported running V4-Flash through Claude Code's Anthropic-compatible endpoint at $1 USD/week for 3 hours/day of coding. That's anecdotal and per one user's workflow, but directionally it illustrates the cost floor V4-Flash enables: approximately what a Claude Pro subscription costs per year.

The trap to watch: DeepSeek's models are strong on benchmarks but handle complex multi-file architectural reasoning differently than Opus. The hybrid pattern — V4-Pro for the volume of coding work, Opus for the 10% that needs it — is what the HN thread converges on. That's the defensible position, not an all-or-nothing switch.

Harness Compatibility

DeepSeek V4 support across coding harnesses, as of this issue.

Cursor: Native support. V4-Pro available in the model picker.
Windsurf: Native support. Added V4-Pro to the model picker on April 25, one day after launch.
Aider: Native support. V4-Pro works out of the box.
Claude Code: Works via DeepSeek's Anthropic-compatible API endpoint. Set ANTHROPIC_BASE_URL=https://api.deepseek.com/anthropic and use model name deepseek-v4-pro or deepseek-v4-flash. DeepSeek's setup guide documents this explicitly.
OpenClaw: Native support per DeepSeek's documentation.
Codex CLI: Not supported. OpenAI ecosystem only.
Gemini CLI: Not supported. Google ecosystem only.

DeepSeek Code Harness (announced, not available): On May 20, DeepSeek researcher Deli Chen posted job listings for a Harness Product Manager and Harness R&D Engineer. The framing: "Model + Harness = Agent." No release date, no beta — this is a hiring announcement, not a product launch. Token Tape will track it as it materializes.

This Week's Switch

❝

Switch: DeepSeek V4-Pro via Cursor or Aider for the bulk of coding work. Why: V4-Pro handles the volume of coding work at $0.435/M input. Opus 4.7 stays in the stack for architectural decisions and complex multi-file reasoning where the 7-14 point SWE-bench gap shows. The math doesn't justify running every task through a $15/M model. Link: cursor.com (Cursor) / aider.chat (Aider)

The hybrid stack is the defensible play. V4-Pro handles the volume work where the benchmark gap doesn't materially affect output quality. Opus handles the architectural and multi-file reasoning tasks where it does. Monthly bill for an indie dev running this pattern: $20-50 instead of $100-200.

TOKEN TAPE

Policy Diff

Price Watch

Harness Compatibility

This Week's Switch

Keep Reading

Token Tape