$0.87 per million output tokens. That is the new permanent price of DeepSeek V4-Pro — roughly a quarter of what the flagship model cost at launch in April, and a fraction of what Western rivals charge for comparable performance.
The Hangzhou-based startup confirmed that the 75% promotional discount on V4-Pro, originally set to expire May 31, will become the standard price going forward, according to the company’s updated API documentation. The move converts what looked like a temporary promotion into a structural bet: DeepSeek believes it can operate at costs its competitors cannot match, and it is willing to prove it indefinitely.
The new V4-Pro pricing sits at $0.003625 per million input tokens on cache hits, $0.435 on cache misses, and $0.87 per million output tokens. At full price, the model already undercut OpenAI’s GPT-5.5, Anthropic’s Claude Opus 4.7, and Google’s Gemini 3.1 Pro on a per-token basis, as The Next Web reported. The permanent discount widens that gap from a competitive edge to a different category entirely.
Who Gets Squeezed
The pressure falls unevenly. OpenAI, which has cut API prices multiple times over the past year, now faces a challenger offering frontier-tier performance at roughly a quarter of the price. Anthropic, which has organized its Claude lineup around tiered pricing — lightweight Haiku models up to premium Opus — must reconcile premium positioning with a market where frontier and cheap are no longer contradictions. Google, which has progressively reduced Gemini API costs, has the infrastructure scale for aggressive pricing, but not at these levels without margin damage.
The squeeze is sharpest for enterprise buyers running agentic workloads — applications where AI models make repeated API calls, process large documents, and operate autonomously. For these users, token costs compound fast. DeepSeek’s separate decision to cut cache-hit prices to one-tenth of prior levels across its entire API, effective April 26, directly targets this pattern.
What the Cost Structure Signals
DeepSeek’s ability to sustain these prices rests on two advantages. First, V4-Pro is a mixture-of-experts architecture with 1.6 trillion total parameters but only 49 billion active per task — substantially less compute per inference than a dense model of equivalent capability. Second, V4 is trained and optimized for Huawei’s Ascend 950 chips and Cambricon hardware rather than Nvidia GPUs.
Wei Sun, principal analyst at Counterpoint Research, noted that running on domestic chips allows AI systems to be built and deployed without relying solely on Nvidia, which could accelerate adoption domestically and contribute to faster global AI development overall. The significance is hard to overstate: US export controls have restricted China’s access to advanced Nvidia silicon. DeepSeek engineered around the constraint and priced the workaround to be commercially attractive to everyone else.
Whether this pricing is sustainable without state support remains unclear. DeepSeek has not disclosed compute costs or margins, and no independent audit of its economics exists. The visible strategy is systematic: open-source weights to remove access barriers, aggressive API pricing to remove cost barriers, native integration with agentic coding tools like Claude Code and OpenClaw to reduce switching friction, and a one-million-token context window to handle enterprise workloads out of the box.
The Political Undertow
The timing carries political weight. When DeepSeek first announced the promotional discount in late April, it came the same week that White House science policy director Michael Kratsios accused foreign entities — primarily Chinese — of conducting “industrial-scale” campaigns to distill frontier AI models from US companies. Kratsios’s memo did not name DeepSeek. But both Anthropic and OpenAI have previously accused the startup of distilling their models, according to reporting by The Next Web and Engadget.
DeepSeek’s response has been consistent: cut prices, don’t comment. The pricing page update contains no statement addressing the allegations. The move doubles as a political signal — the AI race, in DeepSeek’s view, will be decided on cost and access, not on regulatory fences.
For enterprise AI buyers, the calculus is now straightforward. A frontier-class model with a one-million-token context window, open weights, and API compatibility with both OpenAI and Anthropic formats is available at a fraction of every Western alternative’s price. The question is no longer whether DeepSeek can compete on cost. It is whether the companies it is undercutting can afford to keep up.
Discussion (10)