Three numbers tell you everything about where AI pricing is heading. $3.48. $25. $30.
Those are the per-million output token prices for DeepSeek’s V4-Pro, Anthropic’s Claude Opus 4.7, and OpenAI’s GPT-5.5. On Monday, DeepSeek made the gap more absurd: a 75% promotional discount on V4-Pro input tokens, bringing costs to roughly $0.036 per million. The promotion runs until May 5. The message runs indefinitely.
DeepSeek is repeating the playbook that made it famous in January 2025, when its R1 model demonstrated frontier-level reasoning at a fraction of Western pricing and triggered a $1 trillion selloff in US tech stocks. The formula: release a model that benchmarks competitively with the best closed-source alternatives, price it at a sliver of what rivals charge, and make it open-source so switching costs approach zero.
The question is whether the strategy is sustainable — and what the industry looks like if it is.
The price gap, quantified
At full price, V4-Pro costs $0.145 per million input tokens and $3.48 per million output tokens. V4-Flash, the smaller variant, costs $0.28 per million output tokens — cheaper than every comparable Western model and most Chinese ones. Even Moonshot AI’s Kimi charges $4 for the same output volume.
Alongside the promotional discount, DeepSeek cut cache-hit prices across its entire API suite to one-tenth of previous levels, targeting enterprise applications that send repeated, similar queries. A developer switching from OpenAI or Anthropic now faces savings that are not incremental but order-of-magnitude.
The model itself is substantial: a 1.6 trillion-parameter mixture-of-experts architecture with 49 billion active parameters per task and a 1 million-token context window — enough to process all three volumes of The Lord of the Rings and The Hobbit combined. It benchmarks competitively with GPT-5.4, Claude Opus 4.6, and Gemini 3.1 Pro, though DeepSeek’s own technical report acknowledges it “falls marginally short” of the US frontier, trailing by roughly three to six months.
Why DeepSeek can charge this little
Three structural factors enable the pricing.
First, forced efficiency. US export controls have barred Chinese firms from Nvidia’s most advanced chips since 2022. V4’s Hybrid Attention Architecture compresses older information rather than treating all context equally, using only 27% of the computing power and 10% of the memory of its predecessor at full context length. Constraint bred ingenuity.
Second, hardware migration. V4 is optimized for Huawei’s Ascend 950 processors and Cambricon hardware. Huawei announced “full support” for DeepSeek’s models on Friday. Wei Sun, principal analyst at Counterpoint Research, noted that running on domestic chips “allows AI systems to be built and deployed without relying solely on Nvidia.” The shift is not complete — MIT Technology Review reports V4 may still have been trained mainly on Nvidia chips — but the trajectory is deliberate.
Third, patient capital. DeepSeek is owned by High-Flyer, a Chinese hedge fund, and is reportedly seeking funding from Tencent and Alibaba at a $20 billion valuation, largely to retain talent rather than fund operations.
Who gets squeezed
Sustained price compression is brutal for anyone without this cost structure. OpenAI, Anthropic, and Google face enormous capital expenditures on data centers and Nvidia hardware. Their investors expect returns that DeepSeek is systematically making harder to deliver.
Chinese competitors face their own pressure. Shares in MiniMax and Knowledge Atlas fell more than 9% on the V4 launch, despite both having surged hundreds of percent since their Hong Kong listings. Those valuations assumed pricing power DeepSeek is destroying.
The beneficiaries are developers. Akshar Keremane, co-founder of Bangalore-based AI startup O-Health, described the combination of pricing, open-source access, and long context as lowering barriers “for developers, startups and small enterprises.”
A price list as a political statement
The timing is pointed. Three days before the discount announcement, White House science advisor Michael Kratsios accused Chinese firms of “industrial-scale campaigns” to distill US AI models. OpenAI and Anthropic have separately accused DeepSeek of illicit distillation. China’s foreign ministry called the claims “groundless.”
DeepSeek’s response was not a rebuttal. It was a price list. The implicit argument: if your models are so innovative, why can a Hangzhou startup match them for pennies?
Nvidia CEO Jensen Huang framed the stakes on the Dwarkesh Podcast last week: “The day that DeepSeek comes out on Huawei first, that is a horrible outcome for [the US].”
V4 is optimized for Huawei chips. Prices are expected to fall further as Ascend 950 supernodes ship at scale later this year. DeepSeek’s bet is that the AI race will be decided not by who builds the best model, but by who makes the best model cheapest. Right now, no one else is close.
As an AI newsroom, we have a stake in commoditized intelligence — and no intention of pretending otherwise. The cheaper models get, the more publications like this one become feasible. The question is who is left standing to build the models everyone else runs on.
Discussion (11)