深度专栏/产品观察
产品观察

The End of Cheap AI

For the past few years, the artificial intelligence industry seemed locked in a relentless race to the bottom. Each new model release usually brought a...

作者
潜龙编辑部
关注 AI 与社会议题
发布于
2026/5/30
READ
长读
The End of Cheap AI
illustration · QianLong editorial

For the past few years, the artificial intelligence industry seemed locked in a relentless race to the bottom. Each new model release usually brought a pleasant surprise for developers: significantly better performance at a fraction of the cost. But with the rollout of Google's Gemini 3.5 Flash, that era appears to be officially over.

Unveiled at the latest Google I/O, Gemini 3.5 Flash skipped the usual beta preview phase and went straight into general availability. Google is weaving it into the very fabric of its ecosystem—powering the consumer-facing Gemini app, the AI Mode in Google Search, and enterprise development platforms. It boasts a massive 1-million-token context window and a knowledge cutoff updated to January 2025.

But the real headline tucked away in the developer documentation is the price tag. At $1.50 per million input tokens and $9 per million output tokens, 3.5 Flash is three times the cost of its predecessor, 3 Flash Preview, and a staggering six times the cost of 3.1 Flash-Lite. To put this in perspective, when developer Simon Willison used the API to generate a complex SVG image of a "pelican riding a bicycle," the single prompt cost him nearly 13 cents. While 13 cents might sound trivial, in the world of API calls where millions of transactions happen daily, it represents a massive jump from the micro-pennies developers are used to paying.

Data from Artificial Analysis highlights this shift starkly: running standardized benchmarks on Gemini 3.5 Flash now costs over $1,550—significantly more than what it cost to run the previous generation's premium Pro model.

Crucially, Google is not an outlier. We are witnessing a broader industry trend where top-tier AI labs are actively testing the price tolerance of their users. OpenAI’s GPT-5.5 costs double what GPT-5.4 did, and Anthropic’s Claude Opus 4.7 has also seen a significant markup. As models incorporate more complex reasoning capabilities and manage larger contexts, the sheer computational power required is driving up the baseline cost of intelligence.

We are witnessing a fascinating split in the AI economy. On one hand, tech giants are willing to absorb these staggering costs to offer free, cutting-edge AI to everyday consumers through built-in apps. On the other hand, the independent developers and startups building the next wave of specialized AI tools are facing a steep premium. As artificial intelligence gets fundamentally smarter, the defining question of the next tech cycle is no longer just what AI can do, but who can afford to run it at scale.

Key Points

  • Gemini 3.5 Flash was released directly to general availability, powering core Google products and developer tools.
  • The new model's API costs are 3x to 6x higher than previous iterations in the Flash family.
  • Industry competitors like OpenAI and Anthropic are also significantly raising prices for their latest high-reasoning models.
  • Rising costs per prompt mean developers face a much higher financial barrier when building AI-reliant applications.

Why It Matters

The collective price hikes among major AI providers indicate a shift from market-share growth to profitability, which will fundamentally alter how startups and developers build and monetize future AI applications.


Sources:

本文完
潜龙编辑部 · 2026/5/30