The End of Cheap AI
For the past few years, the artificial intelligence industry seemed locked in a relentless race to the bottom. Each new model release usually brought a...

For the past few years, the artificial intelligence industry seemed locked in a relentless race to the bottom. Each new model release usually brought a pleasant surprise for developers: significantly better performance at a fraction of the cost. But with the rollout of Google's Gemini 3.5 Flash, that era appears to be officially over.
Unveiled at the latest Google I/O, Gemini 3.5 Flash skipped the usual beta preview phase and went straight into general availability. Google is weaving it into the very fabric of its ecosystem—powering the consumer-facing Gemini app, the AI Mode in Google Search, and enterprise development platforms. It boasts a massive 1-million-token context window and a knowledge cutoff updated to January 2025.
But the real headline tucked away in the developer documentation is the price tag. At $1.50 per million input tokens and $9 per million output tokens, 3.5 Flash is three times the cost of its predecessor, 3 Flash Preview, and a staggering six times the cost of 3.1 Flash-Lite. To put this in perspective, when developer Simon Willison used the API to generate a complex SVG image of a "pelican riding a bicycle," the single prompt cost him nearly 13 cents. While 13 cents might sound trivial, in the world of API calls where millions of transactions happen daily, it represents a massive jump from the micro-pennies developers are used to paying.
Data from Artificial Analysis highlights this shift starkly: running standardized benchmarks on Gemini 3.5 Flash now costs over $1,550—significantly more than what it cost to run the previous generation's premium Pro model.
Crucially, Google is not an outlier. We are witnessing a broader industry trend where top-tier AI labs are actively testing the price tolerance of their users. OpenAI’s GPT-5.5 costs double what GPT-5.4 did, and Anthropic’s Claude Opus 4.7 has also seen a significant markup. As models incorporate more complex reasoning capabilities and manage larger contexts, the sheer computational power required is driving up the baseline cost of intelligence.
We are witnessing a fascinating split in the AI economy. On one hand, tech giants are willing to absorb these staggering costs to offer free, cutting-edge AI to everyday consumers through built-in apps. On the other hand, the independent developers and startups building the next wave of specialized AI tools are facing a steep premium. As artificial intelligence gets fundamentally smarter, the defining question of the next tech cycle is no longer just what AI can do, but who can afford to run it at scale.
Key Points
- Gemini 3.5 Flash was released directly to general availability, powering core Google products and developer tools.
- The new model's API costs are 3x to 6x higher than previous iterations in the Flash family.
- Industry competitors like OpenAI and Anthropic are also significantly raising prices for their latest high-reasoning models.
- Rising costs per prompt mean developers face a much higher financial barrier when building AI-reliant applications.
Why It Matters
The collective price hikes among major AI providers indicate a shift from market-share growth to profitability, which will fundamentally alter how startups and developers build and monetize future AI applications.
Sources:
- Gemini 3.5 Flash: more expensive, but Google plan to use it for everything — Simon Willison's Weblog
更多专栏

The End of Car Buttons and CarPlay: How AI is Taking the Wheel
For the past decade, the ultimate fix for a clunky car dashboard was simple: plu...

The Agentic Divide: A Glimpse into AI's 2026 Landscape
What happens when artificial intelligence stops being a conversational novelty a...

The Physics of Siri: Why Apple's AI Dream Needs the Cloud
For years, the ultimate promise of smartphone artificial intelligence was strict...