Google’s New AI Model Is “Cheaper” — Just Not Cheaper Than the One It Replaced | AI Business Radar

Google says Gemini 3.5 Flash is half the price of frontier models. It is. It’s also three times the price of the Gemini model it replaces.

Google launched Gemini 3.5 Flash at Google I/O yesterday. The on-stage pitch: “frontier-level capabilities at less than half the price.” The comparison is against the most expensive competitor models — GPT-5, Claude Opus. Against those benchmarks, the claim holds. Against Gemini 3 Flash — the model Malaysian developers may currently be running — the numbers look different. Gemini 3 Flash was priced at USD 0.50 per million input tokens and USD 3.00 per million output tokens. Gemini 3.5 Flash: USD 1.50 input, USD 9.00 output. That is a three-fold increase in base price. For agentic applications — where AI models take multiple processing steps and consume significantly more tokens per query — the real-world cost difference runs to 5.5x, driven by both higher token prices and additional agentic turns.

The benchmark performance is genuinely better. That is not in dispute. What matters for Malaysian businesses is whether the improvement justifies the cost jump — and exactly which model you are comparing against.

Who this really matters to:

→ Malaysian software companies and startups building AI features on the Google API — if your product uses Gemini Flash at scale, the move from 3 to 3.5 Flash triples your base AI cost before accounting for higher token consumption from agentic behaviour → Malaysian enterprises running AI agent workflows on Google Cloud — Google cited a figure of over USD 1 billion in enterprise savings; that applies to companies running at the scale of roughly one trillion tokens per day; most Malaysian enterprises are not at that scale → Malaysian digital agencies managing client AI implementations — when a client asks whether the upgrade is worth it, the honest answer requires knowing whether the task actually benefits from 3.5 Flash capability or whether 3 Flash was performing adequately for that use case → Malaysian CTOs reviewing AI infrastructure cost projections — pricing comparisons in AI are almost always against competitor pricing, not against predecessor pricing; both comparisons are relevant; only one appears in the press release

MULTIPLE PERSPECTIVES

The marketing language of AI pricing is consistently built around one reference point: the most expensive comparable competitors. “Half the price of frontier models” is accurate. It is also selective. The more operationally relevant comparison for most businesses is against the previous version they were running. On that comparison, the cost structure changed significantly.

This pattern is not unique to Google. Almost every major AI model release in the past two years has been marketed as cheaper than the most expensive alternatives while being more expensive than the previous generation of the same model. The improvement in capability is real. The “cheaper” framing depends entirely on which model you are comparing against. For Malaysian businesses running AI tools at stable pricing for 12 months, the relevant question is not “cheaper than GPT-5” — it is “what happens to our cost structure if we upgrade to this version.”

The deeper pattern worth understanding: AI pricing is not stable. The market is maturing, model quality is increasing, and pricing is adjusting to reflect that. The window of very cheap AI API access — where frontier capabilities were underpriced relative to the value they created — is closing. Malaysian businesses building AI-dependent products should model cost scenarios where API pricing increases by 2–5x over the next 24 months. That is not pessimism. It is what the pricing history since 2023 actually shows.

If your business built something on a specific AI model’s pricing — do you know what happens to the unit economics if the price triples?

If the product still works at 3x cost — your margins have room and you priced for the value, not the model cost.

If 3x cost breaks the business case — the dependency on a specific price point is a business risk, not just a technology question. The right time to model that scenario is before the price changes, not after.

Cheaper than today’s frontiers is not the same as cheaper than yesterday’s version.

Google’s New AI Model Is “Cheaper” — Just Not Cheaper Than the One It Replaced

More Radar posts