OpenAI Pulled Three Models in a Week. The Real Story Is the Pricing Sheet.

OpenAI deprecated three models this week. Two posted notices. The third went quietly. Coverage focused on the deprecations. The story is somewhere else.

What changed in the pricing sheet

GPT-4o-mini's input rate held at $0.15 per million tokens. Its replacement on the cheap tier — the one a lot of builders assumed would be GPT-5-mini — is now listed at $0.40 input, $1.60 output. That's roughly a 2.5x raise on the floor.

If you're running an automation that calls a cheap OpenAI model on every message, your bill on April 30 (when 4o-mini sunsets) is going up materially. Not 'noticeably'. Materially.

GPT-4o-mini deprecated April 30. Pricing held until then.
GPT-3.5-turbo final sunset June 15. Already mostly gone.
GPT-4o standard sunset July 1.
GPT-5-mini new floor: $0.40 in / $1.60 out — 2.5x the previous floor.
Haiku 4.5 from Anthropic, comparable in capability, sits at $1.00 in / $5.00 out for context.

Why this happened now

Three reasons, in order of how much I trust them. First: GPU supply. Even with the new Stargate buildout, capacity is the bottleneck and the cheap tier is what gets squeezed when it's tight. Second: the cheap tier is the loss leader and the loss can no longer be subsidised by the enterprise tier alone. Third: a strategic decision to push high-volume builders toward GPT-5 standard, where the margins are better.

Anthropic has been making the opposite bet. Haiku 4.5 launched in October at a price that most analysts assumed couldn't be sustained. Six months in, it's still there.

What builders should do

Migrate now if you're cost-sensitive. Anthropic's Haiku, Google's Gemini Flash 2.5, and the open-weights options (Llama 4 70B running on Groq, DeepSeek-V3) are all cheaper-per-token than GPT-5-mini at non-trivial scale. Quality varies by task. The general rule of thumb after a week of testing: GPT-5-mini is still the best at structured output. Haiku is better at writing. Gemini Flash is the cheapest by volume.

Don't just port your prompts and call it done. Re-evaluate. The model that was right for your use case 18 months ago probably isn't anymore.

OpenAI Pulled Three Models in a Week. The Real Story Is the Pricing Sheet.

What changed in the pricing sheet

Why this happened now

What builders should do

Sources

Pieces like this, once a week.

Comments