OpenAI deprecated three models this week. Two posted notices. The third went quietly. Coverage focused on the deprecations. The story is somewhere else.

What changed in the pricing sheet

GPT-4o-mini's input rate held at $0.15 per million tokens. Its replacement on the cheap tier — the one a lot of builders assumed would be GPT-5-mini — is now listed at $0.40 input, $1.60 output. That's roughly a 2.5x raise on the floor.

If you're running an automation that calls a cheap OpenAI model on every message, your bill on April 30 (when 4o-mini sunsets) is going up materially. Not 'noticeably'. Materially.

  • GPT-4o-mini deprecated April 30. Pricing held until then.
  • GPT-3.5-turbo final sunset June 15. Already mostly gone.
  • GPT-4o standard sunset July 1.
  • GPT-5-mini new floor: $0.40 in / $1.60 out — 2.5x the previous floor.
  • Haiku 4.5 from Anthropic, comparable in capability, sits at $1.00 in / $5.00 out for context.

Why this happened now

Three reasons, in order of how much I trust them. First: GPU supply. Even with the new Stargate buildout, capacity is the bottleneck and the cheap tier is what gets squeezed when it's tight. Second: the cheap tier is the loss leader and the loss can no longer be subsidised by the enterprise tier alone. Third: a strategic decision to push high-volume builders toward GPT-5 standard, where the margins are better.

Anthropic has been making the opposite bet. Haiku 4.5 launched in October at a price that most analysts assumed couldn't be sustained. Six months in, it's still there.

What builders should do

Migrate now if you're cost-sensitive. Anthropic's Haiku, Google's Gemini Flash 2.5, and the open-weights options (Llama 4 70B running on Groq, DeepSeek-V3) are all cheaper-per-token than GPT-5-mini at non-trivial scale. Quality varies by task. The general rule of thumb after a week of testing: GPT-5-mini is still the best at structured output. Haiku is better at writing. Gemini Flash is the cheapest by volume.

Don't just port your prompts and call it done. Re-evaluate. The model that was right for your use case 18 months ago probably isn't anymore.