Gemini 2.5 vs GPT‑4o | Ryan Katayi

Last verified: 19 July 2025

1 · Why this update?

My blog post compared Gemini 2.5 (Pro & Flash) to GPT‑4o. A couple of numbers have shifted, mostly pricing, so here is a clear, single‑page reference you can drop into the article or link as a footnote.

2 · Latest official sources

Model / Doc	Link	Doc timestamp
Gemini 2.5 Pro – Vertex AI model card	docs.google.com (Model Card)	11 Jul 2025
Gemini 2.5 Flash – Vertex AI model card	docs.google.com (Model Card)	10 Jul 2025
Gemini thinking‑budget guide	Google API docs	14 Jun 2025
Gemini pricing table	AI Studio pricing	09 Jul 2025
OpenAI GPT‑4o pricing	openai.com/pricing	05 Jun 2025

3 · Pricing changes since the original post

Tier	Old (May 2025)	Current (19 Jul 2025)	Δ
Gemini 2.5 Flash input	$0.15 / M tokens	$0.10 / M	↓ 33 %
Gemini 2.5 Flash output	$0.40 / M	$0.40 / M	—
Gemini 2.5 Pro input (≤ 200 k)	$1.25 / M	$1.25 / M	—
Gemini 2.5 Pro output (≤ 200 k)	$10 / M	$10 / M	—
Context‑cache price	—	$0.31 / M (≤ 200 k)	new
Flash‑Lite tier	—	$0.10 / M in · $0.40 / M out	new

4 · Capabilities & context windows (unchanged)

Feature	Gemini 2.5 Pro	Gemini 2.5 Flash	GPT‑4o
Max context	1 M tokens	1 M tokens	128 k tokens
Thinking‑budget knob	✓	—	—
Vision + audio	Good (slower)	Vision only (fast)	Real‑time, best‑in‑class

No change since the original post.

5 · Benchmarks & latency snapshots

Gemini 2.5 Pro still scores ≈ 85 ± 1 % MMLU, narrowly behind GPT‑4o (≈ 86 %).
Independent latency tests report 0.26–0.32 s TTFT for Flash, well within the “sub‑second” claim.
GPT‑4o latency remains industry‑leading for multimodal I/O.

6 · What actually changed?

Flash input cost cut from $0.15 → $0.10 per M tokens.
Flash‑Lite SKU introduced (same pricing as new Flash input, slightly lower weight limit).
Context‑cache billing line added to Google docs.
Minor wording: Google now labels Pro the “most advanced reasoning Gemini model.”

Everything else, context limits, thinking tokens, latency, benchmark scores, GPT‑4o pricing, remains unchanged.

7 · Bottom Line:

Only three edits needed to keep your comparison current:

Update the Flash input price to $0.10 / M.

Add a footnote that Flash‑Lite is now available (same price, lower latency).

Mention the context‑cache add‑on if your readers care about serving cost at scale.