LLM

LLM Token Cost Calculator

Compare Claude, ChatGPT, and Gemini API costs at any volume. Same prompt shape, three price tags.

UPDATED · 2026-04-25

Estimate API spend per request

INTERACTIVE · LIVE · VERIFIED PRICES

INPUT TOKENS / REQUEST 8K

OUTPUT TOKENS / REQUEST 2K

CACHE HIT RATE (CLAUDE) 0%

Estimate one request. Cache rate captures stable prefixes (system prompts, long docs); only Claude has explicit prompt caching shown here. Cards sort cheapest → most expensive automatically.

MODEL PER REQUEST PER 1M REQUESTS

GPT-4o mini OPENAI · $0.15 / $0.60 PER MTOK

$0.00

$0 PER 1M REQ

Gemini 2.5 Flash GOOGLE · $0.30 / $2.50 PER MTOK

$0.00

$0 PER 1M REQ

GPT-5 mini OPENAI · $0.25 / $2.00 PER MTOK

$0.00

$0 PER 1M REQ

Claude Haiku 4.5 ANTHROPIC · $1.00 / $5.00 PER MTOK

$0.00

$0 PER 1M REQ

Gemini 2.5 Pro GOOGLE · $1.25 / $5.00 PER MTOK

$0.00

$0 PER 1M REQ

GPT-5 OPENAI · $1.25 / $10.00 PER MTOK

$0.00

$0 PER 1M REQ

Claude Sonnet 4.6 ANTHROPIC · $3.00 / $15.00 PER MTOK

$0.00

$0 PER 1M REQ

Claude Opus 4.7 ANTHROPIC · $5.00 / $25.00 PER MTOK

$0.00

$0 PER 1M REQ

Cache math: when the cache slider is non-zero, that fraction of input tokens is billed at 10% of the model's input rate (the standard prompt-caching read multiplier). The other models in the list don't get a cache discount here — partly because their caching APIs are priced and exposed differently, partly to keep this view honest. For the deeper Claude prompt-caching breakdown (write multipliers, TTLs, Batch API stacking), see the full Claude review.

Building an LLM-powered system? We can scope it.

BOOK A CALL → SEE LLM REVIEWS →