[ TOOLS / EMBED API ]

Embeddings API pricing. Every embedding model, verified $/1M tokens.

Eighteen vector / embedding models priced live from OpenAI, Voyage AI, Cohere, Jina, Mistral, Google Vertex, AWS Bedrock, Nomic, DeepInfra, and Together. Pick the cheapest model that hits your retrieval bar — at your real monthly token volume and re-index cadence.

UPDATED · 2026-04-26 RATES · VERIFIED FROM EACH PROVIDER FREE · NO SIGN-UP

Compare embedding API costs

INTERACTIVE · 18 MODELS · TOKEN-BASED

TOKENS EMBEDDED PER MONTH 100M tokens

RE-EMBEDDING FREQUENCY

CAPABILITY FILTER

DIM PREFERENCE

Why frequency matters. Static (1×) is a one-shot ingest of your corpus. Weekly (52×) is a mid-frequency RAG re-index — typical for support docs, product catalogs, or knowledge bases that change. Daily (365×) is the news / social / agent-memory tier. The cheapest model on a static index is often not the cheapest at daily cadence; volumes compound.

Capability filter. MULTIMODAL = image + text in one vector space (Cohere Embed v4, Voyage Multimodal 3, Jina v4, Nomic Vision). CODE-OPTIMIZED = tuned for source-code retrieval (Voyage Code 3, Codestral Embed). LONG-CONTEXT = >8K input tokens (Voyage 3 family, Cohere Embed v4, M2-BERT 32K).

CHEAPEST AT YOUR VOLUME

—

BEST RETRIEVAL PICK

—

MODEL VENDOR MONTHLY $/1M TOK DIMS CTX CAVEAT

Methodology. Prices pulled 2026-04-26 from platform.openai.com, docs.voyageai.com, cohere.com, jina.ai, mistral.ai, cloud.google.com, aws.amazon.com, atlas.nomic.ai, deepinfra.com, and together.ai. Embedding APIs charge per input token only — no output tokens, no per-image fee unless multimodal. Monthly cost = (tokens × frequency × $/M tokens). The frequency multiplier matters more than most teams budget for: a 100M-token corpus re-embedded daily is 36.5B tokens/yr, which separates the $0.02/M and $0.18/M models into a different budget tier entirely. Output dimension affects downstream vector DB storage and query latency — Pinecone, Weaviate, Qdrant all bill on vector count × dim — but that cost is out of scope for this calc; favor compact dims (256–1024) where retrieval quality holds. A few notes: Hugging Face Inference Endpoints bill per-second of compute, not per-token, so they're not directly comparable and excluded; Voyage's first 200M tokens/account are free for v3-large / v4 family; Cohere Embed v4 lists a separate $0.47/M for image tokens which we flag in caveats. Always verify with the vendor before committing. Cheapest is not best. At RAG scale the recall delta between text-embedding-3-small and Voyage 3 large is real — pick on quality, then optimize cost.

Sizing a build? We can scope the whole thing.

BOOK A CALL → SEE TOOL REVIEWS →

Embeddings API pricing. Every embedding model, verified $/1M tokens.

Compare embedding API costs

ChatGPT (OpenAI)

Claude (Anthropic)

Gemini (Google)

Perplexity

Sizing a build? We can scope the whole thing.