Free tool, runs in your browser

LLM API Cost Calculator

Work out what Claude, GPT or Gemini will cost at your real usage. Pick a model, set your monthly volume and average tokens, and compare every model at the same workload. Prices are editable so the numbers stay honest as providers update them.

cost-calc.app

Editable defaults, verify with provider. List prices as of June 2026, USD per million tokens. If a provider changes pricing, just type the new numbers in.

results.log

> Claude Sonnet 4.6 @ 10,000 req/mo

Monthly cost
$105.00
Annual cost
$1,260.00
Cost per request
$0.01

Estimates only. Real bills depend on caching, batching and how your prompts are built.

model-comparison.tbl
Monthly and annual cost of every model at the same usage
Model$/1M in$/1M outPer requestMonthlyAnnual
Claude Fable 5$10.00$50.00$0.04$350.00$4,200.00
Claude Opus 4.8$5.00$25.00$0.02$175.00$2,100.00
Claude Sonnet 4.6$3.00$15.00$0.01$105.00$1,260.00
Claude Haiku 4.5★ Cheapest$1.00$5.00$0.0035$35.00$420.00
GPT-5.5$5.00$30.00$0.02$200.00$2,400.00
GPT-5.4$2.50$15.00$0.01$100.00$1,200.00
GPT-5.2$1.75$14.00$0.00875$87.50$1,050.00
Gemini 3.1 Pro$2.00$12.00$0.008$80.00$960.00
Gemini 3.5 Flash$1.50$9.00$0.006$60.00$720.00
readme.txt

How costs are calculated

The formula is simple. For each request: cost = (input tokens ÷ 1,000,000 × input price) + (output tokens ÷ 1,000,000 × output price). Monthly cost is that figure multiplied by your requests per month, and annual cost is monthly × 12.

Treat the result as a ceiling rather than a forecast. In production, prompt caching and batch processing routinely cut bills by 50 to 90 percent, and routing simpler requests to a cheaper model saves even more. Getting that architecture right is the kind of thing I help clients with, and it is usually worth far more than haggling over which provider is a few cents cheaper per million tokens.

faq.txt
  • How accurate are the prices in this calculator?
    The defaults are the published list prices from each provider as of June 2026, in USD per million tokens. Providers change pricing without much notice, so every price field is editable: check the official pricing page and type in the current numbers if they have drifted.
  • What is a token?
    A token is the unit LLM providers bill by. In English text, one token is roughly 4 characters or about three quarters of a word, so 1,000 tokens is around 750 words. Your prompt (input tokens) and the model response (output tokens) are billed at different rates, which is why the calculator asks for both.
  • Which model should I pick for business work?
    For most business workloads I recommend Claude, usually Sonnet: it is strong at instruction following, writing and tool use, and the price sits in a sensible middle. Haiku is the pick for very high-volume, simpler tasks, and Opus for the hardest reasoning work. GPT and Gemini models are genuinely good too and win on some tasks, so for anything high-stakes it is worth a small bake-off on your real data.
  • How can I cut my LLM API bill?
    The big levers are prompt caching (reusing a long shared prompt prefix at a discount), batch processing for anything that does not need a real-time answer, routing easy requests to a cheaper model, and trimming bloated prompts. Combined, these commonly cut bills by 50 to 90 percent. This is exactly the kind of work I do with clients.
  • Does this calculator send my numbers anywhere?
    No. All of the maths runs in your browser. Nothing you type is stored, logged or sent to a server.
next-step.app

Want these numbers 50 to 90 percent lower?

I help businesses pick the right model, then cut the bill with caching, batching and smart routing before a single line of integration code is written.