Inference cost
The per-call token cost of running an LLM. Input and output tokens are priced separately.
Claude Haiku is priced at about $1 / 1M input tokens and $5 / 1M output; Sonnet is $3 and $15. A single SumTube summary uses roughly 5K–15K input + 500–1,500 output tokens, which — with prompt caching — lands under a few US cents per request.