All posts
TUTORIAL5 min read

Track token cost across models, traces, and days

The Currai team, ProductMay 15, 2026

Most teams discover the cost of an LLM feature from a billing alert, not a dashboard. The fix is boring and effective: record token usage on every model call, and let cost be computed and rolled up for you. Then "what does this feature cost?" has an answer you can read instead of estimate.

Pass usage on every generation

A generation without usage is a generation you can't cost. Most SDKs hand you the token counts in the response — pass them straight through.

completion = client.chat.completions.create(model="gpt-4o-mini", messages=messages)

gen.end(
    output=completion.choices[0].message.content,
    usage={
        "input": completion.usage.prompt_tokens,
        "output": completion.usage.completion_tokens,
        "total": completion.usage.total_tokens,
        "unit": "TOKENS",
    },
)

Currai applies a built-in, editable model price table to those counts, so each generation carries a dollar cost — not just a token count.

Roll-ups you get for free

Once usage is flowing, the aggregates assemble themselves:

  • Per trace — what one answer cost, end to end, across every model call in it.
  • Per model — where your spend actually lives, so you know which call to move to a cheaper model.
  • Per user — which customers are expensive, useful for pricing and for spotting abuse.
  • Per day — the trend line that tells you a change made things better or worse.

Find the expensive call, not just the expensive day

A daily total tells you spend went up; it doesn't tell you why. Because cost lives on the trace, you can sort traces by cost and open the most expensive ones directly. Nine times out of ten it's a single runaway pattern — a prompt that grew a giant retrieved context, or a retry loop that called the model five times.

Cost is a quality signal too

Cheaper isn't automatically better, but a sudden cost drop can mean the model started truncating, and a spike can mean a prompt is leaking context it shouldn't. Watching cost next to your quality scores turns the dollar figure into an early warning system — not just a line item you reconcile at the end of the month.