TUTORIAL4 min read

Set budgets and alerts for runaway LLM cost

The Currai team, Product — Mar 18, 2026

LLM cost doesn't creep — it spikes. A retry loop that calls the model five times, a prompt that accidentally includes a megabyte of context, a new feature that ten thousand users hit at once. Each can multiply your bill overnight, and the worst way to find out is the monthly invoice. Budgets and alerts move that discovery forward to the moment it starts.

Cost has to be measured before it can be alerted on

You can't alert on a number you don't have. The prerequisite is token usage on every generation, which is what makes per-trace, per-model, and per-day cost real.

gen.end(
    output=reply,
    usage={"input": 312, "output": 88, "unit": "TOKENS"},
)
# cost is computed and rolled up per trace, model, user, and day

Once that's flowing, a budget is just a threshold on a roll-up you already have.

Alert on the leading indicators, not the total

A monthly total is a lagging indicator — by the time it's high, the money's spent. Watch the things that move first:

Cost per day crossing a fraction of your monthly budget — the trend that predicts the invoice.
Cost per trace spiking above its normal band — a single trace that costs 100x the median is almost always a bug.
Tokens per request climbing over time — the slow creep of prompts getting fatter as features pile on.

From alert to root cause in one click

An alert that just says "cost is high" sends you hunting. An alert tied to your traces sends you to the evidence: sort the day's traces by cost, open the most expensive ones, and the runaway pattern is right there — the retry loop, the fat context, the model that should have been the cheaper one. Cost living on the trace is what turns "the bill went up" into "this prompt did it."

Make the budget a product decision

Budgets aren't just guardrails against bugs; they're how you keep a feature's unit economics honest. Tag traces by feature and the cost roll-up tells you what each one costs to run. When a feature's cost outgrows its value, you'll know early enough to fix the prompt, switch the model, or change the pricing — instead of discovering it a quarter too late.