Notes on LLM observability

GUIDE6 min read

What is LLM observability?

LLM observability is the practice of capturing every prompt, completion, token, and tool call so you can explain what your model did and why. Here's what it means and where to start.

The Currai teamMay 27, 2026
Read post
ENGINEERING5 min read

Tracing vs logging: why print() debugging fails for LLM apps

Logs tell you a line ran. Traces tell you what the model saw, said, and cost across the whole request. Here's why LLM apps need tracing, not more print statements.

The Currai teamMay 22, 2026
Read post
COMPANY4 min read

Why we built Currai

LLM apps fail in ways your APM never sees. We built Currai so you can watch every prompt, token, and tool call the way you already watch latency and errors.

The Currai teamMay 20, 2026
Read post
TUTORIAL6 min read

Debug a slow RAG pipeline with nested traces

A RAG answer that takes four seconds could be slow retrieval, a fat prompt, or the model itself. Nested traces tell you which one — here's how to find the bottleneck.

The Currai teamMay 18, 2026
Read post
TUTORIAL5 min read

Track token cost across models, traces, and days

Pass token usage on every generation and Currai turns it into cost — rolled up per trace, model, user, and day. Stop guessing what an LLM feature costs.

The Currai teamMay 15, 2026
Read post
TUTORIAL5 min read

Trace your first LLM call in 5 lines

Install the SDK, paste your keys, wrap a call. A walkthrough of going from print() debugging to a real trace you can replay.

The Currai teamMay 12, 2026
Read post
TUTORIAL4 min read

Group multi-turn conversations with sessions and users

One conversation is many traces. Pass a session id to stitch every turn into one thread, and a user id to slice cost, latency, and volume by the people using your app.

The Currai teamMay 9, 2026
Read post
DEEP DIVE6 min read

Traces, spans, and generations: the LLM data model

Trace, span, generation — three nouns that cover everything an LLM app does. Understand the data model and your instrumentation stops being guesswork.

The Currai teamMay 6, 2026
Read post
PRODUCT3 min read

Why we bill on processed bytes, not trace count

Per-trace pricing punishes you for instrumenting well. We price on the data Currai actually processes, so big traces and tiny traces cost what they are.

The Currai teamMay 4, 2026
Read post
TUTORIAL4 min read

Migrate from Langfuse to Currai in one line

Currai is byte-compatible with the Langfuse SDKs. If you're already instrumented, you migrate by changing the host — your spans, exporters, and trace code keep working.

The Currai teamMay 1, 2026
Read post
GUIDE6 min read

Instrument LLM apps with OpenTelemetry

OpenTelemetry is the open standard for traces. Currai ingests OTLP spans, so you can use the collector and instrumentation you already run and still get LLM-aware views.

The Currai teamApr 28, 2026
Read post
TUTORIAL5 min read

Measure latency and time-to-first-token

Total latency hides the metric users actually feel — time to first token. Here's how to capture both on every generation and find what's making your LLM app feel slow.

The Currai teamApr 10, 2026
Read post
DEEP DIVE6 min read

Observability for AI agents and tool calls

Agents loop, call tools, and call themselves — a single request can be dozens of model calls. Here's how to trace agent runs so you can see exactly where one went off the rails.

The Currai teamApr 5, 2026
Read post
GUIDE5 min read

Managed vs self-hosted LLM observability

Observability data is high-volume and append-heavy — the classic case for running ClickHouse yourself. Here's the real trade-off between self-hosting and a managed backend.

The Currai teamMar 30, 2026
Read post
BEST PRACTICES5 min read

Sampling and PII redaction for production tracing

Production traces carry prompts full of user data and arrive at full traffic volume. Here's how to sample for cost and redact for privacy without losing the traces you need.

The Currai teamMar 24, 2026
Read post
TUTORIAL4 min read

Set budgets and alerts for runaway LLM cost

A single runaway prompt or retry loop can 10x your bill overnight. Here's how to turn the cost data on your traces into budgets and alerts that warn you before the invoice does.

The Currai teamMar 18, 2026
Read post
ENGINEERING5 min read

LLM observability beyond Python and TypeScript

Your LLM calls don't all live in Python. Currai ingests traces over plain HTTP and OTLP, so any language that can make a request can send traces — here's how.

The Currai teamMar 12, 2026
Read post
TUTORIAL6 min read

Trace a multi-turn chatbot end to end

A chatbot that works for one message often breaks by message seven. Here's how to trace a full conversation — sessions, history, and cost — so you can debug the whole thread.

The Currai teamMar 6, 2026
Read post