Every AI Call
Can Teach You Something.

Trace LLM calls. Test prompts. Ship better AI.

Get Started

Currai has a 7 day free trial.

1
2
Traces

Traces

Every trace-create event lands here in real time.

Select trace

Trace name

Filters

No filters applied. Add one to narrow your traces.

TimestampNameLevelEnvUserSessionInputTagsLatencyObsTokensOutputCost
Jun 23, 03:18 PMchat-turnOKproductionuser-184sess-7b2summarize refund policysupport, prod842 ms21.3kThe refund window is 30 days...$0.0018
Jun 23, 03:16 PMrag-answerWARNINGproductionuser-092sess-a19compare enterprise plansrag, pricing2.4 s43.8kThe Pro tier includes...$0.0069
Jun 23, 03:14 PMopenai.chat.completionsOKstagingqa-12sess-41fdraft onboarding replyemail690 ms1884Welcome aboard. Here are...$0.0009
Jun 23, 03:11 PMsupport-agentERRORproductionuser-318sess-c03change my billing emailagent, tool5.8 s5.1err5.6ktool timeout: crm.updateUser$0.0112
Jun 23, 03:08 PMeval-runOKproductionsystemeval-22score answer relevanceevals1.1 s32.1kscore: 0.91$0.0034

Ship quality AI at scale

01
Runner doing the work

Observability

Capture every LLM call, tool execution, and retrieval step in hierarchical traces. Filter by user, session, latency, cost, or custom metadata.

02
Runner connecting to many apps

Evaluation

Evaluate outputs with LLM judges, custom heuristics, or human review. Run evaluations on production traffic or prompt experiments

03
Runner remembering context

Prompt Management

Manage prompts outside your codebase with one-click deployments and rollbacks. Collaborate on prompt improvements with your entire team

Everything You Need to Improve Your AI

From production traces to prompt experiments, Currai gives your team one place to measure, test, and improve every AI interaction.

When you need to

Understand why an AI response failed

Without Currai

Search disconnected logs and try to reconstruct what happened.

1Opening the complete production trace
2Following retrieval and tool calls
3Inspecting the prompt, model output, latency, and error

The retrieval step timed out after 4.2s, leaving the model without context. The failing span and its inputs are ready to inspect.

When you need to

Measure the quality of production responses

Without Currai

Manually review a small sample and rely on intuition.

1Selecting recent production traces
2Running LLM judges and heuristic evaluators
3Grouping low-scoring responses by failure reason

Evaluation complete: 91% passed. The 23 low-scoring responses are grouped and ready for review.

When you need to

Know which prompt performs better

Without Currai

Deploy a new prompt and hope the results improve.

1Splitting production traffic between prompt versions
2Measuring quality, latency, tokens, and cost
3Comparing results on real user requests

Version B improved quality by 18% and reduced token usage by 12%. It is ready to promote.

When you need to

Update a prompt without redeploying your app

Without Currai

Edit hard-coded prompts, open a pull request, and redeploy.

1Creating a new version in the prompt registry
2Previewing the compiled prompt with variables
3Publishing with a complete version history

Version 12 is live. Previous versions remain available for an instant rollback.

When you need to

Test a prompt before sending it to production

Without Currai

Copy inputs between scripts and compare outputs manually.

1Replaying representative production inputs
2Comparing prompts and models side by side
3Scoring every output with the same evaluators

The strongest prompt and model combination is identified using real inputs and consistent scores.

When you need to

Find what is making your AI slow or expensive

Without Currai

See one total duration and a monthly provider bill.

1Breaking down latency across generations and spans
2Comparing token usage and cost by model and prompt
3Filtering expensive traces by user, session, and environment

Repeated tool calls account for 31% of cost, while retrieval adds 5.1s to the slowest requests.

Launch, observe, improve — repeat.

(and better!)

Runner seamlessly integrates with the tools you already rely on, streamlining your workflow and ensuring that tasks are completed efficiently and effectively. It takes care of the details so you can focus on what truly matters.

1

Concurrent Runners

Run multiple tasks in parallel. Draft follow-ups while pulling analytics while updating your CRM. Receipts and timestamps for everything.

2

Local + Cloud

Works across your local machine and cloud services. Manages files, apps, and workflows wherever they live. Your data stays yours.

3

Memory Across Sessions

Runner remembers what matters across sessions: your contacts, your preferences, your unfinished work. Context that compounds over time.

Connects to the stack you already use

Openai
OpenaiModel Providers
Mistral
MistralModel Providers
Anthropic
AnthropicModel Providers
Gemini
GeminiModel Providers
Perplexity
PerplexityModel Providers
Lovable
LovableNo Code
Qwen
QwenModel Providers
Cursor
CursorDeveloper Tools
Github Copilot
Github CopilotDeveloper Tools
Opencode
OpencodeDeveloper Tools
Vscode
VscodeDeveloper Tools
Replit
ReplitNo Code
Exa
ExaOther
Vercel AI SDK
Vercel AI SDKFrameworks
Typescript
TypescriptNative
Python
PythonNative
Open Telemetry
Open TelemetryNative
Bolt
BoltNo Code

Pricing that tracks real volume

Starter

Free to get started.

$0/mo

50 MB included

3-day retention

  • Drop-in Python & TypeScript SDKs
  • Full traces, tokens & cost in one view
  • Langfuse & OpenTelemetry compatible

Pro

For teams shipping to production.

$8/mo

2 GB included

14-day retention

  • Everything in Starter
  • Run evals and A/B test prompt versions in production
  • Sessions & users roll-ups
  • Cost, token & latency dashboards

Business

Higher volume, longer history.

$20/mo

4 GB included

30-day retention

  • Everything in Pro
  • Hosted ingestion, storage & dashboards
  • Priority support

Need a custom plan?

Higher volume, longer retention, or specific terms — tell us about your usage and we'll tailor a plan to fit.

Contact us

Questions, answered

What is Currai?
Currai is observability for LLM apps. It traces every prompt, token, and tool call your app makes so you can debug, measure, and ship with confidence — full traces, token usage, and cost in one view. It also supports LLM evals, prompt A/B testing, and OpenTelemetry/Langfuse-compatible ingestion.
How long does it take to get my first trace?
About five minutes. Install the SDK with pip install currai or npm i currai, paste your public / secret key pair, and wrap a single LLM call. There's no agent to deploy and no collector to run — the first request you make shows up in the dashboard right away.
Which languages and frameworks do you support?
Currai ships first-class Python and TypeScript SDKs, and it's byte-compatible with the Langfuse SDKs and OpenTelemetry. If you're already instrumented, you migrate by changing a single host line — your existing trace code, spans, and exporters keep working.
Can Currai replace Langfuse or Braintrust?
Currai is an option for teams comparing Langfuse alternatives or Braintrust alternatives for hosted LLM observability, production traces, evals, prompt A/B tests, token cost tracking, and OpenTelemetry-compatible ingestion. Test your exact workflow before switching.
Will tracing slow down my app?
No. Traces are batched and flushed in the background, so instrumentation never blocks a request. In short-lived processes you call flush() (or flush_async()) before exit to make sure everything is sent.
How is pricing calculated?
Usage is billed on the data Currai actually processes — measured in processed bytes — not on a flat per-trace fee. Large traces and tiny traces are priced for what they are, so your cost tracks real volume instead of row counts.
How long is my data retained?
Retention follows your plan, so you keep traces for as long as your plan's window allows. You can export or delete your data on demand at any time.
Do I have to run any infrastructure?
None. Currai hosts ingestion, columnar storage, and dashboards for you — there's no ClickHouse to babysit. It scales with your traffic so you watch data, not infrastructure.
Currai | LLM observability & prompt A/B tests