Quick start
Installation
Within the monorepo
External (CodeArtifact)
With pip:.env:
pyproject.toml:
Lifecycle
| Function | Description |
|---|---|
configure(run_id, phase, ...) | Initialize tracing. Safe to call multiple times. Auto-configures from env vars. |
trace_run(meta) | Context manager for a rollout. Creates root span, writes trace summary on exit. Returns trace_id. |
flush(timeout=5.0) | Flush pending spans to exporters. |
shutdown() | Flush and release resources. Call at process exit. |
trace_run metadata
| Key | Required | Description |
|---|---|---|
datapoint_id | Yes | Input identifier |
phase | Yes | "train", "eval", or "prod" |
run_id | No | Kestrel run ID |
step_number | No | Training/eval step (omit for prod) |
env_name | No | Environment name |
model_name | No | Model identifier |
grader_name | No | Grader name (eval/train only) |
trace_id | No | Manual trace ID for multi-request continuity |
Metadata and status
| Function | Description |
|---|---|
add_meta(data) | Add trace-level metadata (stored in entity_metadata table) |
add_tags(*tags) | Add filterable string tags |
set_status(status, error_message) | Set rollout outcome |
set_error(message) | Mark current span as errored |
PENDING, COMPLETED, TRUNCATED, ABORTED, ENV_SETUP_ERROR, STEP_ERROR, GRADER_TIMEOUT, GRADER_ERROR, TOOL_ERROR, ERROR
Turn index management
Turn and intra-turn indices are managed automatically bylog(Message(...)), but can be controlled manually:
| Function | Description |
|---|---|
increment_turn() | Advance to next turn (resets intra to 0) |
increment_intra() | Advance intra-turn index |
set_turn_index(n) | Set turn index directly |
set_intra_index(n) | Set intra index directly |
Eval/train vs production
| Eval / train | Production | |
|---|---|---|
| Phase | "eval" or "train" | "prod" |
| Run ID | Unique per eval run (from Kestrel) | Must equal project name |
| Step number | Groups traces into training steps | Not used |
| Kestrel registration | Runs, run steps, and datapoints | Only datapoints |
Environment variables
| Variable | Description |
|---|---|
CLICKHOUSE_URL | ClickHouse HTTP URL |
CLICKHOUSE_DATABASE | ClickHouse database name |
CLICKHOUSE_USER | ClickHouse username |
CLICKHOUSE_PASSWORD | ClickHouse password |
KESTREL_API_URL | Kestrel server URL |
KESTREL_API_KEY | Kestrel API key |
LANGFUSE_PUBLIC_KEY | Optional — Langfuse public key |
LANGFUSE_SECRET_KEY | Optional — Langfuse secret key |
LANGFUSE_HOST | Optional — Langfuse host (default: https://us.cloud.langfuse.com) |
CARIBOU_TRACE_DEBUG | Optional — enable console span logging |
Langfuse integration
Set credentials to enable OTLP export to Langfuse (runs alongside ClickHouse):gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, etc.).