Message, ToolExecution, and Event primitives as eval/train, with a few key differences in how you configure and register traces.
Key differences from eval/train
| Eval / train | Production | |
|---|---|---|
| Phase | "eval" or "train" | "prod" |
| Run ID | Unique per eval run (from Kestrel) | Set to the project name |
| Step number | Groups traces into training steps | Not used |
| Grading | Score each rollout with Grade | Typically not used |
| Kestrel registration | Runs, run steps, and datapoints | Project only (optionally datapoints) |
Setup
Configure Caribou once at application startup. Setrun_id to your project name and phase to "prod":
Per-request tracing
Wrap each incoming request intrace_run. Use a request ID or session ID as the datapoint_id:
Multi-turn sessions
For multi-turn conversations that span multiple requests, pass a consistenttrace_id in the metadata to maintain continuity:
Metadata and tags
Attach metadata and tags the same way as eval/train. These are searchable in Kestrel.Error handling
Exceptions insidetrace_run are automatically captured — the trace status is set to ERROR and the traceback is recorded. You can also set errors explicitly:
Shutdown
Callshutdown() on application exit to flush remaining spans: