Skip to main content
Production logging uses the same Message, ToolExecution, and Event primitives as eval/train, with a few key differences in how you configure and register traces.

Key differences from eval/train

Eval / trainProduction
Phase"eval" or "train""prod"
Run IDUnique per eval run (from Kestrel)Set to the project name
Step numberGroups traces into training stepsNot used
GradingScore each rollout with GradeTypically not used
Kestrel registrationRuns, run steps, and datapointsProject only (optionally datapoints)

Setup

Configure Caribou once at application startup. Set run_id to your project name and phase to "prod":
import caribou

PROJECT_NAME = "my-project"
caribou.configure(run_id=PROJECT_NAME, phase="prod")

kestrel = caribou.get_kestrel_client()
if kestrel.enabled:
    kestrel.get_or_create_project(name=PROJECT_NAME)
You don’t create runs or run steps in production — all traces are grouped under the project name.

Per-request tracing

Wrap each incoming request in trace_run. Use a request ID or session ID as the datapoint_id:
from caribou import Message, ToolExecution

async def handle_request(request_id: str, user_input: str):
    with caribou.trace_run({
        "datapoint_id": request_id,
        "phase": "prod",
        "model_name": "gpt-4o",
    }) as trace_id:
        caribou.log(Message(role="user", content=user_input))

        completion = await get_completion(user_input)
        caribou.log(Message(
            role="assistant",
            content=completion["content"],
            tool_calls=completion.get("tool_calls"),
            model="gpt-4o",
            provider="openai",
            input_tokens=completion["usage"]["input"],
            output_tokens=completion["usage"]["output"],
            cost_usd=completion["usage"]["cost"],
        ))

        if completion.get("tool_calls"):
            for tc in completion["tool_calls"]:
                result = await execute_tool(tc)
                caribou.log(ToolExecution(
                    name=tc["function"]["name"],
                    call_id=tc["id"],
                    arguments=tc["function"]["arguments"],
                    result=result,
                    requestor="agent",
                ))

        caribou.set_status(caribou.RolloutStatus.COMPLETED)

    caribou.flush()

Multi-turn sessions

For multi-turn conversations that span multiple requests, pass a consistent trace_id in the metadata to maintain continuity:
with caribou.trace_run({
    "datapoint_id": session_id,
    "phase": "prod",
    "trace_id": f"trc_{session_id}",
    "model_name": "gpt-4o",
}) as trace_id:
    # ...

Metadata and tags

Attach metadata and tags the same way as eval/train. These are searchable in Kestrel.
with caribou.trace_run({...}) as trace_id:
    # ... agent execution ...
    caribou.add_meta({"customer_id": "cust_123", "latency_ms": 450})
    caribou.add_tags("api_v2", "premium_tier")

Error handling

Exceptions inside trace_run are automatically captured — the trace status is set to ERROR and the traceback is recorded. You can also set errors explicitly:
with caribou.trace_run({...}) as trace_id:
    try:
        result = await risky_operation()
    except TimeoutError:
        caribou.set_status(caribou.RolloutStatus.TRUNCATED)
        caribou.set_error("Operation timed out after 30s")

Shutdown

Call shutdown() on application exit to flush remaining spans:
caribou.shutdown()

Complete example

import caribou
from caribou import Message, ToolExecution

PROJECT_NAME = "my-agent"
caribou.configure(run_id=PROJECT_NAME, phase="prod")

kestrel = caribou.get_kestrel_client()
if kestrel.enabled:
    kestrel.get_or_create_project(name=PROJECT_NAME)


async def handle_request(request_id: str, messages: list[dict]):
    with caribou.trace_run({
        "datapoint_id": request_id,
        "phase": "prod",
        "model_name": "gpt-4o",
    }) as trace_id:
        for msg in messages:
            caribou.log(Message(role=msg["role"], content=msg["content"]))

        completion = await get_completion(messages)
        caribou.log(Message(
            role="assistant",
            content=completion["content"],
            model="gpt-4o",
            input_tokens=completion["usage"]["input"],
            output_tokens=completion["usage"]["output"],
            cost_usd=completion["usage"]["cost"],
        ))

        caribou.add_meta({"request_id": request_id})
        caribou.set_status(caribou.RolloutStatus.COMPLETED)

    caribou.flush()