Skip to main content
TraceContext wraps a trace and its spans, providing typed methods for accessing messages, tool calls, grader results, and more. It’s the main way to work with trace data in Manta.

Creating a TraceContext

from manta import TraceContext
from manta.data import MantaService

service = MantaService()
traces = service.load_traces(run_id="run_xxx")
spans_by_trace = service.load_spans_for_traces([t.trace_id for t in traces])

for trace in traces:
    ctx = TraceContext(trace, span_loader=lambda tid: spans_by_trace.get(tid, []))
You can also pass a MantaService directly — useful if your analysis needs to make additional queries:
ctx = TraceContext(trace, span_loader=..., service=service)
ctx.service.load_traces(run_id=ctx.run_id)  # additional queries

Trace properties

ctx.trace_id          # str — unique trace ID
ctx.run_id            # str — parent run ID
ctx.step_number       # int — training/eval step
ctx.phase             # str — "train", "eval", or "prod"
ctx.datapoint_id      # str — parent datapoint ID

ctx.trace.score       # float | None — grader score
ctx.trace.status      # str — "completed", "error", etc.
ctx.trace.cost_usd    # float — total cost
ctx.trace.duration_ms # int — trace duration
ctx.trace.turn_count  # int — number of turns
ctx.trace.tokens_input   # int
ctx.trace.tokens_output  # int
ctx.trace.model_name     # str
ctx.trace.env_name       # str
ctx.trace.meta           # dict — trace metadata

Messages

ctx.messages() returns MessageData objects. Filter by role:
all_messages = ctx.messages()
assistant_msgs = ctx.messages(role="assistant")
user_msgs = ctx.messages(role="user")
Each MessageData has:
PropertyTypeDescription
rolestr"user", "assistant", "system", or "tool"
contentstrText content
tool_callslist[dict]Function calls in this message
is_emptyboolNo content and no tool calls
has_tool_callsboolContains tool calls
for msg in ctx.messages(role="assistant"):
    if msg.is_empty:
        print("Empty assistant message")
    if msg.has_tool_calls:
        print(f"Called: {msg.tool_call_names()}")
Convenience shortcuts:
ctx.assistant_messages()  # same as ctx.messages(role="assistant")
ctx.user_messages()       # same as ctx.messages(role="user")

Tool calls

ctx.tools() returns ToolData objects. Filter by name, requestor, or error status:
all_tools = ctx.tools()
bash_calls = ctx.tools(name="bash")
failed = ctx.tools(errors_only=True)
agent_tools = ctx.tools(requestor="agent")
Each ToolData has:
PropertyTypeDescription
namestrTool name
call_idstrUnique call ID
argumentsstrRaw JSON arguments
arguments_dictdictParsed arguments
resultstrTool output
errorboolWhether the call errored
requestorstr"agent", "user", or "system"
Convenience shortcuts:
ctx.tool_names()    # list[str] — names of all tools used
ctx.failed_tools()  # list[ToolData] — tools that errored

Grader results

ctx.grader_result() returns the aggregate grading result with individual criteria populated:
result = ctx.grader_result()
if result:
    result.score       # float | None — aggregate score
    result.passed      # bool | None
    result.grader_name # str
    result.reasoning   # str
    result.criteria    # list[GraderCriterionData]
Each criterion:
PropertyTypeDescription
criterion_namestre.g. "correctness", "db_state"
passedboolWhether this criterion passed
scorefloat | NoneCriterion score
reasoningstrGrader reasoning
result = ctx.grader_result()
if result:
    for c in result.criteria:
        print(f"{c.criterion_name}: passed={c.passed}")
Query criteria directly:
correctness = ctx.grader_criteria(criterion="correctness")
all_criteria = ctx.grader_criteria(grader="my-grader")

LLM calls

ctx.llm_calls() returns raw LLM invocations:
for llm in ctx.llm_calls():
    llm.system          # str — provider ("openai", "anthropic")
    llm.request_model   # str — model name
    llm.input_tokens    # int
    llm.output_tokens   # int
    llm.cost_usd        # float

Conversation

ctx.conversation() returns the full conversation as a flat list of dicts — convenient for sending to an LLM:
conversation = ctx.conversation()
# [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}, ...]

content = "\n".join(f"{m['role']}: {m['content']}" for m in conversation)
Tool spans appear as {"role": "tool", "content": "input: {...}\noutput: ..."}.

Utility methods

ctx.has_errors()       # bool — trace failed or any span errored
ctx.has_tool_errors()  # bool — any tool call errored
ctx.total_tokens()     # int — tokens_input + tokens_output
ctx.span_tokens()      # tuple[int, int] — (input, output) from LLM spans