Skip to main content
MantaService is the entry point for querying trace data from ClickHouse. It handles connections, filtering, and returns typed TraceRow and SpanRow objects.

Creating a service

from manta.data import MantaService

service = MantaService()
By default, MantaService reads connection details from environment variables. You can also pass them explicitly:
service = MantaService(
    url="https://your-clickhouse-host:8443",
    user="default",
    password="...",
    database="default",
)

Loading traces

traces = service.load_traces(run_id="run_xxx")
Filter by phase, step, datapoint, status, or specific trace IDs:
eval_traces = service.load_traces(run_id="run_xxx", phase="eval")
step_traces = service.load_traces(run_id="run_xxx", step_numbers=[0, 1, 2])
dp_traces = service.load_traces(run_id="run_xxx", datapoint_ids=["dp_abc"])
specific = service.load_traces(run_id="run_xxx", trace_ids=["trc_abc", "trc_def"])
Control ordering and pagination:
recent = service.load_traces(
    run_id="run_xxx",
    limit=100,
    offset=0,
)

Lightweight ID queries

If you only need trace IDs (e.g. for orchestration), use get_trace_ids():
ids = service.get_trace_ids(run_id="run_xxx", phase="eval", limit=1000)

Entity counts

Count entities at each level without loading full rows:
from manta.schema import EntityLevel

counts = service.count_entities_by_level(
    run_id="run_xxx",
    levels=[EntityLevel.TRACE, EntityLevel.DATAPOINT, EntityLevel.STEP],
)
# {EntityLevel.TRACE: 5000, EntityLevel.DATAPOINT: 200, EntityLevel.STEP: 10}

Datapoint and step queries

Query distinct datapoints or steps for a run:
from manta.data.query import DatapointQuery, StepQuery

dp_ids = service.get_datapoint_ids(DatapointQuery(run_id="run_xxx", phase="eval"))
steps = service.get_step_identifiers(StepQuery(run_id="run_xxx"))
# steps = [("eval", 0), ("eval", 1), ("train", 0), ...]

Loading spans

Load spans grouped by trace ID:
spans_by_trace = service.load_spans_for_traces(
    trace_ids=["trc_abc", "trc_def"],
)
# {"trc_abc": [SpanRow, ...], "trc_def": [SpanRow, ...]}
Filter by span kind to load only what you need:
spans_by_trace = service.load_spans_for_traces(
    trace_ids=trace_ids,
    span_kinds=["message", "tool"],
)
Load all spans for an entire run:
spans_by_trace = service.load_spans_for_run(
    run_id="run_xxx",
    span_kinds=["message", "tool", "grader"],
)

Span filters

The underlying SpanFilter supports additional filtering:
FilterWhat it does
span_kindsFilter by kind: "message", "tool", "grader", "llm", "system"
rolesFilter messages by role: "user", "assistant", "system", "tool"
tool_namesFilter tool spans by name
requestorFilter tools by requestor: "agent", "user", "system"
grader_namesFilter grader spans by grader name
errors_onlyOnly return spans with errors

Configuration

Environment variables

VariableDefaultDescription
CLICKHOUSE_URLhttp://localhost:8123ClickHouse HTTP endpoint
CLICKHOUSE_USERdefaultClickHouse user
CLICKHOUSE_PASSWORDClickHouse password
CLICKHOUSE_DATABASEdefaultClickHouse database
CLICKHOUSE_TRACES_TABLEtracesTraces table name
CLICKHOUSE_SPANS_TABLEspansSpans table name