Consolidate trace-to-invocation conversion by krisztianfekete · Pull Request #76 · agentevals-dev/agentevals

krisztianfekete · 2026-03-30T20:24:44Z

This PR eliminates the duplicated trace-to-invocation conversion logic between the Python backend and TypeScript frontend, making the backend the single source of truth.

Now we have a new POST /api/convert endpoint accepts trace files, runs the backend converter + metadata extraction, and returns invocations + metadata in a unified format. The frontend calls this for all file uploads and eval set building. The inspector reads invocations from state instead of re-converting. Now everything works consistently with streaming sessions as that was already using backend invocations.

…version

Copilot

Pull request overview

This PR moves trace-to-invocation conversion and trace metadata extraction fully into the Python backend, introducing a new conversion API endpoint and updating the frontend to consume backend-produced invocations consistently across uploads, inspector, and eval set building.

Changes:

Added POST /api/convert to convert uploaded trace files into invocations + extracted metadata in a unified response.
Updated the UI upload/evalset/inspector flows to call the backend conversion endpoint and to stop re-converting traces client-side.
Removed the TypeScript trace conversion module and introduced a smaller trace-helpers utility module for editor-related span traversal.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
ui/src/lib/types.ts	Adds TypeScript types for the new trace conversion API response payload.
ui/src/lib/trace-patcher.ts	Repoints editor utilities to `trace-helpers` after removing the converter module.
ui/src/lib/trace-metadata.ts	Removes client-side metadata extraction logic; keeps the `TraceMetadata` type only.
ui/src/lib/trace-helpers.ts	Introduces shared trace helper utilities (format detection, span traversal) used by the editor.
ui/src/lib/trace-converter.ts	Deletes the client-side trace-to-invocation converter implementation.
ui/src/lib/evalset-builder.ts	Switches eval set generation to accept backend-produced invocations rather than raw traces.
ui/src/context/TraceProvider.tsx	Updates trace file ingestion to call `convertTraces()` and store backend metadata + invocations in state.
ui/src/components/upload/TraceEditorDrawer.tsx	Loads invocations via backend conversion while still parsing traces locally for edit mapping.
ui/src/components/inspector/InspectorView.tsx	Uses invocations already in state rather than re-parsing/re-converting trace files on mount.
ui/src/components/inspector/InspectorHeader.tsx	Creates eval sets directly from stored invocations instead of reloading/parsing traces.
ui/src/components/dashboard/TraceCard.tsx	Same eval set creation change as inspector header (use stored invocations).
ui/src/components/builder/TraceUploadZone.tsx	Generates eval sets from backend conversion results; expands accepted file extensions to `.jsonl`.
ui/src/api/client.ts	Adds `convertTraces()` client for `POST /api/convert` (multipart upload).
src/agentevals/api/routes.py	Implements the new `/convert` endpoint (file validation, loading, conversion, metadata extraction).
src/agentevals/api/models.py	Adds response models for trace conversion metadata and entries.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-31T08:45:38Z

+export function detectTraceFormat(trace: Trace): 'adk' | 'genai' {
+  const check = (spans: Span[]): 'adk' | 'genai' | null => {
+    let hasGenai = false;
+    for (const span of spans) {
+      if (span.tags['otel.scope.name'] === ADK_SCOPE) {
+        return 'adk';
+      }
+      if (!hasGenai && (span.tags['gen_ai.request.model'] || span.tags['gen_ai.input.messages'])) {
+        hasGenai = true;
+      }
+    }
+    return hasGenai ? 'genai' : null;
+  };


detectTraceFormat marks a trace as GenAI only if it finds gen_ai.request.model or gen_ai.input.messages. Other parts of the UI (and the removed converter) treat gen_ai.system as a GenAI indicator (see trace-patcher.ts), so traces that only emit gen_ai.system will be misclassified as ADK and the editor mappings will be wrong. Please include gen_ai.system (and ideally the other input-message variants the patcher supports, e.g. gen_ai.prompt / gen_ai.request.messages) in the detection logic to avoid this regression.

Copilot · 2026-03-31T08:45:39Z

+export function findDescendantLLMSpans(root: Span): Span[] {
+  const results: Span[] = [];
+  const queue = [root];
+
+  while (queue.length > 0) {
+    const span = queue.shift()!;
+    if (span.tags['gen_ai.request.model'] || span.tags['gen_ai.input.messages']) {
+      results.push(span);
+    }
+    queue.push(...span.children);
+  }


findDescendantLLMSpans only considers spans with gen_ai.request.model or gen_ai.input.messages as LLM spans. trace-patcher relies on this to locate the first/last LLM span for editing, but it also keys off gen_ai.system and supports gen_ai.prompt / gen_ai.request.messages for input; a GenAI trace that lacks gen_ai.input.messages will produce zero LLM spans and no edit mappings. Align this predicate with the set of attributes used elsewhere (at least gen_ai.system, and optionally gen_ai.prompt/gen_ai.request.messages).

Copilot · 2026-03-31T08:45:39Z

+        for trace_file in trace_files:
+            if not trace_file.filename:
+                continue
+
+            if not (trace_file.filename.endswith(".json") or trace_file.filename.endswith(".jsonl")):
+                raise HTTPException(
+                    status_code=400,
+                    detail=f"Invalid file extension for {trace_file.filename}. Only .json and .jsonl files are allowed.",
+                )
+
+            trace_path = os.path.join(temp_dir, trace_file.filename)
+            with open(trace_path, "wb") as f:  # noqa: ASYNC230
+                content = await trace_file.read()
+
+                if len(content) > 10 * 1024 * 1024:
+                    raise HTTPException(
+                        status_code=400,
+                        detail=f"File {trace_file.filename} exceeds 10MB limit",
+                    )
+
+                f.write(content)
+            trace_paths.append(trace_path)


trace_file.filename is used directly in os.path.join(temp_dir, trace_file.filename). A crafted filename containing path separators (e.g. ../) can escape the temp dir, and duplicate filenames will overwrite each other. Please sanitize with os.path.basename(...) (or generate a UUID-based temp filename) before joining, and ensure names are unique within the batch.

Copilot · 2026-03-31T08:45:39Z

+        fmt = trace_format
+        if not fmt:
+            if trace_paths[0].endswith(".jsonl"):
+                fmt = "otlp-json"
+            else:
+                fmt = "jaeger-json"
+
+        loader = get_loader(fmt)
+        all_traces = []
+        trace_to_filename: dict[str, str] = {}
+        for path in trace_paths:
+            try:
+                traces = loader.load(path)
+                filename = os.path.basename(path)
+                for t in traces:
+                    trace_to_filename[t.trace_id] = filename
+                all_traces.extend(traces)
+            except Exception as exc:
+                logger.warning(f"Failed to load trace file '{path}': {exc}")
+


The endpoint infers a single fmt from only the first uploaded file’s extension, then applies that loader to all files. If a user uploads a mix of .json and .jsonl, the non-matching files will fail loader.load(...) and be silently skipped (only a server log warning), which is confusing for clients. Consider either (a) validating that all files share the same extension/format, (b) selecting the loader per file, and/or (c) surfacing per-file load failures in the response warnings/errors so the UI can notify the user.

Copilot · 2026-03-31T08:45:39Z

+            if not (trace_file.filename.endswith(".json") or trace_file.filename.endswith(".jsonl")):
+                raise HTTPException(
+                    status_code=400,
+                    detail=f"Invalid file extension for {trace_file.filename}. Only .json and .jsonl files are allowed.",
+                )


The extension check is case-sensitive (endswith('.json') / endswith('.jsonl')), so valid files like TRACE.JSON will be rejected even though other code (e.g. _session_name_from_filename regex) treats extensions case-insensitively. Normalizing filename.lower() (or using a case-insensitive regex) would make the API more robust.

refactor to have a single source of truth for trace-to-invocation con…

1d42985

…version

krisztianfekete force-pushed the chore/consolidate-conversion-logic branch from 7292e27 to 1d42985 Compare March 30, 2026 20:26

krisztianfekete marked this pull request as ready for review March 31, 2026 08:41

krisztianfekete requested a review from Copilot March 31, 2026 08:41

Copilot started reviewing on behalf of krisztianfekete March 31, 2026 08:42 View session

Copilot AI reviewed Mar 31, 2026

View reviewed changes

address review comments, cleanup

19d116f

krisztianfekete merged commit cd6a0a2 into main Mar 31, 2026
4 checks passed

krisztianfekete deleted the chore/consolidate-conversion-logic branch March 31, 2026 09:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate trace-to-invocation conversion#76

Consolidate trace-to-invocation conversion#76
krisztianfekete merged 2 commits intomainfrom
chore/consolidate-conversion-logic

krisztianfekete commented Mar 30, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

krisztianfekete commented Mar 30, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants