Consolidate text extraction into a single source of truth by krisztianfekete · Pull Request #79 · agentevals-dev/agentevals

krisztianfekete · 2026-03-31T10:20:53Z

As title says.

Copilot

Pull request overview

This PR consolidates user/assistant text extraction logic by routing converters through shared “extract from span attributes” helpers, aiming to reduce duplicated parsing across ADK and GenAI trace formats.

Changes:

Updated GenAI converter to use extract_user_text_from_attrs() / extract_agent_response_from_attrs() instead of inline message parsing.
Updated ADK converter to prefer the shared extraction helpers before falling back to direct llm_request / llm_response parsing.
Adjusted shared extraction to prefer the last assistant message in gen_ai.output.messages.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
`src/agentevals/genai_converter.py`	Replaces inline user/assistant message parsing with shared extraction helpers.
`src/agentevals/extraction.py`	Tweaks agent-response extraction to scan output messages in reverse (prefer latest).
`src/agentevals/converter.py`	Uses shared extraction helpers for ADK conversion prior to legacy parsing fallbacks.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-03-31T10:25:40Z

 def _extract_user_content(first_call_llm: Span) -> genai_types.Content:
    """Extract user input from the first call_llm span's llm_request tag."""
+    text = extract_user_text_from_attrs(first_call_llm.tags)
+    if text:
+        return genai_types.Content(
+            role="user",
+            parts=[genai_types.Part(text=text)],
+        )


The docstring says user input is extracted from the first call_llm span’s llm_request tag, but the function now first uses extract_user_text_from_attrs() (which may read either ADK llm_request or gen_ai.input.messages). Please update the docstring to reflect the new “span attributes / shared extractor” behavior to avoid misleading future readers.

Copilot · 2026-03-31T10:25:40Z

    messages_raw = attrs.get(OTEL_GENAI_OUTPUT_MESSAGES)
    if messages_raw:
        messages = parse_json_attr(messages_raw, "gen_ai.output.messages")
        if isinstance(messages, list):
-            for msg in messages:
+            for msg in reversed(messages):
                if isinstance(msg, dict) and msg.get("role") in ASSISTANT_ROLES:
                    text = extract_text_from_message(msg)
                    if text:


extract_agent_response_from_attrs() now iterates over gen_ai.output.messages in reverse order (preferring the last assistant/model message). There isn’t a unit test covering the multi-message case in tests/test_extraction.py, so this behavior change could regress unnoticed. Please add a test asserting the last assistant message with text is selected when multiple assistant messages are present.

Copilot · 2026-03-31T10:25:40Z

+    text = extract_user_text_from_attrs(llm_span.tags)
+    if text:
+        return text
+    raise ValueError(f"LLM span {llm_span.span_id}: no user message found in span attributes")


The raised ValueError here is now fairly generic (“no user message found in span attributes”), but extraction checks multiple possible sources (e.g., gen_ai.input.messages and ADK llm_request). Consider including the specific attribute keys checked (or at least gen_ai.input.messages) in the error message to make trace/debugging failures more actionable.

Suggested change

raise ValueError(f"LLM span {llm_span.span_id}: no user message found in span attributes")

raise ValueError(

f"LLM span {llm_span.span_id}: no user message found in span attributes; "

f"expected user message in attribute {OTEL_GENAI_INPUT_MESSAGES!r}"

)

consolidate text extraction

6d8875a

krisztianfekete requested a review from Copilot March 31, 2026 10:21

Copilot started reviewing on behalf of krisztianfekete March 31, 2026 10:22 View session

Copilot AI reviewed Mar 31, 2026

View reviewed changes

address review comments

1a15507

krisztianfekete merged commit 743389a into main Mar 31, 2026
4 checks passed

krisztianfekete deleted the chore/content-extraction-dedup branch March 31, 2026 10:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consolidate text extraction into a single source of truth#79

Consolidate text extraction into a single source of truth#79
krisztianfekete merged 2 commits intomainfrom
chore/content-extraction-dedup

krisztianfekete commented Mar 31, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Copilot AI Mar 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

krisztianfekete commented Mar 31, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants