Skip to content

fix: detect agent event type from GenAI semantic conventions#292

Open
devin-ai-integration[bot] wants to merge 4 commits intofederated-sdk-release-candidatefrom
devin/1772445758-fix-agent-type-detection
Open

fix: detect agent event type from GenAI semantic conventions#292
devin-ai-integration[bot] wants to merge 4 commits intofederated-sdk-release-candidatefrom
devin/1772445758-fix-agent-type-detection

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 2, 2026

fix: detect agent event type from GenAI semantic conventions

Summary

Adds agent-specific event type detection (Priority 4.5) in _detect_event_type before the general pattern matcher runs.

Problem: pydantic-ai agent run spans carry both gen_ai.agent.name and gen_ai.request.model. The existing Priority 5 pattern matcher sees gen_ai.request.model first and classifies agent spans as "model" — which is incorrect.

Fix: Before pattern matching, check for gen_ai.agent.name or gen_ai.operation.name=agent. If either is present, return "agent" immediately. Both checks include isinstance(..., str) guards to safely handle non-string attribute values from OTEL.

Updates since last revision

  • Added isinstance(agent_name, str) type guard to match the existing isinstance(operation_name, str) guard — prevents truthy non-string values (e.g. int, dict) from incorrectly triggering agent detection.
  • Added 7 unit tests in TestHoneyHiveSpanProcessorAgentDetection covering:
    • gen_ai.agent.name string → "agent"
    • gen_ai.operation.name=agent"agent" (case-insensitive)
    • Agent name takes priority over gen_ai.request.model
    • Only gen_ai.request.model (no agent name) → falls through to pattern matching
    • Non-string gen_ai.agent.name (e.g. int) → ignored, falls through
    • Empty string gen_ai.agent.name → ignored, falls through

Review & Testing Checklist for Human

  • Verify "agent" is a recognized event type in the backend. The ingestion service's event_type.go must handle this (see companion hive-kube PR #2517). If the backend doesn't recognize "agent", this will cause downstream issues.
  • Verify that gen_ai.agent.name is NOT set on child model call spans (only on the agent span itself). If it's inherited by child spans, this would misclassify model calls as agents.
  • Test with actual pydantic-ai spans: Run the pydantic-ai example and verify that:
    • Agent run spans → honeyhive_event_type=agent
    • Model call spans within the agent → honeyhive_event_type=model
    • Tool call spans → honeyhive_event_type=tool

Notes

This PR addresses Issue #5 from the pydantic-ai tracing pipeline investigation. The companion PR in hive-kube (#2517) addresses the backend-side agent type detection and other pipeline issues. The hive-kube PR should be merged first since the backend needs to recognize the agent event type before the SDK change takes effect.

Devin Session: https://app.devin.ai/sessions/ad3780b7b3f548bfaa530b8fd8533e16
Requested by: unknown ()


Open with Devin

Issue #5: Agent runs with gen_ai.agent.name were misclassified as 'model'
because the pattern matcher found gen_ai.request.model first.

Added Priority 4.5 check in _detect_event_type: if gen_ai.agent.name is
present or gen_ai.operation.name='agent', classify as 'agent' type before
falling through to general pattern matching.

Co-Authored-By: unknown <>
@devin-ai-integration devin-ai-integration bot requested a review from a team March 2, 2026 10:21
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@snyk-io
Copy link

snyk-io bot commented Mar 2, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

@claude
Copy link
Contributor

claude bot commented Mar 2, 2026

Claude finished @devin-ai-integration[bot]'s task in 2m 23s —— View job


Review of PR #292 — Complete

  • Gather context and read changed files
  • Understand the existing _detect_event_type priority chain
  • Review the PR changes for correctness and edge cases
  • Post review feedback

Summary

Reviewed the new Priority 4.5 agent detection logic in _detect_event_type. The fix correctly addresses the pydantic-ai misclassification problem. Posted 2 inline comments and 1 top-level comment covering:

  1. Inconsistency — OpenInference maps AGENT"chain" (Priority 4) but this new code returns "agent" (Priority 4.5). Needs clarification on whether this is intentional or if the OpenInference mapping should be updated.
  2. Type safety — The agent_name truthiness check should use isinstance(agent_name, str) for consistency with the operation_name check.
  3. Missing tests — No unit tests for the new detection path. Listed 5 specific test cases that should be added.
  4. Backend dependency — The hive-kube companion PR should land before/with this to avoid "agent" being an unrecognized event type.

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

operation_name,
)
return "agent"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Inconsistency with OpenInference mapping: The OPENINFERENCE_TO_HONEYHIVE mapping at Priority 4 (line 1040) maps "AGENT""chain", but this new Priority 4.5 block returns "agent". This means the same conceptual operation (an agent run) gets classified differently depending on which instrumentation library produced the span:

  • OpenInference agent span → "chain"
  • GenAI semantic conventions agent span → "agent"

Is this intentional? If so, it should be documented clearly. If "agent" is the desired canonical type going forward, consider updating the OpenInference mapping too ("AGENT": "agent") for consistency.


Missing unit tests: The PR description itself suggests adding a unit test for this Priority 4.5 path. Given that this is a published SDK, please add tests covering:

  1. Span with gen_ai.agent.name set → returns "agent"
  2. Span with gen_ai.operation.name = "agent" → returns "agent"
  3. Span with gen_ai.operation.name = "Agent" (case variation) → returns "agent"
  4. Span with both gen_ai.agent.name AND gen_ai.request.model → returns "agent" (not "model")
  5. Span with only gen_ai.request.model (no agent attrs) → falls through to pattern matching

if agent_name or (
isinstance(operation_name, str) and operation_name.lower() == "agent"
):
self._safe_log(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Truthy check on agent_name may be too broad: attributes.get("gen_ai.agent.name") could return non-string truthy values (e.g., an integer, a list) from OTel attributes. While unlikely in practice, for defensive coding in a published SDK, consider mirroring the isinstance guard used for operation_name:

Suggested change
self._safe_log(
if (isinstance(agent_name, str) and agent_name) or (
isinstance(operation_name, str)
and operation_name.lower() == "agent"
):

@claude
Copy link
Contributor

claude bot commented Mar 2, 2026

Overall Review

What this PR does well:

  • Correctly identifies the root cause: pydantic-ai agent spans carry both gen_ai.agent.name and gen_ai.request.model, causing misclassification
  • Good defensive coding with isinstance check on operation_name and case-insensitive comparison
  • Proper placement in the priority chain (after OpenInference, before general pattern matching)
  • Wrapped in the existing try/except so failures degrade gracefully

Key concerns:

  1. Backend readiness for "agent" event type — The PR description itself flags this as a risk. Shipping this SDK change before the backend recognizes "agent" could cause silent data loss or processing errors. The companion hive-kube PR should land first (or simultaneously).

  2. Inconsistent agent classification — The OpenInference mapping (Priority 4, line 1040) maps "AGENT""chain", while this new code returns "agent". Same concept, different output depending on instrumentation library. This should be intentional and documented, or unified.

  3. No unit tests included — The checklist in the PR body calls for tests, but none are included. For a published SDK, the new detection path needs test coverage before merge.

Documentation

No public API changes, so no doc updates needed. However, if "agent" becomes a new recognized event type, the tracing documentation should eventually list it alongside "model", "tool", and "chain".

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Copy link
Contributor Author

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

agent_name,
operation_name,
)
return "agent"
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 New agent detection returns unrecognized event type "agent" instead of "chain"

The new Priority 4.5 agent detection returns "agent" as the event type, but "agent" is not a valid event type in the HoneyHive system.

Root Cause and Impact

The valid event types are {"model", "tool", "chain"} as defined in src/honeyhive/tracer/core/operations.py:913 and the EventType enum at src/honeyhive/models/__init__.py:13-29 (which lists model, tool, chain, session, generic — no agent).

Critically, the existing OpenInference mapping just 18 lines above at span_processor.py:1040 already establishes the convention for agent operations:

"AGENT": "chain",  # Agent operations (map to chain)

This means agent spans detected via OpenInference's span.kind=AGENT are correctly mapped to "chain", but agent spans detected via GenAI semantic conventions (gen_ai.agent.name or gen_ai.operation.name=agent) are incorrectly mapped to "agent".

In OTLP export mode (the active code path), the value "agent" is set directly on the span attribute honeyhive_event_type at span_processor.py:692 and sent to the backend without any normalization. The _normalize_event_type method at operations.py:910-923 that would have caught this (by falling back to "tool") is only invoked in the deprecated client mode path.

Impact: Agent spans from GenAI-convention instrumentors (e.g., AWS Strands) will be sent to the HoneyHive backend with an unrecognized "agent" event type, likely causing incorrect categorization or processing errors on the backend.

Suggested change
return "agent"
return "chain"
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is intentional, not a bug. The companion PR https://github.com/honeyhiveai/hive-kube/pull/2517 adds backend support for the "agent" event type in the ingestion pipeline (event_type.go). The SDK sets honeyhive_event_type as a hint attribute on the OTLP span, and the backend's ingestion service performs the actual event type classification.

The EventType enum in the SDK is used for the deprecated client-mode path, not the active OTLP export path. In OTLP mode, the value flows as a string attribute and is interpreted by the backend, which now recognizes "agent" as a valid type.

Returning "chain" here would defeat the purpose of Issue #2 (agent runs incorrectly typed as chain instead of agent).

…5 agent detection

- Add isinstance(agent_name, str) guard to match operation_name check (type safety)
- Add 7 unit tests for Priority 4.5 agent detection:
  * gen_ai.agent.name → agent
  * gen_ai.operation.name=agent → agent
  * Case-insensitive operation_name (Agent) → agent
  * Both agent_name + request_model → agent (not model)
  * Only request_model → falls through to pattern matching
  * Non-string agent_name (int) → ignored, falls through
  * Empty string agent_name → ignored, falls through

Co-Authored-By: unknown <>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

@github-actions
Copy link
Contributor

github-actions bot commented Mar 2, 2026

📚 Documentation preview built — Download artifact

Review instructions & validation status

How to Review

  1. Download the artifact from the link above
  2. Extract the files
  3. Open index.html in your browser

Validation Status

  • API validation: ✅ Passed
  • Build process: ✅ Successful
  • Import tests: ✅ All imports working

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants