fix: detect agent event type from GenAI semantic conventions#292
Conversation
Issue #5: Agent runs with gen_ai.agent.name were misclassified as 'model' because the pattern matcher found gen_ai.request.model first. Added Priority 4.5 check in _detect_event_type: if gen_ai.agent.name is present or gen_ai.operation.name='agent', classify as 'agent' type before falling through to general pattern matching. Co-Authored-By: unknown <>
🤖 Devin AI EngineerI'll be helping with this pull request! Here's what you should know: ✅ I will automatically:
Note: I can only respond to comments from users who have write access to this repository. ⚙️ Control Options:
|
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
|
Claude finished @devin-ai-integration[bot]'s task in 2m 23s —— View job Review of PR #292 — Complete
SummaryReviewed the new Priority 4.5 agent detection logic in
|
Co-Authored-By: unknown <>
|
📚 Documentation preview built — Download artifact Review instructions & validation statusHow to Review
Validation Status
|
| operation_name, | ||
| ) | ||
| return "agent" | ||
|
|
There was a problem hiding this comment.
Inconsistency with OpenInference mapping: The OPENINFERENCE_TO_HONEYHIVE mapping at Priority 4 (line 1040) maps "AGENT" → "chain", but this new Priority 4.5 block returns "agent". This means the same conceptual operation (an agent run) gets classified differently depending on which instrumentation library produced the span:
- OpenInference agent span →
"chain" - GenAI semantic conventions agent span →
"agent"
Is this intentional? If so, it should be documented clearly. If "agent" is the desired canonical type going forward, consider updating the OpenInference mapping too ("AGENT": "agent") for consistency.
Missing unit tests: The PR description itself suggests adding a unit test for this Priority 4.5 path. Given that this is a published SDK, please add tests covering:
- Span with
gen_ai.agent.nameset → returns"agent" - Span with
gen_ai.operation.name = "agent"→ returns"agent" - Span with
gen_ai.operation.name = "Agent"(case variation) → returns"agent" - Span with both
gen_ai.agent.nameANDgen_ai.request.model→ returns"agent"(not"model") - Span with only
gen_ai.request.model(no agent attrs) → falls through to pattern matching
| if agent_name or ( | ||
| isinstance(operation_name, str) and operation_name.lower() == "agent" | ||
| ): | ||
| self._safe_log( |
There was a problem hiding this comment.
Truthy check on agent_name may be too broad: attributes.get("gen_ai.agent.name") could return non-string truthy values (e.g., an integer, a list) from OTel attributes. While unlikely in practice, for defensive coding in a published SDK, consider mirroring the isinstance guard used for operation_name:
| self._safe_log( | |
| if (isinstance(agent_name, str) and agent_name) or ( | |
| isinstance(operation_name, str) | |
| and operation_name.lower() == "agent" | |
| ): |
Overall ReviewWhat this PR does well:
Key concerns:
DocumentationNo public API changes, so no doc updates needed. However, if |
|
📚 Documentation preview built — Download artifact Review instructions & validation statusHow to Review
Validation Status
|
| agent_name, | ||
| operation_name, | ||
| ) | ||
| return "agent" |
There was a problem hiding this comment.
🔴 New agent detection returns unrecognized event type "agent" instead of "chain"
The new Priority 4.5 agent detection returns "agent" as the event type, but "agent" is not a valid event type in the HoneyHive system.
Root Cause and Impact
The valid event types are {"model", "tool", "chain"} as defined in src/honeyhive/tracer/core/operations.py:913 and the EventType enum at src/honeyhive/models/__init__.py:13-29 (which lists model, tool, chain, session, generic — no agent).
Critically, the existing OpenInference mapping just 18 lines above at span_processor.py:1040 already establishes the convention for agent operations:
"AGENT": "chain", # Agent operations (map to chain)This means agent spans detected via OpenInference's span.kind=AGENT are correctly mapped to "chain", but agent spans detected via GenAI semantic conventions (gen_ai.agent.name or gen_ai.operation.name=agent) are incorrectly mapped to "agent".
In OTLP export mode (the active code path), the value "agent" is set directly on the span attribute honeyhive_event_type at span_processor.py:692 and sent to the backend without any normalization. The _normalize_event_type method at operations.py:910-923 that would have caught this (by falling back to "tool") is only invoked in the deprecated client mode path.
Impact: Agent spans from GenAI-convention instrumentors (e.g., AWS Strands) will be sent to the HoneyHive backend with an unrecognized "agent" event type, likely causing incorrect categorization or processing errors on the backend.
| return "agent" | |
| return "chain" |
Was this helpful? React with 👍 or 👎 to provide feedback.
There was a problem hiding this comment.
This is intentional, not a bug. The companion PR https://github.com/honeyhiveai/hive-kube/pull/2517 adds backend support for the "agent" event type in the ingestion pipeline (event_type.go). The SDK sets honeyhive_event_type as a hint attribute on the OTLP span, and the backend's ingestion service performs the actual event type classification.
The EventType enum in the SDK is used for the deprecated client-mode path, not the active OTLP export path. In OTLP mode, the value flows as a string attribute and is interpreted by the backend, which now recognizes "agent" as a valid type.
Returning "chain" here would defeat the purpose of Issue #2 (agent runs incorrectly typed as chain instead of agent).
…5 agent detection - Add isinstance(agent_name, str) guard to match operation_name check (type safety) - Add 7 unit tests for Priority 4.5 agent detection: * gen_ai.agent.name → agent * gen_ai.operation.name=agent → agent * Case-insensitive operation_name (Agent) → agent * Both agent_name + request_model → agent (not model) * Only request_model → falls through to pattern matching * Non-string agent_name (int) → ignored, falls through * Empty string agent_name → ignored, falls through Co-Authored-By: unknown <>
Co-Authored-By: unknown <>
|
📚 Documentation preview built — Download artifact Review instructions & validation statusHow to Review
Validation Status
|
|
📚 Documentation preview built — Download artifact Review instructions & validation statusHow to Review
Validation Status
|
fix: detect agent event type from GenAI semantic conventions
Summary
Adds agent-specific event type detection (Priority 4.5) in
_detect_event_typebefore the general pattern matcher runs.Problem: pydantic-ai agent run spans carry both
gen_ai.agent.nameandgen_ai.request.model. The existing Priority 5 pattern matcher seesgen_ai.request.modelfirst and classifies agent spans as"model"— which is incorrect.Fix: Before pattern matching, check for
gen_ai.agent.nameorgen_ai.operation.name=agent. If either is present, return"agent"immediately. Both checks includeisinstance(..., str)guards to safely handle non-string attribute values from OTEL.Updates since last revision
isinstance(agent_name, str)type guard to match the existingisinstance(operation_name, str)guard — prevents truthy non-string values (e.g.int,dict) from incorrectly triggering agent detection.TestHoneyHiveSpanProcessorAgentDetectioncovering:gen_ai.agent.namestring →"agent"gen_ai.operation.name=agent→"agent"(case-insensitive)gen_ai.request.modelgen_ai.request.model(no agent name) → falls through to pattern matchinggen_ai.agent.name(e.g.int) → ignored, falls throughgen_ai.agent.name→ ignored, falls throughReview & Testing Checklist for Human
"agent"is a recognized event type in the backend. The ingestion service'sevent_type.gomust handle this (see companion hive-kube PR #2517). If the backend doesn't recognize "agent", this will cause downstream issues.gen_ai.agent.nameis NOT set on child model call spans (only on the agent span itself). If it's inherited by child spans, this would misclassify model calls as agents.honeyhive_event_type=agenthoneyhive_event_type=modelhoneyhive_event_type=toolNotes
This PR addresses Issue #5 from the pydantic-ai tracing pipeline investigation. The companion PR in
hive-kube(#2517) addresses the backend-side agent type detection and other pipeline issues. The hive-kube PR should be merged first since the backend needs to recognize theagentevent type before the SDK change takes effect.Devin Session: https://app.devin.ai/sessions/ad3780b7b3f548bfaa530b8fd8533e16
Requested by: unknown ()