Skip to content

Log Pydantic AI tool definitions to Langfuse#86

Open
rafacm wants to merge 2 commits intomainfrom
claude/add-tools-langfuse-8ygHf
Open

Log Pydantic AI tool definitions to Langfuse#86
rafacm wants to merge 2 commits intomainfrom
claude/add-tools-langfuse-8ygHf

Conversation

@rafacm
Copy link
Copy Markdown
Owner

@rafacm rafacm commented Mar 23, 2026

Summary

  • Add _get_tool_definitions() and _log_tool_definitions() helpers to episodes/agents/agent.py that extract tool schemas from the Pydantic AI agent and log them as a recovery-tool-definitions Langfuse event inside the recovery trace
  • Pydantic AI's OTel instrumentation captures conversation flow but omits the tools JSON schemas sent to the model — this fills that gap so tool definitions are visible in Langfuse alongside the LLM request traces
  • 6 unit tests covering schema extraction and Langfuse event creation

Test plan

  • uv run python manage.py test episodes.tests.test_agent_langfuse — all 6 tests pass
  • Trigger a recovery agent run (e.g. via a scraping failure) and verify the recovery-tool-definitions event appears in the Langfuse trace with full JSON schemas for all 11 tools

Documentation

  • [Plan](doc/plans/2026-03-23-add-tools-langfuse.md)
  • [Feature](doc/features/2026-03-23-add-tools-langfuse.md)
  • [Planning session](doc/sessions/2026-03-23-add-tools-langfuse-planning-session.md)
  • [Implementation session](doc/sessions/2026-03-23-add-tools-langfuse-implementation-session.md)

Pydantic AI's OTel instrumentation captures conversation flow but not
the tools JSON schemas sent to the model. Add _get_tool_definitions()
and _log_tool_definitions() helpers that extract tool schemas from the
agent's _function_toolset and log them as a recovery-tool-definitions
Langfuse event inside the recovery trace.

https://claude.ai/code/session_01TQ9vfjdH5MLG6Dy9ivRuAS
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds explicit Langfuse event logging for the recovery agent’s Pydantic AI tool JSON schemas so they’re visible in Langfuse traces (since Pydantic AI OTel spans don’t include the tools array).

Changes:

  • Add _get_tool_definitions() / _log_tool_definitions() helpers and emit a recovery-tool-definitions Langfuse event within the recovery trace context.
  • Add unit tests covering tool schema extraction and Langfuse event creation behavior.
  • Add plan/feature/session documentation and a changelog entry for the feature.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
episodes/agents/agent.py Extract tool schemas from the Pydantic AI agent and log them as a Langfuse event inside propagate_attributes.
episodes/tests/test_agent_langfuse.py New tests for tool definition extraction + Langfuse event creation/error handling.
doc/sessions/2026-03-23-add-tools-langfuse-planning-session.md Planning transcript for the change.
doc/sessions/2026-03-23-add-tools-langfuse-implementation-session.md Implementation transcript for the change.
doc/plans/2026-03-23-add-tools-langfuse.md Plan document describing problem/approach/steps.
doc/features/2026-03-23-add-tools-langfuse.md Feature doc describing the implemented behavior and verification steps.
CHANGELOG.md Add entry describing the new Langfuse tool-definition logging.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +4 to +13
import os
import unittest
from unittest.mock import MagicMock, patch

# Ensure Django settings are configured before importing app code.
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "ragtime.settings")

import django
django.setup()

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test module sets DJANGO_SETTINGS_MODULE and calls django.setup() at import time. Under Django’s test runner (manage.py test) settings are already configured and Django is already set up, so this is redundant and introduces import-time side effects that can make the test suite harder to run under other runners/tools. Prefer relying on the Django test runner setup (or following the existing pattern in episodes/tests/test_agent_tools.py: try/except ImportError → SkipTest) and remove the manual environment + setup calls from the module top-level.

Suggested change
import os
import unittest
from unittest.mock import MagicMock, patch
# Ensure Django settings are configured before importing app code.
os.environ.setdefault("DJANGO_SETTINGS_MODULE", "ragtime.settings")
import django
django.setup()
import unittest
from unittest.mock import MagicMock, patch

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9d40b88. Removed manual os.environ/django.setup() and switched to the try/except ImportError pattern used by test_agent_tools.py. All imports (including Agent, _get_tool_definitions, _log_tool_definitions) are now at module level inside the try block.

Comment on lines +221 to +244
def _get_tool_definitions(agent):
"""Extract tool definitions from the agent's registered function tools.

Returns a list of dicts with ``name``, ``description``, and
``parameters`` (JSON schema) for each tool — the same information
that Pydantic AI sends to the model in the ``tools`` array.
"""
try:
toolset = agent._function_toolset
return [
{
"name": tool.name,
"description": tool.description,
"parameters": tool.function_schema.json_schema,
}
for tool in toolset.tools.values()
]
except Exception:
logger.debug("Failed to extract tool definitions from agent", exc_info=True)
return []


def _log_tool_definitions(agent, episode_id):
"""Log the agent's tool definitions as a Langfuse event.
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New helper functions are missing type annotations, while the surrounding code in this module uses typed signatures (e.g., _build_agent(), _get_system_prompt(), _attach_screenshots()). Adding parameter/return type hints for _get_tool_definitions() and _log_tool_definitions() would keep the file consistent and make the logged payload shape clearer (e.g., episode_id as int, _log_tool_definitions returning None, and the tool-def list element types).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9d40b88. Added type annotations: _get_tool_definitions(agent: Agent) -> list[dict] and _log_tool_definitions(agent: Agent, episode_id: int) -> None.


**Date:** 2026-03-23

**Session ID:** 01TQ9vfjdH5MLG6Dy9ivRuAS
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AGENTS.md requires session transcripts to include the actual Claude Code session UUID (example given is a UUID format). The value here (01TQ9vfjdH5MLG6Dy9ivRuAS) doesn’t match that format; please replace it with the real Claude Code session UUID for this transcript (or use unavailable only if it truly can’t be recovered).

Suggested change
**Session ID:** 01TQ9vfjdH5MLG6Dy9ivRuAS
**Session ID:** unavailable

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9d40b88. Replaced with unavailable — the real UUID could not be recovered.


**Date:** 2026-03-23

**Session ID:** 01TQ9vfjdH5MLG6Dy9ivRuAS
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AGENTS.md requires session transcripts to include the actual Claude Code session UUID (example given is a UUID format). The value here (01TQ9vfjdH5MLG6Dy9ivRuAS) doesn’t match that format; please replace it with the real Claude Code session UUID for this transcript (or use unavailable only if it truly can’t be recovered).

Suggested change
**Session ID:** 01TQ9vfjdH5MLG6Dy9ivRuAS
**Session ID:** unavailable

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 9d40b88. Replaced with unavailable — same as above.

…ns, fix session IDs

- Remove manual django.setup() from test module; use try/except ImportError pattern
- Add type annotations to _get_tool_definitions() and _log_tool_definitions()
- Replace non-UUID session IDs with 'unavailable' in session transcripts

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants