Skip to content

fix(agent): use full tool schema for DeepSeek V4#7862

Closed
imwyvern wants to merge 1 commit intoAstrBotDevs:masterfrom
imwyvern:clawoss/fix/deepseek-v4-tool-schema
Closed

fix(agent): use full tool schema for DeepSeek V4#7862
imwyvern wants to merge 1 commit intoAstrBotDevs:masterfrom
imwyvern:clawoss/fix/deepseek-v4-tool-schema

Conversation

@imwyvern
Copy link
Copy Markdown

@imwyvern imwyvern commented Apr 28, 2026

Fixes #7855

DeepSeek V4 rejects the skills-like light tool schema, so AstrBot can fall back as if the model does not support function calling. For DeepSeek V4 models, the local agent runner now uses the full tool schema instead of mutating func_tool to the light schema.

Added a regression test that verifies deepseek-v4-flash keeps the original full tool set when skills_like is selected.

Modifications / 改动点

  • Normalize skills_like to full for deepseek-v4-flash and deepseek-v4-pro.

  • Keep existing skills_like behavior unchanged for other models.

  • Add coverage for the DeepSeek V4 schema-mode override.

  • This is NOT a breaking change. / 这不是一个破坏性变更。

Screenshots or Test Results / 运行截图或测试结果

uv run pytest -q tests/test_tool_loop_agent_runner.py::test_deepseek_v4_uses_full_tool_schema_instead_of_skills_like tests/test_tool_loop_agent_runner.py::test_skills_like_requery_passes_extra_user_content_parts
# 2 passed, 3 warnings in 11.65s

uv run ruff check astrbot/core/agent/runners/tool_loop_agent_runner.py tests/test_tool_loop_agent_runner.py
# All checks passed!

Checklist / 检查清单

  • 😊 If there are new features added in the PR, I have discussed it with the authors through issues/emails, etc.
    / 如果 PR 中有新加入的功能,已经通过 Issue / 邮件等方式和作者讨论过。

  • 👀 My changes have been well-tested, and "Verification Steps" and "Screenshots" have been provided above.
    / 我的更改经过了良好的测试,并已在上方提供了“验证步骤”和“运行截图”

  • 🤓 I have ensured that no new dependencies are introduced, OR if new dependencies are introduced, they have been added to the appropriate locations in requirements.txt and pyproject.toml.
    / 我确保没有引入新依赖库,或者引入了新依赖库的同时将其添加到 requirements.txtpyproject.toml 文件相应位置。

  • 😮 My changes do not introduce malicious code.
    / 我的更改没有引入恶意代码。

Summary by Sourcery

Normalize tool schema mode handling for DeepSeek V4 models to ensure compatible tool invocation while preserving behavior for other providers.

Enhancements:

  • Add model-aware normalization of the skills_like tool schema mode, forcing DeepSeek V4 models to use the full tool schema with logging for this override.

Tests:

  • Add regression test ensuring DeepSeek V4 models retain the full tool schema when skills_like is requested and that internal state is updated accordingly.

@dosubot dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Apr 28, 2026
Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • The DeepSeek-specific handling in _normalize_tool_schema_mode hardcodes model names in the runner; consider centralizing model capability logic (e.g., in the provider or a capability map) so adding or changing DeepSeek variants doesn’t require touching agent runner internals.
  • The info-level log in _normalize_tool_schema_mode may be noisy if many requests are made with DeepSeek V4 models; consider downgrading to debug or adding a one-time warning mechanism if this is expected to be a common code path.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The DeepSeek-specific handling in `_normalize_tool_schema_mode` hardcodes model names in the runner; consider centralizing model capability logic (e.g., in the provider or a capability map) so adding or changing DeepSeek variants doesn’t require touching agent runner internals.
- The info-level log in `_normalize_tool_schema_mode` may be noisy if many requests are made with DeepSeek V4 models; consider downgrading to debug or adding a one-time warning mechanism if this is expected to be a common code path.

## Individual Comments

### Comment 1
<location path="tests/test_tool_loop_agent_runner.py" line_range="1286-1290" />
<code_context>
+        handler=AsyncMock(),
+    )
+    tool_set = ToolSet(tools=[tool])
+    req = ProviderRequest(
+        prompt="test",
+        func_tool=tool_set,
+        contexts=[],
+        model="deepseek-v4-flash",
+    )
+    runner = ToolLoopAgentRunner()
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding a test where the model name comes from the provider instead of the request.

This path supports models from both `request.model` and `provider.get_model()`, but this test only covers the former. Please also cover the case where `request.model` is `None` and `MockProvider.get_model()` returns a DeepSeek V4 name (either via a separate test or parametrization) to ensure normalization works in that scenario.

Suggested implementation:

```python
from unittest.mock import AsyncMock, Mock

```

```python
@pytest.mark.asyncio
async def test_deepseek_v4_uses_full_tool_schema_instead_of_skills_like():
    provider = MockProvider()
    tool = FunctionTool(
        name="test_tool",
        description="测试",
        parameters={"type": "object", "properties": {"query": {"type": "string"}}},
        handler=AsyncMock(),
    )
    tool_set = ToolSet(tools=[tool])
    req = ProviderRequest(
        prompt="test",
        func_tool=tool_set,
        contexts=[],
        model="deepseek-v4-flash",
    )
    runner = ToolLoopAgentRunner()

    await runner.reset(
        provider=provider,
        request=req,
        run_context=ContextWrapper(context=None),
        tool_executor=cast(Any, MockToolExecutor()),
        agent_hooks=MockHooks(),
        tool_schema_mode="skills_like",
    )


@pytest.mark.asyncio
async def test_deepseek_v4_uses_full_tool_schema_when_model_from_provider():
    provider = MockProvider()
    # Ensure provider.get_model returns a DeepSeek V4 model name when request.model is None
    provider.get_model = Mock(return_value="deepseek-v4-flash")

    tool = FunctionTool(
        name="test_tool",
        description="测试",
        parameters={"type": "object", "properties": {"query": {"type": "string"}}},
        handler=AsyncMock(),
    )
    tool_set = ToolSet(tools=[tool])
    req = ProviderRequest(
        prompt="test",
        func_tool=tool_set,
        contexts=[],
        model=None,
    )
    runner = ToolLoopAgentRunner()

    await runner.reset(
        provider=provider,
        request=req,
        run_context=ContextWrapper(context=None),
        tool_executor=cast(Any, MockToolExecutor()),
        agent_hooks=MockHooks(),
        tool_schema_mode="skills_like",
    )

```
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +1286 to +1290
req = ProviderRequest(
prompt="test",
func_tool=tool_set,
contexts=[],
model="deepseek-v4-flash",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Consider adding a test where the model name comes from the provider instead of the request.

This path supports models from both request.model and provider.get_model(), but this test only covers the former. Please also cover the case where request.model is None and MockProvider.get_model() returns a DeepSeek V4 name (either via a separate test or parametrization) to ensure normalization works in that scenario.

Suggested implementation:

from unittest.mock import AsyncMock, Mock
@pytest.mark.asyncio
async def test_deepseek_v4_uses_full_tool_schema_instead_of_skills_like():
    provider = MockProvider()
    tool = FunctionTool(
        name="test_tool",
        description="测试",
        parameters={"type": "object", "properties": {"query": {"type": "string"}}},
        handler=AsyncMock(),
    )
    tool_set = ToolSet(tools=[tool])
    req = ProviderRequest(
        prompt="test",
        func_tool=tool_set,
        contexts=[],
        model="deepseek-v4-flash",
    )
    runner = ToolLoopAgentRunner()

    await runner.reset(
        provider=provider,
        request=req,
        run_context=ContextWrapper(context=None),
        tool_executor=cast(Any, MockToolExecutor()),
        agent_hooks=MockHooks(),
        tool_schema_mode="skills_like",
    )


@pytest.mark.asyncio
async def test_deepseek_v4_uses_full_tool_schema_when_model_from_provider():
    provider = MockProvider()
    # Ensure provider.get_model returns a DeepSeek V4 model name when request.model is None
    provider.get_model = Mock(return_value="deepseek-v4-flash")

    tool = FunctionTool(
        name="test_tool",
        description="测试",
        parameters={"type": "object", "properties": {"query": {"type": "string"}}},
        handler=AsyncMock(),
    )
    tool_set = ToolSet(tools=[tool])
    req = ProviderRequest(
        prompt="test",
        func_tool=tool_set,
        contexts=[],
        model=None,
    )
    runner = ToolLoopAgentRunner()

    await runner.reset(
        provider=provider,
        request=req,
        run_context=ContextWrapper(context=None),
        tool_executor=cast(Any, MockToolExecutor()),
        agent_hooks=MockHooks(),
        tool_schema_mode="skills_like",
    )

@dosubot dosubot Bot added the area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. label Apr 28, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a normalization step for tool schema modes in the ToolLoopAgentRunner, specifically forcing the 'full' schema for DeepSeek V4 models which do not support the 'skills-like' mode. A new test case has been added to verify this logic. The review feedback suggests using a prefix-based check for the model name to improve robustness against future model variants within the same family.

Comment on lines +338 to +339
if model_name not in {"deepseek-v4-flash", "deepseek-v4-pro"}:
return tool_schema_mode
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Instead of hardcoding specific model variants, consider using a prefix check for deepseek-v4. This makes the logic more robust against future model releases within the same family (e.g., deepseek-v4-chat or simply deepseek-v4) that likely share the same tool schema limitations.

Suggested change
if model_name not in {"deepseek-v4-flash", "deepseek-v4-pro"}:
return tool_schema_mode
if not model_name.startswith("deepseek-v4"):
return tool_schema_mode

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 2edcd3f61d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".


model = (request.model or provider.get_model() or "").lower().strip()
model_name = model.rsplit("/", 1)[-1]
if model_name not in {"deepseek-v4-flash", "deepseek-v4-pro"}:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle DeepSeek V4 model aliases in schema normalization

The override only triggers when model_name is exactly deepseek-v4-flash or deepseek-v4-pro, so DeepSeek V4 identifiers with valid suffixes/prefix variants (for example provider-qualified or tier-suffixed IDs like deepseek-v4-flash:free) will miss this branch and remain in skills_like mode. In that case the runner still sends light tool schemas and can hit the same function-calling rejection this fix is meant to prevent.

Useful? React with 👍 / 👎.

@Soulter Soulter closed this in 6b36e1a Apr 28, 2026
LIghtJUNction pushed a commit that referenced this pull request Apr 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:provider The bug / feature is about AI Provider, Models, LLM Agent, LLM Agent Runner. size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] DeepSeek V4模型无法调用工具

1 participant