11/21 by Cursx · Pull Request #7 · Cursx/dify

Cursx · 2025-11-21T07:43:59Z

Important

Make sure you have read our contribution guidelines
Ensure there is an associated issue and you have been assigned to it
Use the correct syntax to link this PR: Fixes #<issue number>.

Summary

Add a return direct mechanism similar to Langchain to the agent module to control whether the agent skips LLM and directly outputs the tool results

Screenshots

Before	After
...	...

Checklist

This change requires a documentation update, included: Dify Document
I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
I've updated the documentation accordingly.
I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

Added detection for 'return_direct' signal in tool messages.

gemini-code-assist · 2025-11-21T07:44:16Z

Summary of Changes

Hello @Cursx, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a 'return direct' mechanism, inspired by Langchain, into the agent modules. This enhancement allows agents to bypass further Large Language Model (LLM) processing and directly output the results of a tool invocation as the final answer. This streamlines agent execution for specific scenarios where a tool's output is definitive and does not require additional LLM reasoning, improving efficiency and control over agent behavior.

Highlights

Direct Return Mechanism: Introduced a 'return direct' mechanism, similar to Langchain, allowing agents to bypass further LLM processing and directly output tool results.
Agent Runner Modifications: Modified both CoT (Chain-of-Thought) and Function Call agent runners to detect and act upon the 'return direct' flag, streamlining execution.
Tool Invoke Metadata Enhancement: Added an extra dictionary field to ToolInvokeMeta to flexibly carry additional metadata, specifically used for the return_direct signal.
Tool Engine Integration: Implemented logic in the tool engine to detect the return_direct signal from internal variable messages within tool responses and pass it to the agent runners.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a 'return direct' mechanism, allowing an agent's tool to bypass subsequent LLM calls and return its result directly. The implementation correctly adds the necessary logic to the agent runners and tool engine. However, I've identified a few areas for improvement. There are several redundant try-except blocks that can be removed to make the code cleaner and more Pythonic. More importantly, there's a bug in the function-calling agent runner where agent thoughts for direct returns are logged with empty tool names and inputs, which should be fixed to ensure proper tracing and debugging.

gemini-code-assist

Code Review

This pull request introduces a return_direct mechanism for tools within the agent module, enabling them to return results directly and skip subsequent LLM interactions. The changes are well-distributed across the agent runners, tool entities, and the tool engine. My review focuses on enhancing code quality and correctness. I've identified several instances of redundant try-except blocks that could be simplified for better readability and to prevent masking potential errors. Additionally, I've found a potential logic issue in tool_engine.py where the return_direct flag might be incorrectly overridden, and I have proposed a solution. Overall, the implementation aligns with the pull request's objective, and addressing the suggested improvements will contribute to a more robust and maintainable codebase.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Cursx · 2025-11-21T08:19:31Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a "return direct" mechanism for tools within agents, allowing a tool's output to be returned directly to the user, bypassing subsequent LLM calls. This is implemented by extending ToolInvokeMeta with an extra dictionary to carry the return_direct flag. The tool engine is updated to detect this flag from a special VARIABLE tool message. Both CoT and Function Calling agent runners are modified to handle this flag, terminating the agent loop and returning the tool's output when appropriate. The implementation is logical and effectively achieves the goal. I have provided a couple of suggestions to enhance exception handling and improve code structure for better maintainability.

gemini-code-assist · 2025-11-21T08:23:18Z

+                if direct_flag:
+                    # save agent thought for this tool call
+                    self.save_agent_thought(
+                        agent_thought_id=agent_thought_id,
+                        tool_name=tool_call_name,
+                        tool_input=tool_call_args,
+                        thought="",
+                        tool_invoke_meta={tool_call_name: tool_invoke_meta.to_dict()},
+                        observation={tool_call_name: tool_invoke_response},
+                        answer=str(tool_invoke_response or ""),
+                        messages_ids=message_file_ids,
+                    )
+                    self.queue_manager.publish(
+                        QueueAgentThoughtEvent(agent_thought_id=agent_thought_id), PublishFrom.APPLICATION_MANAGER
+                    )
+
+                    # publish end event immediately and return
+                    final_answer = str(tool_invoke_response or "")
+                    llm_final_usage = llm_usage.get("usage") or LLMUsage.empty_usage()
+                    self.queue_manager.publish(
+                        QueueMessageEndEvent(
+                            llm_result=LLMResult(
+                                model=model_instance.model,
+                                prompt_messages=prompt_messages,
+                                message=AssistantPromptMessage(content=final_answer),
+                                usage=llm_final_usage,
+                                system_fingerprint="",
+                            )
+                        ),
+                        PublishFrom.APPLICATION_MANAGER,
+                    )
+
+                    yield LLMResultChunk(
+                        model=model_instance.model,
+                        prompt_messages=prompt_messages,
+                        system_fingerprint="",
+                        delta=LLMResultChunkDelta(
+                            index=0,
+                            message=AssistantPromptMessage(content=final_answer),
+                            usage=llm_final_usage,
+                        ),
+                    )
+                    return


This if direct_flag: block is quite large and contains complex logic for handling the direct return. This makes the for loop harder to read and understand. To improve readability and maintainability, I suggest extracting this block of code into a separate private helper method, for example, _handle_direct_return(...). This helper method could encapsulate the logic for saving the agent thought, publishing events, and yielding the final result chunk.

Cursx · 2025-11-21T08:33:31Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a 'return direct' mechanism for agents, allowing tools to return a final result directly without further LLM interaction. The implementation looks good overall. I've identified a significant code duplication and a related bug in fc_agent_runner.py where the final answer was not being yielded in the normal execution path. I've suggested a refactoring to address both issues. I also found a minor style issue in tool_engine.py.

gemini-code-assist · 2025-11-21T08:35:43Z

+                if direct_flag:
+                    # save agent thought for this tool call
+                    self.save_agent_thought(
+                        agent_thought_id=agent_thought_id,
+                        tool_name=tool_call_name,
+                        tool_input=tool_call_args,
+                        thought="",
+                        tool_invoke_meta={tool_call_name: tool_invoke_meta.to_dict()},
+                        observation={tool_call_name: tool_invoke_response},
+                        answer=str(tool_invoke_response or ""),
+                        messages_ids=message_file_ids,
+                    )
+                    self.queue_manager.publish(
+                        QueueAgentThoughtEvent(agent_thought_id=agent_thought_id), PublishFrom.APPLICATION_MANAGER
+                    )
+
+                    # publish end event immediately and return
+                    final_answer = str(tool_invoke_response or "")
+                    llm_final_usage = llm_usage.get("usage") or LLMUsage.empty_usage()
+                    self.queue_manager.publish(
+                        QueueMessageEndEvent(
+                            llm_result=LLMResult(
+                                model=model_instance.model,
+                                prompt_messages=prompt_messages,
+                                message=AssistantPromptMessage(content=final_answer),
+                                usage=llm_final_usage,
+                                system_fingerprint="",
+                            )
+                        ),
+                        PublishFrom.APPLICATION_MANAGER,
+                    )
+
+                    yield LLMResultChunk(
+                        model=model_instance.model,
+                        prompt_messages=prompt_messages,
+                        system_fingerprint="",
+                        delta=LLMResultChunkDelta(
+                            index=0,
+                            message=AssistantPromptMessage(content=final_answer),
+                            usage=llm_final_usage,
+                        ),
+                    )
+                    return


This block introduces a large amount of duplicated logic for finalizing the agent run. The logic for publishing the QueueMessageEndEvent and yielding the final LLMResultChunk is very similar to what should happen at the end of the run method.

Currently, the original code path at the end of the method (lines 352-363) seems to be missing a yield for the final answer, which is a bug. This new direct_flag path correctly yields the result.

To improve maintainability and fix the bug in the original path, I recommend extracting this finalization logic into a private helper method. This helper could then be called here, and also at the end of the run method to ensure consistent behavior and a single source of truth for ending the agent execution.

A refactoring would look something like this:

Create a private helper method _yield_final_answer(...) that contains the logic from lines 296-321.

Call it here: yield from self._yield_final_answer(...) and then return.

Replace lines 352-363 with a call to the same helper method: yield from self._yield_final_answer(...).

This would fix the bug and remove the code duplication.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Cursx · 2025-11-21T08:48:23Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a "return direct" mechanism for tools in the agent module. This allows a tool to specify that its output should be the final answer, skipping any further LLM processing steps. The changes are implemented across both CoT and Function Calling agent runners. The implementation involves adding an extra dictionary to ToolInvokeMeta to carry the return_direct flag, modifying the ToolEngine to populate this flag from tool messages, and updating the agent runners to handle this flag and terminate execution early if set. The changes look good and the refactoring in fc_agent_runner.py to extract _yield_final_answer is a nice improvement for code clarity and reuse. I have one suggestion to improve the robustness of the return_direct flag detection.

Cursx · 2025-11-21T09:02:56Z

/gemini review

Cursx · 2025-11-21T09:19:19Z

/gemini review

gemini-code-assist

Code Review

This pull request introduces a 'return direct' mechanism for agents, allowing tool results to be returned immediately without further LLM interaction. The implementation is well-structured across the CoT and function-calling agent runners, with necessary modifications to tool entities and the tool engine. My feedback focuses on improving code clarity and robustness in a few areas to enhance maintainability.

gemini-code-assist

Code Review

This pull request introduces a 'return direct' mechanism for tools, allowing them to bypass subsequent LLM calls and return their results directly. This is a valuable feature that aligns with capabilities in other agent frameworks. The implementation is well-executed across the cot_agent_runner and fc_agent_runner, using a new extra field in ToolInvokeMeta to signal the direct return. The logic in both runners correctly handles this signal to either terminate the execution loop or continue iteration. I also appreciate the refactoring in fc_agent_runner.py which introduces the _yield_final_answer method, improving code clarity and reducing duplication. Overall, the changes are solid. I have one minor suggestion for improving type hinting.

gemini-code-assist · 2025-11-21T09:22:04Z


+    def _yield_final_answer(
+        self,
+        prompt_messages: list,


For better type safety and code clarity, it's recommended to use a more specific type hint for prompt_messages. Instead of list, please use list[PromptMessage].

Suggested change

prompt_messages: list,

prompt_messages: list[PromptMessage],

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request introduces a 'return direct' mechanism for tools within the agent module, allowing a tool's output to be returned as the final answer, bypassing subsequent LLM calls. The implementation correctly modifies the tool invocation metadata and updates both the Chain-of-Thought and Function-Calling agent runners to handle this new logic. A notable improvement is the refactoring of the final answer yielding process in the FcAgentRunner into a reusable method, which cleans up the code.

My review has identified a high-severity issue where the agent's thought process is lost when a direct return occurs in the FcAgentRunner. I've also included a medium-severity suggestion to improve type hinting for better code maintainability. Overall, the changes are well-structured to support the new feature.

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Cursx added 4 commits November 21, 2025 15:39

Add 'extra' field to ToolInvokeMeta class

66add8c

Implement return_direct signal detection

5aa5634

Added detection for 'return_direct' signal in tool messages.

Update fc_agent_runner.py

2a5a6ec

Update cot_agent_runner.py

6ea3dcb