Skip to content

feat: support advanced non-incremental output#63

Merged
kevinlin09 merged 12 commits intodashscope:developfrom
kevinlin09:feat/advanced_incremental_output
Nov 5, 2025
Merged

feat: support advanced non-incremental output#63
kevinlin09 merged 12 commits intodashscope:developfrom
kevinlin09:feat/advanced_incremental_output

Conversation

@kevinlin09
Copy link
Copy Markdown
Collaborator

No description provided.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @kevinlin09, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the API client's streaming capabilities by providing a unified interface for receiving non-incremental output, even when the underlying API delivers responses in incremental chunks. This is achieved through an intelligent internal merging mechanism that reconstructs complete responses from partial data, thereby simplifying client-side integration. The changes also include specific handling for models that do not support this merging behavior, ensuring robust and predictable functionality across diverse model types.

Highlights

  • Non-Incremental Streaming Output: Introduced a feature allowing users to receive consolidated, non-incremental output even when using streaming mode (stream=True) by setting incremental_output=False.
  • Internal Response Merging Logic: Implemented internal mechanisms to accumulate and merge incremental response chunks (for content and tool calls) into a single, complete response before yielding it to the user, simplifying client-side consumption.
  • Model-Specific Output Handling: Added a utility (ParamUtil) to identify specific models (e.g., 'tts', 'omni', 'qwen-deep-research') that do not support non-incremental output merging, ensuring correct and predictable behavior for these exceptions.
  • Asynchronous API Enhancements: Updated asynchronous API calls (AioGeneration, AioMultiModalConversation) to correctly handle AsyncGenerator return types and seamlessly integrate the new merging logic.
  • Comprehensive Test Coverage: Added and updated numerous test cases across both synchronous and asynchronous generation and multimodal conversation APIs to validate the new non-incremental streaming output feature for various response types, including text, tool calls, reasoning, and search information.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for non-incremental streaming output by internally merging incremental chunks. The implementation is a good step forward, but there are a few areas for improvement regarding code duplication, a potential bug in the merging logic, and a minor inefficiency. I've left specific comments on these points. Additionally, the new sample files are a great way to demonstrate the new functionality.

Comment on lines +440 to +444
existing_call.update({k: v for k, v in current_call.items()
if k != 'function' and v})
if 'function' in current_call and current_call['function']:
existing_call['function'].update({k: v for k, v in current_call['function'].items()
if k not in ['arguments', 'name'] and v})
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The use of and v in the dictionary comprehensions for updating existing_call will cause issues when a field's value is intentionally falsy (e.g., False, 0, '', []). These updates will be skipped, potentially leaving stale data in the accumulated response. This same issue exists in dashscope/aigc/multimodal_conversation.py.

Please remove the and v check to ensure all updates are applied correctly.

Comment on lines +373 to +419
if 'tool_calls' in choice.message and choice.message.tool_calls:
current_tool_calls = choice.message.tool_calls

# For each current tool call, accumulate its arguments
for current_call in current_tool_calls:
if isinstance(current_call, dict) and 'index' in current_call:
idx = current_call['index']

# Find existing accumulated call with same index
existing_call = None
for acc_call in accumulated_data[choice_idx]['tool_calls']:
if (isinstance(acc_call, dict) and
acc_call.get('index') == idx):
existing_call = acc_call
break

if existing_call:
# Accumulate function fields from current call
if ('function' in current_call and
current_call['function']):
if 'function' not in existing_call:
existing_call['function'] = {}

# Accumulate function.name
if 'name' in current_call['function']:
if 'name' not in existing_call['function']:
existing_call['function']['name'] = ''
existing_call['function']['name'] += current_call['function']['name']

# Accumulate function.arguments
if 'arguments' in current_call['function']:
if 'arguments' not in existing_call['function']:
existing_call['function']['arguments'] = ''
existing_call['function']['arguments'] += current_call['function']['arguments']

# Update other fields with latest values
existing_call.update({k: v for k, v in current_call.items()
if k != 'function' and v})
if 'function' in current_call and current_call['function']:
existing_call['function'].update({k: v for k, v in current_call['function'].items()
if k not in ['arguments', 'name'] and v})
else:
# Add new tool call
accumulated_data[choice_idx]['tool_calls'].append(dict(current_call))

# Update choice with accumulated tool_calls
choice.message.tool_calls = accumulated_data[choice_idx]['tool_calls']
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The logic for accumulating tool_calls in this function is identical to the implementation in _merge_single_response in dashscope/aigc/generation.py. Duplicating complex logic like this increases the maintenance burden and the risk of introducing inconsistencies. This logic should be extracted into a shared utility function that both modules can use. The other issues I've pointed out in the generation.py implementation (inefficient lookup and a potential bug with and v) also apply here.

Comment on lines +340 to +347
is_stream = parameters.get('stream', False)
# Check if we need to merge incremental output
is_incremental_output = kwargs.get('incremental_output', None)
to_merge_incremental_output = False
if (ParamUtil.should_modify_incremental_output(model) and
is_stream and is_incremental_output is False):
to_merge_incremental_output = True
parameters['incremental_output'] = True
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This block of code to determine if response merging is needed is identical to the logic in the synchronous Generation.call method (lines 142-149). To improve maintainability and reduce code duplication, consider extracting this logic into a shared private helper function that both call and acall can use.

Comment on lines +414 to +418
for acc_call in accumulated_data[choice_idx]['tool_calls']:
if (isinstance(acc_call, dict) and
acc_call.get('index') == idx):
existing_call = acc_call
break
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The loop to find an existing_call by its index performs a linear scan, which has O(n) complexity where n is the number of tool calls. For a large number of tool calls, this could become inefficient. Consider restructuring accumulated_data[choice_idx]['tool_calls'] to be a dictionary mapping the tool call index to the tool call object. This would allow for an O(1) lookup.

Comment on lines +117 to +118
if (ParamUtil.should_modify_incremental_output(model) and
is_stream and is_incremental_output is not None and is_incremental_output is False):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The condition is_incremental_output is not None and is_incremental_output is False is unnecessarily verbose. The is not None check is redundant because if is_incremental_output is False, it cannot be None. This can be simplified. This same redundant check is also present in AioMultiModalConversation.call on lines 277-278.

Suggested change
if (ParamUtil.should_modify_incremental_output(model) and
is_stream and is_incremental_output is not None and is_incremental_output is False):
if (ParamUtil.should_modify_incremental_output(model) and
is_stream and is_incremental_output is False):

@kevinlin09 kevinlin09 merged commit aafb74d into dashscope:develop Nov 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant