Skip to content

Support wan2.5#65

Merged
kevinlin09 merged 3 commits intodashscope:mainfrom
mose-x:support_wan2.5
Oct 20, 2025
Merged

Support wan2.5#65
kevinlin09 merged 3 commits intodashscope:mainfrom
mose-x:support_wan2.5

Conversation

@mose-zm
Copy link
Copy Markdown
Contributor

@mose-zm mose-zm commented Oct 20, 2025

Add input.audio_url to support wan2.5 video generation

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @mose-zm, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the video synthesis capabilities by integrating audio input. It introduces a dedicated audio_url parameter within the VideoSynthesis functionality, allowing users to provide an audio source alongside other inputs for generating videos, particularly for the wan2.5 model. This addition broadens the scope of video generation to include synchronized audio, making the output more dynamic and complete.

Highlights

  • Audio URL Support: Introduced a new audio_url parameter across the VideoSynthesis class methods (call, _get_input, async_call) to enable audio input for video generation, specifically supporting the wan2.5 model.
  • Audio Input Processing: The _get_input method now includes logic to process the provided audio_url, handling potential local file uploads before passing the URL to the video synthesis model.
  • New Test Case: A new sample file, samples/test_video_synthesis.py, has been added to demonstrate the usage of the audio_url parameter with the wan2.5-t2v-preview model, including a Chinese prompt and an example audio URL.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds support for an audio_url parameter to the video synthesis API, allowing users to provide an audio track for video generation. The changes are applied consistently across both synchronous and asynchronous methods, and a new sample script is added to demonstrate the feature. My review focuses on improving documentation, code maintainability, and adherence to Python best practices in the new sample code.

negative_prompt (str): The negative prompt is the opposite of the prompt meaning.
template (str): LoRa input, such as gufeng, katong, etc.
img_url (str): The input image url, Generate the URL of the image referenced by the video.
audio_url (str): The input audio url
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The docstring for audio_url is a bit brief. For better clarity and consistency with other parameters like img_url, consider providing a more descriptive explanation of its purpose. Also, this docstring is missing a period at the end. This feedback also applies to the audio_url docstrings on lines 209, 378, and 432.

Suggested change
audio_url (str): The input audio url
audio_url (str): The input audio URL, used for video generation.

Comment on lines +132 to +137
if audio_url is not None and audio_url:
is_upload, res_audio_url = check_and_upload_local(
model, audio_url, api_key)
if is_upload:
has_upload = True
inputs['audio_url'] = res_audio_url
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This logic for handling audio_url is very similar to the logic for other URL-based parameters like img_url, head_frame, etc., leading to code duplication. This can make the code harder to maintain and prone to errors if changes are not applied consistently across all blocks.

To improve maintainability, consider refactoring this repeated logic into a private helper method. For example:

@classmethod
def _maybe_upload_and_set_input(cls, inputs, key, url, model, api_key):
    if not url:
        return False

    is_upload, res_url = check_and_upload_local(model, url, api_key)
    inputs[key] = res_url
    return is_upload

Then you could use it like this:

if cls._maybe_upload_and_set_input(inputs, 'audio_url', audio_url, model, api_key):
    has_upload = True

This would make the _get_input method much cleaner and easier to read. Also, the check if audio_url is not None and audio_url: can be simplified to if audio_url:.

audio_url=audio_url)
if rsp.status_code == HTTPStatus.OK:

print('response: %s' % rsp)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and consistency with modern Python practices, consider using an f-string for string formatting.

Suggested change
print('response: %s' % rsp)
print(f'response: {rsp}')

Comment on lines +20 to +21
print('sync_call Failed, status_code: %s, code: %s, message: %s' %
(rsp.status_code, rsp.code, rsp.message))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For better readability and consistency with modern Python practices, consider using an f-string for string formatting. It also allows you to write this on a single line.

        print(f'sync_call Failed, status_code: {rsp.status_code}, code: {rsp.code}, message: {rsp.message}')



if __name__ == '__main__':
simple_call() No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

It's a standard convention to end files with a single newline character. Please add one here.

    simple_call()

@kevinlin09 kevinlin09 merged commit c4542fa into dashscope:main Oct 20, 2025
@mose-zm mose-zm deleted the support_wan2.5 branch October 21, 2025 01:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants