Skip to content

Conversation

@zhang-prog
Copy link
Contributor

No description provided.

root and others added 2 commits September 1, 2025 23:32
@paddle-bot
Copy link

paddle-bot bot commented Sep 1, 2025

Thanks for your contribution!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for a new multimodal model called "QF VL" (QFVLForConditionalGeneration) to the FastDeploy framework. The implementation includes comprehensive multimodal processing capabilities for handling both images and videos alongside text inputs.

Key changes include:

  • Addition of the QF VL model architecture with SigLIP vision transformer backbone
  • Implementation of multimodal input processing pipeline for text, images, and videos
  • Integration of the new model type into the existing model registry and preprocessing workflow

Reviewed Changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
fastdeploy/worker/gpu_model_runner.py Adds QF model-specific RoPE embedding handling and vision feature extraction
fastdeploy/multimodal/utils.py Removes unused import and adds formatting cleanup
fastdeploy/multimodal/registry.py Registers the new QFVLForConditionalGeneration model type
fastdeploy/model_executor/models/qf_vl/siglip.py Implements SigLIP vision transformer with rotary embeddings and attention mechanisms
fastdeploy/model_executor/models/qf_vl/qf_vl.py Defines the main QF VL model architecture and weight management
fastdeploy/model_executor/models/qf_vl/projector.py Implements vision-to-text feature projection layer
fastdeploy/model_executor/models/qf_vl/config.py Defines configuration classes for the QF VL model
fastdeploy/model_executor/models/qf_vl/init.py Module initialization file
fastdeploy/input/qf_vl_processor/qf_vl_processor.py Main processor for handling QF VL multimodal inputs
fastdeploy/input/qf_vl_processor/process.py Core data processing logic for tokenization and multimodal handling
fastdeploy/input/qf_vl_processor/image_processor.py Image and video preprocessing implementation
fastdeploy/input/qf_vl_processor/init.py Module initialization for QF VL processor
fastdeploy/input/preprocess.py Integrates QF VL processor into the preprocessing pipeline
Comments suppressed due to low confidence (1)

fastdeploy/input/qf_vl_processor/process.py:1

  • The dtype parameter should be passed to np.concatenate as a separate argument, not as a keyword argument. The correct syntax is np.concatenate([...]).astype(np.int64).
"""

Comment on lines 500 to 501
)
else:
Copy link

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The conditional logic for QF model handling should include the original code inside the else block. The current structure suggests the else block is empty, which may break existing functionality for non-QF models.

Copilot uses AI. Check for mistakes.
patch_embeds = self.patch_embedding(pixel_values.to(dtype=target_dtype)) # shape = [*, width, grid, grid]
embeddings = patch_embeds.flatten(-2).squeeze(-1)
embeddings = rearrange(embeddings, "(b l) d -> b l d", b=batch_size, l=squence_len)
# todo: not dubug
Copy link

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment contains a typo. 'dubug' should be 'debug'.

Suggested change
# todo: not dubug
# todo: not debug

Copilot uses AI. Check for mistakes.
**({extra.value: True} if extra else {}),
}

if "lm_head.weight" or "" in weight_name:
Copy link

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This condition will always evaluate to True because 'lm_head.weight' is a non-empty string. The intended logic appears to be checking if weight_name contains 'lm_head.weight' or is empty.

Suggested change
if "lm_head.weight" or "" in weight_name:
if "lm_head.weight" in weight_name or weight_name == "":

Copilot uses AI. Check for mistakes.
tool_parser_obj=None,
):
"""
Initialize QwenVLProcessor instance.
Copy link

Copilot AI Sep 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring incorrectly refers to 'QwenVLProcessor' instead of 'QFVLProcessor'.

Suggested change
Initialize QwenVLProcessor instance.
Initialize QFVLProcessor instance.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,107 @@
"""
# Copyright (c) 2024 PaddlePaddle Authors. All Rights Reserved.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2025

@zhang-prog zhang-prog closed this Jan 21, 2026
@zhang-prog
Copy link
Contributor Author

In #4396

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants