Skip to content

Conversation

@kevincheng2
Copy link
Collaborator

@kevincheng2 kevincheng2 commented Nov 10, 2025

Motivation

support async download features

Modifications

Usage or Command

优化通信速度(不强依赖,建议配置):
export FD_ENABLE_E2W_TENSOR_CONVERT = 1

启动时需要增加参数:

  python -m fastdeploy.entrypoints.openai.api_server \
       ...
       --enable-async-download-features

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link

paddle-bot bot commented Nov 10, 2025

Thanks for your contribution!

rainyfly
rainyfly previously approved these changes Nov 12, 2025
lizhenyun01
lizhenyun01 previously approved these changes Nov 12, 2025
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds asynchronous download capabilities for multimodal features (image, video, audio) from Baidu Object Storage (BOS). The implementation uses a thread pool executor to download features asynchronously during request scheduling, preventing blocking operations on the main scheduler thread. The feature is controlled by the --enable-async-download-features CLI flag.

Key Changes:

  • Added download_from_bos() utility function for downloading pickled feature data from BOS
  • Implemented async preprocessing in the resource manager with thread pool executor for non-blocking downloads
  • Enhanced tensor conversion to handle lists of features and bidirectional numpy/tensor conversion
  • Refactored error handling to support async download failures with proper error codes

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
fastdeploy/utils.py Added download_from_bos() function to download and deserialize features from BOS using pickle
fastdeploy/inter_communicator/engine_worker_queue.py Added to_numpy() method for engine-to-worker tensor conversion and enhanced to_tensor() to handle feature lists
fastdeploy/engine/sched/resource_manager_v1.py Implemented async preprocessing pipeline with thread pool executor, download orchestration, and skip/error request handling in scheduler
fastdeploy/engine/request.py Added async_process_futures, error_message, and error_code fields to support async preprocessing
fastdeploy/engine/common_engine.py Refactored error response handling into _send_error_response() method and integrated async error handling in scheduler loop
fastdeploy/engine/args_utils.py Added --enable-async-download-features CLI argument
fastdeploy/config.py Added enable_async_download_features configuration flag to ParallelConfig

@kevincheng2 kevincheng2 merged commit 3ce2c8f into PaddlePaddle:feature/experimental_feature_20250908 Nov 18, 2025
14 checks passed
Deleter-D pushed a commit to Deleter-D/FastDeploy that referenced this pull request Nov 26, 2025
* add async download

* update code

* fix bug

* update code

* update code

* fix bugs

* update code

---------

Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>
@kevincheng2 kevincheng2 deleted the preprocess_download branch January 19, 2026 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants