Skip to content

Conversation

@kevincheng2
Copy link
Collaborator

@kevincheng2 kevincheng2 commented Nov 3, 2025

Motivation

Support EPLB.

为保证MoE部分不同专家之间的负载均衡,会将共享专家和高负载的细粒度专家在集群的不同GPU做多个复制,让GPU把更多的热数据(发给共享专家的)跑起来。

EPLB 通过复制高负载专家(Redundant Experts Strategy)并对专家分配进行启发式调整,确保不同 GPU 之间的负载均衡。这种方法解决了专家并行中因专家负载不均导致的计算资源浪费问题。分层负载平衡策略也可用于预填充阶段,具有较小的专家并行规模。

cp from #4599 and #4782

Modifications

Usage or Command

使用方法:

python -m fastdeploy.entrypoints.openai.api_server \
     ...
     --enable-eplb \
     --eplb-config '{"redundant_experts_num": 8, "redundant_expert_async_load_model_shmem_size_gb": 10}'

Accuracy Tests

Checklist

  • Add at least a tag in the PR title.
    • Tag list: [[FDConfig],[APIServer],[Engine], [Scheduler], [PD Disaggregation], [Executor], [Graph Optimization], [Speculative Decoding], [RL], [Models], [Quantization], [Loader], [OP], [KVCache], [DataProcessor], [BugFix], [Docs], [CI], [Optimization], [Feature], [Benchmark], [Others], [XPU], [HPU], [GCU], [DCU], [Iluvatar], [Metax]]
    • You can add new tags based on the PR content, but the semantics must be clear.
  • Format your code, run pre-commit before commit.
  • Add unit tests. Please write the reason in this PR if no unit tests.
  • Provide accuracy results.
  • If the current PR is submitting to the release branch, make sure the PR has been submitted to the develop branch, then cherry-pick it to the release branch with the [Cherry-Pick] PR tag.

@paddle-bot
Copy link

paddle-bot bot commented Nov 3, 2025

Thanks for your contribution!

@Jiang-Jia-Jun Jiang-Jia-Jun merged commit 3dbe559 into PaddlePaddle:feature/experimental_feature_20250908 Nov 7, 2025
14 of 16 checks passed
@xiaoxiaohehe001 xiaoxiaohehe001 mentioned this pull request Nov 10, 2025
5 tasks
Deleter-D pushed a commit to Deleter-D/FastDeploy that referenced this pull request Nov 26, 2025
* support eplb for ep

* update code

* update code

* update code

* update code

* update code

* update code

* update code

* update code

* update code
@kevincheng2 kevincheng2 deleted the 0908_eplb branch January 19, 2026 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants