-
Notifications
You must be signed in to change notification settings - Fork 690
[XPU] [Optimization] [EP] EP communication optimization. #5145
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #5145 +/- ##
==========================================
Coverage ? 59.80%
==========================================
Files ? 325
Lines ? 40203
Branches ? 6084
==========================================
Hits ? 24042
Misses ? 14272
Partials ? 1889
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
57bf185 to
5fbe284
Compare
b5f1dad to
f7bc8c8
Compare
zhupengyang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Motivation
Implement low-latency version communication operators for pure D requests, and high-throughput version communication operators for P requests in centralized inference scenarios.
Modifications
Usage or Command
export MOE_FFN_USE_DENSE_INPUT=1
Accuracy Tests
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.