[AMD][MI30X]Update Qwen3.5 perf#986
Conversation
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
1 similar comment
|
Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you |
|
bump @zhentaocc |
…ormance - Added new config keys for Qwen3.5 BF16 and FP8 benchmarks on MI300X and MI325X. - Updated Docker image to lmsysorg/sglang:v0.5.10rc0-rocm720-mi30x for better compatibility. - Enhanced benchmark scripts with additional parameters for context length and prefill tokens. - Adjusted memory fraction settings and added new flags for server launch to optimize performance.
…d FP8 configurations on MI300X and MI325X to streamline server launch commands.
…end instead of 'triton' for BF16 and FP8 configurations on MI300X and MI325X.
e307c2d to
2c17191
Compare
…rsion for MI300X and MI325X setups. Changed image tag from 'v0.5.10rc0-rocm720-mi30x' to 'v0.5.10-rocm720-mi30x' for consistency and reliability.
|
@chunfangamd @zhentaocc This looks good to me, what is the hold up? |

e2e Test: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24170176668
Co-Authored-by: @chunfangamd @1am9trash