Skip to content

[Feat][Plugin] Enable DeepSeek-V3.2 for vLLM-ATOM Plugin#494

Open
kliuae-amd wants to merge 36 commits intomainfrom
kliuae/plugin_deepseekv32
Open

[Feat][Plugin] Enable DeepSeek-V3.2 for vLLM-ATOM Plugin#494
kliuae-amd wants to merge 36 commits intomainfrom
kliuae/plugin_deepseekv32

Conversation

@kliuae-amd
Copy link
Copy Markdown
Contributor

@kliuae-amd kliuae-amd commented Apr 6, 2026

Motivation

This PR is a follow up to #399 in adding DeepSeek-V3.2 support to ATOM's vLLM plugin mode.

Technical Details

Test Plan

Accuracy test with lm_eval

Model: deepseek-ai/DeepSeek-V3.2

Server command:

ATOM_DISABLE_VLLM_PLUGIN=0 \
ATOM_DISABLE_VLLM_PLUGIN_ATTENTION=0 \
vllm serve deepseek-ai/DeepSeek-V3.2 \
  -tp 8 \
  --gpu-memory-utilization 0.8 \
  --no-enable-prefix-caching \
  --disable-uvicorn-access-log \
  --trust-remote-code \
  --compilation-config '{"cudagraph_mode": "FULL_AND_PIECEWISE"}' \
  --kv-cache-dtype {auto, fp8} \
  --block-size 1

Test Result

lm_eval command

lm_eval --model local-completions  --model_args model=deepseek-ai/DeepSeek-V3.2,base_url=http://localhost:8000/v1/completions --batch_size 100  --tasks gsm8k --num_fewshot 20

vLLM plugin, bf16 kv cache

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 20 exact_match _ 0.9545 _ 0.0057
strict-match 20 exact_match _ 0.9545 _ 0.0057

vLLM plugin, fp8 kv cache

Tasks Version Filter n-shot Metric Value Stderr
gsm8k 3 flexible-extract 20 exact_match _ 0.9431 _ 0.0064
strict-match 20 exact_match _ 0.9393 _ 0.0066

Performance on MI355X, TP8

ISL/OSL Concurrency KV Cache vLLM Plugin Req/s ATOM Req/s vLLM Plugin over ATOM (Req/s) vLLM Plugin Total tok/s ATOM Total tok/s vLLM Plugin over ATOM (tok/s)
1k/1k 128 fp8 3.51 3.80 -7.63% 7194.55 7786.85 -7.61%
1k/1k 128 bf16 3.47 3.58 -3.07% 7099.09 7326.96 -3.11%
1k/1k 64 fp8 2.28 2.32 -1.72% 4677.17 4744.63 -1.42%
1k/1k 64 bf16 2.26 2.22 +1.80% 4621.56 4538.04 +1.84%
8k/1k 128 fp8 1.42 1.59 -10.69% 13077.77 14614.80 -10.52%
8k/1k 128 bf16 1.39 1.03 +34.95% 12831.04 9505.53 +34.99%
8k/1k 64 fp8 1.16 1.22 -4.92% 10683.58 11288.62 -5.36%
8k/1k 64 bf16 1.16 0.86 +34.88% 10731.82 7962.13 +34.79%

Submission Checklist

kliuae and others added 30 commits March 20, 2026 07:01
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
@kliuae-amd kliuae-amd changed the title [Feat][Plugin] Enable DeepSeek-V3.2 for vLLM OOT Plugin [Feat][Plugin] Enable DeepSeek-V3.2 for vLLM-ATOM Plugin Apr 15, 2026
@kliuae-amd kliuae-amd marked this pull request as ready for review April 15, 2026 09:04
wuhuikx and others added 5 commits April 17, 2026 12:29
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Signed-off-by: kliuae <kuanfu.liu@embeddedllm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants