[Bug] vLLM+ATOM_OOT (gpt-oss-120b) server crashes for particular sequence lengths

I'm noticing that the server runs fine for `--random-input-len 1024 --random-output-len 1024 --max-concurrency 8` but crashes for `--random-input-len **4096** --random-output-len 1024 --max-concurrency 8`

Error:

>  (EngineCore pid=34835)   File "/app/ATOM/atom/plugin/attention.py", line 349, in build
(EngineCore pid=34835)     query_lens_cpu[num_decodes + num_extends :].max().item()
(EngineCore pid=34835)     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore pid=34835) RuntimeError: max(): Expected reduction dim to be specified for input.numel() == 0. Specify the reduction dim with the 'dim' a
rgument.

**Docker used:**
`rocm/atom-dev:vllm-latest`
ATOM commit: 58af3e4
vLLM commit: 0.19.1.dev0+g2a69949bd.d20260420.rocm722

**Machine used:** mi355

[logs_client.txt](https://github.com/user-attachments/files/26939050/logs_client.txt)
[logs_server.txt](https://github.com/user-attachments/files/26939051/logs_server.txt)


**Server Launch cmd:**
```
export ATOM_ENABLE_QK_NORM_ROPE_CACHE_QUANT_FUSION=1
export VLLM_ROCM_USE_AITER=1

vllm serve /data/models/gpt-oss-120b/ -tp 1 --disable-uvicorn-access-log --no-enable-prefix-caching --port 8004 --kv-cache-dtype=fp8
```

**Client Launch cmd:**
```
vllm bench serve --model /data/models/gpt-oss-120b/  --dataset-name random --random-input-len 4096 --random-output-len 1024 --max-concurrency 8 --num-prompts 80 --percentile-metrics ttft,tpot,itl,e2el --metric-percentiles 99 --ignore-eos --temperature 0 --seed 0 --trust-remote-code
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] vLLM+ATOM_OOT (gpt-oss-120b) server crashes for particular sequence lengths #623

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] vLLM+ATOM_OOT (gpt-oss-120b) server crashes for particular sequence lengths #623

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions