Skip to content

Fix Qwen3.5 model config type error#655

Open
zovonoir wants to merge 2 commits intoROCm:mainfrom
zovonoir:fix-model-config
Open

Fix Qwen3.5 model config type error#655
zovonoir wants to merge 2 commits intoROCm:mainfrom
zovonoir:fix-model-config

Conversation

@zovonoir
Copy link
Copy Markdown

@zovonoir zovonoir commented Apr 28, 2026

Fix Qwen3.5 RoPE validation ignore keys type

Summary

This PR fixes a compatibility issue in the Qwen3.5 and Qwen3.5-MoE model configs when running ATOM with newer versions of transformers.

The affected configs set ignore_keys_at_rope_validation as a list, but the latest transformers RoPE validation code treats this field as a set and performs a set union with {"partial_rotary_factor"}. This causes ATOM standalone mode to fail with:

TypeError: unsupported operand type(s) for |: 'list' and 'set'
image

Upstream Context

This failure was exposed by Hugging Face Transformers PR #41250, which introduced stricter config attribute validation and updated the RoPE validation path to union ignore_keys_at_rope_validation with {"partial_rotary_factor"}.

The upstream change is reasonable because the RoPE validation code now expects ignore_keys_at_rope_validation to behave like a set. Instead of changing or pinning transformers, this PR fixes ATOM's Qwen3.5 configs to provide the expected type.

Root Cause

atom/model_config/qwen3_5.py and atom/model_config/qwen3_5_moe.py initialized:

kwargs["ignore_keys_at_rope_validation"] = [
    "mrope_section",
    "mrope_interleaved",
]

In newer transformers, modeling_rope_utils.py later executes:

self.ignore_keys_at_rope_validation = self.ignore_keys_at_rope_validation | {"partial_rotary_factor"}

That operation requires self.ignore_keys_at_rope_validation to be a set-like value.

Fix

Change both Qwen3.5 config definitions from list literals to set literals:

kwargs["ignore_keys_at_rope_validation"] = {
    "mrope_section",
    "mrope_interleaved",
}

This preserves the same ignored RoPE validation keys while matching the type expected by transformers.

Reproduction

The issue can be reproduced with the standalone ATOM path using the following quick script:

IMAGE="rocm/atom-dev:latest"
CONTAINER_NAME="repro_atom_rope_bug"
MODEL_PATH="/.cache/huggingface/Qwen3.5-27B"
HOST_MODEL_MOUNT_PATH="/raid/models"

echo "=== ATOM Qwen3.5 rope validation bug reproducer ==="
echo "Image: ${IMAGE}"
echo ""

docker rm -f "${CONTAINER_NAME}" >/dev/null 2>&1 || true

echo ">>> Starting container..."
docker run --rm \
  --device=/dev/kfd --device=/dev/dri \
  --group-add video --ipc=host --network host \
  -e HIP_VISIBLE_DEVICES=0 \
  -e CUDA_VISIBLE_DEVICES=0 \
  -e HF_HOME=/.cache/huggingface/ \
  --mount type=bind,src=${HOST_MODEL_MOUNT_PATH},dst=/.cache/huggingface/ \
  --name "${CONTAINER_NAME}" \
  "${IMAGE}" \
  python3 -c "
from atom.model_config.qwen3_5 import Qwen3_5TextConfig
import json

with open('${MODEL_PATH}/config.json') as f:
    cfg = json.load(f)
text_config_dict = cfg['text_config']

print('>>> Attempting to create Qwen3_5TextConfig...')
print(f'>>> transformers version: ', end='')
import transformers; print(transformers.__version__)

config = Qwen3_5TextConfig(**text_config_dict)
print('>>> Success (bug not triggered)')
" 2>&1

With the list-based ignore_keys_at_rope_validation, the config construction fails with:

TypeError: unsupported operand type(s) for |: 'list' and 'set'

The reproducer also documents the impact scope: this affects ATOM standalone mode (atom.entrypoints.openai_server) and does not affect the SGLang plugin mode, where SGLang handles config parsing.

Validation

  • Confirmed the same list-based pattern only appeared in the two affected Qwen3.5 config files.
  • Ran syntax compilation for the updated files:
python -m py_compile atom/model_config/qwen3_5.py atom/model_config/qwen3_5_moe.py

Copilot AI review requested due to automatic review settings April 28, 2026 09:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes a compatibility break between ATOM’s Qwen3.5/Qwen3.5-MoE model configs and newer transformers RoPE validation logic by ensuring ignore_keys_at_rope_validation is provided as a set (so set-union operations in transformers don’t raise a TypeError).

Changes:

  • Update Qwen3.5 text config to set ignore_keys_at_rope_validation as a set instead of a list.
  • Update Qwen3.5-MoE text config to set ignore_keys_at_rope_validation as a set instead of a list.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
atom/model_config/qwen3_5.py Switch ignore_keys_at_rope_validation from list to set to satisfy transformers RoPE validation expectations.
atom/model_config/qwen3_5_moe.py Switch ignore_keys_at_rope_validation from list to set to satisfy transformers RoPE validation expectations.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@zovonoir zovonoir self-assigned this Apr 28, 2026
@valarLip valarLip requested a review from ganyi1996ppo April 28, 2026 16:07
@zovonoir
Copy link
Copy Markdown
Author

Hi @valarLip @ganyi1996ppo, the CI checks have failed again on re-run. The consistently failing tests (gpt-oss-120b in both ATOM Test and ATOM vLLM Test) and the stuck job (DeepSeek-R1-FP8 TP4 pending for over 1 hour) appear unrelated to this PR's changes (list→set fix for ignore_keys_at_rope_validation). The other previously failing tests (Meta-Llama-3-8B-Instruct, Qwen3.5-35B-A3B-FP8 TP2) passed on re-run. Could you help review if these are known flaky tests and assist with merging? Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants