Skip to content

[Feature Request] Add support for Qwen3.5/3.6 MoE (qwen3_5_moe) architecture #929

@shrey2003

Description

@shrey2003

Is your feature request related to a problem? Please describe.

Attempting to export Qwen/Qwen3.6-35B-A3B (and other Qwen3.5/3.6 MoE variants) using QEfficient.cloud.export fails at the config loading stage because qwen3_5_moe is not a recognized architecture in the transformers version that QEfficient currently pins:

KeyError: 'qwen3_5_moe'
ValueError: The checkpoint you are trying to load has model type `qwen3_5_moe`
but Transformers does not recognize this architecture.

The qwen3_5_moe architecture was introduced in transformers>=5.3.0, but QEfficient's current dependency tree pins transformers<5.x, making it impossible to export or compile Qwen3.5/3.6 MoE models for AIC100 without breaking QEfficient's own internal imports.

Describe the solution you'd like

  1. Add qwen3_5_moe / Qwen3_5MoeForCausalLM to QEfficient's supported model registry (QEfficient/utils/model_registery.py)
  2. Add the necessary AIC100-specific PyTorch transforms for the Qwen3.5 MoE attention and MoE routing layers in QEfficient/transformers/models/
  3. Bump the transformers dependency pin to >=5.3.0 (while fixing the internal import breakages caused by symbols removed in transformers>=5.4.0, such as AwqBackendPackingMethod, AWQLinearVersion, HybridCache, and Qwen2RMSNorm from qwen2_5_vl)
  4. Add an ONNX export config for text-generation-with-past task for this architecture

Describe alternatives you've considered

  • Manually patching the installed QEfficient venv files with sed to stub out missing imports — this is unsustainable as each fix exposes another broken import down the chain
  • Using optimum-cli export onnx directly — same blocker, as optimum also has a transformers>=5.x compatibility issue
  • Downgrading to a supported Qwen variant (e.g., Qwen2-57B-A14B) — undesirable as Qwen3.5/3.6 MoE offers significantly better performance per parameter

Additional context


Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions