Skip to content

feat: add Qwen3.6-35B-A3B VLM finetune recipe#1882

Merged
snowmanwwg merged 1 commit intomainfrom
huiyingl/add_qwen3_6_35b_config
Apr 16, 2026
Merged

feat: add Qwen3.6-35B-A3B VLM finetune recipe#1882
snowmanwwg merged 1 commit intomainfrom
huiyingl/add_qwen3_6_35b_config

Conversation

@HuiyingLi
Copy link
Copy Markdown
Contributor

Summary

  • Adds examples/vlm_finetune/qwen3_5_moe/qwen3_6_35b.yaml — a MedPix-VQA fine-tuning recipe for Qwen/Qwen3.6-35B-A3B (next-gen Qwen3 MoE, same qwen3_5_moe arch).
  • Adds news bullet in README.md, a row in docs/model-coverage/latest-models.md, and entries in docs/model-coverage/vlm/qwen/qwen3-5-vl.md (available models + example recipes).
  • Follows the docs pattern established by feat: minimax m27 #1785.

Test plan

Note: requires the collate_fn fix in #1799 for the default MedPix recipe (local_batch_size: 1, max_length: 2048) to avoid occasional batches where the only sample exceeds max_length.

🤖 Generated with Claude Code

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented Apr 16, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Adds a ready-to-run MedPix-VQA fine-tuning recipe for `Qwen/Qwen3.6-35B-A3B`
under the existing `qwen3_5_moe` architecture (same custom model impl).
Verified on 8×H100: 100 steps complete, loss 1.86 → ~1.5, peak mem 64 GiB/GPU.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
@HuiyingLi HuiyingLi force-pushed the huiyingl/add_qwen3_6_35b_config branch from 7a16517 to f6a4f99 Compare April 16, 2026 18:54
@HuiyingLi
Copy link
Copy Markdown
Contributor Author

/ok to test f6a4f99

@HuiyingLi HuiyingLi added the docs-only With great power comes great responsibility. label Apr 16, 2026
@snowmanwwg snowmanwwg merged commit 906ecae into main Apr 16, 2026
32 checks passed
@snowmanwwg snowmanwwg deleted the huiyingl/add_qwen3_6_35b_config branch April 16, 2026 19:03
linnanwang pushed a commit that referenced this pull request Apr 24, 2026
Adds a ready-to-run MedPix-VQA fine-tuning recipe for `Qwen/Qwen3.6-35B-A3B`
under the existing `qwen3_5_moe` architecture (same custom model impl).
Verified on 8×H100: 100 steps complete, loss 1.86 → ~1.5, peak mem 64 GiB/GPU.

Signed-off-by: HuiyingLi <willwin.lee@gmail.com>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs-only With great power comes great responsibility.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants