Add Qwen3-VL support to Minitron pruning by eagle705 · Pull Request #919 · NVIDIA/Model-Optimizer

eagle705 · 2026-02-23T13:50:56Z

What does this PR do?

Type of change: ? new feature

Overview: This PR adds VLM support to the Minitron pruning flow (including Qwen3-VL paths), and fixes pipeline-parallel runtime issues specific to mRoPE models.

Key updates:

Added VLM wrapper-aware pruning by resolving the prunable language backbone (language_model) while preserving non-language components.
Improved HF export compatibility for VLM checkpoints by using robust dummy-model creation and architecture suffix normalization for AutoBridge.
Fixed PP + mRoPE runtime failures (position_ids=None on non-first pipeline stages) by ensuring position_ids are synthesized from decoder input shape (with safe kwargs fallback).
Updated generation utilities to:
- provide explicit position_ids for mRoPE models,
- send vision tensors only during prefill (step 0), not decode steps.
Extended dynamic conversion coverage for VLM/VLM-MoE paths:
- TE linear compatibility for Megatron NAS modules,
- grouped MoE expert handling,
- IdentityOp-safe conversion/export paths,
- auto-registration of forward-overriding subclasses used by VLM modules,
- preserved original runtime class behavior via dynamic MRO conversion for QKV/proj wrappers.

Usage

torchrun --nproc_per_node 2 prune_minitron.py \
    --pp_size 2 \
    --hf_model_name_or_path /work/checkpoints/hf/Qwen3-VL-8B-Instruct \
    --hparams_to_skip num_attention_heads \
    --prune_target_params 6e9 \
    --output_hf_path /work/checkpoints/compressor/Qwen3-VL-8B-Instruct-Pruned-6B

torchrun --nproc_per_node 2 prune_minitron.py \
    --pp_size 2 \
    --hf_model_name_or_path /work/checkpoints/hf/Qwen3-VL-30B-A3B-Instruct \
    --prune_target_params 26e9 \
    --hparams_to_skip num_attention_heads \
    --output_hf_path /work/checkpoints/compressor/Qwen3-VL-30B-A3B-Instruct-Pruned-6B

Testing

Manual multi-GPU validation on Megatron-Bridge pruning flows:

Qwen3-VL-8B (pp_size=2) now runs calibration/evaluation through full iterations (previous NoneType.ndim mRoPE crash is resolved).
Qwen3-VL-30B-A3B (pp_size=2) proceeds through NAS search and candidate evaluation path without PP+mRoPE runtime crash.
Verified that very low target-params constraints may still be infeasible (No subnets found fitting the constraints!), which is a search-space/constraint outcome, not a PP runtime failure.

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes/No
Did you write any new necessary tests?: Yes/No
Did you add or update any necessary documentation?: Yes/No
Did you update Changelog?: Yes/No

Additional Information

This PR focuses on enabling VLM/VLM-MoE pruning paths in Megatron-Bridge + ModelOpt, with mRoPE pipeline-parallel runtime stability improvements.

Signed-off-by: joosungy <joosungy@nvidia.com>

copy-pr-bot · 2026-02-23T13:51:00Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

eagle705 added 2 commits February 23, 2026 05:40

[1/3] Add Qwen3-VL runtime and export support

290f0a5

Signed-off-by: joosungy <joosungy@nvidia.com>

[2/3] Support VLM/VLM-MoE conversion paths

06a8c9d

Signed-off-by: joosungy <joosungy@nvidia.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Add Qwen3-VL support to Minitron pruning#919

Add Qwen3-VL support to Minitron pruning#919
eagle705 wants to merge 2 commits intoNVIDIA:mainfrom
eagle705:add-vlm-pruning

eagle705 commented Feb 23, 2026

Uh oh!

copy-pr-bot bot commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

eagle705 commented Feb 23, 2026

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

copy-pr-bot bot commented Feb 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant