Skip to content

feat: Add LoRA support for Gemma4ForConditionalGeneration#39291

Merged
vllm-bot merged 15 commits into
vllm-project:mainfrom
allgather:2
Apr 17, 2026
Merged

feat: Add LoRA support for Gemma4ForConditionalGeneration#39291
vllm-bot merged 15 commits into
vllm-project:mainfrom
allgather:2

Conversation

@allgather
Copy link
Copy Markdown
Contributor

@allgather allgather commented Apr 8, 2026

fix #39246 <- only mm

Quoting from issue:

Enable LoRA for Gemma4ForConditionalGeneration
Implement:

  • get_num_mm_connector_tokens
  • get_num_mm_encoder_tokens
    Support LoRA on:
  • language backbone (first)
  • connector / tower modules (optional)

ran on 1xA100.

tests/lora/test_qwenvl.py::test_qwen25vl_vision_lora
PASSED

============================================= 1 passed, 18 warnings in 107.23s (0:01:47) ==============================================
tests/models/multimodal/processing/test_gemma4.py::test_limit_mm_per_prompt
pytest tests/models/multimodal/processing/test_gemma4.py::test_limit_mm_per_prompt -vv -s
PASSED
tests/lora/test_qwenvl.py::test_qwen2vl_multiple_lora_types
pytest tests/lora/test_qwenvl.py::test_qwen2vl_multiple_lora_types -vv -s

============================================= 1 passed, 19 warnings in 133.33s (0:02:13) ==============================================
tests/lora/test_qwenvl.py::test_qwen3vl_vision_lora
pytest tests/lora/test_qwenvl.py::test_qwen3vl_vision_lora -vv -s
PASSED

cc @jeejeelee

Signed-off-by: allgather <all2allops@gmail.com>
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements LoRA support for the Gemma4 multi-modal model by adding the SupportsLoRA interface and configuring MoE layer remapping. Key changes include updating the maximum token calculations for images and videos, where the image token count is now hardcoded to 1120 to accommodate LoRA tower budgeting. Feedback indicates that this hardcoded value should be defined as a named constant to improve code maintainability.

Comment thread vllm/model_executor/models/gemma4_mm.py Outdated
Signed-off-by: allgather <all2allops@gmail.com>
@jeejeelee jeejeelee self-assigned this Apr 8, 2026
Signed-off-by: allgather <all2allops@gmail.com>
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 10, 2026

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @allgather.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify Bot added the needs-rebase label Apr 10, 2026
Signed-off-by: allgather <all2allops@gmail.com>
@jeejeelee
Copy link
Copy Markdown
Collaborator

Have you tested this with a real LoRA adapter?

@allgather
Copy link
Copy Markdown
Contributor Author

@jeejeelee I haven't.

I can't get access to compute right now, cloud instances are still not available.

Are there dedicated tests for LoRA that you would suggest I run?

side note on gemma tests I wanted to run:

btw these kinds of tests and the variants are running into an issue bc of the transformers upgrade, and once you force upgrade to transformers v5, it fails with a type error.
tests/models/multimodal/generation/test_common.py::test_single_image_models[gemma4-test_casex]
tests/models/multimodal/generation/test_common.py::test_multi_image_models[gemma4-test_casex]

Comment thread vllm/model_executor/models/gemma4_mm.py Outdated
Signed-off-by: allgather <all2allops@gmail.com>
wow i made syntax errors lol

Signed-off-by: allgather <all2allops@gmail.com>
@jeejeelee
Copy link
Copy Markdown
Collaborator

I still think we should validate this with an actual LoRA adapter.

@allgather
Copy link
Copy Markdown
Contributor Author

@jeejeelee sg. lmk if there's already 1 or if it needs to be made.

@jeejeelee
Copy link
Copy Markdown
Collaborator

I will test it locally

@jeejeelee
Copy link
Copy Markdown
Collaborator

@allgather For gemma4, we currently should only support adding LoRA to the language model(see: tower module), so functions like get_num_mm_encoder_tokens need to be removed in this PR. Then we need to implement the Tower modules in vLLM format, just like Qwen3, makes sense?

@allgather
Copy link
Copy Markdown
Contributor Author

@jeejeelee im a bit confused:

we currently should only support adding LoRA to the language model

i believe this is done and was merged; meaning that vision + audio are missing LoRA support.

@allgather For gemma4, we currently should only support adding LoRA to the language model(see: tower module), so functions like get_num_mm_encoder_tokens need to be removed in this PR. Then we need to implement the Tower modules in vLLM format, just like Qwen3, makes sense?

Vision and audio currently rely on the hf automodel method. were you trying to say that we want to switch this whole thing to a vllm only impl?

I just committed the changes trying to model after Qwen3 as much as possible. PTAL

Copy link
Copy Markdown
Collaborator

@jeejeelee jeejeelee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added 2 comments. Please remove the unrelated code changes —
you can complete them in a follow-up PR.

Comment thread vllm/model_executor/models/gemma4_mm.py
Comment thread vllm/model_executor/models/gemma4_mm.py
Signed-off-by: allgather <all2allops@gmail.com>
@EvanWeiner
Copy link
Copy Markdown

I will test it locally

Thank you for the hard work, I am the original submitter of #39246. Just wanted to ask have you tested with Gemma 4
31B IT?

Copy link
Copy Markdown
Collaborator

@jeejeelee jeejeelee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, please fix pre-commit failure

Copy link
Copy Markdown

@Nik-Reddy Nik-Reddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start on wiring up LoRA for Gemma4 — the get_mm_mapping() change to conditionally include audio modules is a nice defensive touch.

A few concerns:

1. No Gemma4-specific LoRA test
The PR description shows passing tests for Qwen2.5VL and Qwen3VL but no Gemma4 LoRA test. Even a minimal test that loads a small Gemma4 model with a dummy LoRA adapter and checks that forward passes work would add confidence here.

2. Missing get_num_mm_connector_tokens / get_num_mm_encoder_tokens
The linked issue (#39246) calls out implementing these methods. The maintainer's comment suggests get_num_mm_encoder_tokens should actually be removed for now since only language-model LoRA should be supported at this stage (tower modules need vLLM-native reimplementation first). Could you clarify the current scope — is this PR language-model-only LoRA? If so, the get_mm_mapping with connector/tower entries might confuse the LoRA weight loading into thinking those modules are LoRA-eligible.

3. Merge conflicts
Mergify flagged conflicts. You'll need a rebase before this can move forward.

4. The diff is very small (10 additions) for the feature scope
Other multimodal models with LoRA support (like Qwen2VL) also define supported_lora_modules, embedding_modules, and embedding_padding_modules class attributes. Are those inherited from somewhere, or are they missing?

@DarkLight1337 DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 16, 2026
@jeejeelee
Copy link
Copy Markdown
Collaborator

Don't sync with main anymore. We can merge it if the failures are unrelated.

@allgather
Copy link
Copy Markdown
Contributor Author

allgather commented Apr 17, 2026

@jeejeelee understood. I was doing that to re-run ci, my bad. the docs build is unrelated and all tests are passing.

@vllm-bot vllm-bot merged commit 640cc9d into vllm-project:main Apr 17, 2026
56 of 57 checks passed
bnellnm pushed a commit to neuralmagic/vllm that referenced this pull request Apr 20, 2026
…ct#39291)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
baonudesifeizhai pushed a commit to baonudesifeizhai/vllm that referenced this pull request Apr 23, 2026
…ct#39291)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
whk-lab pushed a commit to whk-lab/vllm that referenced this pull request Apr 23, 2026
…ct#39291)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
avinashsingh77 pushed a commit to avinashsingh77/vllm that referenced this pull request Apr 27, 2026
…ct#39291)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
@allgather allgather deleted the 2 branch April 30, 2026 12:44
marcospaulo pushed a commit to torad-labs/vllm that referenced this pull request May 3, 2026
…ct#39291)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
mystous pushed a commit to mystous/vllm_hybrid that referenced this pull request May 10, 2026
…ct#39291)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…ct#39291)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
my-other-github-account pushed a commit to my-other-github-account/vllm that referenced this pull request May 15, 2026
…ct#39291)

Signed-off-by: allgather <all2allops@gmail.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Add LoRA support for Gemma4ForConditionalGeneration / Gemma 4 models

6 participants