feat: Add LoRA support for Gemma4ForConditionalGeneration by allgather · Pull Request #39291 · vllm-project/vllm

allgather · 2026-04-08T10:25:00Z

fix #39246 <- only mm

Quoting from issue:

Enable LoRA for Gemma4ForConditionalGeneration
Implement:

get_num_mm_connector_tokens

get_num_mm_encoder_tokens
Support LoRA on:

language backbone (first)

connector / tower modules (optional)

ran on 1xA100.

tests/lora/test_qwenvl.py::test_qwen25vl_vision_lora

PASSED

============================================= 1 passed, 18 warnings in 107.23s (0:01:47) ==============================================

tests/models/multimodal/processing/test_gemma4.py::test_limit_mm_per_prompt

pytest tests/models/multimodal/processing/test_gemma4.py::test_limit_mm_per_prompt -vv -s
PASSED

tests/lora/test_qwenvl.py::test_qwen2vl_multiple_lora_types

pytest tests/lora/test_qwenvl.py::test_qwen2vl_multiple_lora_types -vv -s

============================================= 1 passed, 19 warnings in 133.33s (0:02:13) ==============================================

tests/lora/test_qwenvl.py::test_qwen3vl_vision_lora

pytest tests/lora/test_qwenvl.py::test_qwen3vl_vision_lora -vv -s
PASSED

cc @jeejeelee

Signed-off-by: allgather <all2allops@gmail.com>

gemini-code-assist

Code Review

This pull request implements LoRA support for the Gemma4 multi-modal model by adding the SupportsLoRA interface and configuring MoE layer remapping. Key changes include updating the maximum token calculations for images and videos, where the image token count is now hardcoded to 1120 to accommodate LoRA tower budgeting. Feedback indicates that this hardcoded value should be defined as a named constant to improve code maintainability.

Signed-off-by: allgather <all2allops@gmail.com>

mergify · 2026-04-10T19:43:06Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @allgather.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: allgather <all2allops@gmail.com>

jeejeelee · 2026-04-11T14:05:38Z

Have you tested this with a real LoRA adapter?

allgather · 2026-04-11T16:51:35Z

@jeejeelee I haven't.

I can't get access to compute right now, cloud instances are still not available.

Are there dedicated tests for LoRA that you would suggest I run?

side note on gemma tests I wanted to run:

btw these kinds of tests and the variants are running into an issue bc of the transformers upgrade, and once you force upgrade to transformers v5, it fails with a type error.
tests/models/multimodal/generation/test_common.py::test_single_image_models[gemma4-test_casex]
tests/models/multimodal/generation/test_common.py::test_multi_image_models[gemma4-test_casex]

Signed-off-by: allgather <all2allops@gmail.com>

wow i made syntax errors lol Signed-off-by: allgather <all2allops@gmail.com>

jeejeelee · 2026-04-13T03:41:27Z

I still think we should validate this with an actual LoRA adapter.

allgather · 2026-04-13T03:52:44Z

@jeejeelee sg. lmk if there's already 1 or if it needs to be made.

jeejeelee · 2026-04-13T05:01:09Z

I will test it locally

jeejeelee · 2026-04-13T13:13:22Z

@allgather For gemma4, we currently should only support adding LoRA to the language model(see: tower module), so functions like get_num_mm_encoder_tokens need to be removed in this PR. Then we need to implement the Tower modules in vLLM format, just like Qwen3, makes sense?

Signed-off-by: allgather <all2allops@gmail.com>

allgather · 2026-04-13T19:31:26Z

@jeejeelee im a bit confused:

we currently should only support adding LoRA to the language model

i believe this is done and was merged; meaning that vision + audio are missing LoRA support.

@allgather For gemma4, we currently should only support adding LoRA to the language model(see: tower module), so functions like get_num_mm_encoder_tokens need to be removed in this PR. Then we need to implement the Tower modules in vLLM format, just like Qwen3, makes sense?

Vision and audio currently rely on the hf automodel method. were you trying to say that we want to switch this whole thing to a vllm only impl?

I just committed the changes trying to model after Qwen3 as much as possible. PTAL

jeejeelee

I've added 2 comments. Please remove the unrelated code changes —
you can complete them in a follow-up PR.

Signed-off-by: allgather <all2allops@gmail.com>

EvanWeiner · 2026-04-16T00:49:57Z

I will test it locally

Thank you for the hard work, I am the original submitter of #39246. Just wanted to ask have you tested with Gemma 4 31B IT?

jeejeelee

LGTM, please fix pre-commit failure

Nik-Reddy

Good start on wiring up LoRA for Gemma4 — the get_mm_mapping() change to conditionally include audio modules is a nice defensive touch.

A few concerns:

1. No Gemma4-specific LoRA test
The PR description shows passing tests for Qwen2.5VL and Qwen3VL but no Gemma4 LoRA test. Even a minimal test that loads a small Gemma4 model with a dummy LoRA adapter and checks that forward passes work would add confidence here.

2. Missing get_num_mm_connector_tokens / get_num_mm_encoder_tokens
The linked issue (#39246) calls out implementing these methods. The maintainer's comment suggests get_num_mm_encoder_tokens should actually be removed for now since only language-model LoRA should be supported at this stage (tower modules need vLLM-native reimplementation first). Could you clarify the current scope — is this PR language-model-only LoRA? If so, the get_mm_mapping with connector/tower entries might confuse the LoRA weight loading into thinking those modules are LoRA-eligible.

3. Merge conflicts
Mergify flagged conflicts. You'll need a rebase before this can move forward.

4. The diff is very small (10 additions) for the feature scope
Other multimodal models with LoRA support (like Qwen2VL) also define supported_lora_modules, embedding_modules, and embedding_padding_modules class attributes. Are those inherited from somewhere, or are they missing?

jeejeelee · 2026-04-17T05:26:16Z

Don't sync with main anymore. We can merge it if the failures are unrelated.

allgather · 2026-04-17T11:05:16Z

@jeejeelee understood. I was doing that to re-run ci, my bad. the docs build is unrelated and all tests are passing.

…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>

…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>

feat/ Add LoRA support for gemma multimodal

d762f1f

Signed-off-by: allgather <all2allops@gmail.com>

gemini-code-assist Bot reviewed Apr 8, 2026

View reviewed changes

Comment thread vllm/model_executor/models/gemma4_mm.py Outdated

update

bf24c46

Signed-off-by: allgather <all2allops@gmail.com>

jeejeelee self-assigned this Apr 8, 2026

taking inspiration from 38844 shape of moe path

3a91aab

Signed-off-by: allgather <all2allops@gmail.com>

mergify Bot added the needs-rebase label Apr 10, 2026

Merge branch 'main' into 2

65e9389

Signed-off-by: allgather <all2allops@gmail.com>

mergify Bot removed the needs-rebase label Apr 10, 2026

jeejeelee mentioned this pull request Apr 11, 2026

feat: Add LoRA support for Gemma 4 models #39560

Closed

7 tasks

jeejeelee reviewed Apr 13, 2026

View reviewed changes

Comment thread vllm/model_executor/models/gemma4_mm.py Outdated

allgather added 2 commits April 12, 2026 20:11

undo-moename

32d366e

Signed-off-by: allgather <all2allops@gmail.com>

fix syntax

69df990

wow i made syntax errors lol Signed-off-by: allgather <all2allops@gmail.com>

jeejeelee mentioned this pull request Apr 13, 2026

[Feature]: Enable LoRA support for tower and connector in more MM models #31479

Open

1 task

allgather added 2 commits April 13, 2026 06:53

Merge branch 'main' into 2

b9e0974

feat/gemma4 vLLM implementation (vision & audio only) + LoRA

a158a3d

Signed-off-by: allgather <all2allops@gmail.com>

jeejeelee reviewed Apr 14, 2026

View reviewed changes

Comment thread vllm/model_executor/models/gemma4_mm.py

Comment thread vllm/model_executor/models/gemma4_mm.py

undo rewrite; keep LoRA changes

a8f7aa7

Signed-off-by: allgather <all2allops@gmail.com>

allgather requested a review from jeejeelee April 14, 2026 19:32

anrp mentioned this pull request Apr 14, 2026

[Bugfix]: gemma4 fix lora online serving #39816

Open

jeejeelee approved these changes Apr 16, 2026

View reviewed changes

Nik-Reddy reviewed Apr 16, 2026

View reviewed changes

DarkLight1337 added the ready ONLY add when PR is ready to merge/full CI is needed label Apr 16, 2026

mergify Bot and others added 6 commits April 16, 2026 12:05

Merge branch 'main' into 2

ce52324

Merge branch 'main' into 2

4a9f723

Merge branch 'main' into 2

49fd1b6

Merge branch 'main' into 2

752ce2b

Merge branch 'main' into 2

7ad9caa

Merge branch 'main' into 2

d84e253

vllm-bot merged commit 640cc9d into vllm-project:main Apr 17, 2026
56 of 57 checks passed

danielhanchen mentioned this pull request Apr 23, 2026

Expose flex inference engines behind UNSLOTH_FAST_INFERENCE=1 unslothai/unsloth#5123

Open

5 tasks

allgather deleted the 2 branch April 30, 2026 12:44

Uh oh!

Conversation

allgather commented Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

mergify Bot commented Apr 10, 2026

Uh oh!

jeejeelee commented Apr 11, 2026

Uh oh!

allgather commented Apr 11, 2026

Uh oh!

Uh oh!

jeejeelee commented Apr 13, 2026

Uh oh!

allgather commented Apr 13, 2026

Uh oh!

jeejeelee commented Apr 13, 2026

Uh oh!

jeejeelee commented Apr 13, 2026

Uh oh!

allgather commented Apr 13, 2026

Uh oh!

jeejeelee left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

EvanWeiner commented Apr 16, 2026

Uh oh!

jeejeelee left a comment

Choose a reason for hiding this comment

Uh oh!

Nik-Reddy left a comment

Choose a reason for hiding this comment

Uh oh!

jeejeelee commented Apr 17, 2026

Uh oh!

allgather commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

allgather commented Apr 8, 2026 •

edited

Loading

allgather commented Apr 17, 2026 •

edited

Loading