feat: Add LoRA support for Gemma4ForConditionalGeneration#39291
Conversation
Signed-off-by: allgather <all2allops@gmail.com>
There was a problem hiding this comment.
Code Review
This pull request implements LoRA support for the Gemma4 multi-modal model by adding the SupportsLoRA interface and configuring MoE layer remapping. Key changes include updating the maximum token calculations for images and videos, where the image token count is now hardcoded to 1120 to accommodate LoRA tower budgeting. Feedback indicates that this hardcoded value should be defined as a named constant to improve code maintainability.
Signed-off-by: allgather <all2allops@gmail.com>
|
This pull request has merge conflicts that must be resolved before it can be |
Signed-off-by: allgather <all2allops@gmail.com>
|
Have you tested this with a real LoRA adapter? |
|
@jeejeelee I haven't. I can't get access to compute right now, cloud instances are still not available. Are there dedicated tests for LoRA that you would suggest I run? side note on gemma tests I wanted to run:
btw these kinds of tests and the variants are running into an issue bc of the transformers upgrade, and once you force upgrade to transformers v5, it fails with a type error. |
Signed-off-by: allgather <all2allops@gmail.com>
wow i made syntax errors lol Signed-off-by: allgather <all2allops@gmail.com>
|
I still think we should validate this with an actual LoRA adapter. |
|
@jeejeelee sg. lmk if there's already 1 or if it needs to be made. |
|
I will test it locally |
|
@allgather For gemma4, we currently should only support adding LoRA to the language model(see: tower module), so functions like |
Signed-off-by: allgather <all2allops@gmail.com>
|
@jeejeelee im a bit confused:
i believe this is done and was merged; meaning that vision + audio are missing LoRA support.
Vision and audio currently rely on the hf automodel method. were you trying to say that we want to switch this whole thing to a vllm only impl? I just committed the changes trying to model after Qwen3 as much as possible. PTAL |
jeejeelee
left a comment
There was a problem hiding this comment.
I've added 2 comments. Please remove the unrelated code changes —
you can complete them in a follow-up PR.
Signed-off-by: allgather <all2allops@gmail.com>
Thank you for the hard work, I am the original submitter of #39246. Just wanted to ask have you tested with Gemma 4 31B IT? |
jeejeelee
left a comment
There was a problem hiding this comment.
LGTM, please fix pre-commit failure
Nik-Reddy
left a comment
There was a problem hiding this comment.
Good start on wiring up LoRA for Gemma4 — the get_mm_mapping() change to conditionally include audio modules is a nice defensive touch.
A few concerns:
1. No Gemma4-specific LoRA test
The PR description shows passing tests for Qwen2.5VL and Qwen3VL but no Gemma4 LoRA test. Even a minimal test that loads a small Gemma4 model with a dummy LoRA adapter and checks that forward passes work would add confidence here.
2. Missing get_num_mm_connector_tokens / get_num_mm_encoder_tokens
The linked issue (#39246) calls out implementing these methods. The maintainer's comment suggests get_num_mm_encoder_tokens should actually be removed for now since only language-model LoRA should be supported at this stage (tower modules need vLLM-native reimplementation first). Could you clarify the current scope — is this PR language-model-only LoRA? If so, the get_mm_mapping with connector/tower entries might confuse the LoRA weight loading into thinking those modules are LoRA-eligible.
3. Merge conflicts
Mergify flagged conflicts. You'll need a rebase before this can move forward.
4. The diff is very small (10 additions) for the feature scope
Other multimodal models with LoRA support (like Qwen2VL) also define supported_lora_modules, embedding_modules, and embedding_padding_modules class attributes. Are those inherited from somewhere, or are they missing?
|
Don't sync with main anymore. We can merge it if the failures are unrelated. |
|
@jeejeelee understood. I was doing that to re-run ci, my bad. the docs build is unrelated and all tests are passing. |
…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com> Signed-off-by: Avinash Singh <avinashsingh.rcoem@gmail.com>
…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
…ct#39291) Signed-off-by: allgather <all2allops@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
fix #39246 <- only mm
Quoting from issue:
ran on 1xA100.
tests/lora/test_qwenvl.py::test_qwen25vl_vision_loratests/models/multimodal/processing/test_gemma4.py::test_limit_mm_per_prompttests/lora/test_qwenvl.py::test_qwen2vl_multiple_lora_typestests/lora/test_qwenvl.py::test_qwen3vl_vision_loracc @jeejeelee