add gemma4 by Qubitium · Pull Request #2663 · ModelCloud/GPTQModel

Qubitium · 2026-04-02T22:40:46Z

No description provided.

Qubitium · 2026-04-02T23:47:17Z

+    # The local env does not ship Marlin runtime kernels, so validation reloads must stay on Torch.
+    LOAD_BACKEND = BACKEND.TORCH
+    # Gemma 4 full-attention layers expand to 512-dim heads, which FlashAttention cannot execute.
+    USE_FLASH_ATTN = False


Gemma4 does not work with FA2. Is this a HF model upload issue due to the Gemma4 variants or Transformers just made up some bad unit tests?

huggingface/transformers#45202

Qubitium added 4 commits April 2, 2026 22:39

add gemma4

7fab0df

Update README.md

e0c62a1

Bump version from 6.0.0 to 6.0.2

9db3fa9

Merge branch 'main' into gemma4

0c31231

Qubitium merged commit 25dd8ea into main Apr 2, 2026
5 checks passed

Qubitium deleted the gemma4 branch April 2, 2026 22:53

Qubitium commented Apr 2, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add gemma4#2663

add gemma4#2663
Qubitium merged 4 commits intomainfrom
gemma4

Qubitium commented Apr 2, 2026

Uh oh!

Uh oh!

Qubitium Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Qubitium commented Apr 2, 2026

Uh oh!

Uh oh!

Qubitium Apr 2, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant