Skip to content

Fixing SW issue in Gemma3#740

Merged
quic-hemagnih merged 13 commits intoquic:mainfrom
qcdipankar:gemma_fix
Jan 28, 2026
Merged

Fixing SW issue in Gemma3#740
quic-hemagnih merged 13 commits intoquic:mainfrom
qcdipankar:gemma_fix

Conversation

@qcdipankar
Copy link
Copy Markdown
Contributor

@qcdipankar qcdipankar commented Jan 19, 2026

The SW issue came with prompt + generation length > SW.

Fix

  1. Cache updated with HybridSlidingWindowCache in cache utils

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
@qcdipankar qcdipankar marked this pull request as draft January 19, 2026 14:56
@qcdipankar qcdipankar self-assigned this Jan 19, 2026
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
@qcdipankar qcdipankar marked this pull request as ready for review January 21, 2026 06:13
Comment thread examples/image_text_to_text/models/gemma_vision/gemma3_example.py Outdated
Comment thread examples/image_text_to_text/models/gemma_vision/gemma3_example.py Outdated
residual = hidden_states
hidden_states = self.input_layernorm(hidden_states)
past_seen_tokens = past_key_value.get_seq_length() if past_key_value is not None else 0
# past_seen_tokens = past_key_value.get_seq_length() if past_key_value is not None else 0
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this line

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

apologies need to remove the comments

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this comment

Comment thread QEfficient/transformers/cache_utils.py
Comment thread QEfficient/transformers/cache_utils.py
Comment thread QEfficient/transformers/cache_utils.py
Comment thread QEfficient/transformers/cache_utils.py
Comment thread QEfficient/transformers/cache_utils.py Outdated
Comment thread QEfficient/transformers/cache_utils.py
Copy link
Copy Markdown
Contributor

@ochougul ochougul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rerun using update method instead of update_old method from PR https://github.com/quic/efficient-transformers/pull/719/changes

That is the code we would want, Let me know if that produces inaccurate results

Comment thread QEfficient/transformers/cache_utils.py Outdated
Comment thread QEfficient/transformers/cache_utils.py
Comment thread QEfficient/transformers/cache_utils.py Outdated
Comment on lines +21 to +23
# For Testing Purpose Only atleast 6 layers are required
# config.text_config.num_hidden_layers = 6
# config.vision_config.num_hidden_layers = 6
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove, we should run full model in example file

@quic-hemagnih quic-hemagnih merged commit 75bf976 into quic:main Jan 28, 2026
4 checks passed
qcdipankar added a commit to qcdipankar/efficient-transformers that referenced this pull request Jan 30, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
quic-rishinr pushed a commit to qcdipankar/efficient-transformers that referenced this pull request Feb 3, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
quic-hemagnih added a commit that referenced this pull request Feb 4, 2026
Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Co-authored-by: Hem Agnihotri <hemagnih@qti.qualcomm.com>
tchawada pushed a commit to tchawada/QEff_tanisha that referenced this pull request Feb 4, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
tchawada pushed a commit to tchawada/QEff_tanisha that referenced this pull request Feb 4, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
tchawada pushed a commit to tchawada/QEff_tanisha that referenced this pull request Feb 4, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
tchawada pushed a commit to tchawada/QEff_tanisha that referenced this pull request Feb 5, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
qcdipankar added a commit to qcdipankar/efficient-transformers that referenced this pull request Feb 8, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
tchawada pushed a commit to tchawada/QEff_tanisha that referenced this pull request Feb 16, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
smedhe pushed a commit to smedhe/QEff_Sharvari that referenced this pull request Mar 8, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
smedhe pushed a commit to smedhe/QEff_Sharvari that referenced this pull request Mar 8, 2026
The SW issue came with prompt + generation length > SW.

Fix
1. Cache updated with HybridSlidingWindowCache in cache utils

---------

Signed-off-by: Dipankar Sarkar <dipankar@qti.qualcomm.com>
Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>

Signed-off-by: Sharvari Medhe <smedhe@qti.qualcomm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants