Fixed Quantization bug in TransformerLens 3.0 by jlarson4 · Pull Request #1276 · TransformerLensOrg/TransformerLens

jlarson4 · 2026-04-29T22:14:07Z

Description

While reviewing #684, I noticed that for BnB-4bit Mistral: q_proj.weight, k_proj.weight, v_proj.weight, o_proj.weight are all Params4bit with dtype torch.uint8 (packed). Bridge picks the first param's dtype as the "compute dtype" and casts hidden_states from fp32 to uint8, which clamped all values to [0, 255] and truncated decimals. HF's MistralAttention.forward then runs on this destroyed input and produced gibberish.

By ensuring this type cast on hidden_states is not done with integer dtypes, we get bit-precise results when compared against HuggingFace. Also added regression tests that will fail loudly if this cast is ever unintentionally restored for Quantized models.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
My changes generate no new warnings
I have added tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

jlarson4 added 2 commits April 29, 2026 17:08

Fixed Quantization bug in TransformerLens 3.0

d082e86

Format fixes

ebf61d7

jlarson4 merged commit 0a5218c into dev Apr 29, 2026
22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fixed Quantization bug in TransformerLens 3.0#1276

Fixed Quantization bug in TransformerLens 3.0#1276
jlarson4 merged 2 commits intodevfrom
bug/quantization-miscasting

jlarson4 commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jlarson4 commented Apr 29, 2026

Description

Type of change

Checklist:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant