Skip to content

Fix quantized model dispatch with device_map='auto'#39468

Open
Krish0909 wants to merge 2 commits intohuggingface:mainfrom
Krish0909:fix-quantized-model-dispatch
Open

Fix quantized model dispatch with device_map='auto'#39468
Krish0909 wants to merge 2 commits intohuggingface:mainfrom
Krish0909:fix-quantized-model-dispatch

Conversation

@Krish0909
Copy link
Copy Markdown
Contributor

@Krish0909 Krish0909 commented Jul 17, 2025

What does this PR do?

Fixes issue #39461 where quantized models fail to load with device_map="auto" on Linux but work on Windows.

Root cause: dispatch_model calls model.to(device) on bitsandbytes quantized models, which isn't supported. Platform differences in device detection/handling cause this code path to be hit on Linux but not Windows.

Fix: Skip dispatch_model for bitsandbytes quantized models since they're already correctly placed during quantization.

Fixes #39461

please verify quantization experts (bitsandbytes, autogpt): @SunMarc @MekkCyber

Krishnan Vignesh and others added 2 commits July 17, 2025 14:32
- Skip dispatch_model for bitsandbytes quantized models
- Quantized models are already placed on correct devices
- Fixes ValueError: .to is not supported for 4-bit/8-bit models
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

I can't make sense of this works on Windows but not on Linux AutoModelForCausalLM.from_pretrained

1 participant