Skip to content

Skipping initialize_weights when model is quantized#39464

Open
DWarez wants to merge 1 commit intohuggingface:mainfrom
DWarez:skip-init-quantized
Open

Skipping initialize_weights when model is quantized#39464
DWarez wants to merge 1 commit intohuggingface:mainfrom
DWarez:skip-init-quantized

Conversation

@DWarez
Copy link
Copy Markdown
Contributor

@DWarez DWarez commented Jul 17, 2025

What does this PR do?

Avoids to perform weights initialization when loading a quantized model. The previous implementation was checking the is_quantized condition only when trying to initialize weights with deepspeed.

Fixes #39366

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

@SunMarc @zach-huggingface

using deepspeed

fix: return types of some methods that are in place
@Rocketknight1
Copy link
Copy Markdown
Member

cc @MekkCyber as well!

@DWarez
Copy link
Copy Markdown
Contributor Author

DWarez commented Jul 26, 2025

@SunMarc any feedback on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RuntimeError when loading llmcompressor W8A8 quantized model: int8 dtype in weight initialization

2 participants