Skip to content

Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py#24472

Merged
tianleiwu merged 3 commits intomainfrom
tlwu/rename_nbits_quantizier
Apr 19, 2025
Merged

Rename matmul_4bits_quantizer.py to matmul_nbits_quantizer.py#24472
tianleiwu merged 3 commits intomainfrom
tlwu/rename_nbits_quantizier

Conversation

@tianleiwu
Copy link
Contributor

@tianleiwu tianleiwu commented Apr 19, 2025

Description

  • Rename filename and class name since it supports 4 and 8 bits.
  • Update HQQWeightOnlyQuantizer to support 8 bits.
  • Update some comments.

Motivation and Context

#24384 added 8 bits support for the default weight only quantizer.

@tianleiwu tianleiwu requested a review from fajin-corp April 19, 2025 00:04
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can commit the suggested changes from lintrunner.

@tianleiwu tianleiwu force-pushed the tlwu/rename_nbits_quantizier branch from 459307e to 538309d Compare April 19, 2025 00:11
Copy link
Contributor

@jiafatom jiafatom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename matmul_4bits_quantizer.py to matmul_quantizer.py? Previously we have 4 bits only, so we add 4bits in between. It seems a matmul_quantizer to me

@tianleiwu
Copy link
Contributor Author

tianleiwu commented Apr 19, 2025

Can we rename matmul_4bits_quantizer.py to matmul_quantizer.py? Previously we have 4 bits only, so we add 4bits in between. It seems a matmul_quantizer to me

nbits naming follows MatMulNBits op.
matmul_quantizer is too general. For example, weights can be quantized to fp8 or fp4 then we need different quantizer name for that.

@tianleiwu tianleiwu merged commit 0d26928 into main Apr 19, 2025
86 of 89 checks passed
@tianleiwu tianleiwu deleted the tlwu/rename_nbits_quantizier branch April 19, 2025 06:09
xiaoyu-work pushed a commit to microsoft/Olive that referenced this pull request Apr 23, 2025
## Describe your changes

onnxruntime uses MatMulNBitsQuantizer since 1.22.0, due to
microsoft/onnxruntime#24472

## Checklist before requesting a review
- [ ] Add unit tests for this change.
- [ ] Make sure all tests can pass.
- [ ] Update documents if necessary.
- [ ] Lint and apply fixes to your code by running `lintrunner -a`
- [ ] Is this a user-facing change? If yes, give a description of this
change to be included in the release notes.
- [ ] Is this PR including examples changes? If yes, please remember to
update [example
documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md)
in a follow-up PR.

## (Optional) Issue link
xiaoyu-work pushed a commit to microsoft/Olive that referenced this pull request Apr 23, 2025
## Describe your changes

onnxruntime uses MatMulNBitsQuantizer since 1.22.0, due to
microsoft/onnxruntime#24472

## Checklist before requesting a review
- [ ] Add unit tests for this change.
- [ ] Make sure all tests can pass.
- [ ] Update documents if necessary.
- [ ] Lint and apply fixes to your code by running `lintrunner -a`
- [ ] Is this a user-facing change? If yes, give a description of this
change to be included in the release notes.
- [ ] Is this PR including examples changes? If yes, please remember to
update [example
documentation](https://github.com/microsoft/Olive/blob/main/docs/source/examples.md)
in a follow-up PR.

## (Optional) Issue link
ashrit-ms pushed a commit that referenced this pull request Apr 24, 2025
### Description

* Rename  filename and class name since it supports 4 and 8 bits.
* Update HQQWeightOnlyQuantizer to support 8 bits.
* Update some comments.

### Motivation and Context
#24384 added 8 bits support
for the default weight only quantizer.
intbf pushed a commit to intbf/onnxruntime that referenced this pull request Apr 25, 2025
…oft#24472)

### Description

* Rename  filename and class name since it supports 4 and 8 bits.
* Update HQQWeightOnlyQuantizer to support 8 bits.
* Update some comments.

### Motivation and Context
microsoft#24384 added 8 bits support
for the default weight only quantizer.

Signed-off-by: bfilipek <bartlomiej.filipek@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants