Remove CompressedLinear support for compressed-tensors > 0.13#43747
Open
colldata79 wants to merge 2 commits intohuggingface:mainfrom
Open
Remove CompressedLinear support for compressed-tensors > 0.13#43747colldata79 wants to merge 2 commits intohuggingface:mainfrom
colldata79 wants to merge 2 commits intohuggingface:mainfrom
Conversation
- Stop passing run_compressed to apply_quantization_config - Always decompress models after loading for CT > 0.13 - Add _dequantize method to CompressedTensorsHfQuantizer - Remove tests that reference deleted CompressedLinear class Related to vllm-project/llm-compressor#2279
Author
|
CI failure in test_processing_layoutlmv2.py is unrelated to this PR - no tokenizer/processor code was modified can a maintainer please verify |
Member
|
cc @dsikka |
kylesayrs
reviewed
Feb 5, 2026
Contributor
kylesayrs
left a comment
There was a problem hiding this comment.
A couple changes are needed, but overall this is the correct approach, thanks for the changes!
@colldata79 Please let me know when all the changes are made, and I'll test them e2e, then we can ask for approval to merge this in.
Contributor
|
[For maintainers] Suggested jobs to run (before merge) run-slow: compressed_tensors_integration |
…store and add tests Signed-off-by: Your Name <your.email@example.com>
c561daf to
5d76c36
Compare
Author
|
@kylesayrs ,Fix is in place . please go ahead with the end to end test |
This was referenced Apr 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Title: Remove CompressedLinear support for compressed-tensors > 0.13
Body:
What does this PR do?
Prepares transformers for the removal of
CompressedLinearfrom compressed-tensors (v0.14+). Users should now callmodel.dequantize()after loading to decompress quantized models.Changes:
run_compressedtoapply_quantization_config_dequantizemethod toCompressedTensorsHfQuantizerCompressedLinearclass (test_default_run_compressed__True,test_default_run_compressed__False)Backwards compatibility is maintained for compressed-tensors ≤ 0.13.
Related to vllm-project/llm-compressor#2279
Companion PR: vllm-project/compressed-tensors#564
Before submitting
Who can review?
@SunMarc @MekkCyber (quantization)