Skip to content

fix compile#1774

Open
wenhuach21 wants to merge 2 commits intomainfrom
fix_compile
Open

fix compile#1774
wenhuach21 wants to merge 2 commits intomainfrom
fix_compile

Conversation

@wenhuach21
Copy link
Copy Markdown
Contributor

Description

Please briefly describe your main changes, the motivation.

Type of Change

Bug fix

Related Issues

Fixes or relates to #

Checklist Before Submitting

  • My code has been tested locally.
  • Documentation has been updated as needed.
  • New or updated tests are included where applicable.
  • The CUDA CI has passed. You can trigger it by commenting /azp run Unit-Test-CUDA-AutoRound.

Signed-off-by: Wenhua Cheng <wenhua.cheng@intel.com>
Copilot AI review requested due to automatic review settings April 30, 2026 14:34
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR appears intended to address a compilation-related issue by disabling torch.compile usage for block_forward in both the legacy and new compressor paths, as well as in the quantization algorithm base.

Changes:

  • Commented out block_forward compile/selection logic in the legacy compressor base.
  • Commented out block_forward compile/selection logic in the new compressor base hardware setup.
  • Removed the enable_torch_compile branch that compiled/cached block_forward in quantization base resolution.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
auto_round/compressors_new/base.py Disables (comments out) block-forward compile/selection during hardware setup.
auto_round/compressors/base.py Disables (comments out) legacy block_forward assignment/compile logic in __init__.
auto_round/algorithms/quantization/base.py Removes the compiled/cached block_forward resolution branch, making resolution always return plain block_forward.

Comment on lines +536 to +548
# if (
# (self.act_bits < 16 and (not self.act_dynamic or self.data_type == "nvfp")) # have hooks
# or self.enable_alg_ext # Use imatrix
# or not self.disable_opt_rtn # Use imatrix
# ):
# self.block_forward = block_forward
# else:
# # TODO FIXME
# # This function could not be compiled, causing a large accuracy drop when `enable_alg_ext` is used.
# # To avoid issues, remove it in all scenarios except WOQ.
# self.block_forward = (
# compile_func(block_forward, self.device) if self.enable_torch_compile else block_forward
# )
Comment on lines +974 to +977
# if self.enable_torch_compile and not _needs_plain_forward and self.need_calib:
# self.block_forward = compile_func(block_forward, self.compress_context.device)
# else:
# self.block_forward = block_forward
Comment on lines 377 to 381
self.config.is_act_quantize and (not self.config.act_dynamic or self.config.is_act_nv_fp)
) or self.enable_alg_ext:
self._resolved_block_forward = block_forward
elif self.compress_context.enable_torch_compile:
compiled = self.__dict__.get("_compiled_block_forward")
if compiled is None:
compiled = compile_func(block_forward, self.compress_context.device)
self._compiled_block_forward = compiled
self._resolved_block_forward = compiled
else:
self._resolved_block_forward = block_forward
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants