feat(qualcomm): PTQPass add constant ptq impl. by chenghuaWang · Pull Request #593 · UbiquitousLearning/mllm

chenghuaWang · 2026-01-08T14:37:53Z

Summary by CodeRabbit

Release Notes

Bug Fixes
- Improved quantization handling for constant tensors across asymmetric and symmetric quantization modes, ensuring proper value computation and bounds checking during the quantization process.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-08T14:38:11Z

📝 Walkthrough

Walkthrough

Runtime PTQ quantization handling added for constant tensors in the PTQPass. The enhancement processes constants through AsymPerTensor and SymPerTensor quantization paths, computing PTQ-quantized values using scale and zero_point parameters, clamping to quant_min/quant_max bounds, and persisting results to both attribute data and tensor storage.

Changes

Cohort / File(s)	Summary
PTQ Constant Tensor Quantization `mllm/backends/qnn/aot/passes/PTQPass.cpp`	Adds runtime PTQ handling for constant tensors in `_recursiveSolveNormalImpl`. Implements separate branches for AsymPerTensor (using scale and zero_point) and SymPerTensor (using scale only) quantization paths. Computes PTQ-quantized values, clamps to valid range, and writes results to both constant attribute data (VectorFP32Attr/VectorInt16Attr) and corresponding TV tensor storage. (+84 lines)

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Whiskers twitching with delight,
Constants quantized, nice and tight,
Scales and zeros, clamped just right,
PTQ magic shines so bright, ✨
Tensors stored both day and night!

🚥 Pre-merge checks | ✅ 1 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Description check	⚠️ Warning	No pull request description was provided by the author, leaving required context and explanation of the changes missing.	Add a comprehensive pull request description explaining the changes, their purpose, and impact. Reference the repository's contribution guidelines for the expected format.
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (1 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly summarizes the main change: adding PTQ (Post-Training Quantization) constant implementation to the PTQPass in the Qualcomm backend.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @mllm/backends/qnn/aot/passes/PTQPass.cpp:
- Around line 155-168: The local variable constant_dtype in PTQPass.cpp is never
used after being set; remove the declaration and all assignments to
constant_dtype inside the block that handles tv->getAttr("constant") (the
branches where constant_ir is VectorFP32Attr and VectorInt16Attr) to eliminate
dead code, and apply the same removal for the SymPerTensor branch (the analogous
constant_dtype declaration/assignments in the SymPerTensor handling section) so
no unused constant_dtype remains.

🧹 Nitpick comments (1)

mllm/backends/qnn/aot/passes/PTQPass.cpp (1)
205-241: Extract duplicated constant quantization logic into a helper function.

Lines 205-241 duplicate nearly all logic from lines 154-192 (the AsymPerTensor branch), with the only meaningful difference being the absence of zero_point in the quantization formula. This violates the DRY principle and increases maintenance burden.

Consider extracting a helper function like:
void quantizeConstantTensor(
    const ir::tensor::TensorValue::ptr_t& tv,
    const Tensor& scale,
    const Tensor* zero_point,  // nullptr for symmetric
    int32_t quant_min,
    int32_t quant_max) {
  // Unified logic for reading, quantizing, and writing constant values
}
Then call it from both branches:
case kAsymPerTensor:
  quantizeConstantTensor(tv, scale, &zero_point, min_v, max_v);
  break;
case kSymPerTensor:
  quantizeConstantTensor(tv, scale, nullptr, min_v, max_v);
  break;

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between dc11090 and e26e61b.

📒 Files selected for processing (1)

mllm/backends/qnn/aot/passes/PTQPass.cpp

🧰 Additional context used

📓 Path-based instructions (4)

{mllm,mllm-cli,pymllm}/**/*

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

{mllm,mllm-cli,pymllm}/**/*: Files must not contain C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, or DEL 0x7F. Horizontal tab (0x09) and line feed (0x0A) are explicitly allowed.
All files must be encoded in UTF-8 without BOM.
Any violation of character set (Rule 1) or encoding (Rule 2) requirements must cause the review to fail.
No line may end with trailing whitespace.
Use Unix line endings (LF).
File and directory names must consist only of printable Unicode characters, excluding C0 control codes 0x00–0x08, 0x0B–0x0C, 0x0E–0x1F, C1 control codes 0x7F–0x9F, and DEL 0x7F.
Only use acceptable file extensions: .c, .cc, .cpp, .cxx, .h, .hh, .hpp, .py, .pyi, .sh, .txt, .md, .yml, .yaml, .json, .toml.
Optional license headers, if present, must comply with character set rules (no C0/C1 control codes except tab and line feed).

Files:

mllm/backends/qnn/aot/passes/PTQPass.cpp

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi,sh}: TODO and FIXME comments must be written as 'TODO:' or 'FIXME:' followed by UTF-8 text that adheres to character set rules.
Encourage consistent coding style and patterns with the existing codebase.
Ensure code is portable across supported platforms (e.g., Linux, Windows) unless explicitly platform-specific.

Files:

mllm/backends/qnn/aot/passes/PTQPass.cpp

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,py,pyi}: Prioritize production-ready code quality by evaluating time and space complexity of algorithms and data structures, and suggest more efficient alternatives for operations with high complexity (e.g., O(n^2) or worse) when feasible.
Avoid unnecessary object creation in loops or hot paths.
Check for proper error handling and resource cleanup (e.g., using try-finally, context managers, or RAII).
Ensure functions that can fail return appropriate error codes or raise exceptions.
Validate inputs for public APIs and critical internal functions.
Add comments for complex algorithms or non-obvious logic.
Identify potential security issues (e.g., buffer overflows, injection risks, insecure temporary files) and recommend using secure alternatives (e.g., parameterized queries, secure random generators).
Suggest adding unit tests for untested complex logic or edge cases.
Ensure code is testable by avoiding global state and using dependency injection.
Flag overly complex functions (e.g., high cyclomatic complexity) and suggest breaking them down.
Use named constants instead of magic numbers.
Add appropriate logging (e.g., debug, info, warning, error) for significant events and errors, avoiding sensitive data exposure.

Files:

mllm/backends/qnn/aot/passes/PTQPass.cpp

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}

📄 CodeRabbit inference engine (.github/copilot-instructions.md)

{mllm,mllm-cli,pymllm}/**/*.{c,cc,cpp,cxx,h,hh,hpp,py,pyi}: Ensure public APIs, classes, and functions have clear docstrings or comments explaining purpose, parameters, returns, and errors.
Adhere to language-specific best practices and idioms (e.g., PEP 8 for Python, Google C++ Style Guide for C++).

Files:

mllm/backends/qnn/aot/passes/PTQPass.cpp

🪛 Clang (14.0.6)

mllm/backends/qnn/aot/passes/PTQPass.cpp

[error] 156-156: variable 'constant_dtype' is not initialized

(cppcoreguidelines-init-variables,-warnings-as-errors)

[error] 160-160: variable name 'ci' is too short, expected at least 3 characters