Integrate Automated QDQ placement tool - part 2.1 by willg-nv · Pull Request #844 · NVIDIA/Model-Optimizer

willg-nv · 2026-02-03T03:08:36Z

What does this PR do?

This PR implements RegionPattern class. RegionPattern describes local topology structure of a Region. Regions with same Pattern could be autotune together. Best insertion points of a given pattern could also be saved to accelerate the next QDQ autotuning.

Overview: ?

Usage

python -m modelopt.onnx.quantization.autotune.region_search --model model.onnx --verbose

    ├─ Region 212 (Level 0, Type: COMPOSITE)
    │  ├─ Direct nodes: 0
    │  ├─ Total nodes (recursive): 9
    │  ├─ Children: 1
    │  ├─ Inputs: 3 tensors
    │  │    - xxx
    │  │    - xxx
    │  │    - xxx
    │  └─ Outputs: 1 tensors
    │       - xxx
    │
    │  Child regions:
    │
      ├─ Region 209 (Level 2, Type: LEAF) 
      │  ├─ Direct nodes: 9
      │  ├─ Total nodes (recursive): 9
      │  ├─ Children: 0
      │  ├─ Inputs: 11 tensors
      │  │    - xxx

Testing

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: Yes
Did you add or update any necessary documentation?: No, document update is in Part 4
Did you update Changelog?: CHANGELOG update could be done after all changes are ready.

Additional Information

Summary by CodeRabbit

Release Notes

New Features
- Enhanced ONNX quantization analysis with improved region pattern matching and comparison capabilities.
- Added utility to identify quantized tensors in models for better analysis.
Tests
- Comprehensive test coverage for region pattern functionality and quantization utilities.

copy-pr-bot · 2026-02-03T03:08:40Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-02-03T03:08:54Z

Important

Review skipped

Auto incremental reviews are disabled on this repository.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

🔍 Trigger a full review

📝 Walkthrough

Walkthrough

This pull request adds utilities for ONNX operation categorization, introduces a new RegionPattern class for structural pattern analysis of ONNX regions, provides a helper function to identify quantized tensors, and includes comprehensive unit tests for the new pattern matching functionality.

Changes

Cohort / File(s)	Summary
Operation Categorization Helpers `modelopt/onnx/op_types.py`	Introduces eight new utility functions (get_bool_ops, get_bitwise_ops, get_value_check_ops, get_comparison_ops, get_conditional_ops, get_aggregation_ops, get_set_ops, get_symmetric_ops) that return sets of ONNX operation names grouped by category.
Region Pattern Matching `modelopt/onnx/quantization/autotune/region_pattern.py`	Adds RegionPattern class for structural pattern representation and matching of ONNX region graphs. Includes recursive signature computation, pattern comparison, insertion point resolution, and tree formatting utilities.
Quantization Utilities `modelopt/onnx/quantization/qdq_utils.py`	Implements get_quantized_tensors() function to identify tensor inputs to DequantizeLinear nodes in ONNX models.
Region Pattern Tests `tests/unit/onnx/quantization/autotune/test_region_pattern.py`	Adds comprehensive unit test coverage for RegionPattern class, including pattern creation, equality, hashing, tree formatting, symmetric operation handling, and pattern-to-region matching semantics.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'Integrate Automated QDQ placement tool - part 2.1' accurately reflects the PR objectives of implementing RegionPattern for QDQ autotuning.
Docstring Coverage	✅ Passed	Docstring coverage is 94.12% which is sufficient. The required threshold is 80.00%.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@modelopt/onnx/quantization/autotune/region_pattern.py`:
- Around line 204-232: The function _compute_signature_recursive currently
ignores out-of-range indices from Region.get_nodes which can produce incorrect
signatures; update this method to validate each index returned by
region.get_nodes against the length of graph.nodes and raise a clear exception
(e.g., ValueError) when any index is >= len(nodes_list) or negative, instead of
silently skipping; perform this check before building node_ops (reference
Region.get_nodes, nodes_list, _make_node_with_params_signature, and
_compute_signature_recursive) so invalid region definitions fail fast and
surface the error to callers.
- Around line 166-179: Replace the assert in get_full_insertion_scheme with an
explicit exception so validation can't be skipped under -O flags: after
computing region_pattern = RegionPattern.from_region(region, graph) check
equality with self and raise a clear exception (e.g., ValueError or
RuntimeError) if they differ, including identifying context (self and
region_pattern or region) to aid debugging; keep the rest of the function
(construction of InsertionScheme and collecting insertion points via
NodeInputInsertionPoint.collect_from_region,
ChildRegionInputInsertionPoint.collect_from_region, and
ChildRegionOutputInsertionPoint.collect_from_region) unchanged.

🧹 Nitpick comments (1)

modelopt/onnx/quantization/qdq_utils.py (1)
1040-1064: Add basic ModelProto validation for safer usage.

get_quantized_tensors currently assumes a valid ONNX model with a populated graph; aligning with other helpers in this module avoids accidental AttributeError and provides clearer errors.
🛠️ Suggested change
 def get_quantized_tensors(onnx_model: onnx.ModelProto) -> set[str]:
     """Get the names of all quantized tensors from an ONNX model.
@@
     Returns:
         Set of tensor names that are inputs to DequantizeLinear nodes
         (i.e., the tensors being dequantized)
     """
+    if not isinstance(onnx_model, onnx.ModelProto):
+        raise ValueError("Input must be an ONNX model protobuf")
+    if not onnx_model.graph or not onnx_model.graph.node:
+        return set()
+
     quantized_tensors = set()

modelopt/onnx/quantization/autotune/region_pattern.py

tests/unit/onnx/quantization/autotune/test_region_pattern.py

ajrasane

LGTM

codecov · 2026-02-03T07:56:30Z

Codecov Report

❌ Patch coverage is 73.21429% with 45 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.69%. Comparing base (e247f5d) to head (e40667d).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
...elopt/onnx/quantization/autotune/region_pattern.py	75.94%	38 Missing ⚠️
modelopt/onnx/quantization/qdq_utils.py	12.50%	7 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff            @@
##             main     #844    +/-   ##
========================================
  Coverage   73.69%   73.69%            
========================================
  Files         196      197     +1     
  Lines       20432    20600   +168     
========================================
+ Hits        15057    15181   +124     
- Misses       5375     5419    +44

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

willg-nv · 2026-02-03T08:13:34Z

/ok to test 3a6b6d4

gcunhase

LGTM as well

gcunhase · 2026-02-03T18:26:46Z

/ok to test 3a6b6d4

gcunhase · 2026-02-03T19:13:57Z

@willg-nv before merging, can you please recover important information from docstrings here? Thanks!

willg-nv · 2026-02-04T01:38:17Z

/ok to test 68dabc3

Signed-off-by: Will Guo <willg@nvidia.com>

ajrasane · 2026-02-04T21:32:06Z

/ok to test e40667d

willg-nv · 2026-02-06T01:44:59Z

@gcunhase could you help me merge this PR? thanks!

## What does this PR do? This PR implements RegionPattern class. RegionPattern describes local topology structure of a Region. Regions with same Pattern could be autotune together. Best insertion points of a given pattern could also be saved to accelerate the next QDQ autotuning. **Overview:** ? ## Usage ```python python -m modelopt.onnx.quantization.autotune.region_search --model model.onnx --verbose ``` ``` ├─ Region 212 (Level 0, Type: COMPOSITE) │ ├─ Direct nodes: 0 │ ├─ Total nodes (recursive): 9 │ ├─ Children: 1 │ ├─ Inputs: 3 tensors │ │ - xxx │ │ - xxx │ │ - xxx │ └─ Outputs: 1 tensors │ - xxx │ │ Child regions: │ ├─ Region 209 (Level 2, Type: LEAF) │ ├─ Direct nodes: 9 │ ├─ Total nodes (recursive): 9 │ ├─ Children: 0 │ ├─ Inputs: 11 tensors │ │ - xxx ``` ## Testing  ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes - **Did you write any new necessary tests?**: Yes - **Did you add or update any necessary documentation?**: No, document update is in Part 4 - **Did you update [Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**: CHANGELOG update could be done after all changes are ready. ## Additional Information   ## Summary by CodeRabbit ## Release Notes * **New Features** * Enhanced ONNX quantization analysis with improved region pattern matching and comparison capabilities. * Added utility to identify quantized tensors in models for better analysis. * **Tests** * Comprehensive test coverage for region pattern functionality and quantization utilities.  --------- Signed-off-by: Will Guo <willg@nvidia.com>

willg-nv requested a review from a team as a code owner February 3, 2026 03:08

willg-nv requested a review from cjluo-nv February 3, 2026 03:08

coderabbitai bot reviewed Feb 3, 2026

View reviewed changes

modelopt/onnx/quantization/autotune/region_pattern.py Show resolved Hide resolved

modelopt/onnx/quantization/autotune/region_pattern.py Show resolved Hide resolved

willg-nv force-pushed the dev-willg-integrate-auto-qdq-placement-part2.1 branch from e166860 to 21afd97 Compare February 3, 2026 03:31