Integrate Automated QDQ placement tool - part 2.1#844
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the
📝 WalkthroughWalkthroughThis pull request adds utilities for ONNX operation categorization, introduces a new RegionPattern class for structural pattern analysis of ONNX regions, provides a helper function to identify quantized tensors, and includes comprehensive unit tests for the new pattern matching functionality. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@modelopt/onnx/quantization/autotune/region_pattern.py`:
- Around line 204-232: The function _compute_signature_recursive currently
ignores out-of-range indices from Region.get_nodes which can produce incorrect
signatures; update this method to validate each index returned by
region.get_nodes against the length of graph.nodes and raise a clear exception
(e.g., ValueError) when any index is >= len(nodes_list) or negative, instead of
silently skipping; perform this check before building node_ops (reference
Region.get_nodes, nodes_list, _make_node_with_params_signature, and
_compute_signature_recursive) so invalid region definitions fail fast and
surface the error to callers.
- Around line 166-179: Replace the assert in get_full_insertion_scheme with an
explicit exception so validation can't be skipped under -O flags: after
computing region_pattern = RegionPattern.from_region(region, graph) check
equality with self and raise a clear exception (e.g., ValueError or
RuntimeError) if they differ, including identifying context (self and
region_pattern or region) to aid debugging; keep the rest of the function
(construction of InsertionScheme and collecting insertion points via
NodeInputInsertionPoint.collect_from_region,
ChildRegionInputInsertionPoint.collect_from_region, and
ChildRegionOutputInsertionPoint.collect_from_region) unchanged.
🧹 Nitpick comments (1)
modelopt/onnx/quantization/qdq_utils.py (1)
1040-1064: Add basic ModelProto validation for safer usage.
get_quantized_tensorscurrently assumes a valid ONNX model with a populated graph; aligning with other helpers in this module avoids accidentalAttributeErrorand provides clearer errors.🛠️ Suggested change
def get_quantized_tensors(onnx_model: onnx.ModelProto) -> set[str]: """Get the names of all quantized tensors from an ONNX model. @@ Returns: Set of tensor names that are inputs to DequantizeLinear nodes (i.e., the tensors being dequantized) """ + if not isinstance(onnx_model, onnx.ModelProto): + raise ValueError("Input must be an ONNX model protobuf") + if not onnx_model.graph or not onnx_model.graph.node: + return set() + quantized_tensors = set()
e166860 to
21afd97
Compare
5af753b to
3a6b6d4
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #844 +/- ##
========================================
Coverage 73.69% 73.69%
========================================
Files 196 197 +1
Lines 20432 20600 +168
========================================
+ Hits 15057 15181 +124
- Misses 5375 5419 +44 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
/ok to test 3a6b6d4 |
|
/ok to test 3a6b6d4 |
|
/ok to test 68dabc3 |
68dabc3 to
8400521
Compare
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
Signed-off-by: Will Guo <willg@nvidia.com>
8400521 to
e40667d
Compare
|
/ok to test e40667d |
|
@gcunhase could you help me merge this PR? thanks! |
## What does this PR do?
This PR implements RegionPattern class. RegionPattern describes local
topology structure of a Region. Regions with same Pattern could be
autotune together. Best insertion points of a given pattern could also
be saved to accelerate the next QDQ autotuning.
**Overview:** ?
## Usage
```python
python -m modelopt.onnx.quantization.autotune.region_search --model model.onnx --verbose
```
```
├─ Region 212 (Level 0, Type: COMPOSITE)
│ ├─ Direct nodes: 0
│ ├─ Total nodes (recursive): 9
│ ├─ Children: 1
│ ├─ Inputs: 3 tensors
│ │ - xxx
│ │ - xxx
│ │ - xxx
│ └─ Outputs: 1 tensors
│ - xxx
│
│ Child regions:
│
├─ Region 209 (Level 2, Type: LEAF)
│ ├─ Direct nodes: 9
│ ├─ Total nodes (recursive): 9
│ ├─ Children: 0
│ ├─ Inputs: 11 tensors
│ │ - xxx
```
## Testing
<!-- Mention how have you tested your change if applicable. -->
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: No, document
update is in Part 4
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
CHANGELOG update could be done after all changes are ready.
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Enhanced ONNX quantization analysis with improved region pattern
matching and comparison capabilities.
* Added utility to identify quantized tensors in models for better
analysis.
* **Tests**
* Comprehensive test coverage for region pattern functionality and
quantization utilities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Will Guo <willg@nvidia.com>
## What does this PR do?
This PR implements RegionPattern class. RegionPattern describes local
topology structure of a Region. Regions with same Pattern could be
autotune together. Best insertion points of a given pattern could also
be saved to accelerate the next QDQ autotuning.
**Overview:** ?
## Usage
```python
python -m modelopt.onnx.quantization.autotune.region_search --model model.onnx --verbose
```
```
├─ Region 212 (Level 0, Type: COMPOSITE)
│ ├─ Direct nodes: 0
│ ├─ Total nodes (recursive): 9
│ ├─ Children: 1
│ ├─ Inputs: 3 tensors
│ │ - xxx
│ │ - xxx
│ │ - xxx
│ └─ Outputs: 1 tensors
│ - xxx
│
│ Child regions:
│
├─ Region 209 (Level 2, Type: LEAF)
│ ├─ Direct nodes: 9
│ ├─ Total nodes (recursive): 9
│ ├─ Children: 0
│ ├─ Inputs: 11 tensors
│ │ - xxx
```
## Testing
<!-- Mention how have you tested your change if applicable. -->
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes
- **Did you write any new necessary tests?**: Yes
- **Did you add or update any necessary documentation?**: No, document
update is in Part 4
- **Did you update
[Changelog](https://github.com/NVIDIA/Model-Optimizer/blob/main/CHANGELOG.rst)?**:
CHANGELOG update could be done after all changes are ready.
## Additional Information
<!-- E.g. related issue. -->
<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit
## Release Notes
* **New Features**
* Enhanced ONNX quantization analysis with improved region pattern
matching and comparison capabilities.
* Added utility to identify quantized tensors in models for better
analysis.
* **Tests**
* Comprehensive test coverage for region pattern functionality and
quantization utilities.
<!-- end of auto-generated comment: release notes by coderabbit.ai -->
---------
Signed-off-by: Will Guo <willg@nvidia.com>
What does this PR do?
This PR implements RegionPattern class. RegionPattern describes local topology structure of a Region. Regions with same Pattern could be autotune together. Best insertion points of a given pattern could also be saved to accelerate the next QDQ autotuning.
Overview: ?
Usage
Testing
Before your PR is "Ready for review"
Additional Information
Summary by CodeRabbit
Release Notes
New Features
Tests