Add pre-commit hook to validate modelopt recipes#1215
Add pre-commit hook to validate modelopt recipes#1215shengliangxu wants to merge 2 commits intomainfrom
Conversation
Reject modelopt recipes that are in some way invalid. Right now check: 1. PTQ YAML quant_cfg need to be in list format instead of the deprecated dict-format. 2. Check general recipe validity. It checks correct top-level keys (e.g. ptq_cfg instead of quantize). And when modelopt is installed, each recipe is also loaded via load_recipe() to catch structural and validation errors. The hook only works on changed or newly added recipes. Signed-off-by: Shengliang Xu <shengliangx@nvidia.com>
📝 WalkthroughWalkthroughThe PR introduces a new pre-commit hook for validating ModelOpt recipe files, consisting of a configuration entry and a new validation script that checks recipe structure, metadata, and quantizer configurations. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
|
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/precommit/check_modelopt_recipes.py`:
- Around line 79-82: The current logic returns an empty list when metadata is
missing or not a dict (metadata variable) which causes malformed recipes to be
treated as non-recipes; change those early returns to append descriptive error
messages to the errors list (e.g., "missing or invalid metadata" / "missing
metadata.recipe_type") and continue validation instead of returning [];
similarly replace the other return at the similar check with an errors.append so
all malformed single-file recipes produce errors; finally, when importing
modelopt (the modelopt import block), ensure import failures do not
short-circuit validation — either fall back to a minimal in-file schema check
and append a clear import-failure error to errors or catch ImportError and
continue with basic validations so invalid recipes are still reported rather
than silently passing.
- Around line 50-53: The loop over quant_cfg currently skips non-dict entries
which allows malformed configs; change the behavior in the for i, entry in
enumerate(quant_cfg) loop to treat any non-dict entry as a failure: when not
isinstance(entry, dict) raise or return a clear error/exit (include the index i
and the offending value) instead of continue, and keep the existing
"quantizer_name" presence check for dict entries so the hook strictly enforces a
list-of-dicts for quant_cfg.
- Around line 106-116: The loop over ("quantize.yml","quantize.yaml") currently
always executes break unconditionally which prevents reporting a missing
quantize file; fix by moving the break so it only runs when a
quantize_file.is_file() branch is hit (i.e., break inside that if) and after the
loop check whether a quantize file was found—if not, append an error to errors
(use the same errors list) indicating the quantize.yml|yaml is missing; keep
using _load_yaml, quant_cfg, _check_quant_cfg and the quantize_file/dir_path
variables to locate and validate the file.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 46086238-1377-4e9d-8d5e-b2ee95b831b8
📒 Files selected for processing (2)
.pre-commit-config.yamltools/precommit/check_modelopt_recipes.py
| for i, entry in enumerate(quant_cfg): | ||
| if not isinstance(entry, dict): | ||
| continue | ||
| if "quantizer_name" not in entry: |
There was a problem hiding this comment.
Enforce list-of-dicts strictly in quant_cfg.
At Line 52, non-dict entries are silently ignored, so malformed quant_cfg can pass the hook.
Proposed fix
elif isinstance(quant_cfg, list):
for i, entry in enumerate(quant_cfg):
if not isinstance(entry, dict):
- continue
+ errors.append(
+ f"{label}: quant_cfg[{i}] must be a mapping with an explicit 'quantizer_name' key."
+ )
+ continue
if "quantizer_name" not in entry:
errors.append(
f"{label}: quant_cfg[{i}] is missing 'quantizer_name'. "
"Each entry must have an explicit 'quantizer_name' key. "
"See https://nvidia.github.io/Model-Optimizer/guides/_quant_cfg.html"
)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for i, entry in enumerate(quant_cfg): | |
| if not isinstance(entry, dict): | |
| continue | |
| if "quantizer_name" not in entry: | |
| for i, entry in enumerate(quant_cfg): | |
| if not isinstance(entry, dict): | |
| errors.append( | |
| f"{label}: quant_cfg[{i}] must be a mapping with an explicit 'quantizer_name' key." | |
| ) | |
| continue | |
| if "quantizer_name" not in entry: |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/precommit/check_modelopt_recipes.py` around lines 50 - 53, The loop
over quant_cfg currently skips non-dict entries which allows malformed configs;
change the behavior in the for i, entry in enumerate(quant_cfg) loop to treat
any non-dict entry as a failure: when not isinstance(entry, dict) raise or
return a clear error/exit (include the index i and the offending value) instead
of continue, and keep the existing "quantizer_name" presence check for dict
entries so the hook strictly enforces a list-of-dicts for quant_cfg.
| metadata = data.get("metadata") | ||
| if not isinstance(metadata, dict) or "recipe_type" not in metadata: | ||
| return [] # not a recipe file | ||
|
|
There was a problem hiding this comment.
Don’t treat missing recipe fields as “not a recipe.”
At Line 81 and Line 92, malformed single-file recipes can return no errors. If modelopt isn’t importable (Line 123), these invalid recipes pass pre-commit.
Proposed fix
metadata = data.get("metadata")
-if not isinstance(metadata, dict) or "recipe_type" not in metadata:
- return [] # not a recipe file
+if not isinstance(metadata, dict):
+ return [f"{label}: missing required top-level 'metadata' mapping."]
+if "recipe_type" not in metadata:
+ return [f"{label}: missing required 'metadata.recipe_type' field."]
if "ptq_cfg" in data and "quantize" not in data:
errors.append(
f"{label}: uses 'ptq_cfg' as the top-level key. "
"PTQ recipes must use 'quantize' instead."
)
quant_section = data["ptq_cfg"]
elif "quantize" in data:
quant_section = data["quantize"]
else:
- return errors
+ return [f"{label}: PTQ recipe must contain top-level 'quantize'."]Also applies to: 91-92
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/precommit/check_modelopt_recipes.py` around lines 79 - 82, The current
logic returns an empty list when metadata is missing or not a dict (metadata
variable) which causes malformed recipes to be treated as non-recipes; change
those early returns to append descriptive error messages to the errors list
(e.g., "missing or invalid metadata" / "missing metadata.recipe_type") and
continue validation instead of returning []; similarly replace the other return
at the similar check with an errors.append so all malformed single-file recipes
produce errors; finally, when importing modelopt (the modelopt import block),
ensure import failures do not short-circuit validation — either fall back to a
minimal in-file schema check and append a clear import-failure error to errors
or catch ImportError and continue with basic validations so invalid recipes are
still reported rather than silently passing.
| for name in ("quantize.yml", "quantize.yaml"): | ||
| quantize_file = dir_path / name | ||
| if quantize_file.is_file(): | ||
| data = _load_yaml(quantize_file) | ||
| if data is not None: | ||
| quant_cfg = data.get("quant_cfg") | ||
| if quant_cfg is not None: | ||
| errors.extend(_check_quant_cfg(quant_cfg, str(quantize_file))) | ||
| break | ||
|
|
||
| return errors |
There was a problem hiding this comment.
Directory recipes should fail when quantize.yml|yaml is missing.
Currently this function returns no error if neither quantize file exists, which creates another false-negative path when loader validation is skipped.
Proposed fix
def _check_dir_recipe(dir_path: Path) -> list[str]:
"""Check a directory-format recipe (recipe.yml + quantize.yml)."""
errors: list[str] = []
+ found_quantize = False
for name in ("quantize.yml", "quantize.yaml"):
quantize_file = dir_path / name
if quantize_file.is_file():
+ found_quantize = True
data = _load_yaml(quantize_file)
if data is not None:
quant_cfg = data.get("quant_cfg")
if quant_cfg is not None:
errors.extend(_check_quant_cfg(quant_cfg, str(quantize_file)))
break
+ if not found_quantize:
+ errors.append(
+ f"{dir_path}: missing quantize file. Looked for quantize.yml or quantize.yaml."
+ )
return errors🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/precommit/check_modelopt_recipes.py` around lines 106 - 116, The loop
over ("quantize.yml","quantize.yaml") currently always executes break
unconditionally which prevents reporting a missing quantize file; fix by moving
the break so it only runs when a quantize_file.is_file() branch is hit (i.e.,
break inside that if) and after the loop check whether a quantize file was
found—if not, append an error to errors (use the same errors list) indicating
the quantize.yml|yaml is missing; keep using _load_yaml, quant_cfg,
_check_quant_cfg and the quantize_file/dir_path variables to locate and validate
the file.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1215 +/- ##
==========================================
- Coverage 75.68% 71.73% -3.95%
==========================================
Files 353 353
Lines 40491 40491
==========================================
- Hits 30644 29048 -1596
- Misses 9847 11443 +1596
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
What does this PR do?
Reject modelopt recipes that are in someway invalid.
Right now check:
PTQ YAML quant_cfg need to be in list format instead of the deprecated dict-format.
Check general recipe validity. It checks correct top-level keys (e.g. ptq_cfg instead of quantize). And when modelopt is installed, each recipe is also loaded via load_recipe() to catch structural and validation errors.
The hook only works on changed or newly added recipes.
Testing
local test using invalid recipes
Summary by CodeRabbit