feat(pt_expt): add frozen model#5318
feat(pt_expt): add frozen model#5318wanghan-iapcm wants to merge 5 commits intodeepmodeling:masterfrom
Conversation
Load a pre-frozen model file (.pte or any format) via convert_backend serialization, reconstruct with BaseModel.deserialize, and delegate all model API methods to the inner model. Cannot be trained.
…module - Create dpmodel FrozenModel (NativeOP + BaseModel) with all delegation methods, so pt_expt can inherit instead of duplicating - Rewrite pt_expt FrozenModel to use @torch_module wrapping dpmodel class - Override __init__ to handle .pte files natively via serialize_from_file, fall back to generic backend detection for other formats - Override serialize() to delegate directly to inner model (unlike pt which must reconstruct from model_def_script due to opaque ScriptModule) - Add pt_expt support to frozen consistency test using BaseModel as pt_expt_class (same pattern as pt) - Guard setUpModule model generation with backend availability checks
📝 WalkthroughWalkthroughAdds a FrozenModel wrapper class in dpmodel and a pt_expt variant that load/deserializes frozen model files and delegate model behavior to an inner model; tests updated to support the PT-EXPT backend. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant FrozenModel
participant Backend
participant Serializer
participant InnerModel
Client->>FrozenModel: __init__(model_file)
alt model_file ends with .pte
FrozenModel->>Serializer: serialize_from_file(model_file)
Serializer-->>FrozenModel: serialized_dict
else other formats
FrozenModel->>Backend: detect_backend_by_model(model_file)
Backend-->>FrozenModel: detected_backend
FrozenModel->>Serializer: backend.serialize(model_file)
Serializer-->>FrozenModel: serialized_dict
end
FrozenModel->>InnerModel: BaseModel.deserialize(serialized_dict)
InnerModel-->>FrozenModel: inner_model_instance
FrozenModel->>InnerModel: inner_model.eval()
FrozenModel-->>Client: ready (self.model set)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
📝 Coding Plan
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment Tip CodeRabbit can approve the review once all CodeRabbit's comments are resolved.Enable the |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
deepmd/dpmodel/model/frozen.py (1)
141-153: Make unsupported paths fail loudly.
FrozenModelis documented as non-trainable, butupdate_sel()currently returns success anddeserialize()raises a non-actionable error. Turning both into explicit failures would make accidental training/deserialization misuse much easier to diagnose.🔧 Suggested change
`@classmethod` def deserialize(cls, data: dict) -> NoReturn: - raise RuntimeError("Should not touch here.") + raise RuntimeError( + "FrozenModel cannot be deserialized directly; deserialize the inner model data instead." + ) `@classmethod` def update_sel( cls, train_data: DeepmdDataSystem, type_map: list[str] | None, local_jdata: dict, ) -> tuple[dict, float | None]: """Update the selection and perform neighbor statistics.""" - return local_jdata, None + raise RuntimeError( + "FrozenModel cannot be used for training or neighbor-statistics updates." + )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/dpmodel/model/frozen.py` around lines 141 - 153, The FrozenModel class currently masks misuse by returning from update_sel and raising a vague error in deserialize; change both to fail loudly: in FrozenModel.deserialize(cls, data) raise a clear RuntimeError (or NotImplementedError) with a message like "FrozenModel is non-deserializable/non-trainable; operation not supported", and in FrozenModel.update_sel(cls, train_data, type_map, local_jdata) raise the same explicit exception instead of returning (remove the tuple return), referencing the methods deserialize and update_sel on class FrozenModel so accidental training or deserialization attempts fail fast and provide actionable messages.source/tests/consistent/model/test_frozen.py (1)
54-60: Add coverage for the new.pteloader path.
deepmd/pt_expt/model/frozen.py, Lines 22-29, now has a dedicated.ptebranch, but this test module still only provisions and parameterizes.pth,.pb, and.dpfixtures. That leaves the main new path in this PR unexercised. If.pteis pt_expt-only, a small targeted test is enough.Also applies to: 75-76
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@source/tests/consistent/model/test_frozen.py` around lines 54 - 60, The test setup and parameterization omit the new .pte loader path; update setUpModule to provision the .pte model by calling case.get_model(".pte", pte_model) (guard with INSTALLED_PT_EXPT if .pte is only available when pt_expt is installed) and update the test parameterization near the parameter lines (around where .pth/.pb/.dp are listed) to include the ".pte" case so the new deepmd/pt_expt/model/frozen.py branch is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@deepmd/pt_expt/model/frozen.py`:
- Around line 19-37: The FrozenModelDP constructor currently calls
self.model.eval() but does not disable gradients, so parameters remain
trainable; after the existing self.model.eval() call in __init__ of
FrozenModelDP, call requires_grad_(False) on the wrapped model (i.e.,
self.model.requires_grad_(False)) to freeze all parameters and prevent optimizer
updates.
---
Nitpick comments:
In `@deepmd/dpmodel/model/frozen.py`:
- Around line 141-153: The FrozenModel class currently masks misuse by returning
from update_sel and raising a vague error in deserialize; change both to fail
loudly: in FrozenModel.deserialize(cls, data) raise a clear RuntimeError (or
NotImplementedError) with a message like "FrozenModel is
non-deserializable/non-trainable; operation not supported", and in
FrozenModel.update_sel(cls, train_data, type_map, local_jdata) raise the same
explicit exception instead of returning (remove the tuple return), referencing
the methods deserialize and update_sel on class FrozenModel so accidental
training or deserialization attempts fail fast and provide actionable messages.
In `@source/tests/consistent/model/test_frozen.py`:
- Around line 54-60: The test setup and parameterization omit the new .pte
loader path; update setUpModule to provision the .pte model by calling
case.get_model(".pte", pte_model) (guard with INSTALLED_PT_EXPT if .pte is only
available when pt_expt is installed) and update the test parameterization near
the parameter lines (around where .pth/.pb/.dp are listed) to include the ".pte"
case so the new deepmd/pt_expt/model/frozen.py branch is exercised.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 9db1b5b6-7d97-40d6-afcf-800b9f6869d1
📒 Files selected for processing (4)
deepmd/dpmodel/model/frozen.pydeepmd/pt_expt/model/__init__.pydeepmd/pt_expt/model/frozen.pysource/tests/consistent/model/test_frozen.py
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #5318 +/- ##
==========================================
- Coverage 82.29% 82.29% -0.01%
==========================================
Files 775 777 +2
Lines 77627 77696 +69
Branches 3676 3675 -1
==========================================
+ Hits 63887 63938 +51
- Misses 12566 12585 +19
+ Partials 1174 1173 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…rialize Use explicit NativeOP.__init__(self) instead of super(FrozenModelDP, self) to fix CodeQL "first argument to super() is not enclosing class" error. Remove serialize() override that duplicates the parent class method.
…rialize Use explicit NativeOP.__init__(self) instead of super(FrozenModelDP, self) to fix CodeQL "first argument to super() is not enclosing class" error. Remove serialize() override that duplicates the parent class method.
…t-expt-frozen # Conflicts: # deepmd/pt_expt/model/frozen.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
deepmd/pt_expt/model/frozen.py (1)
23-23:⚠️ Potential issue | 🟠 Major
eval()alone does not fully freeze parameters.At Line 23,
self.model.eval()only changes eval-mode behavior (e.g., dropout/batchnorm). Parameters can still accumulate gradients and be optimizer-updated. Addrequires_grad_(False)aftereval().Minimal fix
self.model = BaseModel.deserialize(self.model.serialize()) self.model.eval() + self.model.requires_grad_(False)In PyTorch, does nn.Module.eval() disable gradients or prevent optimizer updates? What does nn.Module.requires_grad_(False) do?🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@deepmd/pt_expt/model/frozen.py` at line 23, The call self.model.eval() only sets evaluation behavior but doesn't stop gradient accumulation or optimizer updates; after the existing self.model.eval() call in frozen.py, call self.model.requires_grad_(False) to disable gradients for all parameters (or set requires_grad = False on parameters via self.model.parameters()) so the model is fully frozen during inference/training-freeze.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@deepmd/pt_expt/model/frozen.py`:
- Line 23: The call self.model.eval() only sets evaluation behavior but doesn't
stop gradient accumulation or optimizer updates; after the existing
self.model.eval() call in frozen.py, call self.model.requires_grad_(False) to
disable gradients for all parameters (or set requires_grad = False on parameters
via self.model.parameters()) so the model is fully frozen during
inference/training-freeze.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository UI
Review profile: CHILL
Plan: Pro
Run ID: 5d854797-3056-4251-a433-a76e18464a78
📒 Files selected for processing (1)
deepmd/pt_expt/model/frozen.py
Summary
FrozenModelto pt_expt backend for loading pre-frozen model files (.pte,.pth,.dp)FrozenModel(NativeOP+BaseModel) with all delegation methods, so pt_expt wraps it via@torch_moduleinstead of duplicating codeFrozenModelhandles.ptenatively viaserialize_from_file, falls back to generic backend detection for other formatsTest plan
source/tests/consistent/model/test_frozen.py) — pt_expt consistent_with_ref and self_consistent passSummary by CodeRabbit
New Features
Tests