Skip to content

feat(pt_expt): add frozen model#5318

Open
wanghan-iapcm wants to merge 5 commits intodeepmodeling:masterfrom
wanghan-iapcm:feat-pt-expt-frozen
Open

feat(pt_expt): add frozen model#5318
wanghan-iapcm wants to merge 5 commits intodeepmodeling:masterfrom
wanghan-iapcm:feat-pt-expt-frozen

Conversation

@wanghan-iapcm
Copy link
Collaborator

@wanghan-iapcm wanghan-iapcm commented Mar 16, 2026

Summary

  • Add FrozenModel to pt_expt backend for loading pre-frozen model files (.pte, .pth, .dp)
  • Create dpmodel-level FrozenModel (NativeOP + BaseModel) with all delegation methods, so pt_expt wraps it via @torch_module instead of duplicating code
  • pt_expt FrozenModel handles .pte natively via serialize_from_file, falls back to generic backend detection for other formats
  • Add pt_expt support to frozen model consistency test

Test plan

  • Cross-backend consistency test (source/tests/consistent/model/test_frozen.py) — pt_expt consistent_with_ref and self_consistent pass
  • Existing pt/tf frozen model tests unaffected

Summary by CodeRabbit

  • New Features

    • Added support for loading and using frozen model files across workflows and exposed a FrozenModel in the Python API.
    • Broadened backend compatibility to include an additional experiment backend for frozen models.
  • Tests

    • Added/updated tests to validate frozen-model loading and evaluation across supported backends, including the new experiment backend.

Han Wang added 2 commits March 16, 2026 00:15
Load a pre-frozen model file (.pte or any format) via convert_backend
serialization, reconstruct with BaseModel.deserialize, and delegate
all model API methods to the inner model. Cannot be trained.
…module

- Create dpmodel FrozenModel (NativeOP + BaseModel) with all delegation
  methods, so pt_expt can inherit instead of duplicating
- Rewrite pt_expt FrozenModel to use @torch_module wrapping dpmodel class
- Override __init__ to handle .pte files natively via serialize_from_file,
  fall back to generic backend detection for other formats
- Override serialize() to delegate directly to inner model (unlike pt
  which must reconstruct from model_def_script due to opaque ScriptModule)
- Add pt_expt support to frozen consistency test using BaseModel as
  pt_expt_class (same pattern as pt)
- Guard setUpModule model generation with backend availability checks
@wanghan-iapcm wanghan-iapcm requested a review from njzjz March 16, 2026 06:15
@dosubot dosubot bot added the new feature label Mar 16, 2026
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 16, 2026

📝 Walkthrough

Walkthrough

Adds a FrozenModel wrapper class in dpmodel and a pt_expt variant that load/deserializes frozen model files and delegate model behavior to an inner model; tests updated to support the PT-EXPT backend.

Changes

Cohort / File(s) Summary
FrozenModel Core (dpmodel)
deepmd/dpmodel/model/frozen.py
New FrozenModel class (inherits NativeOP, BaseModel) that detects backend, deserializes an inner model from a frozen file, stores it as self.model, and forwards public model APIs (call/forward, rcut, type map, sel counts, dim fparam/aparam, mixed types, message-passing flags, neighbor-list needs, model definition, minimal neighbor distance, nnei/nsel, output type, observed types, serialize). deserialize raises; update_sel is a no-op.
pt_expt Variant & Export
deepmd/pt_expt/model/frozen.py, deepmd/pt_expt/model/__init__.py
Adds pt_expt FrozenModel wrapper (registered as "frozen" and wrapped for torch) which re-deserializes/converts inner model to pt_expt form and sets it to eval mode; exports FrozenModel in module __all__.
Tests
source/tests/consistent/model/test_frozen.py
Extends tests to conditionally handle PT-EXPT backend (conditional imports, new eval_pt_expt test path, pt_expt_class exposure) and treat PT_EXPT like PT when extracting energy/atom_energy from results.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant FrozenModel
    participant Backend
    participant Serializer
    participant InnerModel

    Client->>FrozenModel: __init__(model_file)
    alt model_file ends with .pte
        FrozenModel->>Serializer: serialize_from_file(model_file)
        Serializer-->>FrozenModel: serialized_dict
    else other formats
        FrozenModel->>Backend: detect_backend_by_model(model_file)
        Backend-->>FrozenModel: detected_backend
        FrozenModel->>Serializer: backend.serialize(model_file)
        Serializer-->>FrozenModel: serialized_dict
    end
    FrozenModel->>InnerModel: BaseModel.deserialize(serialized_dict)
    InnerModel-->>FrozenModel: inner_model_instance
    FrozenModel->>InnerModel: inner_model.eval()
    FrozenModel-->>Client: ready (self.model set)
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested reviewers

  • iProzd
  • njzjz
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 68.97% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding a frozen model implementation to the pt_expt backend.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can approve the review once all CodeRabbit's comments are resolved.

Enable the reviews.request_changes_workflow setting to automatically approve the review once all CodeRabbit's comments are resolved.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
deepmd/dpmodel/model/frozen.py (1)

141-153: Make unsupported paths fail loudly.

FrozenModel is documented as non-trainable, but update_sel() currently returns success and deserialize() raises a non-actionable error. Turning both into explicit failures would make accidental training/deserialization misuse much easier to diagnose.

🔧 Suggested change
     `@classmethod`
     def deserialize(cls, data: dict) -> NoReturn:
-        raise RuntimeError("Should not touch here.")
+        raise RuntimeError(
+            "FrozenModel cannot be deserialized directly; deserialize the inner model data instead."
+        )

     `@classmethod`
     def update_sel(
         cls,
         train_data: DeepmdDataSystem,
         type_map: list[str] | None,
         local_jdata: dict,
     ) -> tuple[dict, float | None]:
         """Update the selection and perform neighbor statistics."""
-        return local_jdata, None
+        raise RuntimeError(
+            "FrozenModel cannot be used for training or neighbor-statistics updates."
+        )
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/dpmodel/model/frozen.py` around lines 141 - 153, The FrozenModel class
currently masks misuse by returning from update_sel and raising a vague error in
deserialize; change both to fail loudly: in FrozenModel.deserialize(cls, data)
raise a clear RuntimeError (or NotImplementedError) with a message like
"FrozenModel is non-deserializable/non-trainable; operation not supported", and
in FrozenModel.update_sel(cls, train_data, type_map, local_jdata) raise the same
explicit exception instead of returning (remove the tuple return), referencing
the methods deserialize and update_sel on class FrozenModel so accidental
training or deserialization attempts fail fast and provide actionable messages.
source/tests/consistent/model/test_frozen.py (1)

54-60: Add coverage for the new .pte loader path.

deepmd/pt_expt/model/frozen.py, Lines 22-29, now has a dedicated .pte branch, but this test module still only provisions and parameterizes .pth, .pb, and .dp fixtures. That leaves the main new path in this PR unexercised. If .pte is pt_expt-only, a small targeted test is enough.

Also applies to: 75-76

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@source/tests/consistent/model/test_frozen.py` around lines 54 - 60, The test
setup and parameterization omit the new .pte loader path; update setUpModule to
provision the .pte model by calling case.get_model(".pte", pte_model) (guard
with INSTALLED_PT_EXPT if .pte is only available when pt_expt is installed) and
update the test parameterization near the parameter lines (around where
.pth/.pb/.dp are listed) to include the ".pte" case so the new
deepmd/pt_expt/model/frozen.py branch is exercised.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@deepmd/pt_expt/model/frozen.py`:
- Around line 19-37: The FrozenModelDP constructor currently calls
self.model.eval() but does not disable gradients, so parameters remain
trainable; after the existing self.model.eval() call in __init__ of
FrozenModelDP, call requires_grad_(False) on the wrapped model (i.e.,
self.model.requires_grad_(False)) to freeze all parameters and prevent optimizer
updates.

---

Nitpick comments:
In `@deepmd/dpmodel/model/frozen.py`:
- Around line 141-153: The FrozenModel class currently masks misuse by returning
from update_sel and raising a vague error in deserialize; change both to fail
loudly: in FrozenModel.deserialize(cls, data) raise a clear RuntimeError (or
NotImplementedError) with a message like "FrozenModel is
non-deserializable/non-trainable; operation not supported", and in
FrozenModel.update_sel(cls, train_data, type_map, local_jdata) raise the same
explicit exception instead of returning (remove the tuple return), referencing
the methods deserialize and update_sel on class FrozenModel so accidental
training or deserialization attempts fail fast and provide actionable messages.

In `@source/tests/consistent/model/test_frozen.py`:
- Around line 54-60: The test setup and parameterization omit the new .pte
loader path; update setUpModule to provision the .pte model by calling
case.get_model(".pte", pte_model) (guard with INSTALLED_PT_EXPT if .pte is only
available when pt_expt is installed) and update the test parameterization near
the parameter lines (around where .pth/.pb/.dp are listed) to include the ".pte"
case so the new deepmd/pt_expt/model/frozen.py branch is exercised.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 9db1b5b6-7d97-40d6-afcf-800b9f6869d1

📥 Commits

Reviewing files that changed from the base of the PR and between 09345bf and b7306ca.

📒 Files selected for processing (4)
  • deepmd/dpmodel/model/frozen.py
  • deepmd/pt_expt/model/__init__.py
  • deepmd/pt_expt/model/frozen.py
  • source/tests/consistent/model/test_frozen.py

@codecov
Copy link

codecov bot commented Mar 16, 2026

Codecov Report

❌ Patch coverage is 72.46377% with 19 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.29%. Comparing base (09345bf) to head (b1daa79).

Files with missing lines Patch % Lines
deepmd/dpmodel/model/frozen.py 66.66% 19 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5318      +/-   ##
==========================================
- Coverage   82.29%   82.29%   -0.01%     
==========================================
  Files         775      777       +2     
  Lines       77627    77696      +69     
  Branches     3676     3675       -1     
==========================================
+ Hits        63887    63938      +51     
- Misses      12566    12585      +19     
+ Partials     1174     1173       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…rialize

Use explicit NativeOP.__init__(self) instead of super(FrozenModelDP, self)
to fix CodeQL "first argument to super() is not enclosing class" error.
Remove serialize() override that duplicates the parent class method.
Han Wang added 2 commits March 16, 2026 22:05
…rialize

Use explicit NativeOP.__init__(self) instead of super(FrozenModelDP, self)
to fix CodeQL "first argument to super() is not enclosing class" error.
Remove serialize() override that duplicates the parent class method.
…t-expt-frozen

# Conflicts:
#	deepmd/pt_expt/model/frozen.py
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
deepmd/pt_expt/model/frozen.py (1)

23-23: ⚠️ Potential issue | 🟠 Major

eval() alone does not fully freeze parameters.

At Line 23, self.model.eval() only changes eval-mode behavior (e.g., dropout/batchnorm). Parameters can still accumulate gradients and be optimizer-updated. Add requires_grad_(False) after eval().

Minimal fix
         self.model = BaseModel.deserialize(self.model.serialize())
         self.model.eval()
+        self.model.requires_grad_(False)
In PyTorch, does nn.Module.eval() disable gradients or prevent optimizer updates? What does nn.Module.requires_grad_(False) do?
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/pt_expt/model/frozen.py` at line 23, The call self.model.eval() only
sets evaluation behavior but doesn't stop gradient accumulation or optimizer
updates; after the existing self.model.eval() call in frozen.py, call
self.model.requires_grad_(False) to disable gradients for all parameters (or set
requires_grad = False on parameters via self.model.parameters()) so the model is
fully frozen during inference/training-freeze.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@deepmd/pt_expt/model/frozen.py`:
- Line 23: The call self.model.eval() only sets evaluation behavior but doesn't
stop gradient accumulation or optimizer updates; after the existing
self.model.eval() call in frozen.py, call self.model.requires_grad_(False) to
disable gradients for all parameters (or set requires_grad = False on parameters
via self.model.parameters()) so the model is fully frozen during
inference/training-freeze.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 5d854797-3056-4251-a433-a76e18464a78

📥 Commits

Reviewing files that changed from the base of the PR and between 4815fbd and b1daa79.

📒 Files selected for processing (1)
  • deepmd/pt_expt/model/frozen.py

@wanghan-iapcm wanghan-iapcm requested a review from njzjz March 16, 2026 14:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants