Skip to content

feat(pt_expt): atomic model #5220

Merged
wanghan-iapcm merged 55 commits intodeepmodeling:masterfrom
wanghan-iapcm:feat-atomic-model
Feb 14, 2026
Merged

feat(pt_expt): atomic model #5220
wanghan-iapcm merged 55 commits intodeepmodeling:masterfrom
wanghan-iapcm:feat-atomic-model

Conversation

@wanghan-iapcm
Copy link
Collaborator

@wanghan-iapcm wanghan-iapcm commented Feb 13, 2026

Summary by CodeRabbit

  • New Features

    • Added end-to-end bias/statistics workflow for atomic models (compute, load, apply, and update output biases).
    • Introduced PyTorch-experimental atomic model wrappers with serialization/export compatibility.
    • Added comprehensive statistics utilities for global and per-atom outputs.
  • Bug Fixes

    • Improved tensor→array conversion to handle gradient-enabled tensors robustly.
  • Tests

    • Added extensive tests covering stats, bias workflows, serialization, export, and consistency.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and well-structured feature: a new experimental PyTorch backend (pt_expt) designed to be more idiomatic with torch.nn.Module and exportable via torch.export. The changes also include a substantial refactoring of core dpmodel components to be backend-agnostic using array_api_compat, which is a great step towards better code structure and maintainability. The addition of comprehensive consistency tests for the new backend is commendable. My review focuses on a couple of areas in the new backend-agnostic logic where code duplication can be reduced to improve clarity and maintainability.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: fb08ffca5b

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 13, 2026

📝 Walkthrough

Walkthrough

Adds a bias/statistics workflow to the DP atomic model: new BaseAtomicModel methods for computing/loading/applying output biases, a new statistics utility module with I/O and computation routines, expanded tensor-to-numpy fallback, PyTorch-exportable atomic model wrappers, and comprehensive unit tests for these flows.

Changes

Cohort / File(s) Summary
Bias/Statistics Core
deepmd/dpmodel/atomic_model/base_atomic_model.py, deepmd/dpmodel/utils/stat.py
Adds public bias/stat APIs (get_intensive, get_compute_stats_distinguish_types, compute_or_load_out_stat, change_out_bias) and helpers (_store_out_stat, _get_forward_wrapper_func) in BaseAtomicModel; adds stat.py with file I/O, global/atomic stat computation, model-prediction integration, merging, filling, and validation logic.
Array Conversion
deepmd/dpmodel/common.py
Expands to_numpy_array to catch RuntimeError and use existing DLpack/namespace fallback for tensor conversion.
PyTorch-export Implementations
deepmd/pt_expt/atomic_model/__init__.py, deepmd/pt_expt/atomic_model/dp_atomic_model.py, deepmd/pt_expt/atomic_model/energy_atomic_model.py
Adds PT_expt-compatible DPAtomicModel and DPEnergyAtomicModel wrappers (torch.nn.Module integration, descriptor/fitting conversion, forward mapping) and register mapping for DP↔PT_expt serialization.
Tests — Atomic/Global Stats & DPAtomicModel
source/tests/pt_expt/atomic_model/test_atomic_model_atomic_stat.py, source/tests/pt_expt/atomic_model/test_atomic_model_global_stat.py, source/tests/pt_expt/atomic_model/test_dp_atomic_model.py
Adds extensive unit tests covering stat computation/load/save, bias application and change-by-statistic, mixed atomic/global labels, DP↔PT_expt consistency, exportability, exclusion masks, and virtual-mapping behaviors.
Test Package Setup
source/tests/pt_expt/atomic_model/__init__.py
Adds SPDX license header to the test package initializer.
Type embedding minor change
deepmd/dpmodel/utils/type_embed.py
Removes explicit dtype argument when creating RNG random arrays and relies on casting to the final dtype (minor initialization change).

Sequence Diagram(s)

sequenceDiagram
    participant Base as BaseAtomicModel
    participant Stat as stat.compute_output_stats
    participant Wrapper as ForwardWrapper
    participant Model as ModelForward
    participant FS as StatFile (DPPath)

    Base->>Base: change_out_bias(sample_merged, stat_file_path, mode)
    Base->>Stat: compute_output_stats(merged, ntypes, keys, stat_file_path, model_forward=wrapper)
    Stat->>FS: _restore_from_file(stat_file_path) (if provided)
    Stat->>Wrapper: request predictions for samples
    Wrapper->>Model: forward(converted inputs / built nlist)
    Model->>Wrapper: predictions (numpy arrays)
    Wrapper->>Stat: return numpy predictions
    Stat->>Stat: compute bias/std (global & atomic), merge/fill, optional save to file
    Stat->>Base: return out_bias, out_std
    Base->>Base: _store_out_stat(out_bias, out_std, add or set)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Suggested reviewers

  • iProzd
  • njzjz
🚥 Pre-merge checks | ✅ 2 | ❌ 2
❌ Failed checks (1 warning, 1 inconclusive)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.77% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'feat(pt_expt): atomic model' is vague and overly broad, describing a general feature area rather than the specific main changes in the pull request. Consider making the title more specific. For example: 'feat(pt_expt): add DPAtomicModel with bias/statistics workflow' would better capture the primary functionality being introduced across the atomic model classes and statistics utilities.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Merge Conflict Detection ✅ Passed ✅ No merge conflicts detected when merging into master

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

No actionable comments were generated in the recent review. 🎉


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🤖 Fix all issues with AI agents
In `@deepmd/dpmodel/atomic_model/base_atomic_model.py`:
- Around line 259-261: The docstring for get_compute_stats_distinguish_types
contradicts its return value: update the docstring in
BaseAtomicModel.get_compute_stats_distinguish_types to accurately describe that
it returns True when the fitting net computes statistics that are distinguished
between different atom types (remove the word "not" and rephrase to say it
indicates whether stats are distinguished by atom type), so the text matches the
method name and the True return value.
- Around line 347-367: The code in _store_out_stat uses np.copy on self.out_bias
and self.out_std which breaks for torch.Tensor buffers in the pt_expt backend;
replace those np.copy calls with safe conversions using the existing
to_numpy_array helper (e.g., out_bias_data =
to_numpy_array(self.out_bias).copy() and out_std_data =
to_numpy_array(self.out_std).copy()) so both numpy arrays and torch tensors are
handled correctly before mutating; keep the rest of _store_out_stat logic intact
and continue to assign the final numpy arrays back to self.out_bias/self.out_std
as before.

In `@deepmd/dpmodel/utils/stat.py`:
- Around line 254-263: The comprehension building model_pred_g (and similarly
model_pred_a) can raise KeyError because it indexes global_sampled_idx[kk]
directly; change it to use global_sampled_idx.get(kk, []) so missing keys yield
an empty list and the inner listcomp becomes empty instead of crashing; update
the comprehension that iterates over model_pred (and the analogous one for
model_pred_a) to call global_sampled_idx.get(kk, []) and keep the existing
np.sum(vv[idx], axis=1) logic for each idx.

In `@deepmd/dpmodel/utils/type_embed.py`:
- Around line 210-222: The call to np.random.default_rng().random(...) passes
PRECISION_DICT[self.precision] directly to dtype which fails for unsupported
types like np.float16; instead generate the random array without dtype (or with
a supported float like np.float32) and then cast to the target dtype before
converting to the array backend. Update the block that creates
extend_type_params (the np.random.default_rng().random call and the subsequent
xp.asarray) so you generate with a supported numpy float type, then use
.astype(first_layer_matrix.dtype) or let xp.asarray handle the dtype conversion
(matching first_layer_matrix.dtype and device via xp.asarray(...,
dtype=first_layer_matrix.dtype,
device=array_api_compat.device(first_layer_matrix))) to ensure compatibility for
precisions such as "float16"/"half".

In `@source/tests/consistent/fitting/test_ener.py`:
- Around line 38-44: The INSTALLED_PT_EXPT branch may run without INSTALLED_PT,
so import torch inside that block to avoid NameError in the eval_pt_expt methods
which call torch.from_numpy; update the block that defines EnerFittingPTExpt and
PT_EXPT_DEVICE to also "import torch" so eval_pt_expt (both implementations that
use torch.from_numpy) and any PT_EXPT_DEVICE-dependent code have torch in scope.

In `@source/tests/pt_expt/atomic_model/test_dp_atomic_model.py`:
- Around line 115-159: Add an inline explanatory comment above the
torch.export.export call clarifying why strict=False is used (to handle dynamic
shapes and dict-returning models) consistent with the pattern in
test_fitting_invar_fitting.py; locate the export call (torch.export.export(...,
strict=False)) in test_exportable and add a brief comment referencing md0
returning a dict and dynamic output shapes so future readers understand the
non-strict export choice.

In `@source/tests/pt_expt/fitting/test_fitting_invar_fitting.py`:
- Around line 168-201: The three tests wrap calls to ifn0(...) in with
self.assertRaises(ValueError) as context but place self.assertIn(...) inside the
with block, so those checks never run; move each self.assertIn(...) to
immediately after its corresponding with block and reference context.exception
(e.g., use str(context.exception)) to assert the error message for the ifn0 call
in each case (the blocks around the first ifn0 call, the second ifn0 call when
nfp > 0, and the third ifn0 call when nap > 0).
🧹 Nitpick comments (13)
deepmd/pt_expt/atomic_model/energy_atomic_model.py (1)

14-21: Docstring claims validation that isn't implemented.

The docstring says this class "validates the fitting is an EnergyFittingNet or InvarFitting," but the body is pass. If this validation is intentionally deferred, consider updating the docstring to reflect that (e.g., "placeholder for future validation" or "specialization for energy models"). Otherwise, add the fitting-type check in __init__.

deepmd/pt_expt/utils/type_embed.py (1)

15-15: Remove unused noqa directive.

Ruff reports F401 isn't enabled, so # noqa: F401 is unnecessary. The side-effect import comment is sufficient to explain the intent.

Proposed fix
-from deepmd.pt_expt.utils import network  # noqa: F401
+from deepmd.pt_expt.utils import network  # ensure EmbeddingNet is registered
deepmd/pt_expt/fitting/invar_fitting.py (1)

23-27: Potentially redundant nets conversion on line 27.

Since __setattr__ routes through dpmodel_setattr, which auto-converts NativeOP instances via the registry, self.nets should already be a pt_expt NetworkCollection after InvarFittingDP.__init__ completes. The explicit NetworkCollection.deserialize(self.nets.serialize()) on line 27 then performs a redundant serialize→deserialize round-trip.

This is harmless (acts as a safety net) and the pattern may be intentional, but worth noting for awareness if you want to trim the overhead.

deepmd/dpmodel/atomic_model/base_atomic_model.py (1)

369-443: _get_forward_wrapper_func — device/no-device branches have significant duplication.

The two branches (lines 391–404 vs 405–417) differ only by the device=device kwarg. A small helper could reduce this, though the current code is correct and readable.

Example consolidation
-            device = getattr(ref_array, "device", None)
-            if device is not None:
-                # For torch tensors
-                coord = xp.asarray(coord, device=device)
-                atype = xp.asarray(atype, device=device)
-                if box is not None:
-                    # Check if box is all zeros before converting
-                    if np.allclose(box, 0.0):
-                        box = None
-                    else:
-                        box = xp.asarray(box, device=device)
-                if fparam is not None:
-                    fparam = xp.asarray(fparam, device=device)
-                if aparam is not None:
-                    aparam = xp.asarray(aparam, device=device)
-            else:
-                # For numpy arrays
-                coord = xp.asarray(coord)
-                atype = xp.asarray(atype)
-                if box is not None:
-                    if np.allclose(box, 0.0):
-                        box = None
-                    else:
-                        box = xp.asarray(box)
-                if fparam is not None:
-                    fparam = xp.asarray(fparam)
-                if aparam is not None:
-                    aparam = xp.asarray(aparam)
+            device = getattr(ref_array, "device", None)
+            dev_kw = {"device": device} if device is not None else {}
+
+            def _to_xp(arr):
+                return xp.asarray(arr, **dev_kw)
+
+            coord = _to_xp(coord)
+            atype = _to_xp(atype)
+            if box is not None:
+                if np.allclose(box, 0.0):
+                    box = None
+                else:
+                    box = _to_xp(box)
+            if fparam is not None:
+                fparam = _to_xp(fparam)
+            if aparam is not None:
+                aparam = _to_xp(aparam)
source/tests/pt_expt/fitting/test_fitting_stat.py (1)

47-72: _brute_fparam_pt and _brute_aparam_pt are identical except for the dict key.

These two functions could be a single helper parameterized by key name, reducing duplication. Minor nit for test utility code.

♻️ Optional: consolidate into a single helper
-def _brute_fparam_pt(data, ndim):
-    adata = [ii["fparam"] for ii in data]
-    all_data = []
-    for ii in adata:
-        tmp = np.reshape(ii, [-1, ndim])
-        if len(all_data) == 0:
-            all_data = np.array(tmp)
-        else:
-            all_data = np.concatenate((all_data, tmp), axis=0)
-    avg = np.average(all_data, axis=0)
-    std = np.std(all_data, axis=0)
-    return avg, std
-
-
-def _brute_aparam_pt(data, ndim):
-    adata = [ii["aparam"] for ii in data]
+def _brute_param_pt(data, ndim, key):
+    adata = [ii[key] for ii in data]
     all_data = []
     for ii in adata:
         tmp = np.reshape(ii, [-1, ndim])
deepmd/dpmodel/utils/stat.py (2)

131-140: np.nan_to_num silently replaces residual NaN with 0.

After the np.where, any positions where both atomic_stat and global_stat are NaN remain NaN; np.nan_to_num then maps them to 0. If this is intentional (e.g., treating unobserved types as zero bias), it's worth a brief inline comment to make the intent clear. If not, it could mask a data-quality problem.


528-540: missing_types only accounts for types beyond max(atype), not gaps.

If atom types 0 and 2 are present but type 1 is missing, compute_stats_from_atomic returns rows for types 0–2 and this padding only appends types beyond 2. Types in the gap would get NaN from compute_stats_from_atomic and are later filled by _fill_stat_with_global, so the overall pipeline is correct. Just flagging for clarity — a comment here would help future readers.

Also, dtype is comparison on line 531 is fragile — prefer ==.

Minor: use `==` for dtype comparison
-                assert bias_atom_e[kk].dtype is std_atom_e[kk].dtype, (
+                assert bias_atom_e[kk].dtype == std_atom_e[kk].dtype, (
                     "bias and std should be of the same dtypes"
                 )
source/tests/pt_expt/atomic_model/test_atomic_model_global_stat.py (2)

47-151: FooFitting is duplicated across test files.

This class is nearly identical to FooFitting in test_atomic_model_atomic_stat.py (the only difference is the addition of pix output). Consider extracting a shared base or parameterized test fixture to reduce copy-paste across the two test modules.


196-199: Unused f in h5py.File context manager.

Ruff flags this (F841). A simple _ would suppress it without changing behavior:

-        with h5py.File(h5file, "w") as f:
+        with h5py.File(h5file, "w") as _:

This pattern repeats at lines 587 and 697 as well.

source/tests/pt_expt/atomic_model/test_atomic_model_atomic_stat.py (1)

40-127: FooFitting duplicates the one in test_atomic_model_global_stat.py (minus pix).

As noted in the other file — extracting common test fixtures would reduce maintenance burden. Not blocking.

source/tests/pt_expt/atomic_model/test_dp_atomic_model.py (1)

161-231: test_excl_consistency: the "hacking!" comment (line 189) deserves a brief explanation.

The test calls reinit_atom_exclude/reinit_pair_exclude on md0 but uses different method names on md1. A one-line comment explaining why would help future maintainers.

source/tests/pt_expt/descriptor/test_se_t.py (1)

59-69: Prefix unused unpacked variables with _.

gr1 (line 65) and gr2 (line 85) are unpacked but never used. Ruff flagged these (RUF059). Since se_t returns None for gr, you can prefix them.

Suggested fix
-            rd1, gr1, _, _, sw1 = dd1(
+            rd1, _gr1, _, _, sw1 = dd1(
-            rd2, gr2, _, _, sw2 = dd2.call(
+            rd2, _gr2, _, _, sw2 = dd2.call(
source/tests/pt_expt/descriptor/test_se_t_tebd.py (1)

92-93: TODO: gr is None — worth tracking.

The comment notes that gr is None and warrants investigation. Consider opening an issue to track this so it doesn't get lost.

Would you like me to open an issue to track the gr being None investigation?

np.testing.assert_almost_equal(ret1[kk], expected_ret1[kk])

def test_serialize(self) -> None:
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nf is not used.
np.testing.assert_almost_equal(ret1[kk], expected_ret1[kk])

def test_serialize(self) -> None:
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nloc is not used.
np.testing.assert_almost_equal(ret1[kk], expected_ret1[kk])

def test_serialize(self) -> None:
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nnei is not used.

def test_change_by_statistic(self) -> None:
"""Test change-by-statistic with atomic foo + global pix + global bar."""
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nf is not used.

def test_change_by_statistic(self) -> None:
"""Test change-by-statistic with atomic foo + global pix + global bar."""
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nloc is not used.

def test_dp_consistency(self) -> None:
"""Test numerical consistency between dpmodel and pt_expt atomic models."""
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nloc is not used.

def test_dp_consistency(self) -> None:
"""Test numerical consistency between dpmodel and pt_expt atomic models."""
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nnei is not used.

def test_exportable(self) -> None:
"""Test that pt_expt atomic model can be exported with torch.export."""
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nf is not used.

def test_exportable(self) -> None:
"""Test that pt_expt atomic model can be exported with torch.export."""
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nloc is not used.

def test_exportable(self) -> None:
"""Test that pt_expt atomic model can be exported with torch.export."""
nf, nloc, nnei = self.nlist.shape

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable nnei is not used.
@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

❌ Patch coverage is 95.83333% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.12%. Comparing base (4f182bc) to head (a9a924d).
⚠️ Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
deepmd/dpmodel/utils/stat.py 95.97% 7 Missing ⚠️
deepmd/dpmodel/atomic_model/base_atomic_model.py 92.30% 4 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #5220      +/-   ##
==========================================
+ Coverage   82.07%   82.12%   +0.05%     
==========================================
  Files         732      736       +4     
  Lines       73974    74237     +263     
  Branches     3615     3616       +1     
==========================================
+ Hits        60711    60967     +256     
- Misses      12100    12107       +7     
  Partials     1163     1163              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@wanghan-iapcm wanghan-iapcm requested a review from njzjz February 14, 2026 03:47
@wanghan-iapcm wanghan-iapcm requested a review from njzjz February 14, 2026 04:51
@wanghan-iapcm wanghan-iapcm added this pull request to the merge queue Feb 14, 2026
Merged via the queue into deepmodeling:master with commit a0bd530 Feb 14, 2026
70 checks passed
@wanghan-iapcm wanghan-iapcm deleted the feat-atomic-model branch February 14, 2026 09:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants