Skip to content

fix: nan error when all fparam/aparam have equal values#5321

Merged
njzjz merged 2 commits intodeepmodeling:masterfrom
Chengqian-Zhang:0317_fparam_nan
Mar 18, 2026
Merged

fix: nan error when all fparam/aparam have equal values#5321
njzjz merged 2 commits intodeepmodeling:masterfrom
Chengqian-Zhang:0317_fparam_nan

Conversation

@Chengqian-Zhang
Copy link
Collaborator

@Chengqian-Zhang Chengqian-Zhang commented Mar 17, 2026

For standard deviation of fparam/aparam, $\sigma = \sqrt{\frac{1}{N} \sum_{i=1}^{N} (x_i - \bar{x})^2}=\sqrt{\frac{\sum x_i^2}{N} - \left( \frac{\sum x_i}{N} \right)^2}$.
When all fparam/aparam have equal values in one dimension, $\frac{\sum x_i^2}{N} - \left( \frac{\sum x_i}{N} \right)^2$ equals zero.

However, it sometimes becomes a very small negative number(for example, 1e-18) due to numerical instability, so $\sqrt{\frac{\sum x_i^2}{N} - \left( \frac{\sum x_i}{N} \right)^2}$ becomes nan.

Summary by CodeRabbit

  • Bug Fixes
    • Improved numerical stability in variance/std calculations by ensuring intermediate variance values are non-negative before taking the square root. This prevents occasional floating-point underflow from producing invalid results and yields more reliable statistical outputs across edge-case inputs.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 17, 2026

📝 Walkthrough

Walkthrough

compute_std now clips the computed variance with np.clip(..., a_min=0) before taking the square root to prevent negative values from floating-point underflow.

Changes

Cohort / File(s) Summary
Variance clamping
deepmd/utils/env_mat_stat.py
Added np.clip to enforce non-negative variance before the np.sqrt in compute_std, preventing negative values from floating-point precision loss. Lines changed: +6/-2.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Suggested reviewers

  • wanghan-iapcm
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main fix: addressing NaN errors when all fparam/aparam values are equal, which directly matches the code change that clips variance before square root.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link

codecov bot commented Mar 17, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 82.30%. Comparing base (4ce5c73) to head (3e70f35).
⚠️ Report is 4 commits behind head on master.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #5321   +/-   ##
=======================================
  Coverage   82.30%   82.30%           
=======================================
  Files         775      775           
  Lines       77628    77626    -2     
  Branches     3675     3678    +3     
=======================================
+ Hits        63888    63889    +1     
+ Misses      12568    12565    -3     
  Partials     1172     1172           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@njzjz
Copy link
Member

njzjz commented Mar 18, 2026

pre-commit.ci autofix

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
deepmd/utils/env_mat_stat.py (1)

92-97: Align TensorFlow _compute_std implementations with the guard applied in deepmd/utils/env_mat_stat.py.

Both deepmd/tf/fit/ener.py (line 380) and deepmd/tf/descriptor/se_a.py (line 918) take sqrt on unclamped variance without protecting against negative values. Their post-check thresholds do not catch NaN results that can arise from numerical instability (e.g., when floating-point errors produce negative variance estimates). Apply pre-sqrt variance clamping (a_min=0) to both _compute_std methods to match the fix already in place.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@deepmd/utils/env_mat_stat.py` around lines 92 - 97, The two TensorFlow
implementations of _compute_std (in deepmd/tf/fit/ener.py and
deepmd/tf/descriptor/se_a.py) take tf.sqrt on a variance expression without
guarding against small negative values; update each _compute_std to clamp the
variance before sqrt (e.g., use tf.clip_by_value or tf.maximum to enforce
a_min=0) so you compute tf.sqrt(clamped_variance) instead of
tf.sqrt(raw_variance), matching the np.clip fix in deepmd/utils/env_mat_stat.py;
ensure you replace the direct variance -> sqrt usage in both _compute_std
functions with the clamping step while keeping downstream logic unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@deepmd/utils/env_mat_stat.py`:
- Around line 92-97: The two TensorFlow implementations of _compute_std (in
deepmd/tf/fit/ener.py and deepmd/tf/descriptor/se_a.py) take tf.sqrt on a
variance expression without guarding against small negative values; update each
_compute_std to clamp the variance before sqrt (e.g., use tf.clip_by_value or
tf.maximum to enforce a_min=0) so you compute tf.sqrt(clamped_variance) instead
of tf.sqrt(raw_variance), matching the np.clip fix in
deepmd/utils/env_mat_stat.py; ensure you replace the direct variance -> sqrt
usage in both _compute_std functions with the clamping step while keeping
downstream logic unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 90ca07ae-2c9b-4464-8b60-6062f9448bdc

📥 Commits

Reviewing files that changed from the base of the PR and between 4ab1427 and 3e70f35.

📒 Files selected for processing (1)
  • deepmd/utils/env_mat_stat.py

@njzjz njzjz enabled auto-merge March 18, 2026 02:28
@njzjz njzjz added this pull request to the merge queue Mar 18, 2026
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 18, 2026
@njzjz njzjz added this pull request to the merge queue Mar 18, 2026
Merged via the queue into deepmodeling:master with commit a9bbf74 Mar 18, 2026
70 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants