Skip to content

NaN Handling#52

Draft
jharwood0 wants to merge 7 commits intodevelopfrom
hotfix/ODB-569
Draft

NaN Handling#52
jharwood0 wants to merge 7 commits intodevelopfrom
hotfix/ODB-569

Conversation

@jharwood0
Copy link
Copy Markdown

@jharwood0 jharwood0 commented Apr 22, 2026

Description

This pr identifies use cases where NaNs result in unexpected behavior:

  • Comparator treats NaN != NaN (IEEE) so two files with identical bits compare as different
  • NaNs round-trip fine through Double/Real columns, but Int/Bit cols reinterpret the NaN's 8 bytes through the integer codec (with narrowing or offset arithmetic depending on which one), and decode back as a plausible-looking integer — ultimately casting {1.0, NaN, 3.0} to an INT column reads back {1, 1, 3}
  • min/max column metadata also gets corrupted:
    • For Double/Real types — when a NaN arrives before any real values: NaN poisons the min/max slots, subsequent real values can't update them (NaN comparisons are always false)
    • For Integer/Bitfield — NaN bits get reinterpreted as a very large int, always overwriting max regardless of position; knock-on effect for the optimiser which sees a range of ~4.6e18

to fix the above:

  • Comparator::same() now checks for NaN first — two NaNs are treated equal,
    one NaN still compares false. dropped the $odc_NAN_IS_OK env var that was
    guarding the flag-controlled version

  • gatherStats early-returns on NaN, so min/max only track real values —
    fixes the Double/Real poisoning and stops Int/Bit from reinterpret-casting
    the NaN bits as a huge int

  • NaN writes to Int/Bit columns are coerced to the column's existing
    missing-value sentinel with a log warning

Contributor Declaration

By opening this pull request, I affirm the following:

  • All authors agree to the Contributor License Agreement.
  • The code follows the project's coding standards.
  • I have performed self-review and added comments where needed.
  • I have added or updated tests to verify that my changes are effective and functional.
  • I have run all existing tests and confirmed they pass.

📖 Documentation 📖
https://sites.ecmwf.int/docs/dev-section/odc/pull-requests/PR-52

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Apr 23, 2026

Codecov Report

❌ Patch coverage is 99.31507% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 87.02%. Comparing base (424ce3d) to head (6015bc5).

Files with missing lines Patch % Lines
src/odc/Comparator.h 75.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop      #52      +/-   ##
===========================================
+ Coverage    86.91%   87.02%   +0.11%     
===========================================
  Files          188      189       +1     
  Lines        15022    15157     +135     
  Branches      1363     1373      +10     
===========================================
+ Hits         13056    13191     +135     
  Misses        1966     1966              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants