Skip to content

Phase 9 retroactive review-fix: signed-zero, MXCSR doc, parity coverage#58

Merged
gvonness-apolitical merged 1 commit intomainfrom
fix/phase9-review
Apr 18, 2026
Merged

Phase 9 retroactive review-fix: signed-zero, MXCSR doc, parity coverage#58
gvonness-apolitical merged 1 commit intomainfrom
fix/phase9-review

Conversation

@gvonness-apolitical
Copy link
Copy Markdown
Contributor

Summary

Retroactive multi-agent review of the two Phase 9 commits (b4bc927 + 9b17193) surfaced three concrete issues that weren't caught before merge:

  • atan2_partials signed-zero regression: rewritten from -a/h/h to T::zero() - a/h/h during the opcode→kernels extraction, which flattens -(+0.0) = -0.0 to +0.0 under IEEE round-to-nearest. Observable via is_sign_negative / copysign on gradients. Restored unary negation.
  • FTZ doc comment foot-gun: the original recipe (_mm_setcsr(0x9FC0)) was a full MXCSR overwrite that clobbers any caller-set rounding mode and also enables DAZ unannounced. Replaced with a read-modify-write idiom and clarified DAZ separation.
  • Parity test coverage gap: the 26-case table omitted acosh (ironically, one of the helpers the Phase 9 refactor introduced), plus atanh/asin/acos/exp2/log2/log10/rem/powi/powf. Added 11 cases, all three runners pass on the expanded table.

Also surfaces the silent-skip behaviour when no GPU is available via eprintln!.

Test plan

  • wgpu Metal (M4 Max) — all 37 cases pass
  • CUDA A100 via vast.ai — f32 and f64 runners pass
  • Full echidna test suite green with all relevant features
  • cargo fmt --check and cargo clippy -- -D warnings clean

Retroactive multi-agent review of the Phase 9 commits flagged:

- `atan2_partials` regressed signed-zero behaviour. The refactor from
  inline `(b/h/h, -a/h/h)` to `(b/h/h, T::zero() - a/h/h)` flattens
  `∂atan2/∂a` at `a = +0.0` to `+0.0` under round-to-nearest, where
  unary negation correctly yields `-0.0`. Observable downstream via
  `is_sign_negative` / `copysign` on gradients. Restored unary `-a/h/h`
  and documented the IEEE signed-zero invariant.
- `Tape::reverse` FTZ doc recommended `_mm_setcsr(0x9FC0)` — a full
  MXCSR overwrite that clobbers any pre-existing rounding mode (e.g.
  interval-arithmetic crates running with `FE_DOWNWARD`) and also
  enables DAZ (input denormals flushed) unannounced. Replaced with a
  read-modify-write idiom that only sets bit 15 (FTZ) and shows the
  restore. Clarified DAZ is separate.
- `tests/gpu_cpu_parity.rs` omitted 10+ opcodes that the bytecode ISA
  carries, including `acosh` — a helper introduced by the same Phase 9
  commit as the parity harness. Added 11 new cases: acosh, atanh,
  asin, acos, exp2, log2, log10, rem, powi, powf, plus filling gaps
  reviewed by the coverage audit. All runners (wgpu + CUDA f32 + CUDA
  f64) green on the expanded table.
- Silent-skip when `WgpuContext::new()` or `CudaContext::new()`
  returns `None` now prints an explicit `eprintln!` so `cargo test
  -- --nocapture` surfaces the skip instead of reporting a green
  result that ran zero assertions.

Verified on M4 Max (wgpu) and A100 via vast.ai (CUDA f32 + f64).
@gvonness-apolitical gvonness-apolitical merged commit 8eb04c5 into main Apr 18, 2026
6 checks passed
@gvonness-apolitical gvonness-apolitical deleted the fix/phase9-review branch April 18, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant