Phase 9 retroactive review-fix: signed-zero, MXCSR doc, parity coverage#58
Merged
gvonness-apolitical merged 1 commit intomainfrom Apr 18, 2026
Merged
Phase 9 retroactive review-fix: signed-zero, MXCSR doc, parity coverage#58gvonness-apolitical merged 1 commit intomainfrom
gvonness-apolitical merged 1 commit intomainfrom
Conversation
Retroactive multi-agent review of the Phase 9 commits flagged: - `atan2_partials` regressed signed-zero behaviour. The refactor from inline `(b/h/h, -a/h/h)` to `(b/h/h, T::zero() - a/h/h)` flattens `∂atan2/∂a` at `a = +0.0` to `+0.0` under round-to-nearest, where unary negation correctly yields `-0.0`. Observable downstream via `is_sign_negative` / `copysign` on gradients. Restored unary `-a/h/h` and documented the IEEE signed-zero invariant. - `Tape::reverse` FTZ doc recommended `_mm_setcsr(0x9FC0)` — a full MXCSR overwrite that clobbers any pre-existing rounding mode (e.g. interval-arithmetic crates running with `FE_DOWNWARD`) and also enables DAZ (input denormals flushed) unannounced. Replaced with a read-modify-write idiom that only sets bit 15 (FTZ) and shows the restore. Clarified DAZ is separate. - `tests/gpu_cpu_parity.rs` omitted 10+ opcodes that the bytecode ISA carries, including `acosh` — a helper introduced by the same Phase 9 commit as the parity harness. Added 11 new cases: acosh, atanh, asin, acos, exp2, log2, log10, rem, powi, powf, plus filling gaps reviewed by the coverage audit. All runners (wgpu + CUDA f32 + CUDA f64) green on the expanded table. - Silent-skip when `WgpuContext::new()` or `CudaContext::new()` returns `None` now prints an explicit `eprintln!` so `cargo test -- --nocapture` surfaces the skip instead of reporting a green result that ran zero assertions. Verified on M4 Max (wgpu) and A100 via vast.ai (CUDA f32 + f64).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Retroactive multi-agent review of the two Phase 9 commits (
b4bc927+9b17193) surfaced three concrete issues that weren't caught before merge:atan2_partialssigned-zero regression: rewritten from-a/h/htoT::zero() - a/h/hduring the opcode→kernels extraction, which flattens-(+0.0) = -0.0to+0.0under IEEE round-to-nearest. Observable viais_sign_negative/copysignon gradients. Restored unary negation._mm_setcsr(0x9FC0)) was a full MXCSR overwrite that clobbers any caller-set rounding mode and also enables DAZ unannounced. Replaced with a read-modify-write idiom and clarified DAZ separation.acosh(ironically, one of the helpers the Phase 9 refactor introduced), plusatanh/asin/acos/exp2/log2/log10/rem/powi/powf. Added 11 cases, all three runners pass on the expanded table.Also surfaces the silent-skip behaviour when no GPU is available via
eprintln!.Test plan
cargo fmt --checkandcargo clippy -- -D warningsclean