Skip to content

COMP: Use pzero() for SelfadjointMatrixVector accumulators (silences GCC -O3 IPA-CP false positives)#6166

Merged
hjmjohnson merged 1 commit intoInsightSoftwareConsortium:mainfrom
hjmjohnson:eigen3-pzero-init-selfadjoint
Apr 29, 2026
Merged

COMP: Use pzero() for SelfadjointMatrixVector accumulators (silences GCC -O3 IPA-CP false positives)#6166
hjmjohnson merged 1 commit intoInsightSoftwareConsortium:mainfrom
hjmjohnson:eigen3-pzero-init-selfadjoint

Conversation

@hjmjohnson
Copy link
Copy Markdown
Member

Switch the two ptmp2 / ptmp3 packet-accumulator initializations in Eigen/src/Core/products/SelfadjointMatrixVector.h from pset1<Packet>(t) to pzero(Packet{}), matching upstream Eigen master. This silences four false-positive -Wmaybe-uninitialized warnings GCC emits at -O3 after IPA-CP clones the function as run.constprop.isra.

Must merge before #6164, which switches ITK.Linux.Python CI from MinSizeRel to Release and is currently red because of these exact warnings.

Why the existing in-header diagnostic-pop doesn't help

Modules/ThirdParty/Eigen3/src/itkeigen/Eigen/src/Core/products/SelfadjointMatrixVector.h already wraps its body in:

#if defined(__GNUC__) && !defined(__clang__)
#  pragma GCC diagnostic push
#  pragma GCC diagnostic ignored "-Wmaybe-uninitialized"
#endif

That suppression is stateful at parse time — it applies while GCC parses this header. The actual warnings are emitted from selfadjoint_matrix_vector_product::run.constprop.isra, a cloned function GCC creates during interprocedural constant propagation at -O2/-O3. The diagnostic state at the point GCC analyses the clone is whatever the call-site translation unit had active, not the suppression baked into the header. This is a long-standing GCC interaction with -fipa-cp-clone.

The proper fix is therefore at the source: give GCC IPA-CP a clearly-zero starting packet so it doesn't ever build a "maybe uninit" hypothesis in the first place.

What changed (2-line diff)
     Scalar t2(0);
-    Packet ptmp2 = pset1<Packet>(t2);
+    Packet ptmp2 = pzero(Packet{});
     Scalar t3(0);
-    Packet ptmp3 = pset1<Packet>(t3);
+    Packet ptmp3 = pzero(Packet{});

Algorithmic behaviour is unchanged: pset1<Packet>(0) and pzero(Packet{}) both produce a zero packet. Upstream Eigen master uses the latter form because:

  • Packet{} is value-initialization (explicit zero, visible to the optimizer as a constant).
  • pzero is a small inline that returns a zero packet of the deduced type.
  • The combination collapses cleanly under IPA-CP, leaving no "maybe uninit" hypothesis.

pzero already exists in our vendored Eigen at Modules/ThirdParty/Eigen3/src/itkeigen/Eigen/src/Core/GenericPacketMath.h:425, so no new symbols are needed.

Upstream Eigen master also did a separate, larger refactor of this same kernel (2-column → 4-column processing, four accumulators ptmp4ptmp7). That refactor is intentionally not included here — it's a bigger change with its own performance trade-offs and reviewer surface, orthogonal to fixing the warning.

Why this affects ITK.Linux.Python specifically

ITK.Linux.Python builds the SuperBuild with ITK_WRAP_PYTHON=ON, which generates many SWIG wrapper translation units that include Eigen and instantiate selfadjoint_matrix_vector_product. The combination of (a) -O3 (Release), (b) GCC's -fipa-cp-clone enabled at that level, and (c) the high template instantiation count is what makes the warnings reproducible there. MinSizeRel (-Os) typically disables IPA-CP cloning, which is why the warnings only appeared after #6164's MinSizeRel→Release switch.

Empirical CDash evidence from #6164 build #14600: 0 errors, 4 warnings, 175 tests passed. The warning text was:

SelfadjointMatrixVector.h:231: warning: '_28' may be used uninitialized [-Wmaybe-uninitialized]
SelfadjointMatrixVector.h:231: warning: 'result_90' may be used uninitialized [-Wmaybe-uninitialized]

_28 and result_90 are GCC SSA temporaries inside the cloned function; they correspond to ptmp2/ptmp3 after IPA-CP rewriting.

Test plan

@github-actions github-actions Bot added type:Compiler Compiler support or related warnings area:ThirdParty Issues affecting the ThirdParty module labels Apr 29, 2026
@hjmjohnson hjmjohnson marked this pull request as ready for review April 29, 2026 10:46
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This is a minimal, targeted two-line fix in the vendored Eigen SelfadjointMatrixVector kernel. It replaces pset1<Packet>(t2/t3) (where t2/t3 are zero-initialized scalars) with pzero(Packet{}), producing semantically identical zero-valued packet accumulators while giving GCC's IPA-CP pass a clearly-constant zero that eliminates false-positive -Wmaybe-uninitialized warnings in -O3 builds. pzero is already present in the vendored GenericPacketMath.h (line 425), and the scalar variables t2/t3 continue to serve their own role as scalar accumulators in the loop body.

Confidence Score: 5/5

Safe to merge — the two-line change is semantically equivalent and directly matches the upstream Eigen fix.

pzero(Packet{}) and pset1<Packet>(Scalar(0)) produce identical zero-valued packets; pzero is already present in the vendored Eigen headers; scalar accumulators t2/t3 are unaffected; no algorithmic behaviour changes. The fix is well-motivated, minimal, and aligned with upstream Eigen master.

No files require special attention.

Important Files Changed

Filename Overview
Modules/ThirdParty/Eigen3/src/itkeigen/Eigen/src/Core/products/SelfadjointMatrixVector.h Replaces pset1<Packet>(t2/t3) with pzero(Packet{}) for zero-initialized packet accumulators; semantically identical but eliminates GCC IPA-CP false-positive -Wmaybe-uninitialized warnings at -O3.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["selfadjoint_matrix_vector_product::run()"] --> B["Initialize ptmp0 = pset1(t0)\nInitialize ptmp1 = pset1(t1)"]
    B --> C["Initialize ptmp2 = pzero(Packet{})\n(was: pset1(t2) where t2=0)\nInitialize ptmp3 = pzero(Packet{})\n(was: pset1(t3) where t3=0)"]
    C --> D{"FirstTriangular?"}
    D -- Yes --> E["Accumulate t3 scalar\nAccumulate ptmp2/ptmp3 packets\nvia aligned vectorized loop"]
    D -- No --> F["Accumulate t2 scalar\nAccumulate ptmp2/ptmp3 packets\nvia aligned vectorized loop"]
    E --> G["Reduce packets → scalars\nt2 += predux(ptmp2)\nt3 += predux(ptmp3)"]
    F --> G
    G --> H["Write results back to res[j], res[j+1]"]
Loading

Reviews (1): Last reviewed commit: "COMP: Use pzero() for SelfadjointMatrixV..." | Re-trigger Greptile

GCC -O3 IPA-CP emits false-positive -Wmaybe-uninitialized on the
cloned `selfadjoint_matrix_vector_product::run.constprop.isra`:
`pset1<Packet>(Scalar(0))` is opaque to interprocedural constant
propagation, while `pzero(Packet{})` (used by upstream Eigen master)
is visible as a constant. Backport just the ptmp2/ptmp3 init lines.
@hjmjohnson hjmjohnson force-pushed the eigen3-pzero-init-selfadjoint branch from e4799ec to 91f6232 Compare April 29, 2026 10:50
@hjmjohnson hjmjohnson requested a review from dzenanz April 29, 2026 11:12
@hjmjohnson
Copy link
Copy Markdown
Member Author

@dzenanz Changing from RelMinSize -> Release in #6164 invoked the IPO toolchain that exposed an eigen warning. This PR backports a more robust eigen variable initialization that propagates through the IPO tool chain to eliminate the warnings.

Copy link
Copy Markdown
Member

@dzenanz dzenanz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM on a glance.

@hjmjohnson hjmjohnson merged commit 527b6c7 into InsightSoftwareConsortium:main Apr 29, 2026
13 of 16 checks passed
hjmjohnson added a commit that referenced this pull request May 1, 2026
Follow-up to #6166. On ubuntu-24.04 + GCC 13/14 + C++20 (the toolchain
that #6164 introduces), GCC -O3 IPA-CP emits additional false-positive
-Wmaybe-uninitialized at the outer selfadjoint_product_impl::run()
call site, attributing the read to "argument 3 of type const float *"
of the cloned selfadjoint_matrix_vector_product::run.constprop.isra.

Add pzero(Packet{}) declare-then-assign init for the four Packet
locals (A0i, A1i, Bi, Xi) loaded from the lhs/rhs/res pointers inside
the inner loop. Mirrors the #6166 ptmp2/ptmp3 pattern: pzero() is
visible to GCC's IPA-CP as a definite zero starting state, while a
bare declaration plus subsequent ploadu/pload is opaque and lets the
optimizer trace an uninit path through cloned variants.
@hjmjohnson hjmjohnson added this to the ITK 6.0 Release Candidate 1 milestone May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:ThirdParty Issues affecting the ThirdParty module type:Compiler Compiler support or related warnings

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants