Skip to content

[CI][Diag] Dump numpy/scipy/openblas state pre-pytest#5626

Closed
hujc7 wants to merge 1 commit into
isaac-sim:developfrom
hujc7:jichuanh/ci-scipy-pin-diag
Closed

[CI][Diag] Dump numpy/scipy/openblas state pre-pytest#5626
hujc7 wants to merge 1 commit into
isaac-sim:developfrom
hujc7:jichuanh/ci-scipy-pin-diag

Conversation

@hujc7
Copy link
Copy Markdown
Collaborator

@hujc7 hujc7 commented May 15, 2026

Description

Pure-diagnostic companion to the CI instability investigation in
#C06HLQ6CB41
and OMPE-92261.

Why

The hypothesis being explored is that the Isaac Sim base image rotated on 2026-05-12 picked up a NumPy/SciPy wheel whose bundled OpenBLAS triggers the well-documented atfork SIGSEGV (see numpy#30092, scipy#23686). SciPy 1.16.1 is reported clean, 1.16.2+ has the regression.

But we don't actually know what versions are inside the currently pinned Isaac Sim image (latest-develop@sha256:0dd49a11… from PR #5600). Without that, we can't:

  • confirm the version cliff,
  • decide whether to pin scipy<1.16.2,
  • or correlate future SIGSEGVs with specific image rotations.

What

Adds a 4-line diagnostic print right before pytest runs, emitting:

  • pip show numpy scipy (Name / Version / Location)
  • numpy.__version__ and scipy.__version__
  • the actual on-disk filename of the bundled libscipy_openblas*.so (the same name that shows up in the crash backtrace)

Output goes to the GitHub Actions log; no artifact upload, no test impact. ~1 second per job.

Relationship to companion PR

This PR is paired with the env-var fix-attempt PR which tests the OpenBLAS thread-pool suppression hypothesis. Decision tree:

Env-var PR result This PR's manifest Conclusion
failure rate drops (any) Env vars sufficient — land that
failure rate stays scipy ≥ 1.16.2 Pin scipy<1.16.2 (follow-up)
failure rate stays scipy < 1.16.2 OpenBLAS hypothesis wrong — pivot

Type of change

  • Diagnostic / non-functional

Checklist

CI is hitting non-deterministic SIGSEGV crashes that look like the
OpenBLAS atfork regression tracked at numpy#30092 and scipy#23686.
The hypothesis is that the Isaac Sim base image rotated on
2026-05-12 picked up a scipy>=1.16.2 wheel whose bundled OpenBLAS
registers a buggy atfork handler.

Without knowing the actual numpy/scipy versions inside the pinned
test image, we can't confirm the hypothesis or pick a remediation
version.  This adds a small diagnostic print right before pytest
that emits:

  - pip show numpy scipy
  - numpy.__version__ / scipy.__version__
  - the on-disk filename of the bundled libscipy_openblas*.so

so the GitHub Actions log captures the actual deployed versions
for every test job.  The output is a few lines of plain text, no
artifacts, no impact on test execution.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant