Skip to content

Fix: Add task_status check to prevent duplicate task execution#15

Merged
ChaoWao merged 1 commit intohw-native-sys:mainfrom
ChaoZheng109:runtime-aicpu1
Jan 29, 2026
Merged

Fix: Add task_status check to prevent duplicate task execution#15
ChaoWao merged 1 commit intohw-native-sys:mainfrom
ChaoZheng109:runtime-aicpu1

Conversation

@ChaoZheng109
Copy link
Copy Markdown
Collaborator

Add task_status verification before executing tasks in aicore_executor. This check was lost during a previous rebase and could cause tasks to be executed multiple times.

Add task_status verification before executing tasks in aicore_executor.
This check was lost during a previous rebase and could cause tasks to
be executed multiple times.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@ChaoWao ChaoWao merged commit 102e18c into hw-native-sys:main Jan 29, 2026
PKUZHOU pushed a commit to PKUZHOU/simpler that referenced this pull request Mar 31, 2026
…tive-sys#15)

Add task_status verification before executing tasks in aicore_executor.
This check was lost during a previous rebase and could cause tasks to
be executed multiple times.

Co-authored-by: ChaoZheng109 <zhengchao47@huawei.com>
ChaoWao added a commit to ChaoWao/simpler-fork that referenced this pull request Apr 11, 2026
On macOS, `python ci.py -p a2a3sim` (or a5sim) aborts every task with
"OMP: Error hw-native-sys#15: Initializing libomp.dylib, but found libomp.dylib
already initialized" (SIGABRT) before any DeviceRunner code runs.

Two distinct libomp.dylib copies get mapped into the single CI process:
homebrew's /opt/homebrew/opt/libomp/lib/libomp.dylib (via numpy ->
openblas) and pip torch's .venv/.../torch/lib/libomp.dylib. They have
different install names, so dyld loads them both and Intel's libomp
aborts on the second init. Surfaced after hw-native-sys#493 collapsed sim CI into
one long-lived Python process; each golden's `import numpy`/`import
torch` now accumulates conflicting libomps in the same address space.

- Set KMP_DUPLICATE_LIB_OK=TRUE at the top of ci.py on darwin, before
  any import that can transitively pull in numpy or torch. This is
  Intel's documented escape hatch; safe for our workload where numpy
  and torch are only used for golden reference math, not parallel
  OMP regions.
- Document the full root cause, debugging steps, and explicit
  "what not to do" list in docs/macos-libomp-collision.md so future
  contributors don't re-investigate. Link it from docs/ci.md.
- Rewrite the two remaining numpy-based goldens
  (a2a3/{aicpu,host}_build_graph/bgemm) in torch for style consistency
  with the rest of examples/. Note this does not avoid the libomp
  collision on its own -- `import torch` transitively imports numpy.

Verified: `python ci.py` passes 32/32 sim tests (20 a2a3sim +
12 a5sim) on macOS without KMP_DUPLICATE_LIB_OK needing to be set
manually.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants