hw-native-sys · uv-xiao · Apr 1, 2026 · Apr 2, 2026 · Apr 3, 2026 · Apr 3, 2026
diff --git a/.agents b/.agents
@@ -0,0 +1 @@
+.claude
diff --git a/.claude/commands/perf-example-device.md b/.claude/commands/perf-example-device.md
@@ -3,8 +3,9 @@ Benchmark the hardware performance of a single example at $ARGUMENTS.
 Reference `tools/benchmark_rounds.sh` for the full implementation pattern (device log resolution, timing parsing, reporting format). This skill runs the same logic but for a single example only.
 
 1. Verify `$ARGUMENTS` exists and contains `kernels/kernel_config.py` and `golden.py`
-2. Check `command -v npu-smi` — if not found, tell the user this requires hardware and stop
-3. **Detect platform**: Run `npu-smi info` and parse the chip name. Map `910B`/`910C` → `a2a3`, `950` → `a5`. If unrecognized, warn and default to `a2a3`
-4. Find the lowest-ID idle device (HBM-Usage = 0) from the `npu-smi info` output. If none, stop
-5. Run the example following the same pattern as `run_bench()` in `tools/benchmark_rounds.sh`:
+2. Require the example path to live under `examples/a2a3/` or `examples/a5/`. If it does not, stop and report that root-level `examples/{runtime}/...` paths are invalid.
+3. Check `command -v npu-smi` — if not found, tell the user this requires hardware and stop
+4. **Detect platform**: Infer the architecture from the example path (`examples/a2a3/...` → `a2a3`, `examples/a5/...` → `a5`). Use `npu-smi info` only as a sanity check; if the detected chip family conflicts with the path, report the mismatch and stop instead of silently switching platforms.
+5. Find the lowest-ID idle device (HBM-Usage = 0) from the `npu-smi info` output. If none, stop
+6. Run the example following the same pattern as `run_bench()` in `tools/benchmark_rounds.sh`:
    - Snapshot logs, run `run_example.py` with `-n 10`, find new log, parse timing, report results
diff --git a/.claude/commands/profile.md b/.claude/commands/profile.md
@@ -1,6 +1,8 @@
 Run the example at $ARGUMENTS with profiling enabled on hardware.
 
 1. Verify the directory exists and contains `kernels/kernel_config.py` and `golden.py`
-2. Run: `python examples/scripts/run_example.py -k $ARGUMENTS/kernels -g $ARGUMENTS/golden.py -p a2a3 --enable-profiling`
-3. If the test passes, report the swimlane output file location in `outputs/`
-4. Summarize the task statistics from the console output (per-function timing breakdown)
+2. Require the example path to live under `examples/a2a3/` or `examples/a5/`. If it does not, stop and report that root-level `examples/{runtime}/...` paths are invalid.
+3. Infer the platform from the example path (`examples/a2a3/...` → `a2a3`, `examples/a5/...` → `a5`).
+4. Run: `python examples/scripts/run_example.py -k $ARGUMENTS/kernels -g $ARGUMENTS/golden.py -p <platform> --enable-profiling`
+5. If the test passes, report the swimlane output file location in `outputs/`
+6. Summarize the task statistics from the console output (per-function timing breakdown)
diff --git a/.claude/commands/test-example-device.md b/.claude/commands/test-example-device.md
@@ -1,8 +1,9 @@
 Run the hardware device test for the example at $ARGUMENTS.
 
 1. Verify the directory exists and contains `kernels/kernel_config.py` and `golden.py`
-2. Check `command -v npu-smi` — if not found, tell the user to use `/test-example-sim` instead and stop
-3. **Detect platform**: Run `npu-smi info` and parse the chip name. Map `910B`/`910C` → `a2a3`, `950` → `a5`. If unrecognized, warn and default to `a2a3`
-4. Read `.github/workflows/ci.yml` to extract the current `-c` (pto-isa commit) flag from the `st-onboard-<platform>` job's `./ci.sh` invocation
-5. Run: `python examples/scripts/run_example.py -k $ARGUMENTS/kernels -g $ARGUMENTS/golden.py -p <platform> -c <commit>`
-6. Report pass/fail status with any error output
+2. Require the example path to live under `examples/a2a3/` or `examples/a5/`. If it does not, stop and report that root-level `examples/{runtime}/...` paths are invalid.
+3. Check `command -v npu-smi` — if not found, tell the user to use `/test-example-sim` instead and stop
+4. **Detect platform**: Infer the architecture from the example path (`examples/a2a3/...` → `a2a3`, `examples/a5/...` → `a5`). Use `npu-smi info` only as a sanity check; if the detected chip family conflicts with the path, report the mismatch and stop instead of silently switching platforms.
+5. Read `.github/workflows/ci.yml` to extract the current `-c` (pto-isa commit) flag from the `st-onboard-<platform>` job's `./ci.sh` invocation
+6. Run: `python examples/scripts/run_example.py -k $ARGUMENTS/kernels -g $ARGUMENTS/golden.py -p <platform> -c <commit>`
+7. Report pass/fail status with any error output
diff --git a/.claude/commands/test-example-sim.md b/.claude/commands/test-example-sim.md
@@ -1,7 +1,8 @@
 Run the simulation test for the example at $ARGUMENTS.
 
 1. Verify the directory exists and contains `kernels/kernel_config.py` and `golden.py`
-2. Read `.github/workflows/ci.yml` to extract the current `-c` (pto-isa commit) flag from the `st-sim-*` jobs' `./ci.sh` invocations
-3. **Detect platform**: Infer the architecture from the example path (e.g., `examples/a2a3/...` → `a2a3sim`, `examples/a5/...` → `a5sim`). If the path doesn't contain an arch prefix, default to `a2a3sim`
-4. Run: `python examples/scripts/run_example.py -k $ARGUMENTS/kernels -g $ARGUMENTS/golden.py -p <platform> -c <commit>`
-5. Report pass/fail status with any error output
+2. Require the example path to live under `examples/a2a3/` or `examples/a5/`. If it does not, stop and report that root-level `examples/{runtime}/...` paths are invalid.
+3. Read `.github/workflows/ci.yml` to extract the current `-c` (pto-isa commit) flag from the `st-sim-*` jobs' `./ci.sh` invocations
+4. **Detect platform**: Infer the architecture from the example path (`examples/a2a3/...` → `a2a3sim`, `examples/a5/...` → `a5sim`).
+5. Run: `python examples/scripts/run_example.py -k $ARGUMENTS/kernels -g $ARGUMENTS/golden.py -p <platform> -c <commit>`
+6. Report pass/fail status with any error output
diff --git a/.claude/rules/architecture.md b/.claude/rules/architecture.md
@@ -24,6 +24,11 @@ See [docs/architecture.md](../../docs/architecture.md) for the full diagram, API
 
 ## Example / Test Layout
 
+Examples must live under `examples/{arch}/{runtime}/{name}/`. Valid example roots are
+`examples/a2a3/` and `examples/a5/`. Paths such as
+`examples/host_build_graph/<name>/` or `examples/tensormap_and_ringbuffer/<name>/`
+directly under `examples/` are invalid.
+
 ```text
 my_example/
   golden.py              # generate_inputs() + compute_golden()

diff --git a/.gitignore b/.gitignore
@@ -21,6 +21,7 @@ venv/
 .claude/settings.local.json
 .claude/worktrees
 .claude/plans
+.worktrees/
 
 # Git cloned dependencies (not tracked in repo)
 examples/scripts/_deps/

diff --git a/AGENTS.md b/AGENTS.md
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1 @@
+CLAUDE.md
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -8,7 +8,7 @@ See [docs/developer-guide.md](docs/developer-guide.md) for full directory struct
 | ---- | ----------------- |
 | Platform Developer | `src/{arch}/platform/` |
 | Runtime Developer | `src/{arch}/runtime/` |
-| Codegen Developer | `examples/` |
+| Codegen Developer | `examples/{arch}/` |
 
 ## Common Commands
 
@@ -32,8 +32,9 @@ clang-format -i <file>
 
 ## Important Rules
 
-1. **Consult `.claude/rules/` for coding conventions** (architecture, codestyle, terminology) — these are always-loaded guidelines. **Consult `.claude/skills/` for task-specific workflows** (e.g., `git-commit/` when committing, `testing/` when running tests)
+1. **Consult `.agents/rules/` for coding conventions** (architecture, codestyle, terminology) — these are always-loaded guidelines. **Consult `.agents/skills/` for task-specific workflows** (e.g., `git-commit/` when committing, `testing/` when running tests)
 2. **Do not modify directories outside your assigned area** unless the user explicitly requests it
 3. Create new subdirectories under your assigned directory as needed
 4. When in doubt, ask the user before making changes to other areas
 5. **Avoid including private information in documentation or code** such as usernames, absolute paths with usernames, or other personally identifiable information. Use relative paths or generic placeholders instead
+6. **Place examples under `examples/{arch}/{runtime}/{name}/`**. Do not create `examples/{runtime}/...` directly under `examples/`.
diff --git a/docs/developer-guide.md b/docs/developer-guide.md
@@ -106,7 +106,9 @@ When preprocessor guards are used to isolate platform code paths, the `__aarch64
 
 ## Example / Test Layout
 
-Every example and device test follows this structure:
+Examples must live under `examples/{arch}/{runtime}/{name}/`, and device scenes must
+live under `tests/st/{arch}/{runtime}/{name}/`. Every example and device test follows
+this structure:
 
 ```text
 my_example/