@@ -9,9 +9,10 @@ description: >-
99 or "regenerate the eval matrix". This skill orchestrates parallel cold
1010 sub-agent spawns via the harness's task tool, scores deterministically,
1111 and converges P>=0.8 / N>=0.8 / R==1.0 within max 3 iteration loops.
12- This skill is contributor-only -- it lives under .apm/skills/ and is
13- NOT shipped inside the user-facing skills/genesis/ bundle (BUNDLE
14- LEAKAGE discipline).
12+ This skill is contributor-only -- it lives under dev/skills/ (OUTSIDE
13+ .apm/) and is NOT shipped inside the user-facing skills/genesis/
14+ bundle (BUNDLE LEAKAGE discipline). See "Why this lives outside
15+ .apm/" below.
1516---
1617
1718# genesis-evals: maintainer-side eval runner
@@ -20,14 +21,22 @@ Run the genesis self-eval suite. Steers the parent LLM session to
2021orchestrate cold sub-agent spawns, capture responses, score
2122deterministically, and report convergence.
2223
23- ## Why this lives under .apm/
24+ ## Why this lives outside ` .apm/ `
2425
2526Genesis ships to USERS via npx / ` apm install ` . Eval scenarios LOOK
2627LIKE real user requests (that is the point). Colocating them under
2728` skills/genesis/evals/ ` would risk DISPATCH CONTAMINATION (an
2829over-eager harness loader pulling scenario prompts into the active
2930context) and PAYLOAD BLOAT for users who never run evals.
3031
32+ We also keep this OUTSIDE ` .apm/ ` because APM treats ` .apm/ ` as the
33+ publishable source root: its local-content scanner picks up anything
34+ under ` .apm/skills/ ` regardless of dev-marker, so `apm pack
35+ --format plugin` would leak this maintainer-only skill into the
36+ shipped artifact. Living under ` dev/skills/ ` keeps it scanner-invisible
37+ while still letting ` apm install --dev ` deploy it via the local-path
38+ devDependency in the root ` apm.yml ` .
39+
3140This is the inverse of PHANTOM DEPENDENCY (referenced-but-not-bundled):
3241BUNDLE LEAKAGE (bundled-but-not-consumed-at-runtime). See
3342` skills/genesis/assets/composition-substrate.md ` "Anti-patterns
@@ -88,9 +97,9 @@ flagged at this step".
8897## Step 1 -- validate
8998
9099```
91- python .apm /skills/genesis-evals/scripts/validate_scenarios.py \
92- --scenarios-dir .apm /skills/genesis-evals/scenarios \
93- --schema .apm /skills/genesis-evals/schema/scenario.schema.json
100+ python dev /skills/genesis-evals/scripts/validate_scenarios.py \
101+ --scenarios-dir dev /skills/genesis-evals/scenarios \
102+ --schema dev /skills/genesis-evals/schema/scenario.schema.json
94103```
95104
96105If exit non-zero, STOP. Fix the schema/yaml violations before any
@@ -100,7 +109,7 @@ spawn. No partial runs.
100109
101110```
102111RUN_ID=$(date -u +%Y%m%dT%H%M%SZ)
103- RUNS_DIR=.apm /skills/genesis-evals/runs
112+ RUNS_DIR=dev /skills/genesis-evals/runs
104113mkdir -p "$RUNS_DIR/$RUN_ID"
105114```
106115
@@ -118,8 +127,8 @@ For EACH scenario YAML in `scenarios/`:
118127
1191281 . ** Record the spawn intent** (deterministic, pre-spawn):
120129 ```
121- python .apm /skills/genesis-evals/scripts/spawn_record.py \
122- --scenario .apm /skills/genesis-evals/scenarios/<id>.yml \
130+ python dev /skills/genesis-evals/scripts/spawn_record.py \
131+ --scenario dev /skills/genesis-evals/scenarios/<id>.yml \
123132 --half <with|without|single> \
124133 --run-id "$RUN_ID" \
125134 --runs-dir "$RUNS_DIR"
@@ -156,10 +165,10 @@ parallel as the harness permits. Do NOT serialize unless forced.
156165## Step 4 -- score
157166
158167```
159- python .apm /skills/genesis-evals/scripts/score_run.py \
168+ python dev /skills/genesis-evals/scripts/score_run.py \
160169 --run-id "$RUN_ID" \
161170 --runs-dir "$RUNS_DIR" \
162- --scenarios-dir .apm /skills/genesis-evals/scenarios
171+ --scenarios-dir dev /skills/genesis-evals/scenarios
163172```
164173
165174Writes ` <runs-dir>/<run-id>/summary.md ` . Exit code: 0 if converged,
@@ -242,7 +251,7 @@ the file (provenance / audit trail).
242251## Setup (one-time)
243252
244253```
245- pip install -r .apm /skills/genesis-evals/requirements.txt
254+ pip install -r dev /skills/genesis-evals/requirements.txt
246255```
247256
248257(deps: pyyaml, jsonschema. Maintainer-side only. Users never see
0 commit comments