Skill Optimizer Run Summary
Evidence
The optimizer failed in run.log with:
Unhandled error: Error: target.repoPath: target repo has uncommitted changes
(optimize.requireCleanGit is enabled) — Run: git stash or commit your changes
init.log shows the optimizer initialized successfully with:
- Surface:
prompt
- Models:
claude-sonnet-4.6
- Tasks: up to 10 per run
- Iterations: up to 1
- Target: 80% pass rate
Because run_status: 1, no benchmark or optimization results were produced — the three improvements below address both the immediate failure and preventative quality enhancements.
Improvements
1. Add a pre-flight git stash step to the skill-optimizer workflow
Problem: The daily skill-optimizer job fails whenever the workspace contains uncommitted changes (e.g., from a previous make recompile or make init that modifies .skill-optimizer/skill-optimizer.json). The optimize.requireCleanGit guard blocks every run that doesn't start from a perfectly clean tree.
Action: In the .github/workflows skill-optimizer workflow, add a step before invoking the optimizer that runs git stash --include-untracked (and a corresponding git stash pop in a finally block), or configure optimize.requireCleanGit: false in .skill-optimizer/skill-optimizer.json if stashing is too disruptive.
Expected impact: Eliminates the recurring run_status: 1 failure so the optimizer can actually execute dry-runs and benchmarks, enabling continuous skill quality improvements.
2. Increase maxTasks and maxIterations in .skill-optimizer/skill-optimizer.json
Problem: answers.json and init.log show the optimizer is configured for at most 10 tasks and 1 iteration. With a single iteration and a 80 % pass-rate target, the optimizer has very little room to explore improvements to SKILL.md before giving up. A single iteration that fails to reach 80 % produces no committed improvement.
Action: Update .skill-optimizer/skill-optimizer.json to increase maxTasks to at least 20 and maxIterations to 3. This gives the optimizer multiple attempts to refine skill guidance before concluding a run, which is especially valuable given that SKILL.md covers many complex topics (MCP configuration, YAML gotchas, Go patterns).
Expected impact: Higher probability of reaching the 80 % pass-rate target in each run, producing meaningful SKILL.md improvements rather than always terminating after a single failed iteration.
3. Add task coverage for critical SKILL.md topics that are under-represented
Problem: The current optimizer configuration does not explicitly define task scenarios for several high-risk areas documented in AGENTS.md and SKILL.md: (a) ANSI escape code prevention in YAML files, (b) the needs.activation.outputs.* deprecation and correct steps.sanitized.outputs.* usage, and (c) requireCleanGit-aware CI setup. Without targeted tasks, the optimizer cannot measure whether SKILL.md actually teaches these topics well.
Action: Add at least 3 scenario-based evaluation tasks to the skill-optimizer task suite (in .skill-optimizer/) covering: (1) a prompt asking how to avoid ANSI codes in compiled workflow YAML, (2) a prompt about migrating needs.activation.outputs.text to the new form, and (3) a prompt about configuring the optimizer to run without requiring a clean git state.
Expected impact: Measurable coverage of the most error-prone topics in the codebase, allowing the optimizer to detect and fix gaps in SKILL.md before those gaps cause CI failures or incorrect agent behavior.
Generated by Daily Skill Optimizer Improvements · ● 200.9K · ◷
Skill Optimizer Run Summary
dry-runEvidence
The optimizer failed in
run.logwith:init.logshows the optimizer initialized successfully with:promptclaude-sonnet-4.6Because
run_status: 1, no benchmark or optimization results were produced — the three improvements below address both the immediate failure and preventative quality enhancements.Improvements
1. Add a pre-flight
git stashstep to the skill-optimizer workflowProblem: The daily skill-optimizer job fails whenever the workspace contains uncommitted changes (e.g., from a previous
make recompileormake initthat modifies.skill-optimizer/skill-optimizer.json). Theoptimize.requireCleanGitguard blocks every run that doesn't start from a perfectly clean tree.Action: In the
.github/workflowsskill-optimizer workflow, add a step before invoking the optimizer that runsgit stash --include-untracked(and a correspondinggit stash popin afinallyblock), or configureoptimize.requireCleanGit: falsein.skill-optimizer/skill-optimizer.jsonif stashing is too disruptive.Expected impact: Eliminates the recurring
run_status: 1failure so the optimizer can actually execute dry-runs and benchmarks, enabling continuous skill quality improvements.2. Increase
maxTasksandmaxIterationsin.skill-optimizer/skill-optimizer.jsonProblem:
answers.jsonandinit.logshow the optimizer is configured for at most 10 tasks and 1 iteration. With a single iteration and a 80 % pass-rate target, the optimizer has very little room to explore improvements toSKILL.mdbefore giving up. A single iteration that fails to reach 80 % produces no committed improvement.Action: Update
.skill-optimizer/skill-optimizer.jsonto increasemaxTasksto at least 20 andmaxIterationsto 3. This gives the optimizer multiple attempts to refine skill guidance before concluding a run, which is especially valuable given thatSKILL.mdcovers many complex topics (MCP configuration, YAML gotchas, Go patterns).Expected impact: Higher probability of reaching the 80 % pass-rate target in each run, producing meaningful
SKILL.mdimprovements rather than always terminating after a single failed iteration.3. Add task coverage for critical
SKILL.mdtopics that are under-representedProblem: The current optimizer configuration does not explicitly define task scenarios for several high-risk areas documented in
AGENTS.mdandSKILL.md: (a) ANSI escape code prevention in YAML files, (b) theneeds.activation.outputs.*deprecation and correctsteps.sanitized.outputs.*usage, and (c)requireCleanGit-aware CI setup. Without targeted tasks, the optimizer cannot measure whetherSKILL.mdactually teaches these topics well.Action: Add at least 3 scenario-based evaluation tasks to the skill-optimizer task suite (in
.skill-optimizer/) covering: (1) a prompt asking how to avoid ANSI codes in compiled workflow YAML, (2) a prompt about migratingneeds.activation.outputs.textto the new form, and (3) a prompt about configuring the optimizer to run without requiring a clean git state.Expected impact: Measurable coverage of the most error-prone topics in the codebase, allowing the optimizer to detect and fix gaps in
SKILL.mdbefore those gaps cause CI failures or incorrect agent behavior.