CodexOpt: Optimize your Agents.MD and Skills for Codex with GEPA
CodexOpt is a lightweight Python CLI to improve Codex instruction assets with a repeatable loop:
- Scan instruction files.
- Benchmark quality.
- Generate optimized candidates.
- Apply only improvements.
- Produce a report.
It targets:
AGENTS.md.codex/skills/**/SKILL.md
Most teams edit AGENTS.md and SKILL.md manually, but struggle to answer:
- Did quality actually improve?
- Did we increase prompt bloat?
- Did we break skill frontmatter conventions?
CodexOpt turns these edits into measurable runs with artifacts you can inspect and version.
- Project scan with issue detection for agents and skills.
- Heuristic benchmark scoring.
- Optimization engine
heuristic(default, local and deterministic). - Optional optimization engine
gepa(viagepa.optimize_anything). - Safe apply flow with automatic backups.
- Markdown reporting from latest runs.
- Minimal OSS CI (lint, test, build).
- Python
>=3.10 uv(recommended) orpip
uv sync --extra devRun commands through the managed environment:
uv run codexopt --helpuv.lock is committed to keep dependency resolution reproducible across machines and CI.
pip install -e ".[dev]"# 1) Create config
uv run codexopt init
# 2) Inspect what will be evaluated
uv run codexopt scan
# 3) Get baseline scores
uv run codexopt benchmark
# 4) Optimize AGENTS.md
uv run codexopt optimize agents --file AGENTS.md
# 5) Optimize skills
uv run codexopt optimize skills --glob ".codex/skills/**/SKILL.md"
# 6) Review apply impact without writing
uv run codexopt apply --kind agents --dry-run
# 7) Apply selected improvements
uv run codexopt apply --kind agents
# 8) Generate markdown summary
uv run codexopt report --output codexopt-report.mdUse codexopt.example.yaml as a starting point for committed team config.
codexopt --config <path-to-codexopt.yaml> <command>Create a default config file.
codexopt init [--path PATH] [--force]Discover AGENTS/SKILL targets and validate shape.
codexopt scanScore current files using built-in heuristics.
codexopt benchmarkOptimize AGENTS files.
codexopt optimize agents \
[--file PATTERN] \
[--engine heuristic|gepa] \
[--reflection-model MODEL] \
[--max-metric-calls N]Optimize SKILL files.
codexopt optimize skills \
[--glob PATTERN] \
[--engine heuristic|gepa] \
[--reflection-model MODEL] \
[--max-metric-calls N]Apply best candidates from the latest optimization run (or a provided run id).
codexopt apply [--kind agents|skills] [--run-id RUN_ID] [--dry-run]Generate a markdown report from latest runs in state.
codexopt report [--output FILE.md]Default codexopt.yaml:
version: 1
targets:
agents_files:
- AGENTS.md
- "**/AGENTS.md"
- "**/AGENTS.override.md"
skills_globs:
- ".codex/skills/**/SKILL.md"
- "**/.codex/skills/**/SKILL.md"
exclude_globs:
- ".git/**"
- ".codexopt/**"
- ".venv/**"
- "node_modules/**"
- "reference/**"
output:
root_dir: ".codexopt"
optimization:
engine: "heuristic"
min_apply_delta: 0.01
max_metric_calls: 60
reflection_model: nullConfig notes:
targets.agents_files: glob patterns for AGENTS targets.targets.skills_globs: glob patterns forSKILL.mdtargets.targets.exclude_globs: paths ignored during scan.output.root_dir: run artifacts and backups location.optimization.engine: default optimization engine.optimization.min_apply_delta: minimum score gain required to apply.optimization.max_metric_calls: GEPA metric budget.optimization.reflection_model: required when using GEPA engine.
CodexOpt computes a 0.0 to 1.0 score per file.
AGENTS scoring factors include:
- Too short or too long content penalties.
- Token-heaviness estimate penalty.
- Empty file penalty.
SKILL scoring factors include:
- Missing frontmatter penalties.
- Missing
name/descriptionpenalties. - Overly long frontmatter fields penalties.
- Too short or too long content penalties.
Candidate transforms include:
- Whitespace normalization.
- Blank-line compaction.
- Duplicate adjacent line removal.
- Skill-specific frontmatter synthesis/trimming.
The best candidate is selected by score delta. If delta is below min_apply_delta, original content is kept.
CodexOpt can call gepa.optimize_anything when --engine gepa is selected.
Requirements:
gepainstalled in the environment.- A valid reflection model via
--reflection-modelor config.
Fallback behavior:
- If GEPA is unavailable or errors, CodexOpt falls back to heuristic optimization.
By default, everything is written under .codexopt/:
runs/<run_id>/scan.jsonruns/<run_id>/benchmark.jsonruns/<run_id>/optimize.jsonruns/<run_id>/apply.jsonbackups/<timestamp>/...(created on non-dry-run apply)state.json(tracks latest run ids per command type)
Run ids are timestamped and namespaced by command kind, for example:
20260308T184800123456Z-benchmark20260308T184812654321Z-optimize-skills
- Commit current
AGENTS.mdand skills. - Run
scanandbenchmarkto establish baseline. - Run
optimize agentsand/oroptimize skills. - Review
optimize.jsonand diffs. - Run
apply --dry-runfirst, thenapply. - Run
reportand attach report to PR.
Before (AGENTS.md):
## Coding Rules
Always run tests before commit.
Always run tests before commit.
Keep changes minimal.After optimization (heuristic):
## Coding Rules
Always run tests before commit.
Keep changes minimal.What changed:
- Removed duplicate adjacent line.
- Compacted extra blank lines.
Before (.codex/skills/my_skill/SKILL.md):
Use this skill for repository release checks.
Run lint, tests, and changelog validation.After optimization (heuristic):
---
name: my-skill
description: Repository-specific workflow skill.
---
Use this skill for repository release checks.
Run lint, tests, and changelog validation.What changed:
- Added required frontmatter block.
- Generated normalized
namefrom folder name. - Added default
description.
uv run codexopt init
uv run codexopt scan
uv run codexopt benchmark
uv run codexopt optimize agents --file AGENTS.md
uv run codexopt optimize skills --glob ".codex/skills/**/SKILL.md"
uv run codexopt apply --kind skills --dry-run
uv run codexopt apply --kind skills
uv run codexopt report --output codexopt-report.mdFiles to inspect after running:
.codexopt/runs/*/scan.json.codexopt/runs/*/benchmark.json.codexopt/runs/*/optimize.json.codexopt/runs/*/apply.json.codexopt/backups/*
GitHub Actions workflow is included at .github/workflows/ci.yml and runs:
uv lock --checkfor lockfile consistency.uv sync --extra devfor environment setup.- Ruff lint checks.
- Pytest tests.
- Package build (
uv build).
It does not publish packages.
uv lock
uv sync --extra dev
uv run --no-sync ruff check src tests
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --no-sync pytest -q
uv buildCause:
- No prior optimization run for the selected kind.
state.jsondoes not contain the expected latest run pointer.
Fix:
uv run codexopt optimize agents
uv run codexopt apply --kind agentsOr pass an explicit run:
uv run codexopt apply --kind agents --run-id <run_id>Cause:
gepais not installed, orreflection_modelis missing.
Behavior:
- CodexOpt falls back to heuristic optimization when GEPA errors.
Fix:
uv run codexopt optimize agents --engine gepa --reflection-model <model_name>Expected behavior:
--dry-runreports candidate applications without writing files.
To write changes, run again without --dry-run:
uv run codexopt apply --kind agentsIf your environment blocks dependency resolution in isolated builds, use:
uv buildSome environments auto-load global pytest plugins that can break local tests. Run with plugin autoload disabled:
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --no-sync pytest -qCause:
- Best candidate delta is below
optimization.min_apply_delta, or - File content is already equivalent.
Fix:
- Lower
optimization.min_apply_deltaincodexopt.yaml, then re-run optimize/apply.
MIT. See LICENSE.
- Shashi (
shashi@super-agentic.ai)