CodexOpt

CodexOpt: Optimize your Agents.MD and Skills for Codex with GEPA

CodexOpt is a lightweight Python CLI to improve Codex instruction assets with a repeatable loop:

Scan instruction files.
Benchmark quality.
Generate optimized candidates.
Apply only improvements.
Produce a report.

It targets:

AGENTS.md
.codex/skills/**/SKILL.md

Why CodexOpt

Most teams edit AGENTS.md and SKILL.md manually, but struggle to answer:

Did quality actually improve?
Did we increase prompt bloat?
Did we break skill frontmatter conventions?

CodexOpt turns these edits into measurable runs with artifacts you can inspect and version.

Features

Project scan with issue detection for agents and skills.
Heuristic benchmark scoring.
Optimization engine heuristic (default, local and deterministic).
Optional optimization engine gepa (via gepa.optimize_anything).
Safe apply flow with automatic backups.
Markdown reporting from latest runs.
Minimal OSS CI (lint, test, build).

Installation

Requirements

Python >=3.10
uv (recommended) or pip

Recommended: uv (full workflow)

uv sync --extra dev

Run commands through the managed environment:

uv run codexopt --help

uv.lock is committed to keep dependency resolution reproducible across machines and CI.

Alternative: pip

pip install -e ".[dev]"

Quick Start (uv)

# 1) Create config
uv run codexopt init

# 2) Inspect what will be evaluated
uv run codexopt scan

# 3) Get baseline scores
uv run codexopt benchmark

# 4) Optimize AGENTS.md
uv run codexopt optimize agents --file AGENTS.md

# 5) Optimize skills
uv run codexopt optimize skills --glob ".codex/skills/**/SKILL.md"

# 6) Review apply impact without writing
uv run codexopt apply --kind agents --dry-run

# 7) Apply selected improvements
uv run codexopt apply --kind agents

# 8) Generate markdown summary
uv run codexopt report --output codexopt-report.md

Use codexopt.example.yaml as a starting point for committed team config.

Command Reference

Global options

codexopt --config <path-to-codexopt.yaml> <command>

`init`

Create a default config file.

codexopt init [--path PATH] [--force]

`scan`

Discover AGENTS/SKILL targets and validate shape.

codexopt scan

`benchmark`

Score current files using built-in heuristics.

codexopt benchmark

`optimize agents`

Optimize AGENTS files.

codexopt optimize agents \
  [--file PATTERN] \
  [--engine heuristic|gepa] \
  [--reflection-model MODEL] \
  [--max-metric-calls N]

`optimize skills`

Optimize SKILL files.

codexopt optimize skills \
  [--glob PATTERN] \
  [--engine heuristic|gepa] \
  [--reflection-model MODEL] \
  [--max-metric-calls N]

`apply`

Apply best candidates from the latest optimization run (or a provided run id).

codexopt apply [--kind agents|skills] [--run-id RUN_ID] [--dry-run]

`report`

Generate a markdown report from latest runs in state.

codexopt report [--output FILE.md]

Configuration

Default codexopt.yaml:

version: 1
targets:
  agents_files:
    - AGENTS.md
    - "**/AGENTS.md"
    - "**/AGENTS.override.md"
  skills_globs:
    - ".codex/skills/**/SKILL.md"
    - "**/.codex/skills/**/SKILL.md"
  exclude_globs:
    - ".git/**"
    - ".codexopt/**"
    - ".venv/**"
    - "node_modules/**"
    - "reference/**"
output:
  root_dir: ".codexopt"
optimization:
  engine: "heuristic"
  min_apply_delta: 0.01
  max_metric_calls: 60
  reflection_model: null

Config notes:

targets.agents_files: glob patterns for AGENTS targets.
targets.skills_globs: glob patterns for SKILL.md targets.
targets.exclude_globs: paths ignored during scan.
output.root_dir: run artifacts and backups location.
optimization.engine: default optimization engine.
optimization.min_apply_delta: minimum score gain required to apply.
optimization.max_metric_calls: GEPA metric budget.
optimization.reflection_model: required when using GEPA engine.

How Scoring Works

CodexOpt computes a 0.0 to 1.0 score per file.

AGENTS scoring factors include:

Too short or too long content penalties.
Token-heaviness estimate penalty.
Empty file penalty.

SKILL scoring factors include:

Missing frontmatter penalties.
Missing name / description penalties.
Overly long frontmatter fields penalties.
Too short or too long content penalties.

Optimization Behavior

Heuristic engine

Candidate transforms include:

Whitespace normalization.
Blank-line compaction.
Duplicate adjacent line removal.
Skill-specific frontmatter synthesis/trimming.

The best candidate is selected by score delta. If delta is below min_apply_delta, original content is kept.

GEPA engine (optional)

CodexOpt can call gepa.optimize_anything when --engine gepa is selected.

Requirements:

gepa installed in the environment.
A valid reflection model via --reflection-model or config.

Fallback behavior:

If GEPA is unavailable or errors, CodexOpt falls back to heuristic optimization.

Artifacts and State

By default, everything is written under .codexopt/:

runs/<run_id>/scan.json
runs/<run_id>/benchmark.json
runs/<run_id>/optimize.json
runs/<run_id>/apply.json
backups/<timestamp>/... (created on non-dry-run apply)
state.json (tracks latest run ids per command type)

Run ids are timestamped and namespaced by command kind, for example:

20260308T184800123456Z-benchmark
20260308T184812654321Z-optimize-skills

Typical Team Workflow

Commit current AGENTS.md and skills.
Run scan and benchmark to establish baseline.
Run optimize agents and/or optimize skills.
Review optimize.json and diffs.
Run apply --dry-run first, then apply.
Run report and attach report to PR.

Examples

Example A: `AGENTS.md` cleanup

Before (AGENTS.md):

## Coding Rules
Always run tests before commit.
Always run tests before commit.


Keep changes minimal.

After optimization (heuristic):

## Coding Rules
Always run tests before commit.

Keep changes minimal.

What changed:

Removed duplicate adjacent line.
Compacted extra blank lines.

Example B: `SKILL.md` missing frontmatter

Before (.codex/skills/my_skill/SKILL.md):

Use this skill for repository release checks.
Run lint, tests, and changelog validation.

After optimization (heuristic):

---
name: my-skill
description: Repository-specific workflow skill.
---

Use this skill for repository release checks.
Run lint, tests, and changelog validation.

What changed:

Added required frontmatter block.
Generated normalized name from folder name.
Added default description.

Example C: Reproduce end-to-end on a repo

uv run codexopt init
uv run codexopt scan
uv run codexopt benchmark
uv run codexopt optimize agents --file AGENTS.md
uv run codexopt optimize skills --glob ".codex/skills/**/SKILL.md"
uv run codexopt apply --kind skills --dry-run
uv run codexopt apply --kind skills
uv run codexopt report --output codexopt-report.md

Files to inspect after running:

.codexopt/runs/*/scan.json
.codexopt/runs/*/benchmark.json
.codexopt/runs/*/optimize.json
.codexopt/runs/*/apply.json
.codexopt/backups/*

CI

GitHub Actions workflow is included at .github/workflows/ci.yml and runs:

uv lock --check for lockfile consistency.
uv sync --extra dev for environment setup.
Ruff lint checks.
Pytest tests.
Package build (uv build).

It does not publish packages.

Development

uv lock
uv sync --extra dev
uv run --no-sync ruff check src tests
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --no-sync pytest -q
uv build

FAQ / Troubleshooting

`codexopt apply` says "no optimization run found"

Cause:

No prior optimization run for the selected kind.
state.json does not contain the expected latest run pointer.

Fix:

uv run codexopt optimize agents
uv run codexopt apply --kind agents

Or pass an explicit run:

uv run codexopt apply --kind agents --run-id <run_id>

`--engine gepa` did not use GEPA

Cause:

gepa is not installed, or
reflection_model is missing.

Behavior:

CodexOpt falls back to heuristic optimization when GEPA errors.

Fix:

uv run codexopt optimize agents --engine gepa --reflection-model <model_name>

`apply --dry-run` says files would be applied, but nothing changed

Expected behavior:

--dry-run reports candidate applications without writing files.

To write changes, run again without --dry-run:

uv run codexopt apply --kind agents

Build fails with network/isolation issues

If your environment blocks dependency resolution in isolated builds, use:

uv build

Pytest fails due to unrelated external plugins

Some environments auto-load global pytest plugins that can break local tests. Run with plugin autoload disabled:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --no-sync pytest -q

Optimization produced no applied changes

Cause:

Best candidate delta is below optimization.min_apply_delta, or
File content is already equivalent.

Fix:

Lower optimization.min_apply_delta in codexopt.yaml, then re-run optimize/apply.

License

MIT. See LICENSE.

Author

Shashi (shashi@super-agentic.ai)

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
src/codexopt		src/codexopt
tests		tests
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
codexopt.example.yaml		codexopt.example.yaml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

CodexOpt

Why CodexOpt

Features

Installation

Requirements

Recommended: uv (full workflow)

Alternative: pip

Quick Start (uv)

Command Reference

Global options

init

scan

benchmark

optimize agents

optimize skills

apply

report

Configuration

How Scoring Works

Optimization Behavior

Heuristic engine

GEPA engine (optional)

Artifacts and State

Typical Team Workflow

Examples

Example A: AGENTS.md cleanup

Example B: SKILL.md missing frontmatter

Example C: Reproduce end-to-end on a repo

CI

Development

FAQ / Troubleshooting

codexopt apply says "no optimization run found"

--engine gepa did not use GEPA

apply --dry-run says files would be applied, but nothing changed

Build fails with network/isolation issues

Pytest fails due to unrelated external plugins

Optimization produced no applied changes

License

Author

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`init`

`scan`

`benchmark`

`optimize agents`

`optimize skills`

`apply`

`report`

Example A: `AGENTS.md` cleanup

Example B: `SKILL.md` missing frontmatter

`codexopt apply` says "no optimization run found"

`--engine gepa` did not use GEPA

`apply --dry-run` says files would be applied, but nothing changed

Packages