Skip to content

SuperagenticAI/CodexOpt

Repository files navigation

CodexOpt

CodexOpt: Optimize your Agents.MD and Skills for Codex with GEPA

CodexOpt is a lightweight Python CLI to improve Codex instruction assets with a repeatable loop:

  1. Scan instruction files.
  2. Benchmark quality.
  3. Generate optimized candidates.
  4. Apply only improvements.
  5. Produce a report.

It targets:

  • AGENTS.md
  • .codex/skills/**/SKILL.md

Why CodexOpt

Most teams edit AGENTS.md and SKILL.md manually, but struggle to answer:

  • Did quality actually improve?
  • Did we increase prompt bloat?
  • Did we break skill frontmatter conventions?

CodexOpt turns these edits into measurable runs with artifacts you can inspect and version.

Features

  • Project scan with issue detection for agents and skills.
  • Heuristic benchmark scoring.
  • Optimization engine heuristic (default, local and deterministic).
  • Optional optimization engine gepa (via gepa.optimize_anything).
  • Safe apply flow with automatic backups.
  • Markdown reporting from latest runs.
  • Minimal OSS CI (lint, test, build).

Installation

Requirements

  • Python >=3.10
  • uv (recommended) or pip

Recommended: uv (full workflow)

uv sync --extra dev

Run commands through the managed environment:

uv run codexopt --help

uv.lock is committed to keep dependency resolution reproducible across machines and CI.

Alternative: pip

pip install -e ".[dev]"

Quick Start (uv)

# 1) Create config
uv run codexopt init

# 2) Inspect what will be evaluated
uv run codexopt scan

# 3) Get baseline scores
uv run codexopt benchmark

# 4) Optimize AGENTS.md
uv run codexopt optimize agents --file AGENTS.md

# 5) Optimize skills
uv run codexopt optimize skills --glob ".codex/skills/**/SKILL.md"

# 6) Review apply impact without writing
uv run codexopt apply --kind agents --dry-run

# 7) Apply selected improvements
uv run codexopt apply --kind agents

# 8) Generate markdown summary
uv run codexopt report --output codexopt-report.md

Use codexopt.example.yaml as a starting point for committed team config.

Command Reference

Global options

codexopt --config <path-to-codexopt.yaml> <command>

init

Create a default config file.

codexopt init [--path PATH] [--force]

scan

Discover AGENTS/SKILL targets and validate shape.

codexopt scan

benchmark

Score current files using built-in heuristics.

codexopt benchmark

optimize agents

Optimize AGENTS files.

codexopt optimize agents \
  [--file PATTERN] \
  [--engine heuristic|gepa] \
  [--reflection-model MODEL] \
  [--max-metric-calls N]

optimize skills

Optimize SKILL files.

codexopt optimize skills \
  [--glob PATTERN] \
  [--engine heuristic|gepa] \
  [--reflection-model MODEL] \
  [--max-metric-calls N]

apply

Apply best candidates from the latest optimization run (or a provided run id).

codexopt apply [--kind agents|skills] [--run-id RUN_ID] [--dry-run]

report

Generate a markdown report from latest runs in state.

codexopt report [--output FILE.md]

Configuration

Default codexopt.yaml:

version: 1
targets:
  agents_files:
    - AGENTS.md
    - "**/AGENTS.md"
    - "**/AGENTS.override.md"
  skills_globs:
    - ".codex/skills/**/SKILL.md"
    - "**/.codex/skills/**/SKILL.md"
  exclude_globs:
    - ".git/**"
    - ".codexopt/**"
    - ".venv/**"
    - "node_modules/**"
    - "reference/**"
output:
  root_dir: ".codexopt"
optimization:
  engine: "heuristic"
  min_apply_delta: 0.01
  max_metric_calls: 60
  reflection_model: null

Config notes:

  • targets.agents_files: glob patterns for AGENTS targets.
  • targets.skills_globs: glob patterns for SKILL.md targets.
  • targets.exclude_globs: paths ignored during scan.
  • output.root_dir: run artifacts and backups location.
  • optimization.engine: default optimization engine.
  • optimization.min_apply_delta: minimum score gain required to apply.
  • optimization.max_metric_calls: GEPA metric budget.
  • optimization.reflection_model: required when using GEPA engine.

How Scoring Works

CodexOpt computes a 0.0 to 1.0 score per file.

AGENTS scoring factors include:

  • Too short or too long content penalties.
  • Token-heaviness estimate penalty.
  • Empty file penalty.

SKILL scoring factors include:

  • Missing frontmatter penalties.
  • Missing name / description penalties.
  • Overly long frontmatter fields penalties.
  • Too short or too long content penalties.

Optimization Behavior

Heuristic engine

Candidate transforms include:

  • Whitespace normalization.
  • Blank-line compaction.
  • Duplicate adjacent line removal.
  • Skill-specific frontmatter synthesis/trimming.

The best candidate is selected by score delta. If delta is below min_apply_delta, original content is kept.

GEPA engine (optional)

CodexOpt can call gepa.optimize_anything when --engine gepa is selected.

Requirements:

  • gepa installed in the environment.
  • A valid reflection model via --reflection-model or config.

Fallback behavior:

  • If GEPA is unavailable or errors, CodexOpt falls back to heuristic optimization.

Artifacts and State

By default, everything is written under .codexopt/:

  • runs/<run_id>/scan.json
  • runs/<run_id>/benchmark.json
  • runs/<run_id>/optimize.json
  • runs/<run_id>/apply.json
  • backups/<timestamp>/... (created on non-dry-run apply)
  • state.json (tracks latest run ids per command type)

Run ids are timestamped and namespaced by command kind, for example:

  • 20260308T184800123456Z-benchmark
  • 20260308T184812654321Z-optimize-skills

Typical Team Workflow

  1. Commit current AGENTS.md and skills.
  2. Run scan and benchmark to establish baseline.
  3. Run optimize agents and/or optimize skills.
  4. Review optimize.json and diffs.
  5. Run apply --dry-run first, then apply.
  6. Run report and attach report to PR.

Examples

Example A: AGENTS.md cleanup

Before (AGENTS.md):

## Coding Rules
Always run tests before commit.
Always run tests before commit.


Keep changes minimal.

After optimization (heuristic):

## Coding Rules
Always run tests before commit.

Keep changes minimal.

What changed:

  • Removed duplicate adjacent line.
  • Compacted extra blank lines.

Example B: SKILL.md missing frontmatter

Before (.codex/skills/my_skill/SKILL.md):

Use this skill for repository release checks.
Run lint, tests, and changelog validation.

After optimization (heuristic):

---
name: my-skill
description: Repository-specific workflow skill.
---

Use this skill for repository release checks.
Run lint, tests, and changelog validation.

What changed:

  • Added required frontmatter block.
  • Generated normalized name from folder name.
  • Added default description.

Example C: Reproduce end-to-end on a repo

uv run codexopt init
uv run codexopt scan
uv run codexopt benchmark
uv run codexopt optimize agents --file AGENTS.md
uv run codexopt optimize skills --glob ".codex/skills/**/SKILL.md"
uv run codexopt apply --kind skills --dry-run
uv run codexopt apply --kind skills
uv run codexopt report --output codexopt-report.md

Files to inspect after running:

  • .codexopt/runs/*/scan.json
  • .codexopt/runs/*/benchmark.json
  • .codexopt/runs/*/optimize.json
  • .codexopt/runs/*/apply.json
  • .codexopt/backups/*

CI

GitHub Actions workflow is included at .github/workflows/ci.yml and runs:

  • uv lock --check for lockfile consistency.
  • uv sync --extra dev for environment setup.
  • Ruff lint checks.
  • Pytest tests.
  • Package build (uv build).

It does not publish packages.

Development

uv lock
uv sync --extra dev
uv run --no-sync ruff check src tests
PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --no-sync pytest -q
uv build

FAQ / Troubleshooting

codexopt apply says "no optimization run found"

Cause:

  • No prior optimization run for the selected kind.
  • state.json does not contain the expected latest run pointer.

Fix:

uv run codexopt optimize agents
uv run codexopt apply --kind agents

Or pass an explicit run:

uv run codexopt apply --kind agents --run-id <run_id>

--engine gepa did not use GEPA

Cause:

  • gepa is not installed, or
  • reflection_model is missing.

Behavior:

  • CodexOpt falls back to heuristic optimization when GEPA errors.

Fix:

uv run codexopt optimize agents --engine gepa --reflection-model <model_name>

apply --dry-run says files would be applied, but nothing changed

Expected behavior:

  • --dry-run reports candidate applications without writing files.

To write changes, run again without --dry-run:

uv run codexopt apply --kind agents

Build fails with network/isolation issues

If your environment blocks dependency resolution in isolated builds, use:

uv build

Pytest fails due to unrelated external plugins

Some environments auto-load global pytest plugins that can break local tests. Run with plugin autoload disabled:

PYTEST_DISABLE_PLUGIN_AUTOLOAD=1 uv run --no-sync pytest -q

Optimization produced no applied changes

Cause:

  • Best candidate delta is below optimization.min_apply_delta, or
  • File content is already equivalent.

Fix:

  • Lower optimization.min_apply_delta in codexopt.yaml, then re-run optimize/apply.

License

MIT. See LICENSE.

Author

  • Shashi (shashi@super-agentic.ai)

About

CodexOpt: Optize your Agents.MD and Skills for Codex with GEPA

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages