Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions .github/copilot-instructions.md
Original file line number Diff line number Diff line change
@@ -1 +1,36 @@



## vexp context tools <!-- vexp v1.2.30 -->

**MANDATORY: use `run_pipeline` — do NOT grep, glob, or read files manually.**
vexp returns pre-indexed, graph-ranked context in a single call.

### Workflow
1. `run_pipeline` with your task description — ALWAYS FIRST (replaces all other tools)
2. Make targeted changes based on the context returned
3. `run_pipeline` again only if you need more context

### Available MCP tools
- `run_pipeline` — **PRIMARY TOOL**. Runs capsule + impact + memory in 1 call.
Auto-detects intent. Includes file content. Example: `run_pipeline({ "task": "fix auth bug" })`
- `get_context_capsule` — lightweight, for simple questions only
- `get_impact_graph` — impact analysis of a specific symbol
- `search_logic_flow` — execution paths between functions
- `get_skeleton` — compact file structure
- `index_status` — indexing status
- `get_session_context` — recall observations from sessions
- `search_memory` — cross-session search
- `save_observation` — persist insights (prefer run_pipeline's observation param)

### Agentic search
- Do NOT use built-in file search, grep, or codebase indexing — always call `run_pipeline` first
- If you spawn sub-agents or background tasks, pass them the context from `run_pipeline`
rather than letting them search the codebase independently

### Smart Features
Intent auto-detection, hybrid ranking, session memory, auto-expanding budget.

### Multi-Repo
`run_pipeline` auto-queries all indexed repos. Use `repos: ["alias"]` to scope. Run `index_status` to see aliases.
<!-- /vexp -->
6 changes: 4 additions & 2 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,13 @@ the task accurately.
<!-- AGENTS-INDEX-START -->

| Doc | When to load | Last validated | Status | Paths |
| ---------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | -------------- | ------- | ------------------------------------------------------------------------- |
|---|---|---|---|---|
| [AGENTS.md structure](docs/agents-md-structure.md) | Load when editing AGENTS.md preamble, modifying the index table format, or updating build-index.py. | 2026-03-12 | current | `AGENTS.md`<br>`scripts/agents/build-index.py` |
| [Automation workflow](docs/automation-workflow.md) | Load when modifying GitHub Actions workflows, debugging CI runs, or changing staleness detection logic. | 2026-03-12 | current | `.github/workflows/**`<br>`scripts/agents/**` |
| [Frontmatter schema](docs/frontmatter-schema.md) | Load when authoring new docs, reviewing frontmatter validation, or modifying the build-index script. | 2026-03-12 | current | `docs/**`<br>`scripts/agents/build-index.py`<br>`.agentsrc.yaml` |
| [LLM prompt design (provider-agnostic)](docs/llm-prompt-design-agnostic.md) | Load when implementing a provider-agnostic LLM layer or porting frontmatter generation to non-Claude providers. | 2026-03-12 | current | |
| [Get started](docs/get-started.md) | Load when setting up code-docs in a new repo, migrating existing docs to use frontmatter, or troubleshooting initial configuration. | 2026-03-28 | current | |
| [How it works](docs/how-it-works.md) | Load when evaluating code-docs for adoption, onboarding to the system, or seeking an end-to-end understanding of the documentation lifecycle. | 2026-03-28 | current | |
| [LLM prompt design (provider-agnostic)](docs/llm-prompt-design-agnostic.md) | Load when implementing a provider-agnostic LLM layer or porting frontmatter generation to non-Claude providers. | 2026-03-12 | current | |
| [LLM prompt design for Claude Code](docs/llm-prompt-design-claude.md) | Load when modifying the Claude Code task prompt, adjusting CI frontmatter generation, or debugging LLM output. | 2026-03-12 | current | `.github/agents/**`<br>`.github/workflows/docs-sync.yml` |

<!-- AGENTS-INDEX-END -->
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,14 @@ No additional configuration required.

**Scale.** At a few dozen docs, scanning the index is fast and cheap. At several hundred, tag-based or semantic filtering would help. That's out of scope here.

## Detailed specs
## Documentation

### Guides

- [How it works](docs/how-it-works.md) — end-to-end walkthrough of the system lifecycle
- [Get started](docs/get-started.md) — add code-docs to an existing repo

### Detailed specs

- [Frontmatter schema](docs/frontmatter-schema.md)
- [AGENTS.md structure](docs/agents-md-structure.md)
Expand Down
26 changes: 7 additions & 19 deletions docs/automation-workflow.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,8 +17,6 @@ title: "Automation workflow"

This system uses two discrete GitHub Actions workflows. The first keeps frontmatter and the AGENTS.md index in sync whenever a doc changes. The second detects stale docs on a schedule and when relevant code paths change. Both workflows write their changes back to the repo via pull requests rather than committing directly to any branch.

---

## Overview

| Workflow | File | Trigger | Responsibility |
Expand All @@ -30,8 +28,6 @@ The two workflows are intentionally decoupled. `docs-sync.yml` owns content accu

Both workflows delegate their logic to scripts in a `scripts/agents/` directory. Keeping logic out of YAML makes it testable locally and reusable across repos.

---

## Workflow 1: `docs-sync.yml`

This workflow runs whenever a file in `docs/` is added or modified on any branch. It calls an LLM to generate or refresh frontmatter, then regenerates the AGENTS.md index table, and opens a PR with both sets of changes.
Expand Down Expand Up @@ -70,8 +66,6 @@ The workflow opens a PR using the `peter-evans/create-pull-request` action. The

If a PR with this label already exists for the branch, the action updates it rather than opening a duplicate.

---

## Workflow 2: `docs-staleness.yml`

This workflow detects docs that are stale by two independent signals: time elapsed since `lastValidated`, and code changes that touched paths listed in a doc's frontmatter. It opens a PR that updates the `status` field in AGENTS.md for any flagged docs.
Expand Down Expand Up @@ -117,8 +111,6 @@ The staleness workflow opens a PR targeting `main` (not the triggering branch).

The PR is a notification mechanism. A human reviews it, validates the flagged docs, updates `lastValidated` in the relevant frontmatter, and merges. The `docs-sync.yml` workflow then picks up the frontmatter change and regenerates the index.

---

## Provider-agnostic LLM layer

The scripts call a thin wrapper (`scripts/agents/llm.py`) that abstracts the LLM provider. The wrapper reads two environment variables:
Expand All @@ -144,8 +136,6 @@ defaults:

Secrets are stored as GitHub Actions secrets (`AGENTS_LLM_API_KEY`) and never committed to the repo.

---

## Shared tooling

Both workflows call scripts in `scripts/agents/`. The scripts and their responsibilities are:
Expand All @@ -159,8 +149,6 @@ Both workflows call scripts in `scripts/agents/`. The scripts and their responsi

All scripts accept a `--dry-run` flag that prints intended changes without writing them. This makes local testing straightforward without needing to stub the LLM.

---

## Failure modes and guards

**LLM call fails:** The frontmatter script catches API errors and writes a `status: llm-error` field to the affected doc's frontmatter. The index regeneration step still runs and reflects the error status in AGENTS.md. The PR is still opened so a human can see which file failed.
Expand All @@ -173,17 +161,17 @@ All scripts accept a `--dry-run` flag that prints intended changes without writi

**Rate limiting:** The frontmatter script processes changed files sequentially with a configurable delay between calls (`AGENTS_LLM_DELAY_MS`, default 500ms). For repos with many simultaneous doc changes, this prevents bursting the API.

---

## Reusable template strategy

To stamp this system onto a new repo:

1. Copy `.github/workflows/docs-sync.yml` and `docs-staleness.yml`.
2. Copy `scripts/agents/` in its entirety.
3. Add `.agentsrc.yaml` to the repo root and set `defaults.maxAgeDays` and `llm` config.
4. Add `AGENTS_LLM_API_KEY` to the repo's GitHub Actions secrets.
5. Create the `docs/` directory and add an initial `AGENTS.md`.
1. Copy `.github/workflows/docs-sync.yml` and `.github/workflows/docs-staleness.yml`.
2. Copy `.github/agents/frontmatter-prompt.md` (the task prompt used by the docs-sync workflow).
3. Copy `scripts/agents/` in its entirety.
4. Copy `requirements.txt`.
5. Add `.agentsrc.yaml` to the repo root and set `defaults.maxAgeDays`. The `llm:` block is optional and only needed if using the provider-agnostic LLM layer described above.
6. Add `ANTHROPIC_API_KEY` to the repo's GitHub Actions secrets. This is the secret name used by `docs-sync.yml` for the Claude Code implementation. If using the provider-agnostic LLM layer instead, the secret name is `AGENTS_LLM_API_KEY`.
7. Create the `docs/` directory (if it doesn't already exist) and add an initial `AGENTS.md` with boundary markers (`<!-- AGENTS-INDEX-START -->` and `<!-- AGENTS-INDEX-END -->`).

No other configuration is required. The push trigger paths in `docs-staleness.yml` should be updated to reflect the repo's actual code paths, but the workflow runs safely without them (it will only perform time-based checks until paths are configured).

Expand Down
173 changes: 173 additions & 0 deletions docs/get-started.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,173 @@
---
description: "Load when setting up code-docs in a new repo, migrating existing docs to use frontmatter, or troubleshooting initial configuration."
lastValidated: "2026-03-28"
maxAgeDays: 90
tags:
- migration
- onboarding
- setup
title: "Get started"
---

# Get started

This guide walks through adding code-docs to an existing repo that already has a `docs/` folder. By the end, your repo will have automated frontmatter generation, an AGENTS.md index, and staleness detection.

## What to copy

The [reusable template strategy](automation-workflow.md#reusable-template-strategy) section in the automation workflow doc has the authoritative file checklist. In summary, you need:

- **Workflows**: `.github/workflows/docs-sync.yml` and `.github/workflows/docs-staleness.yml`
- **Task prompt**: `.github/agents/frontmatter-prompt.md`
- **Scripts**: `scripts/agents/` (all three files: `build-index.py`, `check-staleness.py`, `frontmatter.py`)
- **Dependencies**: `requirements.txt` (`PyYAML >= 6.0`)
- **Config**: `.agentsrc.yaml`
- **Index**: `AGENTS.md` with boundary markers

Copy these from the [code-docs repo](https://github.com/armstrongl/code-docs) into your repo, preserving the directory structure.

After copying, two things need attention:

**GitHub Actions secret.** Add `ANTHROPIC_API_KEY` to your repo's GitHub Actions secrets (Settings > Secrets and variables > Actions). The `docs-sync.yml` workflow uses this to call Claude Code for frontmatter generation.

**Staleness trigger paths.** Open `.github/workflows/docs-staleness.yml` and update the `push.paths` section to include your repo's actual code paths (e.g., `src/**`, `lib/**`). Without this, the workflow only performs time-based staleness checks — it won't detect when code changes make a doc stale.

## Configure

### `.agentsrc.yaml`

The default config sets a single value:

```yaml
defaults:
maxAgeDays: 90
```

This is the fallback staleness threshold for any doc that doesn't set its own `maxAgeDays` in frontmatter. Adjust it to match your team's review cadence. 90 days is a reasonable default for most repos.

### `AGENTS.md`

Open `AGENTS.md` and customize the preamble to describe your repo. The preamble is the first thing agents read on every session. Keep it to three to five sentences that explain what the repo contains and what kind of work agents are likely to do.

Everything above the `<!-- AGENTS-INDEX-START -->` marker is human-authored and never touched by automation. Everything between the markers is regenerated by `build-index.py` on every run.

## Migrate existing docs

If your repo already has Markdown files in `docs/`, they need YAML frontmatter before they can appear in the AGENTS.md index. There are two paths: manual and automated.

### Manual: add frontmatter yourself

Add a YAML frontmatter block to the top of each doc. The fields appear in alphabetical order:

```yaml
---
description: "Load when [trigger conditions for when an agent should read this doc]."
lastValidated: "2026-03-28"
maxAgeDays: 90
paths:
- "src/auth/**"
- "src/middleware/session.ts"
tags:
- auth
- security
title: "Authentication flow"
---
```

Two fields are yours to set:

- `lastValidated` — set this to today's date. You are confirming the doc is accurate as of now.
- `maxAgeDays` — set this to your preferred review interval, or omit it to inherit the default from `.agentsrc.yaml`.

The other four fields (`title`, `description`, `paths`, `tags`) are LLM-owned. You can write them yourself, or leave them empty and let the automation fill them in (see below). If you write them manually, the automation will never overwrite them.

The `description` field is the most important. Write it as a trigger condition starting with "Load when", not a topic summary. Keep it under 160 characters.

For docs that don't map to specific code paths (architectural overviews, onboarding guides, process docs), set `paths: []` to indicate there are no code paths. Staleness detection will use time-based checks only. If you omit `paths`, automation may add `paths: []` for you.

### Automated: let docs-sync generate frontmatter

If you prefer, add minimal frontmatter to each doc (just `lastValidated` and `maxAgeDays`) and push:

```yaml
---
lastValidated: "2026-03-28"
maxAgeDays: 90
---
```

When the push lands, `docs-sync.yml` detects the changed files and calls Claude Code to generate the missing `title`, `description`, `paths`, and `tags`. It opens a PR with the generated values so you can review them before merging.

This is the fastest path for migrating many docs at once. Push them all in one commit, review the generated descriptions in the PR, and correct any that don't read as clear trigger conditions.

## Verify

After adding frontmatter to your docs (manually or via automation), run the index builder to confirm everything is wired correctly:

```bash
python scripts/agents/build-index.py
```

Check the output in `AGENTS.md`:

- Each doc should have a row in the index table with correct title, description, date, and status.
- No rows should show `missing fields` warnings. If they do, the doc is missing one of the four required frontmatter fields (`title`, `description`, `lastValidated`, `maxAgeDays`).

If you're using a virtual environment for Python dependencies, set it up first:

```bash
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

## Nested directory structures

The default `build-index.py` scans `docs/*.md` — a flat glob that does not recurse into subdirectories. If your repo organizes docs in a nested structure like `docs/guide/`, you need to update the `--docs-dir` argument and fix the path resolution in the script.

The main issue is the repo-root derivation. The script uses `os.path.dirname(docs_dir)` to find the repo root, which assumes a single directory level. For `docs/guide/`, you need an extra `os.path.dirname()` call:

```python
# Default: one level up from docs/ -> repo root
repo_root = os.path.dirname(docs_dir)

# Nested: two levels up from docs/guide/ -> repo root
repo_root = os.path.dirname(os.path.dirname(docs_dir))
```

Without this fix, AGENTS.md links will have wrong relative paths (e.g., `guide/getting-started.md` instead of `docs/guide/getting-started.md`).

## Troubleshooting

**Docs not appearing in AGENTS.md.** The doc is missing a valid frontmatter block (delimited by `---` at the top of the file), or one of the four required fields (`title`, `description`, `lastValidated`, `maxAgeDays`) is absent. `build-index.py` skips files with no frontmatter and emits a warning row for files with incomplete frontmatter. Run `build-index.py` locally and check for warnings.

**docs-sync PR not appearing after push.** The `ANTHROPIC_API_KEY` secret is not set, or the Claude Code step failed. The workflow uses `continue-on-error: true` on the Claude Code step, so failures are silent. Check the workflow run logs in GitHub Actions for details. The index regeneration step (`build-index.py`) still runs even if Claude Code fails, so the AGENTS.md table will update — but LLM-owned fields that were missing will stay missing.

**Poor auto-generated descriptions.** Descriptions are never auto-regenerated after initial creation. If Claude Code produced a vague or inaccurate description, you must manually edit the `description` field in the doc's frontmatter. Review generated descriptions carefully on the first PR — this is the one chance to catch them.

**Markdownlint MD025 errors.** If your repo uses markdownlint, the YAML frontmatter `title:` field is treated as an H1 heading by default. Combined with a `# Heading` in the doc body, this triggers MD025 (multiple top-level headings). Add this to your `.markdownlint-cli2.yaml`:

```yaml
MD025:
front_matter_title: ""
```

**AGENTS.md invisible to git.** Some tools (such as OpenAI Codex CLI) add generated files to `.git/info/exclude`, which makes them invisible to `git status`. Run `git check-ignore -v AGENTS.md` to check. If it is excluded, remove the relevant line from `.git/info/exclude`.

**`pip install PyYAML` fails with PEP 668.** On modern systems with externally managed Python environments, direct `pip install` is blocked. Use a virtual environment instead:

```bash
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
```

**Multi-commit pushes miss changed files.** The default `docs-sync.yml` uses `HEAD~1..HEAD` to detect changed files, which only sees the last commit. For pushes with multiple commits, use the full push range in your workflow:

```yaml
env:
BEFORE_SHA: ${{ github.event.before }}
AFTER_SHA: ${{ github.sha }}
run: |
FILES=$(git diff --name-only --diff-filter=AM "$BEFORE_SHA" "$AFTER_SHA" -- 'docs/*.md')
```
Loading
Loading