docs: update agentic maintenance docs with latest features#25919
docs: update agentic maintenance docs with latest features#25919
Conversation
Agent-Logs-Url: https://github.com/github/gh-aw/sessions/dda184a0-414f-4c6e-b9d9-92b4d0bc2e5e Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Add cache-memory cleanup section to Ephemerals guide - Add missing operations: update, upgrade, clean_cache_memories, validate - Fix safe_outputs operation description (replay, not auto-close) - Add aw.json configuration section (custom runner, disable maintenance) - Add automatic cleanup section to Cache Memory reference - Docs build passes with all links valid Agent-Logs-Url: https://github.com/github/gh-aw/sessions/dda184a0-414f-4c6e-b9d9-92b4d0bc2e5e Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Updates the documentation for agentic maintenance/ephemerals to reflect newly supported maintenance operations (including cache-memory cleanup) and corrects prior inaccuracies, with a small regeneration/adjustment to the maintenance workflow YAML.
Changes:
- Documented cache-memory automatic + manual cleanup and added missing maintenance operations (
update,upgrade,clean_cache_memories,validate). - Corrected
safe_outputsoperation description and expanded maintenance operation/configuration details (incl.aw.json). - Updated the Agent Factory status list and adjusted the generated
agentics-maintenance.ymlvalidate job runner/steps.
Show a summary per file
| File | Description |
|---|---|
docs/src/content/docs/reference/cache-memory.md |
Adds an “Automatic Cleanup” section and links to maintenance docs. |
docs/src/content/docs/guides/ephemerals.md |
Adds cache-memory cleanup docs, expands manual maintenance operations, and documents aw.json maintenance config. |
docs/src/content/docs/agent-factory-status.mdx |
Updates the published workflow/status table (new entries + engine updates). |
.github/workflows/agentics-maintenance.yml |
Updates validate_workflows runner and removes the Docker daemon startup step. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (1)
docs/src/content/docs/guides/ephemerals.md:120
- The
validateoperation output is not always “full”:run_validate_workflows.cjstruncates captured output to 50,000 characters before writing to the issue/comment. Update the doc to mention that the output may be truncated (and point readers to the workflow logs for the complete output).
- **`validate`**: Runs `gh aw compile --validate --no-emit --zizmor --actionlint --poutine --verbose`. If errors or warnings are found, creates or updates a GitHub issue titled `[aw] workflow validation findings` with the full output.
- Files reviewed: 4/4 changed files
- Comments generated: 3
| @@ -315,11 +315,6 @@ jobs: | |||
| - name: Build gh-aw | |||
| run: make build | |||
|
|
|||
There was a problem hiding this comment.
validate_workflows runs run_validate_workflows.cjs, which executes gh aw compile --validate ... --poutine. The poutine scanner is implemented via docker run (see pkg/cli/poutine.go), so this job needs a running Docker daemon. After removing the Docker start step (and switching to ubuntu-slim), validation may fail on runners where Docker isn’t already running. Consider re-adding an explicit Docker availability/daemon start check (or conditionally skipping --poutine when Docker is unavailable).
| - name: Ensure Docker is available | |
| shell: bash | |
| run: | | |
| set -euo pipefail | |
| if ! command -v docker >/dev/null 2>&1; then | |
| echo "Docker CLI is required for workflow validation with --poutine, but it is not installed." | |
| exit 1 | |
| fi | |
| if docker info >/dev/null 2>&1; then | |
| echo "Docker daemon is already available." | |
| exit 0 | |
| fi | |
| echo "Docker daemon is not running; attempting to start it." | |
| if command -v sudo >/dev/null 2>&1; then | |
| sudo service docker start >/dev/null 2>&1 || sudo systemctl start docker >/dev/null 2>&1 || true | |
| else | |
| service docker start >/dev/null 2>&1 || systemctl start docker >/dev/null 2>&1 || true | |
| fi | |
| sleep 5 | |
| if ! docker info >/dev/null 2>&1; then | |
| echo "Docker is required for workflow validation with --poutine, but the daemon could not be started." | |
| exit 1 | |
| fi |
|
|
||
| ### Cache-Memory Cleanup | ||
|
|
||
| The maintenance workflow automatically cleans up outdated [cache-memory](/gh-aw/reference/cache-memory/) entries on every scheduled run. Cache keys follow the pattern `memory-{workflow}-{run-id}`, and the cleanup job groups caches by workflow prefix, keeps the latest run ID per group, and deletes older entries. This prevents cache storage from growing unboundedly as workflows run repeatedly. |
There was a problem hiding this comment.
The cache key format here is oversimplified. Current cache-memory keys can include additional prefix segments (e.g., integrity level + policy hash) and always end with the numeric run ID (see docs/src/content/docs/reference/safe-outputs-specification.md key format section). Suggest describing the pattern as memory-{...}-{runID} (or the full format) and describing grouping as “everything before the final numeric run ID” rather than “workflow prefix”.
This issue also appears on line 120 of the same file.
| The maintenance workflow automatically cleans up outdated [cache-memory](/gh-aw/reference/cache-memory/) entries on every scheduled run. Cache keys follow the pattern `memory-{workflow}-{run-id}`, and the cleanup job groups caches by workflow prefix, keeps the latest run ID per group, and deletes older entries. This prevents cache storage from growing unboundedly as workflows run repeatedly. | |
| The maintenance workflow automatically cleans up outdated [cache-memory](/gh-aw/reference/cache-memory/) entries on every scheduled run. Cache keys follow the pattern `memory-{...}-{runID}`, ending with the numeric run ID and potentially including additional prefix segments before it. The cleanup job groups caches by everything before the final numeric run ID, keeps the latest run ID per group, and deletes older entries. This prevents cache storage from growing unboundedly as workflows run repeatedly. |
| The [agentic maintenance](/gh-aw/guides/ephemerals/#cache-memory-cleanup) workflow automatically cleans up outdated cache-memory entries on a schedule. Caches are grouped by key prefix (everything before the run ID), and only the latest entry per group is kept. Older entries are deleted to prevent unbounded storage growth. | ||
|
|
||
| You can also trigger cleanup manually from the GitHub Actions UI by running the `Agentic Maintenance` workflow with the `clean_cache_memories` operation. |
There was a problem hiding this comment.
This section implies cache cleanup is always automatic, but the cleanup job only runs when the Agentic Maintenance workflow exists and is enabled (the maintenance workflow is generated only when workflows use expires, and can be disabled via .github/workflows/aw.json). Consider clarifying this prerequisite to avoid confusing users who use cache-memory without maintenance enabled.
| The [agentic maintenance](/gh-aw/guides/ephemerals/#cache-memory-cleanup) workflow automatically cleans up outdated cache-memory entries on a schedule. Caches are grouped by key prefix (everything before the run ID), and only the latest entry per group is kept. Older entries are deleted to prevent unbounded storage growth. | |
| You can also trigger cleanup manually from the GitHub Actions UI by running the `Agentic Maintenance` workflow with the `clean_cache_memories` operation. | |
| If the [agentic maintenance](/gh-aw/guides/ephemerals/#cache-memory-cleanup) workflow exists and is enabled, it automatically cleans up outdated cache-memory entries on a schedule. This maintenance workflow is only generated when workflows use `expires`, and it can be disabled via `.github/workflows/aw.json`. Caches are grouped by key prefix (everything before the run ID), and only the latest entry per group is kept. Older entries are deleted to prevent unbounded storage growth. | |
| If `Agentic Maintenance` is present and enabled, you can also trigger cleanup manually from the GitHub Actions UI by running that workflow with the `clean_cache_memories` operation. |
The Ephemerals guide was missing several maintenance operations and the new cache-memory cleanup feature added in #25908. The
safe_outputsoperation was also misdescribed.Ephemerals guide (
guides/ephemerals.md)update,upgrade,clean_cache_memories,validatesafe_outputsdescription — was "Auto-close expired issues" (wrong), now correctly describes replay from a previous runaw.jsonfor custom runners and disabling maintenance:{ "maintenance": { "runs_on": "ubuntu-latest" } }{ "maintenance": false }Cache Memory reference (
reference/cache-memory.md)clean_cache_memoriesmanual trigger