Add worldforge runs prune retention policy#297
Conversation
omarespejel
left a comment
There was a problem hiding this comment.
I mirrored this into my fork for extra CodeRabbit/Qodo review. I think this is close, but I would fix a few retention-path issues before merging because this command deletes local artifacts.
- Retention profile precedence misses
--flag=valueforms.
The CLI checks exact strings in sys.argv, so valid argparse forms like --max-age-days=7 or --keep-latest=2 are treated as absent and can be overwritten by profile values. Since the docs say CLI flags override the profile, this can change deletion behavior.
I would use argparse defaults like None or SUPPRESS, or a helper that detects both --flag value and --flag=value, plus regression tests for both forms.
- The prune path safety check is too loose.
_ensure_inside_runs() only checks whether "runs" appears somewhere in the resolved path. A symlink or crafted report path under another runs/ directory could pass even if it is outside the intended workspace. Since apply_prune() eventually calls shutil.rmtree(), I think this should verify the resolved deletion target is under the exact resolved <workspace>/runs root before deleting.
Something like target.resolve().is_relative_to((workspace / "runs").resolve()) is the important property to enforce, with a symlink/crafted-report regression test.
keep_latest+--familysemantics are ambiguous and may surprise users.
Right now keep_latest is computed before applying families. If --family benchmark is used, a newer non-benchmark run can consume the keep slot and the newest benchmark run can become deletable. I think keep_latest should be computed inside the filtered family set, or the docs should explicitly say it is global across all run kinds.
Smaller follow-up: invalid retention profile JSON shapes should fail as WorldForgeError, not leak AttributeError/ValueError, and shutil.rmtree() failures should be wrapped so CLI users get a stable error instead of a traceback.
Address review feedback on AbdelStark#297: - Replace the lenient substring check in `_ensure_inside_runs` with a resolved-path `is_relative_to(runs_root)` comparison so crafted report paths and symlinks pointing outside `<workspace>/runs/` are rejected before `shutil.rmtree`. - Wrap `shutil.rmtree` failures in `WorldForgeError` so CLI callers see the stable error envelope rather than a traceback. - Scope `keep_latest` to the family filter: only entries that match the current `--family` selection (or all entries when no filter is set) populate the keep window, so a newer non-matching run can no longer consume the slot reserved for the filtered family. - Switch the `--max-age-days` and `--keep-latest` argparse defaults to `None` and decide profile precedence from the parsed value rather than from a substring search on `sys.argv`. Profile values now correctly yield to either the `--flag value` or `--flag=value` form. - Validate retention profile shapes in `parse_runs_retention`: non-int `max_age_days` / `keep_latest` (including `bool`) and non-string family entries raise `WorldForgeError` instead of bubbling up `AttributeError` / `ValueError`. - Document the new keep_latest-within-family rule, the `--flag=value` precedence guarantee, and the resolved-path delete check in `docs/src/run-index.md` and the changelog.
|
Thanks for the thorough review @omarespejel. All three blocking concerns plus the two smaller follow-ups landed in 826707f:
Follow-ups also addressed:
|
The new worldforge.runs_prune module ships RunsRetentionPolicy, PruneCandidate, PruneReport, plan_prune, apply_prune, and parse_runs_retention. `worldforge runs prune` walks `<workspace>/runs/` read-only by default and reports which run workspaces fall outside the requested retention; --apply actually removes them. Policy fields: --max-age-days (default 30), --keep-latest (default 10), and repeatable --family <kind> (filter to manifest kinds like eval, benchmark, flow). A 24-hour safety window blocks deletion of fresh runs unless --max-age-days=0 is passed explicitly. The command refuses to operate on paths outside <workspace>/runs/. A non-secret config profile can carry a runs_retention block (read via --retention-profile <path>); explicit CLI flags still override profile values. Run-index docs gained a retention/cleanup section comparing `runs cleanup` (blanket keep-N trim) against `runs prune` (typed policy with dry-run default). Artifact schemas table now lists RUNS_PRUNE_SCHEMA_VERSION. Closes AbdelStark#282 (WF-FEAT3-003).
Address review feedback on AbdelStark#297: - Replace the lenient substring check in `_ensure_inside_runs` with a resolved-path `is_relative_to(runs_root)` comparison so crafted report paths and symlinks pointing outside `<workspace>/runs/` are rejected before `shutil.rmtree`. - Wrap `shutil.rmtree` failures in `WorldForgeError` so CLI callers see the stable error envelope rather than a traceback. - Scope `keep_latest` to the family filter: only entries that match the current `--family` selection (or all entries when no filter is set) populate the keep window, so a newer non-matching run can no longer consume the slot reserved for the filtered family. - Switch the `--max-age-days` and `--keep-latest` argparse defaults to `None` and decide profile precedence from the parsed value rather than from a substring search on `sys.argv`. Profile values now correctly yield to either the `--flag value` or `--flag=value` form. - Validate retention profile shapes in `parse_runs_retention`: non-int `max_age_days` / `keep_latest` (including `bool`) and non-string family entries raise `WorldForgeError` instead of bubbling up `AttributeError` / `ValueError`. - Document the new keep_latest-within-family rule, the `--flag=value` precedence guarantee, and the resolved-path delete check in `docs/src/run-index.md` and the changelog.
826707f to
6d6d8d6
Compare
Summary
worldforge.runs_pruneandworldforge runs prune. Implements WF-FEAT3-003 (WF-FEAT3-003: Add run artifact retention policy with worldforge runs prune #282).RunsRetentionPolicy,PruneCandidate,PruneReport,plan_prune,apply_prune,parse_runs_retention,RUNS_PRUNE_SCHEMA_VERSION(all re-exported on the top-level package).--applyactually deletes.--max-age-days(default 30),--keep-latest(default 10), repeatable--family <kind>filter,--retention-profile <path>readsruns_retentionfrom a non-secret config profile (CLI flags override).--max-age-days=0is passed explicitly; refuses to operate on paths outside<workspace>/runs/; corrupted directories (missing manifest) only become delete candidates when also outside the--keep-latestwindow.Acceptance criteria (from #282)
runs_retentionparsed viaparse_runs_retention--applyis required to deletedocs/src/run-index.mdretention section now contrastsruns cleanupvsruns prune)Validation
uv run pytest tests/test_runs_prune.py tests/test_run_index.py tests/test_provider_config.py— 53 passeduv run pytest— 1262 passed, 85 skippeduv lock --check,uv run ruff check,uv run ruff format --check,uv run mkdocs build --strict— cleanbash scripts/test_package.sh— 1261 passedworldforge runs prune --workspace-dir .worldforge --format markdown→ dry-run report, no deletion--applyand--max-age-days=0 --keep-latest=0on a tmp workspace, the older run directory is removed and the newer kept-latest survivesOut of scope (per #282)
Companion to
🤖 Generated with Claude Code