V2 reliability and performance hardening by z23cc · Pull Request #3 · z23cc/flow-code

z23cc · 2026-04-03T06:40:48Z

Summary

Second optimization round based on deep audit of v0.1.18. Focuses on business logic reliability, performance hot paths, and review system robustness.

P0 — Critical reliability

cmd_done write ordering: Runtime state now written FIRST (authoritative), then spec, then receipt — prevents phantom incomplete tasks on crash
Workflow command tests: 43 new pytest tests for cmd_start, cmd_done, cmd_ready, cmd_next, cmd_restart, cmd_block
get_repo_root() cached: Module-level memoization eliminates 10+ subprocess calls per command

P1 — High priority fixes

ralph-guard TOCTOU: lock-check (read-only) replaced with lock --task (acquire-or-fail) — eliminates check-then-act race
Backward-compat writes removed: cmd_start/done/reset/restart no longer write status to git-tracked definition JSON
Codex review fixes: Timeout configurable via FLOW_CODEX_TIMEOUT, resume fallback logs warning, files_embedded respects sandbox mode

P2 — Performance + DX

Batch git grep: gather_context_hints uses alternation pattern (500 subprocess → ~10)
find_dependents O(N²) → O(N): Loads tasks once, traverses in-memory
cmd_next batch loading: Uses load_all_tasks_with_state() instead of per-file reads
rp.py --json consistency: All 15 error paths now respect args.json
Guard hardening: Command regex uses normalized input, state stored in state-dir, debug log rotates at 1MB

P3 — Cross-platform + prompts

Windows flock: Uses msvcrt.locking() instead of no-op
Adversarial prompt validation: Warns on unconsumed {{placeholders}}
phases.md fix: codex exec → flowctl codex exec (matches guard allowlist)

Test plan

python3 -m pytest scripts/flowctl/tests/ -v — 255 passed (43 new)
bash scripts/smoke_test.sh — 90/90 passed
All Python files compile cleanly
Teams mode used for all parallel waves

🤖 Generated with Claude Code

…e files_embedded - Replace hardcoded timeout=600 with FLOW_CODEX_TIMEOUT env var (default 600s) - Print WARNING to stderr when codex resume falls back to new session - Force files_embedded=True in plan-review when sandbox prevents disk reads Task: fn-6-v2-reliability-and-performance-hardening.5

- compat.py: use msvcrt.locking() on Windows instead of no-op flock, with warning fallback for unknown platforms - adversarial.py: warn on unconsumed {{...}} placeholders after template substitution - phases.md: replace raw `codex exec` with `flowctl codex exec` to match guard allowlist

Write runtime state before spec/receipt in cmd_done so a mid-write crash still marks the task as done (runtime is authoritative). Cache get_repo_root() per cwd to avoid repeated subprocess calls, following the same pattern as get_state_dir(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Status is now managed exclusively by runtime state files. Definition JSON files no longer have status written to them in cmd_task_reset, cmd_restart, cmd_skip, and cmd_split. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- cmd_start: 11 tests (claim, deps, blocked, force, resume, invalid ID) - cmd_done: 10 tests (evidence, spec update, duration, cross-actor, force) - cmd_ready: 7 tests (unblocked, dep done, skipped-as-done, in_progress, blocked) - cmd_next: 6 tests (ready task, resume, all-done, completion/plan review gates) - cmd_restart: 6 tests (cascade, dry-run, force, skip-todo, invalid/missing) - cmd_block: 3 tests (status, done-fails, spec update) Task: fn-6-v2-reliability-and-performance-hardening.2

- Thread use_json parameter through all rp.py helpers (require_rp_cli, run_rp_cli, parse_windows, parse_builder_tab) and cmd_* functions - All ~15 error_exit calls now use args.json instead of hardcoded False - Add 1MB log rotation in ralph-guard _debug_log_path() (single-file .1) - Bump RALPH_GUARD_VERSION to 0.15.0 Task: fn-6-v2-reliability-and-performance-hardening.7

- gather_context_hints: collect all symbols, batch into ≤5 git grep calls instead of N per-symbol subprocess calls - find_dependents: load all task files once upfront, then BFS over in-memory dict instead of re-globbing per BFS level - cmd_next: use load_all_tasks_with_state(epic_id) single-scan batch load instead of per-file load_task_with_state loop Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

P0 fixes (state loss — root cause of 5 issues): - get_flow_dir() now walks up directory tree (FLOW_STATE_DIR env → walk-up → CWD) Fixes: #1 state loss, #3 state not persistent, #5 worker parallel fail, #9 .flow symlink issues. Same pattern as git finding .git. - flowctl recover --epic <id> [--dry-run]: rebuilds task completion status from git log. Fixes #11 no recovery after state loss. P1 fixes (guard + review): - Guard graceful fallback: missing tools → "skipped" (not "failed"). Only actual failures block pipeline. Fixes #8. - Review-backend availability check: if rp-cli/codex not in PATH, auto-fallback to "none" with warning. Fixes #7. P2 fixes (UX): - Slug max length 40→20 chars. "Django+React platform with account management" → "fn-3-django-react-plat" not 40-char monster. Fixes #2 #12. - Brainstorm auto-skip: trivial tasks (≤10 words, contains "fix"/"typo"/etc) skip brainstorm entirely. Fixes #6. - --interactive flag: pause at key decisions. Fixes #10. 370 tests pass. Zero new dependencies. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

z23cc and others added 7 commits April 3, 2026 14:22

z23cc merged commit 4c67897 into main Apr 3, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

V2 reliability and performance hardening#3

V2 reliability and performance hardening#3
z23cc merged 7 commits intomainfrom
fn-6-v2-reliability-and-performance-hardening

z23cc commented Apr 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

z23cc commented Apr 3, 2026

Summary

P0 — Critical reliability

P1 — High priority fixes

P2 — Performance + DX

P3 — Cross-platform + prompts

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant