Skip to content

fix: sort cycle logs by mtime to prevent false health check alarms#93

Open
warlockee wants to merge 12 commits intoanadim:mainfrom
warlockee:fix/fsm-health-check-log-sort
Open

fix: sort cycle logs by mtime to prevent false health check alarms#93
warlockee wants to merge 12 commits intoanadim:mainfrom
warlockee:fix/fsm-health-check-log-sort

Conversation

@warlockee
Copy link
Copy Markdown

Summary

  • Bug: FSM health check guards (role_unhealthy, research_stalled) sorted cycle_*.log files alphabetically. When cycle numbers wrap at 1000, cycle_999.log sorts after cycle_2053.log lexicographically (9 > 2), causing stale logs to be picked as the latest and falsely triggering bug_fixer with no activity for 45min alarms.
  • Evidence: The alphabetical sort picks cycle_999.log (53min old) while the actual latest log is cycle_2057.log (0.1min old). The FSM bug_fixer history shows 10+ false positive triggers from this bug.
  • Fix: Created a project-level override of orze_guards.py in fsm/plugins/ that sorts log files by modification time (st_mtime) instead of filename. The FSM runner plugin loading already supports project-level overrides by filename.

Test plan

  • Verified alphabetical sort picks wrong file (cycle_999.log, 53min stale)
  • Verified mtime sort picks correct file (cycle_2057.log, 0.1min old)
  • Ran python3 fsm/runner.py --results-dir results - all 7 FSMs step clean, bug_fixer stays in healthy state
  • Plugin syntax verified with ast.parse()

Generated with Claude Code

erik and others added 12 commits April 2, 2026 04:56
Core scripts for training tiny transformers on 10-digit addition:
train.py (orze-compatible), multiphase.py (5-phase pipeline),
autopilot.py (automated param reduction), compile_adder.py (ES/GD),
circular_adder.py (standalone 3-param adder). Includes 49p/58p/62p
submissions, orze submodules, and research agent configuration.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix train.py config loading: SQLite path (results/idea_lake.db),
  hard-fail on missing config instead of silent fallback
- Add code evolution features to train.py: cosine warm restarts (SGDR),
  stochastic weight averaging, gradient noise injection, label smoothing,
  per-digit loss weighting, EMA of model weights
- Add generic FSM engine (fsm/) with YAML-defined procedures, pluggable
  guards/actions, pause registry for multi-FSM coordination, and JSONL
  activity logging for full traceability
- Add 7 FSM procedures: evolution loop, quality gate, meta-research,
  bug fixer escalation, research stall detection, idea verifier, and
  activity logger
- Add The Professor role (Opus): LLM agent that reviews queued ideas
  with full research context before they consume GPU time
- Add code evolution rules (CODE_EVOLUTION_RULES.md) for Claude to
  modify train.py and generate ideas exercising new code paths
- Configure orze.yaml: FSM as sole authority for pause/trigger dispatch,
  orze retrospection detection-only, GPU scheduling tuned for tiny models
- Update orze-pro submodule to latest

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split FSM components by tier:
- orze (basic): fsm/engine.py, fsm/runner.py — generic state machine
  infrastructure with tiered plugin/procedure discovery
- orze-pro: procedures/*.yaml, fsm/plugins/*.py, prompts/*.md — the
  orchestration intelligence, guards, actions, and agent prompts

The runner discovers pro-tier files from the orze-pro package
(installed or submodule) and allows project-level overrides.
Basic tier gets an empty state machine; pro tier adds the procedures
that make it useful.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
orze 3.1.0 (basic):
- New: FSM engine (fsm/engine.py) — YAML-defined state machines
- New: Tiered runner (fsm/runner.py) — discovers pro plugins/procedures
- Generic infrastructure, no intelligence built in

orze-pro 0.2.0 (pro):
- New: 7 FSM procedures (evolution, quality_gate, meta_research,
  bug_fixer, research_stall, idea_verifier, activity_log)
- New: 14 guards + 10 actions + pause registry
- New: The Professor — Opus-powered idea reviewer agent
- New: Code Evolution — Claude modifies train.py on plateau
- New: Idea verifier — two-tier filtering (rules + LLM)
- New: JSONL activity log for full traceability

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug fixer agent (Claude Opus) auto-detects broken roles, diagnoses
root cause, applies fix, verifies, and creates a PR — all autonomously.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added Steps 9a-9e for pro users:
- Detect orze-pro, activate license
- Set up API keys (auto-detect key type from prefix)
- Configure FSM engine
- Add all pro roles to orze.yaml (professor, code_evolution,
  meta_research, bug_fixer, fsm) with paths resolved from
  installed orze-pro package
- Set retrospection to detection-only (FSM owns control)
- Verify with orze --check

A pro user running ORZE-AGENT.md now gets the full stack:
research agent + professor + code evolution + bug fixer + 7 FSMs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
train.py: code evolution added focal loss, carry bias schedule,
and majority-vote inference. All backward-compatible (disabled by default).

Submodules synced to latest main (orze 3.1.1, orze-pro 0.2.4).
Activation server deployed and verified.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 9c: use thin wrapper that imports from installed orze package
Step 9d: emphasize resolving absolute paths from installed orze-pro,
not relative submodule paths. Prevents broken rules_file references
for pip-install users who don't have git submodules.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The FSM health check (role_unhealthy, research_stalled guards) sorted
cycle_*.log files alphabetically. When cycle numbers wrap at 1000,
cycle_999.log sorts after cycle_2053.log lexicographically ("9" > "2"),
causing the health check to pick a stale log as the "latest" and
falsely report roles as inactive for 45+ minutes.

This project-level override of orze_guards.py sorts log files by
modification time instead, correctly identifying the most recent log
regardless of cycle number wrapping.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant