fix: sort cycle logs by mtime to prevent false health check alarms#93
Open
warlockee wants to merge 12 commits intoanadim:mainfrom
Open
fix: sort cycle logs by mtime to prevent false health check alarms#93warlockee wants to merge 12 commits intoanadim:mainfrom
warlockee wants to merge 12 commits intoanadim:mainfrom
Conversation
Core scripts for training tiny transformers on 10-digit addition: train.py (orze-compatible), multiphase.py (5-phase pipeline), autopilot.py (automated param reduction), compile_adder.py (ES/GD), circular_adder.py (standalone 3-param adder). Includes 49p/58p/62p submissions, orze submodules, and research agent configuration. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix train.py config loading: SQLite path (results/idea_lake.db), hard-fail on missing config instead of silent fallback - Add code evolution features to train.py: cosine warm restarts (SGDR), stochastic weight averaging, gradient noise injection, label smoothing, per-digit loss weighting, EMA of model weights - Add generic FSM engine (fsm/) with YAML-defined procedures, pluggable guards/actions, pause registry for multi-FSM coordination, and JSONL activity logging for full traceability - Add 7 FSM procedures: evolution loop, quality gate, meta-research, bug fixer escalation, research stall detection, idea verifier, and activity logger - Add The Professor role (Opus): LLM agent that reviews queued ideas with full research context before they consume GPU time - Add code evolution rules (CODE_EVOLUTION_RULES.md) for Claude to modify train.py and generate ideas exercising new code paths - Configure orze.yaml: FSM as sole authority for pause/trigger dispatch, orze retrospection detection-only, GPU scheduling tuned for tiny models - Update orze-pro submodule to latest Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Split FSM components by tier: - orze (basic): fsm/engine.py, fsm/runner.py — generic state machine infrastructure with tiered plugin/procedure discovery - orze-pro: procedures/*.yaml, fsm/plugins/*.py, prompts/*.md — the orchestration intelligence, guards, actions, and agent prompts The runner discovers pro-tier files from the orze-pro package (installed or submodule) and allows project-level overrides. Basic tier gets an empty state machine; pro tier adds the procedures that make it useful. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
orze 3.1.0 (basic): - New: FSM engine (fsm/engine.py) — YAML-defined state machines - New: Tiered runner (fsm/runner.py) — discovers pro plugins/procedures - Generic infrastructure, no intelligence built in orze-pro 0.2.0 (pro): - New: 7 FSM procedures (evolution, quality_gate, meta_research, bug_fixer, research_stall, idea_verifier, activity_log) - New: 14 guards + 10 actions + pause registry - New: The Professor — Opus-powered idea reviewer agent - New: Code Evolution — Claude modifies train.py on plateau - New: Idea verifier — two-tier filtering (rules + LLM) - New: JSONL activity log for full traceability Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Bug fixer agent (Claude Opus) auto-detects broken roles, diagnoses root cause, applies fix, verifies, and creates a PR — all autonomously. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Added Steps 9a-9e for pro users: - Detect orze-pro, activate license - Set up API keys (auto-detect key type from prefix) - Configure FSM engine - Add all pro roles to orze.yaml (professor, code_evolution, meta_research, bug_fixer, fsm) with paths resolved from installed orze-pro package - Set retrospection to detection-only (FSM owns control) - Verify with orze --check A pro user running ORZE-AGENT.md now gets the full stack: research agent + professor + code evolution + bug fixer + 7 FSMs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
train.py: code evolution added focal loss, carry bias schedule, and majority-vote inference. All backward-compatible (disabled by default). Submodules synced to latest main (orze 3.1.1, orze-pro 0.2.4). Activation server deployed and verified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Step 9c: use thin wrapper that imports from installed orze package Step 9d: emphasize resolving absolute paths from installed orze-pro, not relative submodule paths. Prevents broken rules_file references for pip-install users who don't have git submodules. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The FSM health check (role_unhealthy, research_stalled guards) sorted
cycle_*.log files alphabetically. When cycle numbers wrap at 1000,
cycle_999.log sorts after cycle_2053.log lexicographically ("9" > "2"),
causing the health check to pick a stale log as the "latest" and
falsely report roles as inactive for 45+ minutes.
This project-level override of orze_guards.py sorts log files by
modification time instead, correctly identifying the most recent log
regardless of cycle number wrapping.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Test plan
Generated with Claude Code