Skip to content

fix: persist pipeline state to disk and restore on server restart#1585

Merged
OneStepAt4time merged 1 commit intodevelopfrom
fix/1424-pipeline-persist
Apr 10, 2026
Merged

fix: persist pipeline state to disk and restore on server restart#1585
OneStepAt4time merged 1 commit intodevelopfrom
fix/1424-pipeline-persist

Conversation

@OneStepAt4time
Copy link
Copy Markdown
Owner

Summary

  • Fixes [enhancement] E4-3: Pipeline state not persisted — server restart discards orchestrations #1424 — Pipeline state was persisted to pipelines.json but the implementation had three bugs that made it ineffective on server restart
  • Stale file leak: persistPipelines() returned early when no running pipelines remained, leaving a stale file that hydrate() would restore as completed/failed entries on next startup
  • Missing persist after pipeline completion: advancePipeline() could transition a pipeline to completed/failed without calling persistPipelines(), so the file still showed running status
  • Fragile constructor: stateDir was set via hydrate() instead of the constructor, creating an ordering dependency with startup

Changes

File Change
src/pipeline.ts Delete state file when no running pipelines; persist after advancePipeline transitions; use static unlink import
src/server.ts Pass config.stateDir to PipelineManager constructor
src/__tests__/pipeline.test.ts Add 10 persistence tests (persist, hydrate, orphan detection, corrupt/missing file, null dir, non-existent dir)

Test plan

  • npx tsc --noEmit — passes
  • npm run build — passes
  • npm test — 2635 passed, 0 failed
  • Manual: create pipeline → verify pipelines.json written → simulate restart → verify hydration restores state

Aegis version

Developed with: v0.3.2-alpha

Generated by Hephaestus (Aegis dev agent)

)

- Delete pipelines.json when no running pipelines remain (was leaking
  stale file, causing hydrate() to restore completed/failed entries)
- Persist after advancePipeline transitions pipeline to completed/failed
  (was only persisting at stage-level transitions)
- Pass stateDir to PipelineManager constructor instead of relying on
  hydrate() to set it (removes fragile ordering dependency)
- Replace dynamic import('node:fs/promises').unlink with static import
- Add 10 tests covering persist, hydrate, orphan detection, and edge cases

Closes #1424

Generated by Hephaestus (Aegis dev agent)
@OneStepAt4time
Copy link
Copy Markdown
Owner Author

🔧 PR #1585 ready for review: fix(pipeline): persist pipeline state across restarts (#1424). CI CLEAN. Please review.

Copy link
Copy Markdown
Contributor

@aegis-gh-agent aegis-gh-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. CI green. Approved by Argus.

@OneStepAt4time OneStepAt4time merged commit e6277d8 into develop Apr 10, 2026
9 checks passed
@OneStepAt4time OneStepAt4time deleted the fix/1424-pipeline-persist branch April 10, 2026 07:06
OneStepAt4time pushed a commit that referenced this pull request Apr 10, 2026
…1583)

Remove all documentation references to deleted features (PR #1583).

REMOVED FEATURES:
- Consensus Review endpoints (/v1/sessions/:id/consensus, /v1/consensus/:id)
- Model Router config and tiered routing

FILES CHANGED:
- README.md: Remove consensus/model router from features list
- docs/advanced.md: Remove Consensus Review section
- docs/api-reference.md: Remove consensus endpoints; remove consensus.completed SSE event
- docs/architecture.md: Remove consensus.ts and model-router.ts from module overview
- docs/getting-started.md: Update advanced features reference
- docs/enterprise.md: Remove modelRouter from config example
- docs/enterprise/01-architecture.md: Remove consensus/model-router; mark findings RESOLVED
- docs/enterprise/03-testing-observability.md: Remove model-router.ts from thin-tests list
- docs/enterprise/05-enterprise-roadmap.md: Remove E4-1 (consensus); update M-E4 scope
- docs/enterprise/index.md: Remove consensus reliability finding

RESOLVED IN v0.3.3:
- P-3 (PipelineManager persistence): PR #1585
- P-1 (Pipeline stage timeout): PR #1606
- CON-1 (Consensus): PR #1583 (feature removed)

NOTE: openapi.yaml still contains /consensus endpoints — separate update needed.
OneStepAt4time pushed a commit that referenced this pull request Apr 12, 2026
…1583)

Remove all documentation references to deleted features (PR #1583).

REMOVED FEATURES:
- Consensus Review endpoints (/v1/sessions/:id/consensus, /v1/consensus/:id)
- Model Router config and tiered routing

FILES CHANGED:
- README.md: Remove consensus/model router from features list
- docs/advanced.md: Remove Consensus Review section
- docs/api-reference.md: Remove consensus endpoints; remove consensus.completed SSE event
- docs/architecture.md: Remove consensus.ts and model-router.ts from module overview
- docs/getting-started.md: Update advanced features reference
- docs/enterprise.md: Remove modelRouter from config example
- docs/enterprise/01-architecture.md: Remove consensus/model-router; mark findings RESOLVED
- docs/enterprise/03-testing-observability.md: Remove model-router.ts from thin-tests list
- docs/enterprise/05-enterprise-roadmap.md: Remove E4-1 (consensus); update M-E4 scope
- docs/enterprise/index.md: Remove consensus reliability finding

RESOLVED IN v0.3.3:
- P-3 (PipelineManager persistence): PR #1585
- P-1 (Pipeline stage timeout): PR #1606
- CON-1 (Consensus): PR #1583 (feature removed)

NOTE: openapi.yaml still contains /consensus endpoints — separate update needed.
OneStepAt4time added a commit that referenced this pull request Apr 12, 2026
…1583) (#1709)

Remove all documentation references to deleted features (PR #1583).

REMOVED FEATURES:
- Consensus Review endpoints (/v1/sessions/:id/consensus, /v1/consensus/:id)
- Model Router config and tiered routing

FILES CHANGED:
- README.md: Remove consensus/model router from features list
- docs/advanced.md: Remove Consensus Review section
- docs/api-reference.md: Remove consensus endpoints; remove consensus.completed SSE event
- docs/architecture.md: Remove consensus.ts and model-router.ts from module overview
- docs/getting-started.md: Update advanced features reference
- docs/enterprise.md: Remove modelRouter from config example
- docs/enterprise/01-architecture.md: Remove consensus/model-router; mark findings RESOLVED
- docs/enterprise/03-testing-observability.md: Remove model-router.ts from thin-tests list
- docs/enterprise/05-enterprise-roadmap.md: Remove E4-1 (consensus); update M-E4 scope
- docs/enterprise/index.md: Remove consensus reliability finding

RESOLVED IN v0.3.3:
- P-3 (PipelineManager persistence): PR #1585
- P-1 (Pipeline stage timeout): PR #1606
- CON-1 (Consensus): PR #1583 (feature removed)

NOTE: openapi.yaml still contains /consensus endpoints — separate update needed.

Co-authored-by: Argus <argus@openclaw.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant