feat(harness): FileSnapshotStore takes top-level dir, auto-gitignore#123
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. To trigger a review, include ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request refactors the FileSnapshotStore to manage its own internal directory layout, moving session snapshots into a sessions subdirectory under a specified root. It introduces a new gitignore-sync utility that automatically and atomically appends the storage directory to the nearest .gitignore file when a git worktree is detected. Additionally, environment variables for minimal-agent and tgbot have been migrated to support this new structure. Feedback was provided regarding the sleepSync implementation in the gitignore utility, which currently uses a busy-wait loop that blocks the Node.js event loop and consumes excessive CPU cycles.
| function sleepSync(ms: number): void { | ||
| const end = Date.now() + ms; | ||
| while (Date.now() < end) { | ||
| // Busy-wait is acceptable for sub-100ms lock contention. The lock window | ||
| // is bounded by a single small file rename, so spin time is negligible. | ||
| } | ||
| } |
There was a problem hiding this comment.
The sleepSync function uses a busy-wait loop, which consumes 100% CPU on the thread while waiting. In Node.js, this blocks the event loop and is highly inefficient. Since this utility is used during session initialization (including in long-running processes like tgbot), this can lead to significant performance degradation and resource exhaustion under lock contention. A better approach for a synchronous sleep in Node.js is to use Atomics.wait with a SharedArrayBuffer, which suspends the thread without burning CPU cycles.
function sleepSync(ms: number): void {
Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}There was a problem hiding this comment.
1 issue found across 16 files
Prompt for AI agents (unresolved issues)
Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.
<file name="packages/harness/src/gitignore-sync.ts">
<violation number="1" location="packages/harness/src/gitignore-sync.ts:89">
P2: `sleepSync` uses a busy-wait loop that burns 100% CPU for the sleep duration. Under lock contention the acquire loop can spin for up to 5 seconds (`LOCK_ACQUIRE_TIMEOUT_MS`). Use `Atomics.wait` instead, which suspends the thread without consuming CPU cycles:
```ts
function sleepSync(ms: number): void {
Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}
```</violation>
</file>
Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.
| function sleepSync(ms: number): void { | ||
| const end = Date.now() + ms; | ||
| while (Date.now() < end) { | ||
| // Busy-wait is acceptable for sub-100ms lock contention. The lock window | ||
| // is bounded by a single small file rename, so spin time is negligible. | ||
| } |
There was a problem hiding this comment.
P2: sleepSync uses a busy-wait loop that burns 100% CPU for the sleep duration. Under lock contention the acquire loop can spin for up to 5 seconds (LOCK_ACQUIRE_TIMEOUT_MS). Use Atomics.wait instead, which suspends the thread without consuming CPU cycles:
function sleepSync(ms: number): void {
Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/harness/src/gitignore-sync.ts, line 89:
<comment>`sleepSync` uses a busy-wait loop that burns 100% CPU for the sleep duration. Under lock contention the acquire loop can spin for up to 5 seconds (`LOCK_ACQUIRE_TIMEOUT_MS`). Use `Atomics.wait` instead, which suspends the thread without consuming CPU cycles:
```ts
function sleepSync(ms: number): void {
Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}
```</comment>
<file context>
@@ -0,0 +1,268 @@
+ return normalizeEntry(trimmed) === normalizedEntry;
+}
+
+function sleepSync(ms: number): void {
+ const end = Date.now() + ms;
+ while (Date.now() < end) {
</file context>
| function sleepSync(ms: number): void { | |
| const end = Date.now() + ms; | |
| while (Date.now() < end) { | |
| // Busy-wait is acceptable for sub-100ms lock contention. The lock window | |
| // is bounded by a single small file rename, so spin time is negligible. | |
| } | |
| function sleepSync(ms: number): void { | |
| Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms); | |
| } |
FileSnapshotStore now accepts a top-level state directory, resolves it at construction time, owns the <root>/sessions layout, and best-effort appends that root to the containing worktree's .gitignore. Consumers no longer precompute sessions directories or maintain ignore entries independently. This keeps session storage stable even if process.cwd() changes after store construction, and it aligns the shipped release notes with the public rootDir/sessionsDir getter contract. Constraint: Preserve #124's edge-safe runtime import changes while rebasing onto current main Rejected: Leave relative rootDir values as-is | contradicted the resolved-path contract and made later cwd changes affect save/load paths Confidence: high Scope-risk: moderate Directive: Do not reintroduce caller-managed /sessions paths without updating downstream env names and migration notes Tested: pnpm run typecheck; pnpm run test; pnpm run build; pnpm run check Not-tested: Live model-backed minimal-agent session
b81ba25 to
b1dd640
Compare
Summary
FileSnapshotStoreso it owns its on-disk layout: pass a top-level directory (.plugsuits,.minimal-agent, ...) instead of a pre-resolved/sessionspath. The store manages<root>/sessions/*.jsonlitself and exposesrootDir/sessionsDirgetters for consumers that want to co-locate related files (e.g. session memory)..gitignoreif not already listed. Opt out with{ autoGitignore: false }.cea,minimal-agent,tgbot) to the new contract. No backward compatibility is kept for the legacy session-file path or the old env var names.Motivation
Every consumer was reimplementing the same few lines: resolve a root dir,
mkdirSync(.../sessions), pass.../sessionsinto the store, and then separately remember to add the state dir to.gitignore. The last step was usually forgotten, and agent state directories leaked into commits.This PR pushes both responsibilities into the store:
/sessionssubpath. They hand over a top-level dir; the store lays out its files..plugsuits/,.minimal-agent/,<tmpdir>/tgbot/can't be committed by accident.Changes
FileSnapshotStore(core)new FileSnapshotStore(rootDir, options?).rootDiris the top-level dir; sessions live at<rootDir>/sessions/*.jsonl.rootDir,sessionsDir(resolved absolute paths).FileSnapshotStoreOptions { autoGitignore?: boolean }(defaults totrue).getFilePathfallback for unencoded session filenames. Files are always encoded viaencodeSessionId.gitignore-sync(new module)Covered by
packages/harness/src/gitignore-sync.ts+ tests. Exported from both the package root and@ai-sdk-tool/harness/sessions:ensureDirIgnoredByGit,ensureGitignoreEntry,findNearestGitignore,gitignoreEntryForDir.Design constraints:
.gitignore.gitignore.lockviaopenSync(path, \"wx\")serializes writers.gitignoretruncatedrenameswap (atomic on same fs).gitignore.gitignorethat is not at a verified worktree root (sibling.gitmarker)Consumer migration (no backward compat)
cea.plugsuits/sessions+ explicitmkdirSync+ manual session-memory pathnew FileSnapshotStore(\".plugsuits\"); session-memory path derived fromstore.sessionsDirminimal-agentSESSION_DIR(default.minimal-agent/sessions)MINIMAL_AGENT_DIR(default.minimal-agent)tgbotSESSION_DIR(default<tmpdir>/tgbot-sessions)TGBOT_DIR(default<tmpdir>/tgbot)Tests
FileSnapshotStoresuite passes{ autoGitignore: false }to stay hermetic.rootDir/sessionsDirgetter exposure,<root>/sessions/layout, auto-gitignore inside a fake worktree, and the skip path when the root is outside any worktree.gitignore-sync.test.tscovers concurrency (via a worker mjs), stale-lock reclaim, LF/CRLF preservation, and the worktree-root guard.Docs
AGENTS.md— updatedFileSnapshotStoreexample and auto-gitignore semantics.packages/harness/README.md— sample path updated.packages/minimal-agent/README.md— documents the newMINIMAL_AGENT_DIR.Changeset
patchbump for@ai-sdk-tool/harness,@plugsuits/minimal-agent,@plugsuits/tgbot,plugsuits.Verification
pnpm run typecheck— all 6 packages green (full turbo cache).pnpm --filter @ai-sdk-tool/harness test— 727/727 passing (47 files).Migration notes for downstream consumers
If you were constructing
FileSnapshotStoredirectly:If you relied on
SESSION_DIR:minimal-agent: setMINIMAL_AGENT_DIRinstead.tgbot: setTGBOT_DIRinstead.To disable the auto-gitignore behavior (e.g. in tests or non-git environments):
Summary by cubic
FileSnapshotStorenow owns its on-disk layout: pass a top-level state dir and it writes snapshots to<root>/sessions/*.jsonl. It resolves the root dir at construction (stable ifprocess.cwd()changes) and best-effort appends the dir to the worktree.gitignore(safe and atomic), which you can disable.New Features
@ai-sdk-tool/harness:new FileSnapshotStore(rootDir, { autoGitignore = true }); resolvesrootDiron construct, exposesrootDirandsessionsDir, always usesencodeSessionId; exportsSESSIONS_SUBDIR.cwd); concurrency-safe lock + atomic writes, preserves LF/CRLF, verifies worktree root, supports.gitfile/dir, never blocks initialization (best-effort).ensureDirIgnoredByGit,ensureGitignoreEntry,findNearestGitignore,gitignoreEntryForDir.Migration
new FileSnapshotStore("<dir>/sessions")withnew FileSnapshotStore("<dir>"); usestore.sessionsDirfor co-located files.@plugsuits/minimal-agent:SESSION_DIR→MINIMAL_AGENT_DIR(default.minimal-agent).@plugsuits/tgbot:SESSION_DIR→TGBOT_DIR(default<tmpdir>/tgbot).new FileSnapshotStore(dir, { autoGitignore: false }).Written for commit b1dd640. Summary will update on new commits. Review in cubic