Skip to content

feat(harness): FileSnapshotStore takes top-level dir, auto-gitignore#123

Merged
minpeter merged 1 commit into
mainfrom
feat/file-snapshot-auto-gitignore
Apr 27, 2026
Merged

feat(harness): FileSnapshotStore takes top-level dir, auto-gitignore#123
minpeter merged 1 commit into
mainfrom
feat/file-snapshot-auto-gitignore

Conversation

@minpeter
Copy link
Copy Markdown
Owner

@minpeter minpeter commented Apr 24, 2026

Summary

  • Reshape FileSnapshotStore so it owns its on-disk layout: pass a top-level directory (.plugsuits, .minimal-agent, ...) instead of a pre-resolved /sessions path. The store manages <root>/sessions/*.jsonl itself and exposes rootDir / sessionsDir getters for consumers that want to co-locate related files (e.g. session memory).
  • Auto-gitignore: when the root dir lives inside a git worktree, the store appends the top-level dir to that worktree's .gitignore if not already listed. Opt out with { autoGitignore: false }.
  • Migrate all in-repo consumers (cea, minimal-agent, tgbot) to the new contract. No backward compatibility is kept for the legacy session-file path or the old env var names.

Motivation

Every consumer was reimplementing the same few lines: resolve a root dir, mkdirSync(.../sessions), pass .../sessions into the store, and then separately remember to add the state dir to .gitignore. The last step was usually forgotten, and agent state directories leaked into commits.

This PR pushes both responsibilities into the store:

  1. Convention over configuration — consumers stop thinking about the /sessions subpath. They hand over a top-level dir; the store lays out its files.
  2. Safe-by-default persistence — state dirs are auto-ignored on the way in, so .plugsuits/, .minimal-agent/, <tmpdir>/tgbot/ can't be committed by accident.

Changes

FileSnapshotStore (core)

  • Constructor signature: new FileSnapshotStore(rootDir, options?). rootDir is the top-level dir; sessions live at <rootDir>/sessions/*.jsonl.
  • Public getters: rootDir, sessionsDir (resolved absolute paths).
  • New options type: FileSnapshotStoreOptions { autoGitignore?: boolean } (defaults to true).
  • Removed the undocumented getFilePath fallback for unencoded session filenames. Files are always encoded via encodeSessionId.

gitignore-sync (new module)

Covered by packages/harness/src/gitignore-sync.ts + tests. Exported from both the package root and @ai-sdk-tool/harness/sessions:

  • ensureDirIgnoredByGit, ensureGitignoreEntry, findNearestGitignore, gitignoreEntryForDir.

Design constraints:

Risk Mitigation
Concurrent writers corrupt .gitignore .gitignore.lock via openSync(path, \"wx\") serializes writers
Crashed writer wedges the next caller Stale locks older than 30s are reclaimed
Partial write leaves .gitignore truncated Temp-file + rename swap (atomic on same fs)
Mixes LF/CRLF with existing file Detects and preserves the file's line-ending convention
Accidentally modifies parent repo's or home-level .gitignore Refuses to touch any ancestor .gitignore that is not at a verified worktree root (sibling .git marker)

Consumer migration (no backward compat)

Package Before After
cea hardcoded .plugsuits/sessions + explicit mkdirSync + manual session-memory path new FileSnapshotStore(\".plugsuits\"); session-memory path derived from store.sessionsDir
minimal-agent SESSION_DIR (default .minimal-agent/sessions) MINIMAL_AGENT_DIR (default .minimal-agent)
tgbot SESSION_DIR (default <tmpdir>/tgbot-sessions) TGBOT_DIR (default <tmpdir>/tgbot)

Tests

  • Existing FileSnapshotStore suite passes { autoGitignore: false } to stay hermetic.
  • New cases: rootDir / sessionsDir getter exposure, <root>/sessions/ layout, auto-gitignore inside a fake worktree, and the skip path when the root is outside any worktree.
  • Standalone gitignore-sync.test.ts covers concurrency (via a worker mjs), stale-lock reclaim, LF/CRLF preservation, and the worktree-root guard.

Docs

  • AGENTS.md — updated FileSnapshotStore example and auto-gitignore semantics.
  • packages/harness/README.md — sample path updated.
  • packages/minimal-agent/README.md — documents the new MINIMAL_AGENT_DIR.

Changeset

patch bump for @ai-sdk-tool/harness, @plugsuits/minimal-agent, @plugsuits/tgbot, plugsuits.

Verification

  • pnpm run typecheck — all 6 packages green (full turbo cache).
  • pnpm --filter @ai-sdk-tool/harness test — 727/727 passing (47 files).

Migration notes for downstream consumers

If you were constructing FileSnapshotStore directly:

- new FileSnapshotStore(\".plugsuits/sessions\")
+ new FileSnapshotStore(\".plugsuits\")

If you relied on SESSION_DIR:

  • minimal-agent: set MINIMAL_AGENT_DIR instead.
  • tgbot: set TGBOT_DIR instead.

To disable the auto-gitignore behavior (e.g. in tests or non-git environments):

new FileSnapshotStore(dir, { autoGitignore: false })

Summary by cubic

FileSnapshotStore now owns its on-disk layout: pass a top-level state dir and it writes snapshots to <root>/sessions/*.jsonl. It resolves the root dir at construction (stable if process.cwd() changes) and best-effort appends the dir to the worktree .gitignore (safe and atomic), which you can disable.

  • New Features

    • @ai-sdk-tool/harness: new FileSnapshotStore(rootDir, { autoGitignore = true }); resolves rootDir on construct, exposes rootDir and sessionsDir, always uses encodeSessionId; exports SESSIONS_SUBDIR.
    • Auto-gitignore anchored to the target dir’s worktree (not cwd); concurrency-safe lock + atomic writes, preserves LF/CRLF, verifies worktree root, supports .git file/dir, never blocks initialization (best-effort).
    • Utilities exported: ensureDirIgnoredByGit, ensureGitignoreEntry, findNearestGitignore, gitignoreEntryForDir.
  • Migration

    • Replace new FileSnapshotStore("<dir>/sessions") with new FileSnapshotStore("<dir>"); use store.sessionsDir for co-located files.
    • @plugsuits/minimal-agent: SESSION_DIRMINIMAL_AGENT_DIR (default .minimal-agent).
    • @plugsuits/tgbot: SESSION_DIRTGBOT_DIR (default <tmpdir>/tgbot).
    • To disable auto-gitignore: new FileSnapshotStore(dir, { autoGitignore: false }).

Written for commit b1dd640. Summary will update on new commits. Review in cubic

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 24, 2026

Important

Review skipped

Auto reviews are disabled on this repository. To trigger a review, include @crb review in the PR description. Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 373febfe-26ab-4670-92c4-520d391d25e4

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/file-snapshot-auto-gitignore

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request refactors the FileSnapshotStore to manage its own internal directory layout, moving session snapshots into a sessions subdirectory under a specified root. It introduces a new gitignore-sync utility that automatically and atomically appends the storage directory to the nearest .gitignore file when a git worktree is detected. Additionally, environment variables for minimal-agent and tgbot have been migrated to support this new structure. Feedback was provided regarding the sleepSync implementation in the gitignore utility, which currently uses a busy-wait loop that blocks the Node.js event loop and consumes excessive CPU cycles.

Comment on lines +89 to +95
function sleepSync(ms: number): void {
const end = Date.now() + ms;
while (Date.now() < end) {
// Busy-wait is acceptable for sub-100ms lock contention. The lock window
// is bounded by a single small file rename, so spin time is negligible.
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The sleepSync function uses a busy-wait loop, which consumes 100% CPU on the thread while waiting. In Node.js, this blocks the event loop and is highly inefficient. Since this utility is used during session initialization (including in long-running processes like tgbot), this can lead to significant performance degradation and resource exhaustion under lock contention. A better approach for a synchronous sleep in Node.js is to use Atomics.wait with a SharedArrayBuffer, which suspends the thread without burning CPU cycles.

function sleepSync(ms: number): void {
  Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 16 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="packages/harness/src/gitignore-sync.ts">

<violation number="1" location="packages/harness/src/gitignore-sync.ts:89">
P2: `sleepSync` uses a busy-wait loop that burns 100% CPU for the sleep duration. Under lock contention the acquire loop can spin for up to 5 seconds (`LOCK_ACQUIRE_TIMEOUT_MS`). Use `Atomics.wait` instead, which suspends the thread without consuming CPU cycles:
```ts
function sleepSync(ms: number): void {
  Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}
```</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment on lines +89 to +94
function sleepSync(ms: number): void {
const end = Date.now() + ms;
while (Date.now() < end) {
// Busy-wait is acceptable for sub-100ms lock contention. The lock window
// is bounded by a single small file rename, so spin time is negligible.
}
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: sleepSync uses a busy-wait loop that burns 100% CPU for the sleep duration. Under lock contention the acquire loop can spin for up to 5 seconds (LOCK_ACQUIRE_TIMEOUT_MS). Use Atomics.wait instead, which suspends the thread without consuming CPU cycles:

function sleepSync(ms: number): void {
  Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}
Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/harness/src/gitignore-sync.ts, line 89:

<comment>`sleepSync` uses a busy-wait loop that burns 100% CPU for the sleep duration. Under lock contention the acquire loop can spin for up to 5 seconds (`LOCK_ACQUIRE_TIMEOUT_MS`). Use `Atomics.wait` instead, which suspends the thread without consuming CPU cycles:
```ts
function sleepSync(ms: number): void {
  Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}
```</comment>

<file context>
@@ -0,0 +1,268 @@
+  return normalizeEntry(trimmed) === normalizedEntry;
+}
+
+function sleepSync(ms: number): void {
+  const end = Date.now() + ms;
+  while (Date.now() < end) {
</file context>
Suggested change
function sleepSync(ms: number): void {
const end = Date.now() + ms;
while (Date.now() < end) {
// Busy-wait is acceptable for sub-100ms lock contention. The lock window
// is bounded by a single small file rename, so spin time is negligible.
}
function sleepSync(ms: number): void {
Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, ms);
}
Fix with Cubic

FileSnapshotStore now accepts a top-level state directory, resolves it at construction time, owns the <root>/sessions layout, and best-effort appends that root to the containing worktree's .gitignore. Consumers no longer precompute sessions directories or maintain ignore entries independently.

This keeps session storage stable even if process.cwd() changes after store construction, and it aligns the shipped release notes with the public rootDir/sessionsDir getter contract.

Constraint: Preserve #124's edge-safe runtime import changes while rebasing onto current main

Rejected: Leave relative rootDir values as-is | contradicted the resolved-path contract and made later cwd changes affect save/load paths

Confidence: high

Scope-risk: moderate

Directive: Do not reintroduce caller-managed /sessions paths without updating downstream env names and migration notes

Tested: pnpm run typecheck; pnpm run test; pnpm run build; pnpm run check

Not-tested: Live model-backed minimal-agent session
@minpeter minpeter force-pushed the feat/file-snapshot-auto-gitignore branch from b81ba25 to b1dd640 Compare April 27, 2026 16:36
@minpeter minpeter merged commit a6b8a5f into main Apr 27, 2026
6 checks passed
@minpeter minpeter deleted the feat/file-snapshot-auto-gitignore branch April 27, 2026 16:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant