fix: reduce memory overflow from checkpoint reads and writes (#344) by svarlamov · Pull Request #571 · git-ai-project/git-ai

svarlamov · 2026-02-21T03:15:56Z

fix: reduce memory overflow from checkpoint reads and writes (#344)

Summary

Addresses the runaway memory usage (30-60GB) reported in #344 by fixing the highest-impact patterns in checkpoint I/O:

1. Streaming prune-and-rewrite on append (repo_storage.rs): append_checkpoint previously read ALL checkpoints into memory, appended one, pruned, then wrote ALL back — O(N) memory per append. Now it streams existing checkpoints one-at-a-time through a BufReader, prunes char-level attributions for files superseded by the new checkpoint, writes to a temp file via BufWriter, appends the new checkpoint, then atomically renames. Peak memory is one checkpoint + the new checkpoint, rather than all checkpoints. The file stays small throughout long agent loops because pruning happens on every append.

2. Eliminate redundant full reads (checkpoint.rs): A single checkpoint::run() call previously triggered 4+ independent read_all_checkpoints() deserializations of the entire JSONL file. Now checkpoints are read once at the top of run() and passed through to get_all_tracked_files via a new preloaded_checkpoints parameter.

3. Streaming reads (repo_storage.rs): read_all_checkpoints now uses BufReader line-by-line instead of fs::read_to_string, avoiding holding the full file string and parsed structs in memory simultaneously.

4. BufWriter for writes (repo_storage.rs): write_all_checkpoints now streams serialization through BufWriter instead of building a full string in memory. An explicit flush() call ensures write errors are propagated rather than silently dropped on BufWriter::drop.

All 38 checkpoint-related unit tests pass (31 checkpoint tests + 7 repo_storage tests). No new dependencies added.

Updates since last revision

Major rework: The initial approach deferred char-level attribution pruning to write_all_checkpoints, but this left un-pruned data in the file during the intra-commit loop where memory issues are worst. The new approach prunes on every append_checkpoint call using a streaming read-modify-write pattern that keeps only one checkpoint in memory at a time. The prune_old_char_attributions method was removed entirely (logic inlined into append_checkpoint).

Reset ordering fix (ef4a4d7): Fixed a data loss bug where reset=true + is_pre_commit=true would destroy the working log before the early-exit check, causing the check to always pass (empty checkpoints → no AI edits) and skip checkpoint creation. Now checkpoints are always read first for the early-exit check, then reset happens afterward if needed.

Other changes:

write_all_checkpoints signature reverted to &[Checkpoint] (no longer needs &mut since pruning moved to append)
Added explicit writer.flush()?; in both append_checkpoint and write_all_checkpoints to ensure I/O errors are propagated

Review & Testing Checklist for Human

Reset + pre_commit interaction: When reset=true and is_pre_commit=true, verify that the early-exit check correctly inspects the old data before reset, and that the subsequent reset properly clears both disk state and in-memory checkpoints. Test with a repo that has AI checkpoints, run checkpoint --reset during pre-commit, and verify a fresh checkpoint is created (not skipped).
Streaming prune correctness: The new append_checkpoint assumes the checkpoint being appended is the newest for its files (clears attributions from older entries with matching files). This should always be true since we append chronologically, but verify no code path appends out-of-order or writes un-pruned checkpoints via write_all_checkpoints directly.
Real-world validation: Test with a repo that has a large checkpoint file (>100MB) and multiple agent sessions. Verify memory usage stays reasonable during git commit and that attributions are correctly preserved end-to-end. Unit tests validate correctness but not the memory improvement.
Temp file cleanup: append_checkpoint writes to checkpoints.jsonl.tmp then renames. If the process crashes mid-write, the temp file is left behind (harmless but clutters .git/ai/). Consider if cleanup is needed.

Notes

get_all_tracked_files gained an optional preloaded_checkpoints parameter. Existing callers that don't pass it will still work (reads from disk as before).
No changes to checkpoint format or serialization — purely I/O optimization.
The streaming prune approach means write_all_checkpoints no longer prunes. If any code path writes un-pruned data via write_all_checkpoints, it won't be pruned until the next append_checkpoint.
The has_no_ai_edits logic was rewritten from all_ai_touched_files().is_empty() to checkpoints.iter().all(|cp| cp.entries.is_empty() || cp.kind != AiAgent/AiTab). These are logically equivalent but worth a careful trace through both code paths.

Link to Devin run: https://app.devin.ai/sessions/2a46b6eaa71f4f46913488bef2ff52a1
Requested by: @svarlamov

- Make append_checkpoint truly append-only (O(1) instead of O(N) read-write-all) - Use BufReader for streaming JSONL reads instead of read_to_string - Eliminate 3 redundant read_all_checkpoints() calls in checkpoint::run() - Pass pre-loaded checkpoints to get_all_tracked_files - Defer char-level attribution pruning to write_all_checkpoints - Use BufWriter for efficient checkpoint serialization Addresses #344 Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

devin-ai-integration · 2026-02-21T03:15:58Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment and CI monitoring

git-ai-cloud-dev · 2026-02-21T03:16:02Z

No AI authorship found for these commits. Please install git-ai to start tracking AI generated code in your commits.

CLAassistant · 2026-02-21T03:16:02Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

svarlamov · 2026-02-21T05:09:30Z

devin review devin review's feedback

…ent data loss Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

…during agent loops Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

…_checkpoint) Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

svarlamov · 2026-02-22T19:02:32Z

Devin address the feedback

…t=true and is_pre_commit=true Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

git-ai-cloud · 2026-02-22T19:05:26Z

No AI authorship found for these commits. Please install git-ai to start tracking AI generated code in your commits.

devin-ai-integration · 2026-02-22T19:06:45Z

Addressed the Devin Review feedback in ef4a4d7:

Reset reordering fix: Moved reset_working_log() to after the early-exit check so existing checkpoint data is inspected first. When reset=true, we still read all checkpoints upfront (needed for has_no_ai_edits), then clear them after the early exit has been evaluated. This prevents the data loss scenario where reset=true + is_pre_commit=true would destroy the working log before the check could see existing AI edits.

devin-ai-integration bot assigned svarlamov Feb 21, 2026

This comment was marked as resolved.

Sign in to view

style: fix rustfmt formatting

365e555

Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

devin-ai-integration bot and others added 3 commits February 21, 2026 05:10

fix: explicitly flush BufWriter in write_all_checkpoints to avoid sil…

a3b78a2

…ent data loss Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

fix: restore streaming prune-on-append to keep checkpoint file small …

b4c90f1

…during agent loops Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

fix: remove unused prune_old_char_attributions (now inlined in append…

d9ce417

…_checkpoint) Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

This comment was marked as resolved.

Sign in to view

fix: move reset after early-exit check to prevent data loss when rese…

ef4a4d7

…t=true and is_pre_commit=true Co-Authored-By: Sasha Varlamov <sasha@sashavarlamov.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: reduce memory overflow from checkpoint reads and writes (#344)#571

fix: reduce memory overflow from checkpoint reads and writes (#344)#571
svarlamov wants to merge 6 commits intomainfrom
devin/1771643297-memory-overflow-fixes

svarlamov commented Feb 21, 2026 •

edited by devin-ai-integration bot

Loading

Uh oh!

devin-ai-integration bot commented Feb 21, 2026

Uh oh!

git-ai-cloud-dev bot commented Feb 21, 2026

Uh oh!

CLAassistant commented Feb 21, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

svarlamov commented Feb 21, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

svarlamov commented Feb 22, 2026

Uh oh!

git-ai-cloud bot commented Feb 22, 2026

Uh oh!

devin-ai-integration bot commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

svarlamov commented Feb 21, 2026 • edited by devin-ai-integration bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

fix: reduce memory overflow from checkpoint reads and writes (#344)

Summary

Updates since last revision

Review & Testing Checklist for Human

Notes

Uh oh!

devin-ai-integration bot commented Feb 21, 2026

🤖 Devin AI Engineer

Uh oh!

git-ai-cloud-dev bot commented Feb 21, 2026

Uh oh!

CLAassistant commented Feb 21, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

svarlamov commented Feb 21, 2026

Uh oh!

This comment was marked as resolved.

Uh oh!

svarlamov commented Feb 22, 2026

Uh oh!

git-ai-cloud bot commented Feb 22, 2026

Uh oh!

devin-ai-integration bot commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

svarlamov commented Feb 21, 2026 •

edited by devin-ai-integration bot

Loading