Skip to content

Implement checkpoint diff blob cleanup on trim #22

@rororowyourboat

Description

@rororowyourboat

Context

From the system design audit, when checkpoint records are trimmed (500-per-thread cap), the associated diff blobs in checkpoint_diff_blobs are orphaned but never cleaned up.

Current State

  • Checkpoints capped at 500 per thread in projections
  • checkpoint_diff_blobs table stores cached unified diffs
  • When oldest checkpoints are trimmed from projection, their diff blobs remain in the database
  • Git refs (commit SHAs) are small and should be retained for forensic value

Design Decision (from spec)

Q-DL-2 resolved: Delete cached diff blobs on trim; retain git refs. Diffs can be recomputed on-demand from git refs if ever needed.

Proposed Changes

  1. When the projector trims a checkpoint from the 500-cap, delete the corresponding row(s) from checkpoint_diff_blobs
  2. Retain the git ref (checkpointRef) in the event store (events are never deleted per INV-1)
  3. Add a fallback in diff query: if blob missing, recompute from git refs on-demand

Acceptance Criteria

  • Diff blobs are deleted when their parent checkpoint is trimmed
  • Git refs remain available in the event stream
  • Diff viewer gracefully handles missing blobs (recomputes or shows "diff unavailable")
  • Storage does not grow unboundedly for long-lived threads

References

  • System Design Spec: .plans/21-system-design-spec.md Section 11 (Data Lifecycle)
  • Audit finding: Q-DL-2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions