Skip to content

Make incremental embedding generation content-hash based and project-safe#56

Merged
cosmocoder merged 2 commits intomainfrom
fix-incremental-embeddings
Mar 8, 2026
Merged

Make incremental embedding generation content-hash based and project-safe#56
cosmocoder merged 2 commits intomainfrom
fix-incremental-embeddings

Conversation

@cosmocoder
Copy link
Copy Markdown
Owner

Summary

  • make incremental embedding generation rely on content_hash instead of mtime, so unchanged files are skipped correctly even in CI environments
  • distinguish full scans from partial --files runs, and prune stale file/document embeddings only when a full scan makes that safe
  • harden project isolation and retrieval by scoping project-structure embeddings to project_path, validating file existence before using stored context, and refreshing project summaries when embedding inventory changes

…roject-safe

- use content hashes as the source of truth for incremental freshness instead of relying on mtimes
- distinguish full scans from partial file updates so stale embeddings are only pruned when safe
- prune deleted file and document embeddings during full scans to keep review context current
- scope project structure embeddings and retrieval by project path to avoid cross-project collisions
- harden retrieval and summary invalidation so code review uses up-to-date project context
@cosmocoder cosmocoder merged commit a883bf4 into main Mar 8, 2026
14 checks passed
@cosmocoder cosmocoder deleted the fix-incremental-embeddings branch March 8, 2026 20:20
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 8, 2026

🎉 This PR is included in version 1.2.4 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant