fix: call vector cleanup on all observation delete paths#32
Conversation
Root cause: DeleteObservation and DeleteObservations did not call cleanupFunc, leaving orphaned vectors in pgvector. Only processCleanup (project-based background cleanup) called cleanupFunc correctly. Result: 227K vectors for 186 observations — orphaned embeddings from dedup and MCP bulk_delete that never synced to vector store. Fix: Both delete methods now call s.cleanupFunc(ctx, ids) after successful deletion, matching the pattern in processCleanup.
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses a critical data consistency problem by ensuring that when observations are deleted, their associated vector embeddings are also properly removed from the vector store. Previously, manual deletion methods failed to trigger this cleanup, leading to a buildup of orphaned vector data. The fix integrates the necessary cleanup function calls directly into the deletion logic, preventing future data discrepancies and maintaining a clean vector store. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
ОбзорОбновлены: удаление наблюдений теперь вызывает функцию очистки после успешных GORM-удалений; загрузка конфигурации теперь предпочитает Изменения
Диаграмма последовательностиsequenceDiagram
participant Client as Клиент
participant Handler as handleVaultStatus
participant Service as Worker.s (s)
participant Vault as Vault (s.vault)
Клиент->>Handler: GET /vault/status
Handler->>Service: check keyConfigured
alt keyConfigured == true
alt s.vault != nil
Service->>Vault: Fingerprint(), KeySource()
Vault-->>Service: fingerprint, keySource
else s.vault == nil
Service-->>Handler: нет данных (не вызываем getVault)
end
else keyConfigured == false
Handler->>Service: s.getVault()
Service->>Vault: (getVault) инициализация/чтение
Vault-->>Service: vault metadata
end
Service-->>Handler: status payload
Handler-->>Клиент: 200 {fingerprint, keySource}
sequenceDiagram
participant Caller as Код, вызывающий удаление
participant Store as ObservationStore
participant DB as GORM/DB
participant Cleanup as cleanupFunc
Caller->>Store: DeleteObservations(ctx, ids)
Store->>DB: DELETE ... (GORM)
DB-->>Store: result (Error, RowsAffected)
alt result.Error == nil and RowsAffected > 0 and cleanupFunc != nil
Store->>Cleanup: cleanupFunc(ctx, ids)
Cleanup-->>Store: done
end
Store-->>Caller: возвращает nil (при успехе) или ошибку
Оценка объёма проверки кода🎯 3 (Moderate) | ⏱️ ~20 minutes Стихотворение
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
|
@coderabbitai review |
|
@gemini-code-assist review |
✅ Actions performedReview triggered.
|
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request addresses an issue where deleting observations did not trigger the corresponding vector cleanup, leading to orphaned embeddings. The fix correctly adds calls to cleanupFunc in DeleteObservations and DeleteObservation. The change in DeleteObservation is solid. For DeleteObservations, I've suggested an improvement to make the cleanup more precise and efficient by only passing the IDs of actually deleted observations to the cleanup function, which aligns better with existing patterns in the codebase.
There was a problem hiding this comment.
Code Review
This pull request correctly addresses an issue where deleting observations could leave orphaned vectors in the vector store. The changes in DeleteObservation and DeleteObservations now trigger the necessary cleanup. The implementation for DeleteObservation is sound. For DeleteObservations, I've suggested an improvement to enhance correctness and atomicity by using a transaction, ensuring the cleanup function is called only for observations that were verifiably deleted. This also prevents potential race conditions.
1. ENGRAM_VAULT_KEY is now the primary env var (ENGRAM_ENCRYPTION_KEY as alias) Docker template and UI hint both use ENGRAM_VAULT_KEY — code now matches. 2. handleVaultStatus no longer calls getVault(), preventing auto-generation of a new key on read-only status checks. Previously, Refresh on the Vault page would trigger NewVault() → auto-generate vault.key → switch from Disabled to Enabled as a side effect.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
internal/worker/handlers_vault.go (2)
319-319: Рассмотрите обновление backup_reminder для соответствия новому имени переменной.В
backup_reminderуказанENGRAM_ENCRYPTION_KEY, но вconfig.goтеперьENGRAM_VAULT_KEYявляется основным именем. Для согласованности документации:📝 Предлагаемое изменение
- "backup_reminder": "Back up vault.key (or set ENGRAM_ENCRYPTION_KEY) — losing this key makes stored credentials unrecoverable", + "backup_reminder": "Back up vault.key (or set ENGRAM_VAULT_KEY) — losing this key makes stored credentials unrecoverable",🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/worker/handlers_vault.go` at line 319, Update the backup_reminder message to reference the new environment variable name: replace the mention of ENGRAM_ENCRYPTION_KEY with ENGRAM_VAULT_KEY in the map entry keyed by "backup_reminder" so the displayed guidance matches the new canonical variable defined in config (ensure the string in the "backup_reminder" value now reads something like ENGRAM_VAULT_KEY).
299-305: Логика корректна, но возможно неинформативное состояние при ленивой инициализации.Изменение правильно избегает побочных эффектов в read-only эндпоинте. Однако при
keyConfigured == trueиs.vault == nil(до первого использования vault) ответ будет содержать:{ "key_configured": true, "fingerprint": "", "key_source": "" }Это может сбить с толку пользователей — ключ сконфигурирован, но fingerprint пустой. Рассмотрите возможность добавления поля
vault_initializedили возвратаnullвместо пустых строк для более явного разделения состояний.💡 Опциональное улучшение
writeJSON(w, map[string]any{ "key_configured": keyConfigured, + "vault_initialized": s.vault != nil, "key_source": keySource, "fingerprint": fingerprint, "credential_count": count, "backup_reminder": "Back up vault.key (or set ENGRAM_ENCRYPTION_KEY) — losing this key makes stored credentials unrecoverable", })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@internal/worker/handlers_vault.go` around lines 299 - 305, When keyConfigured is true but s.vault is nil the response currently shows empty fingerprint/key_source which is ambiguous; update the vault metadata response produced in the handler that uses keyConfigured, s.vault, fingerprint and keySource to include an explicit vault_initialized boolean (true when s.vault != nil) or return null for fingerprint/key_source when s.vault == nil instead of empty strings. Locate the block that currently sets fingerprint = s.vault.Fingerprint() / keySource = s.vault.KeySource() and change the response construction to set vault_initialized = (s.vault != nil) and set fingerprint/key_source to null (or omit them) when vault_initialized is false, ensuring JSON marshaling produces nulls (use pointer/string nil or proper struct tags) and preserve existing behavior when s.vault is present.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@internal/worker/handlers_vault.go`:
- Line 319: Update the backup_reminder message to reference the new environment
variable name: replace the mention of ENGRAM_ENCRYPTION_KEY with
ENGRAM_VAULT_KEY in the map entry keyed by "backup_reminder" so the displayed
guidance matches the new canonical variable defined in config (ensure the string
in the "backup_reminder" value now reads something like ENGRAM_VAULT_KEY).
- Around line 299-305: When keyConfigured is true but s.vault is nil the
response currently shows empty fingerprint/key_source which is ambiguous; update
the vault metadata response produced in the handler that uses keyConfigured,
s.vault, fingerprint and keySource to include an explicit vault_initialized
boolean (true when s.vault != nil) or return null for fingerprint/key_source
when s.vault == nil instead of empty strings. Locate the block that currently
sets fingerprint = s.vault.Fingerprint() / keySource = s.vault.KeySource() and
change the response construction to set vault_initialized = (s.vault != nil) and
set fingerprint/key_source to null (or omit them) when vault_initialized is
false, ensuring JSON marshaling produces nulls (use pointer/string nil or proper
struct tags) and preserve existing behavior when s.vault is present.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e2ce4e7c-5411-4262-a7ef-e1245948146c
📒 Files selected for processing (2)
internal/config/config.gointernal/worker/handlers_vault.go
🤖 PR Review MCP State (auto-managed, do not edit){
"version": 2,
"parentChildren": {},
"resolvedNitpicks": {
"coderabbit-nitpick-6d29c539-319": {
"resolvedAt": "2026-03-20T22:39:22.406Z",
"resolvedBy": "agent"
},
"coderabbit-nitpick-d043dd39-299": {
"resolvedAt": "2026-03-20T22:39:25.064Z",
"resolvedBy": "agent"
}
},
"updatedAt": "2026-03-20T22:39:25.530Z"
} |
Agents couldn't rate observations because the injected <relevant-memory> block showed "## 1. [DECISION] Title" with no ID. The MCP feedback tool requires feedback(action="rate", id=N) but N was never visible. Now renders as "## 1. [DECISION] Title (id:12345)" in both: - session-start.js (initial context injection) - user-prompt.js (per-prompt context injection) Closes #32. Plugin version bumped to 3.7.5.
Agents couldn't rate observations because the injected <relevant-memory> block showed "## 1. [DECISION] Title" with no ID. The MCP feedback tool requires feedback(action="rate", id=N) but N was never visible. Now renders as "## 1. [DECISION] Title (id:12345)" in both: - session-start.js (initial context injection) - user-prompt.js (per-prompt context injection) Closes #32. Plugin version bumped to 3.7.5.
…oop (#139) Agents couldn't rate observations because the injected <relevant-memory> block showed "## 1. [DECISION] Title" with no ID. The MCP feedback tool requires feedback(action="rate", id=N) but N was never visible. Now renders as "## 1. [DECISION] Title (id:12345)" in both: - session-start.js (initial context injection) - user-prompt.js (per-prompt context injection) Closes #32. Plugin version bumped to 3.7.5. Co-authored-by: Kirill Turanskiy <thebtf@users.noreply.github.com>
Root Cause
DeleteObservationandDeleteObservationsdid not callcleanupFunc, leaving orphaned vectors in pgvector. OnlyprocessCleanup(project-based background cleanup) properly synced deletions to the vector store.Result: 227,377 vectors for 186 observations — ~227K orphaned embeddings from dedup and MCP bulk_delete operations.
Fix
Both delete methods now call
s.cleanupFunc(ctx, ids)after successful deletion, matching the existing pattern inprocessCleanup.Test plan
Summary by CodeRabbit
Исправления ошибок
Chores