feat: Phase 2 — intelligent filtering + injection tracking#41
Conversation
- LLM behavioral relevance filter (internal/search/llm_filter.go) with configurable timeout fallback to composite scoring - injection_log table (migration 046) for tracking where observations get injected across projects - Injection diversity penalty in composite scoring — penalizes observations injected across many unrelated projects - Config flags: ENGRAM_LLM_FILTER_ENABLED, _MODEL, _TIMEOUT_MS, _CANDIDATES - Async injection logging in handleSearchByPrompt - 90-day injection_log cleanup in maintenance cycle
Skill for periodically reviewing observation usefulness across two dimensions (global usefulness, project relevance). Maps verdicts to rate_memory/suppress_memory MCP tool calls.
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (2)
WalkthroughДобавлены функциональность LLM‑фильтрации релевантности кандидатов и консолидации (near‑dedup), миграция и хранилище для логирования инъекций с подсчётом diversity, применение штрафа разнообразия в ранжировании, асинхронное логирование/очистка инъекций, конфигурационные флаги и пороги, и пост‑записная supersession при сохранении решений. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Handler as handleSearchByPrompt
participant SearchMgr as Search Manager
participant LLMFilter
participant LLMClient as LLM Service
participant ObsStore as Observation Store
participant DB as Database
Client->>Handler: Search request
Handler->>SearchMgr: Cluster, score, sort
SearchMgr->>DB: Query observations
DB-->>SearchMgr: Results
SearchMgr->>SearchMgr: ApplyDiversityPenalty
Handler->>Handler: LLM filter enabled?
alt LLM filter enabled
Handler->>LLMFilter: FilterByRelevance(top N)
activate LLMFilter
LLMFilter->>LLMFilter: Build deterministic prompt
LLMFilter->>LLMClient: Complete(prompt) within timeout
LLMClient-->>LLMFilter: JSON array of IDs
LLMFilter->>Handler: Filtered IDs (or fallback)
deactivate LLMFilter
Handler->>ObsStore: LogInjections (async)
activate ObsStore
ObsStore->>DB: Insert injection_log rows
DB-->>ObsStore: OK
deactivate ObsStore
end
Handler-->>Client: Final filtered & ranked results
Estimated code review effort🎯 4 (Complex) | ⏱️ ~65 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.11.3)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the memory management system by introducing intelligent filtering and injection tracking capabilities. It integrates an LLM-driven filter to refine search results based on behavioral relevance, ensuring that AI agents receive more focused and pertinent context. Additionally, a new system tracks observation injections to penalize overly generic memories, and a retrospective evaluation skill empowers agents to actively improve memory quality over time, contributing to a more efficient and accurate knowledge base. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
@coderabbitai review |
|
@gemini-code-assist review |
✅ Actions performedReview triggered.
|
|
@codex review |
There was a problem hiding this comment.
Code Review
This pull request introduces an LLM-based filter for search results and tracks observation injections to calculate a diversity score, penalizing overly generic memories. The implementation is solid, with good use of asynchronous operations for logging and maintenance. However, I've identified a few areas for improvement, including a missing call to the new diversity penalty function, which is a key part of the feature. I've also suggested some minor refactorings to improve code clarity and maintainability, such as using constants instead of magic numbers and optimizing a database index.
There was a problem hiding this comment.
Code Review
This pull request introduces an LLM behavioral relevance filter, injection log tracking, diversity penalty, and a 90-day cleanup process. It also includes a retrospective evaluation skill. The changes span multiple files, including new files for the LLM filter and injection log store, and modifications to configuration, search management, worker handlers, and migrations.
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/config/config.go`:
- Around line 532-547: В Load() блоке для LLM-флагов сейчас учитывается только
окружение — нужно добавить симметричное чтение из файловой конфигурации
(settings.json / viper/Settings map) с теми же ключами, но сохранить приоритет
env-переменных: если в файле есть значения для ENGRAM_LLM_FILTER_ENABLED,
ENGRAM_LLM_FILTER_MODEL, ENGRAM_LLM_FILTER_TIMEOUT_MS и
ENGRAM_LLM_FILTER_CANDIDATES, установить соответствующие поля
cfg.LLMFilterEnabled, cfg.LLMFilterModel, cfg.LLMFilterTimeoutMS и
cfg.LLMFilterCandidates, а затем переопределять их значениями из os.Getenv если
они заданы; обновите логику парсинга (strconv.Atoi и проверку >0) для таймаута и
candidates так, чтобы она работала при чтении из файла так же, как сейчас для
env.
In `@internal/search/llm_filter.go`:
- Around line 96-107: The current loop building the prompt with sb.WriteString
and inserting obs.Title/obs.Narrative (using llmFilterNarrativeTruncate) writes
unescaped, line-oriented records which can break record boundaries; change the
code that iterates over candidates to produce a data-only payload (e.g., build a
slice/array of structs with fields ID, Type, Title, Narrative and marshal to
JSON) and append that JSON to sb instead of the free-form formatted lines; if
you must keep text, at minimum normalize/strip newlines and escape/control any
marker tokens before writing, and update the prompt text to explicitly tell the
model the payload is JSON and to treat fields as data only.
- Around line 70-72: The current warn log in llm_filter.go logs the raw LLM
"response" (log.Warn().Err(err).Str("response", strutil.Truncate(response,
200))...), which can leak PII; change it to stop logging the raw response and
instead log the error, the response length and a hash (e.g., sha256 hex of
response), plus the request/correlation id available in the context/request
metadata; remove or replace the Str("response", ...) field and add fields like
"response_len", "response_sha256" and "correlation_id" when emitting the warn in
the LLM filter error path.
- Around line 76-81: The code currently treats an empty relevantIDs slice as an
error and returns allIDs; change this so an empty [] is considered a valid
(successful) filter result and only fall back to allIDs on explicit
timeout/error conditions. Locate the branch that checks len(relevantIDs) and
remove/replace the return-allIDs behavior so that when relevantIDs is empty you
simply return relevantIDs (and still log the empty result via the existing
logger), and ensure the fallback path remains tied to actual error/timeout
signals instead of len(relevantIDs) (referencing relevantIDs and allIDs in
internal/search/llm_filter.go).
In `@internal/search/manager.go`:
- Around line 213-235: The penalty is currently applied to all non-global
scopes, wrongly affecting agent-scoped observations; update
ApplyDiversityPenalty so the penalty is applied only when obs.Scope ==
models.ScopeProject (i.e., skip unless scope equals models.ScopeProject),
leaving models.ScopeGlobal and models.ScopeAgent untouched; adjust the scope
check in ApplyDiversityPenalty to only proceed for project-scoped observations
and leave the rest unchanged, referencing the function name
ApplyDiversityPenalty and the scope constants models.ScopeProject and
models.ScopeAgent.
In `@internal/worker/handlers_context.go`:
- Around line 355-368: The current goroutine calls
s.observationStore.LogInjections with the raw query, causing full user queries
to be stored; change the call to avoid saving sensitive raw query data by
passing a safe alternative (e.g., an empty string or a non-reversible hash)
instead of the query variable when invoking LogInjections; update the closure
that builds resultIDs from clusteredObservations and the LogInjections call
(referencing clusteredObservations, resultIDs, project, query, and
s.observationStore.LogInjections) so only observation IDs and project are stored
while the raw query is omitted or replaced with a hashed/sanitized value.
- Around line 325-353: The new diversity signal isn't applied because after
ApplyCompositeScoring the code proceeds to sorting/limit/LLM filter without
calling GetDiversityScores or search.ApplyDiversityPenalty; fix by invoking
s.GetDiversityScores(...) to compute diversity scores for clusteredObservations
and then call search.ApplyDiversityPenalty(clusteredObservations,
diversityScores, s.config.DiversityPenalty...) (or the actual params used in
your code) to adjust scores before the sort/limit/LLM filter block (i.e., update
the flow around ApplyCompositeScoring and before the sorting/limit/LLMFilter
section so the injection_log entries affect ranking). Ensure you reference and
use the returned diversity score map/struct from GetDiversityScores when calling
ApplyDiversityPenalty and preserve types expected by both functions.
In `@plugin/engram/skills/retrospective-eval/SKILL.md`:
- Around line 48-72: В таблице действия keep(global) и keep(project) оба сведены
к единому вызову rate_memory(..., rating="useful"), поэтому scope не меняется;
нужно явно менять scope перед или вместе с рейтингом: для keep(global) вызвать
инструмент MCP, который устанавливает scope в "global" (например
set_memory_scope(id=<observation_id>, scope="global") или эквивалентный
update_memory/id метод) и затем rate_memory(id=<observation_id>,
rating="useful"), а для keep(project) установить scope="project" аналогичным
вызовом и потом rate_memory(id=<observation_id>, rating="useful"); оставьте
suppress как suppress_memory(id=<observation_id>) и demote как rate_memory(...,
rating="not_useful").
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 59961ce0-4b83-40b2-95b6-025080945541
📒 Files selected for processing (9)
internal/config/config.gointernal/db/gorm/injection_log_store.gointernal/db/gorm/migrations.gointernal/search/llm_filter.gointernal/search/manager.gointernal/worker/handlers_context.gointernal/worker/handlers_maintenance.gointernal/worker/service.goplugin/engram/skills/retrospective-eval/SKILL.md
- Near-duplicate consolidation (internal/maintenance/near_dedup.go): finds observations >0.95 similarity with same project+type, marks lower-importance one as superseded. Opt-in via ENGRAM_CONSOLIDATION_ENABLED (default: false) - Write-time supersession for decisions: when storing a new decision, marks similar existing decisions (>0.9 similarity) as superseded. Controlled by ENGRAM_SUPERSESSION_ENABLED (default: true) - Config flags: ENGRAM_CONSOLIDATION_ENABLED, _THRESHOLD, ENGRAM_SUPERSESSION_ENABLED, _THRESHOLD
- Remove raw LLM response from logs (PII risk) — log response_len instead - Empty LLM filter result falls back to top-5 per FR-7 spec - Wire ApplyDiversityPenalty into search pipeline after composite scoring - Don't store raw query in injection_log (privacy) - Remove unused model field from LLMFilter
Summary
internal/search/llm_filter.go): evaluates top-N candidates via LLM with configurable 3s timeout fallback to composite scoring. Controlled byENGRAM_LLM_FILTER_ENABLED(default: false)rate_memory/suppress_memoryMCP toolsFiles changed
internal/search/llm_filter.gointernal/db/gorm/injection_log_store.gointernal/db/gorm/migrations.gointernal/config/config.gointernal/search/manager.gointernal/worker/handlers_context.gointernal/worker/handlers_maintenance.gointernal/worker/service.goplugin/engram/skills/retrospective-eval/SKILL.mdTest plan
go build ./...ENGRAM_LLM_FILTER_ENABLED=false(default) — verify no behavior changeENGRAM_LLM_FILTER_ENABLED=true+ENGRAM_LLM_URL— verify filter logs appear/retrospective-evalskill manually to verify it worksSummary by CodeRabbit
Примечания к релизу
Новые функции
Обслуживание
Документация