Skip to content

feat: Phase 2 — intelligent filtering + injection tracking#41

Merged
thebtf merged 4 commits into
mainfrom
worktree-phase2-intelligent-filtering
Mar 22, 2026
Merged

feat: Phase 2 — intelligent filtering + injection tracking#41
thebtf merged 4 commits into
mainfrom
worktree-phase2-intelligent-filtering

Conversation

@thebtf
Copy link
Copy Markdown
Owner

@thebtf thebtf commented Mar 22, 2026

Summary

  • LLM behavioral relevance filter (internal/search/llm_filter.go): evaluates top-N candidates via LLM with configurable 3s timeout fallback to composite scoring. Controlled by ENGRAM_LLM_FILTER_ENABLED (default: false)
  • Injection log (migration 046): tracks where observations get injected per project/session for diversity scoring
  • Injection diversity penalty: penalizes scope=project observations injected across many unrelated projects (high diversity = generic noise)
  • 90-day cleanup: injection_log entries cleaned up in maintenance cycle
  • Retrospective evaluation skill: periodic observation quality review via rate_memory/suppress_memory MCP tools

Files changed

File Change
internal/search/llm_filter.go NEW — LLM filter with prompt builder + JSON parser
internal/db/gorm/injection_log_store.go NEW — LogInjection, GetDiversityScores, CleanupInjectionLog
internal/db/gorm/migrations.go Migration 046: injection_log table
internal/config/config.go 4 new LLM filter config fields
internal/search/manager.go ApplyDiversityPenalty function
internal/worker/handlers_context.go LLM filter wiring + async injection logging
internal/worker/handlers_maintenance.go injection_log 90-day cleanup
internal/worker/service.go LLM filter initialization
plugin/engram/skills/retrospective-eval/SKILL.md NEW — retrospective eval skill

Test plan

  • Build passes: go build ./...
  • Deploy with ENGRAM_LLM_FILTER_ENABLED=false (default) — verify no behavior change
  • Deploy with ENGRAM_LLM_FILTER_ENABLED=true + ENGRAM_LLM_URL — verify filter logs appear
  • Verify injection_log table is created by migration 046
  • Verify 90-day cleanup runs in maintenance cycle
  • Run /retrospective-eval skill manually to verify it works

Summary by CodeRabbit

Примечания к релизу

  • Новые функции

    • Фильтрация результатов поиска через LLM с настраиваемыми моделью, таймаутом и лимитом кандидатов
    • Автоматическая консолидация близких дубликатов (near-duplicate merging) и механизм пометки устаревших наблюдений (supersession)
    • Оценка разнообразия результатов и применение штрафа к скорингу
  • Обслуживание

    • Асинхронная запись инъекций, новая таблица журнала инъекций и периодическая очистка с отчётом о числе удалённых записей
  • Документация

    • Руководство по ретроспективной оценке наблюдений (retrospective-eval)

thebtf added 2 commits March 23, 2026 01:15
- LLM behavioral relevance filter (internal/search/llm_filter.go)
  with configurable timeout fallback to composite scoring
- injection_log table (migration 046) for tracking where observations
  get injected across projects
- Injection diversity penalty in composite scoring — penalizes
  observations injected across many unrelated projects
- Config flags: ENGRAM_LLM_FILTER_ENABLED, _MODEL, _TIMEOUT_MS, _CANDIDATES
- Async injection logging in handleSearchByPrompt
- 90-day injection_log cleanup in maintenance cycle
Skill for periodically reviewing observation usefulness across
two dimensions (global usefulness, project relevance). Maps
verdicts to rate_memory/suppress_memory MCP tool calls.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 03046b93-2435-4d48-957a-d7f33b8f1c31

📥 Commits

Reviewing files that changed from the base of the PR and between 0cf0c54 and cf98399.

📒 Files selected for processing (2)
  • internal/search/llm_filter.go
  • internal/worker/handlers_context.go

Walkthrough

Добавлены функциональность LLM‑фильтрации релевантности кандидатов и консолидации (near‑dedup), миграция и хранилище для логирования инъекций с подсчётом diversity, применение штрафа разнообразия в ранжировании, асинхронное логирование/очистка инъекций, конфигурационные флаги и пороги, и пост‑записная supersession при сохранении решений.

Changes

Cohort / File(s) Summary
Configuration
internal/config/config.go
Добавлены поля конфигурации: LLMFilterEnabled, LLMFilterModel, LLMFilterTimeoutMS, LLMFilterCandidates, ConsolidationEnabled, SupersessionEnabled, SupersessionThreshold, ConsolidationThreshold с дефолтами и загрузкой из окружения.
DB migration & store
internal/db/gorm/migrations.go, internal/db/gorm/injection_log_store.go
Новая миграция 046_injection_log и GORM‑реализация: логирование инъекций (single/batch), расчёт diversity (COUNT DISTINCT / COUNT) и очистка устаревших записей.
LLM filter
internal/search/llm_filter.go
Новый LLMFilter строит детерминированный промпт, вызывает LLM с таймаутом, парсит JSON‑массив ID; в ошибках использует fallback (все или top‑5) и логирует.
Scoring helpers
internal/search/manager.go
Добавлена ApplyDiversityPenalty(...), по‑месту снижает оценки на основании diversity‑скоров с порогами и минимальным множителем.
Worker integration
internal/worker/service.go, internal/worker/handlers_context.go, internal/worker/handlers_maintenance.go
Инициализация s.llmFilter при включённом флаге; применение LLM‑фильтра к топ‑N кандидатов в handleSearchByPrompt с асинхронным LogInjections; периодическая очистка injection_log в maintenance.
Maintenance: near‑dedup
internal/maintenance/near_dedup.go, internal/maintenance/service.go
Добавлен NearDuplicateFinder и интеграция в Maintenance.Service (включение по конфигу): поиск похожих векторов и маркировка менее приоритетных наблюдений как superseded; счётчик объединений и статистика.
MCP post‑write supersession
internal/mcp/tools_memory.go
После записи observation типа decision выполняется поиск похожих (top‑3) и маркировка старых как superseded при превышении порога; ошибки логируются.
Docs / Skill
plugin/engram/skills/retrospective-eval/SKILL.md
Новый документ описывает процесс ретроспективной оценки инъекций, шкалы оценок и действия (keep/demote/suppress) с примерами вызовов инструментов.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant Handler as handleSearchByPrompt
    participant SearchMgr as Search Manager
    participant LLMFilter
    participant LLMClient as LLM Service
    participant ObsStore as Observation Store
    participant DB as Database

    Client->>Handler: Search request
    Handler->>SearchMgr: Cluster, score, sort
    SearchMgr->>DB: Query observations
    DB-->>SearchMgr: Results
    SearchMgr->>SearchMgr: ApplyDiversityPenalty
    Handler->>Handler: LLM filter enabled?
    alt LLM filter enabled
        Handler->>LLMFilter: FilterByRelevance(top N)
        activate LLMFilter
        LLMFilter->>LLMFilter: Build deterministic prompt
        LLMFilter->>LLMClient: Complete(prompt) within timeout
        LLMClient-->>LLMFilter: JSON array of IDs
        LLMFilter->>Handler: Filtered IDs (or fallback)
        deactivate LLMFilter
        Handler->>ObsStore: LogInjections (async)
        activate ObsStore
        ObsStore->>DB: Insert injection_log rows
        DB-->>ObsStore: OK
        deactivate ObsStore
    end
    Handler-->>Client: Final filtered & ranked results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~65 minutes

Possibly related PRs

Poem

🐰 Я — кролик в коде, тихо чищу след,
Фильтр LLM мерит, кто важен, кто нет.
Дубликат спрячу, разнообразье учту,
Асинхронно логи — и порядок в саду.
Прыгаю дальше — релевантность храню.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Название PR чётко отражает основные изменения: добавление интеллектуальной фильтрации на основе LLM и отслеживание инъекций в логе. Это соответствует основной цели Phase 2, описанной в целях PR.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-phase2-intelligent-filtering

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the memory management system by introducing intelligent filtering and injection tracking capabilities. It integrates an LLM-driven filter to refine search results based on behavioral relevance, ensuring that AI agents receive more focused and pertinent context. Additionally, a new system tracks observation injections to penalize overly generic memories, and a retrospective evaluation skill empowers agents to actively improve memory quality over time, contributing to a more efficient and accurate knowledge base.

Highlights

  • LLM Behavioral Relevance Filter: Introduced an LLM-based filter to evaluate the behavioral relevance of top-N observation candidates, with a configurable timeout and fallback to composite scoring.
  • Injection Tracking Log: Implemented a new injection_log table and associated store functions to track where observations are injected per project and session.
  • Injection Diversity Penalty: Added a mechanism to penalize observations that are injected across many unrelated projects, reducing the score of generic observations.
  • 90-Day Log Cleanup: Established a maintenance task to automatically clean up injection_log entries older than 90 days.
  • Retrospective Evaluation Skill: Documented a new skill for periodic review of observation quality, allowing agents to rate or suppress memories.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Mar 22, 2026

@coderabbitai review

@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Mar 22, 2026

@gemini-code-assist review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 22, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Mar 22, 2026

@codex review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an LLM-based filter for search results and tracks observation injections to calculate a diversity score, penalizing overly generic memories. The implementation is solid, with good use of asynchronous operations for logging and maintenance. However, I've identified a few areas for improvement, including a missing call to the new diversity penalty function, which is a key part of the feature. I've also suggested some minor refactorings to improve code clarity and maintainability, such as using constants instead of magic numbers and optimizing a database index.

Comment thread internal/search/manager.go
Comment thread internal/config/config.go
Comment thread internal/db/gorm/migrations.go
Comment thread internal/search/llm_filter.go Outdated
Comment thread internal/search/manager.go
Comment thread internal/worker/handlers_maintenance.go
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an LLM behavioral relevance filter, injection log tracking, diversity penalty, and a 90-day cleanup process. It also includes a retrospective evaluation skill. The changes span multiple files, including new files for the LLM filter and injection log store, and modifications to configuration, search management, worker handlers, and migrations.

Comment thread internal/worker/handlers_maintenance.go
Comment thread internal/search/llm_filter.go
Comment thread internal/config/config.go
Comment thread internal/db/gorm/injection_log_store.go
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/config/config.go`:
- Around line 532-547: В Load() блоке для LLM-флагов сейчас учитывается только
окружение — нужно добавить симметричное чтение из файловой конфигурации
(settings.json / viper/Settings map) с теми же ключами, но сохранить приоритет
env-переменных: если в файле есть значения для ENGRAM_LLM_FILTER_ENABLED,
ENGRAM_LLM_FILTER_MODEL, ENGRAM_LLM_FILTER_TIMEOUT_MS и
ENGRAM_LLM_FILTER_CANDIDATES, установить соответствующие поля
cfg.LLMFilterEnabled, cfg.LLMFilterModel, cfg.LLMFilterTimeoutMS и
cfg.LLMFilterCandidates, а затем переопределять их значениями из os.Getenv если
они заданы; обновите логику парсинга (strconv.Atoi и проверку >0) для таймаута и
candidates так, чтобы она работала при чтении из файла так же, как сейчас для
env.

In `@internal/search/llm_filter.go`:
- Around line 96-107: The current loop building the prompt with sb.WriteString
and inserting obs.Title/obs.Narrative (using llmFilterNarrativeTruncate) writes
unescaped, line-oriented records which can break record boundaries; change the
code that iterates over candidates to produce a data-only payload (e.g., build a
slice/array of structs with fields ID, Type, Title, Narrative and marshal to
JSON) and append that JSON to sb instead of the free-form formatted lines; if
you must keep text, at minimum normalize/strip newlines and escape/control any
marker tokens before writing, and update the prompt text to explicitly tell the
model the payload is JSON and to treat fields as data only.
- Around line 70-72: The current warn log in llm_filter.go logs the raw LLM
"response" (log.Warn().Err(err).Str("response", strutil.Truncate(response,
200))...), which can leak PII; change it to stop logging the raw response and
instead log the error, the response length and a hash (e.g., sha256 hex of
response), plus the request/correlation id available in the context/request
metadata; remove or replace the Str("response", ...) field and add fields like
"response_len", "response_sha256" and "correlation_id" when emitting the warn in
the LLM filter error path.
- Around line 76-81: The code currently treats an empty relevantIDs slice as an
error and returns allIDs; change this so an empty [] is considered a valid
(successful) filter result and only fall back to allIDs on explicit
timeout/error conditions. Locate the branch that checks len(relevantIDs) and
remove/replace the return-allIDs behavior so that when relevantIDs is empty you
simply return relevantIDs (and still log the empty result via the existing
logger), and ensure the fallback path remains tied to actual error/timeout
signals instead of len(relevantIDs) (referencing relevantIDs and allIDs in
internal/search/llm_filter.go).

In `@internal/search/manager.go`:
- Around line 213-235: The penalty is currently applied to all non-global
scopes, wrongly affecting agent-scoped observations; update
ApplyDiversityPenalty so the penalty is applied only when obs.Scope ==
models.ScopeProject (i.e., skip unless scope equals models.ScopeProject),
leaving models.ScopeGlobal and models.ScopeAgent untouched; adjust the scope
check in ApplyDiversityPenalty to only proceed for project-scoped observations
and leave the rest unchanged, referencing the function name
ApplyDiversityPenalty and the scope constants models.ScopeProject and
models.ScopeAgent.

In `@internal/worker/handlers_context.go`:
- Around line 355-368: The current goroutine calls
s.observationStore.LogInjections with the raw query, causing full user queries
to be stored; change the call to avoid saving sensitive raw query data by
passing a safe alternative (e.g., an empty string or a non-reversible hash)
instead of the query variable when invoking LogInjections; update the closure
that builds resultIDs from clusteredObservations and the LogInjections call
(referencing clusteredObservations, resultIDs, project, query, and
s.observationStore.LogInjections) so only observation IDs and project are stored
while the raw query is omitted or replaced with a hashed/sanitized value.
- Around line 325-353: The new diversity signal isn't applied because after
ApplyCompositeScoring the code proceeds to sorting/limit/LLM filter without
calling GetDiversityScores or search.ApplyDiversityPenalty; fix by invoking
s.GetDiversityScores(...) to compute diversity scores for clusteredObservations
and then call search.ApplyDiversityPenalty(clusteredObservations,
diversityScores, s.config.DiversityPenalty...) (or the actual params used in
your code) to adjust scores before the sort/limit/LLM filter block (i.e., update
the flow around ApplyCompositeScoring and before the sorting/limit/LLMFilter
section so the injection_log entries affect ranking). Ensure you reference and
use the returned diversity score map/struct from GetDiversityScores when calling
ApplyDiversityPenalty and preserve types expected by both functions.

In `@plugin/engram/skills/retrospective-eval/SKILL.md`:
- Around line 48-72: В таблице действия keep(global) и keep(project) оба сведены
к единому вызову rate_memory(..., rating="useful"), поэтому scope не меняется;
нужно явно менять scope перед или вместе с рейтингом: для keep(global) вызвать
инструмент MCP, который устанавливает scope в "global" (например
set_memory_scope(id=<observation_id>, scope="global") или эквивалентный
update_memory/id метод) и затем rate_memory(id=<observation_id>,
rating="useful"), а для keep(project) установить scope="project" аналогичным
вызовом и потом rate_memory(id=<observation_id>, rating="useful"); оставьте
suppress как suppress_memory(id=<observation_id>) и demote как rate_memory(...,
rating="not_useful").
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 59961ce0-4b83-40b2-95b6-025080945541

📥 Commits

Reviewing files that changed from the base of the PR and between e7bfd03 and a589bb8.

📒 Files selected for processing (9)
  • internal/config/config.go
  • internal/db/gorm/injection_log_store.go
  • internal/db/gorm/migrations.go
  • internal/search/llm_filter.go
  • internal/search/manager.go
  • internal/worker/handlers_context.go
  • internal/worker/handlers_maintenance.go
  • internal/worker/service.go
  • plugin/engram/skills/retrospective-eval/SKILL.md

Comment thread internal/config/config.go
Comment thread internal/search/llm_filter.go
Comment thread internal/search/llm_filter.go Outdated
Comment thread internal/search/llm_filter.go
Comment thread internal/search/manager.go
Comment thread internal/worker/handlers_context.go
Comment thread internal/worker/handlers_context.go
Comment thread plugin/engram/skills/retrospective-eval/SKILL.md
thebtf added 2 commits March 23, 2026 01:26
- Near-duplicate consolidation (internal/maintenance/near_dedup.go):
  finds observations >0.95 similarity with same project+type,
  marks lower-importance one as superseded. Opt-in via
  ENGRAM_CONSOLIDATION_ENABLED (default: false)
- Write-time supersession for decisions: when storing a new decision,
  marks similar existing decisions (>0.9 similarity) as superseded.
  Controlled by ENGRAM_SUPERSESSION_ENABLED (default: true)
- Config flags: ENGRAM_CONSOLIDATION_ENABLED, _THRESHOLD,
  ENGRAM_SUPERSESSION_ENABLED, _THRESHOLD
- Remove raw LLM response from logs (PII risk) — log response_len instead
- Empty LLM filter result falls back to top-5 per FR-7 spec
- Wire ApplyDiversityPenalty into search pipeline after composite scoring
- Don't store raw query in injection_log (privacy)
- Remove unused model field from LLMFilter
@thebtf thebtf merged commit b97a034 into main Mar 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant