Skip to content

refactor: remove local-server assumptions, JS-only hooks#6

Merged
thebtf merged 7 commits into
mainfrom
refactor/client-server-cleanup
Mar 14, 2026
Merged

refactor: remove local-server assumptions, JS-only hooks#6
thebtf merged 7 commits into
mainfrom
refactor/client-server-cleanup

Conversation

@thebtf
Copy link
Copy Markdown
Owner

@thebtf thebtf commented Mar 14, 2026

Summary

  • Remove Go hook binaries (cmd/hooks/, pkg/hooks/) — JS hooks in plugin/engram/hooks/ are the only runtime hooks
  • Remove dead dbWatcher field from worker service (SQLite-era leftover)
  • Redesign import_instincts MCP tool: accept files: [{name, content}] instead of filesystem paths (works in Docker)
  • Gate SDK local file verification behind LOCAL_VERIFICATION_ENABLED config flag (default: off for Docker mode)
  • Update all architecture docs to reflect JS-only hooks

Motivation

Engram runs ONLY in Docker on Unraid/NAS. Several components assumed local filesystem access (Go hooks compiled into binaries, instinct import reading local paths, SDK verifying local files). This PR removes or redesigns all remaining local-server assumptions.

Test plan

  • go build ./cmd/worker/ compiles cleanly
  • go vet ./internal/... passes (excluding pre-existing go-tree-sitter issues)
  • Docker image builds successfully
  • goreleaser archive includes JS hooks instead of Go hook binaries
  • import_instincts works with files parameter (content over wire)
  • import_instincts with path parameter logs deprecation warning
  • SDK processor logs 'local verification disabled' when LOCAL_VERIFICATION_ENABLED is unset

Summary by CodeRabbit

Обновление релиза

  • Новые функции

    • Переход на JavaScript-хуки (выполняются через Node) и утилита синхронизации сессий
    • REST API для индексации и проверки наличия сессий
    • Импорт контента "инстинктов" по содержимому с дедупликацией
  • Улучшения

    • Упрощена сборка — хуки больше не компилируются в бинарники
    • Статуслайн и плагины перенесены в JS-плагины; обновлён плагинный поток
  • Изменения конфигурации

    • Добавлена опция локальной проверки и поля для ключей шифрования
    • Убрана зависимость от ранее использовавшегося каталога сессий

thebtf added 6 commits March 13, 2026 23:01
Clients push session transcripts to the server instead of server
reading local files. Adds POST /api/sessions/index (accepts raw JSONL,
optional gzip) and POST /api/sessions/check (batch existence check).

Stop hook uploads transcript after session end. New sync-sessions.js
script handles catch-up for historical sessions.

Removes dead local indexer (internal/sessions/indexer.go) and
SESSIONS_DIR config — server never had access to client files.
Remove cmd/hooks/ (6 Go binaries) and pkg/hooks/ (framework) — all hooks
already run as JS via node in hooks.json. Go hooks were compiled but never
invoked by Claude Code plugin.

Changes:
- Delete cmd/hooks/ and pkg/hooks/ directories
- Makefile: remove hooks build target and cross-compile sections
- Dockerfile: remove 6 hook build lines, copy JS hooks instead
- .goreleaser.yaml: remove 7 hook build sections, fix stale archive path
  (plugin/hooks/ → plugin/engram/hooks/), add JS files to release archive
- Install scripts: statusline command now uses node instead of binary
- register-plugin.sh: statusline references node statusline.js
The dbWatcher field monitored SQLite database files for deletion/recreation.
Since migrating to PostgreSQL, this watcher serves no purpose — the database
is managed by the PostgreSQL server, not the filesystem.

configWatcher is retained (watches server's own settings.json).
The import_instincts MCP tool now accepts a 'files' parameter with an array
of {name, content} objects, allowing clients to send file content directly
instead of relying on the server having filesystem access to client paths.

This fixes the broken remote import scenario where the Docker server cannot
read files from the client's ~/.claude/homunculus/instincts/ directory.

The legacy 'path' parameter is preserved with a deprecation warning for
backward compatibility in local development setups.
Add LOCAL_VERIFICATION_ENABLED config flag (default: false) to explicitly
control whether the server attempts to read client files from disk or invoke
the Claude CLI locally.

In Docker/remote mode (default), these operations are skipped entirely:
- GetFileMtimes returns nil (no filesystem stat calls)
- GetFileContent returns empty (no file reads)
- Claude CLI discovery is skipped (no exec.LookPath)

Set LOCAL_VERIFICATION_ENABLED=true for local development where the server
runs on the same machine as the client.
Remove all references to Go hook binaries (cmd/hooks/, pkg/hooks/,
bin/hooks/) from architecture documentation, README, and quickstart
guide. Update to reflect JS-only hooks in plugin/engram/hooks/.
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 14, 2026

Walkthrough

Миграция Go-базированных hook-ов в JavaScript-плагины под plugin/engram/hooks/, удаление локальной Go hook‑инфраструктуры, переход от фоновой индексации сессий к REST API (/api/sessions/index, /api/sessions/check), и добавление импорта instincts по содержимому и новых скриптов синхронизации.

Changes

Cohort / File(s) Summary
Build & Release
\.goreleaser.yaml, Dockerfile, Makefile
Убрана сборка и встраивание Go hook-бинарников; артефакты и копирование переориентированы на plugin/engram/hooks/*.js и hooks.json.
Удаление Go hook фреймворка
pkg/hooks/... (pkg/hooks/response.go, pkg/hooks/worker.go, pkg/hooks/worker_test.go), cmd/hooks/... (cmd/hooks/session-start/..., post-tool-use/..., user-prompt/..., stop/..., subagent-stop/..., statusline/...)
Полное удаление пакета hook-утилит и всех шести Go hook-реализаций и тестов; удалены связанные публичные типы и функции.
JS hooks & helpers
plugin/engram/hooks/lib.js, plugin/engram/hooks/stop.js, plugin/engram/hooks/sync-sessions.js
Добавлен WorkstationID и requestUpload в lib.js; stop.js теперь индексирует транскрипты; добавлен скрипт sync-sessions.js для пакетной загрузки сессий.
Session indexing (worker)
internal/worker/service.go, internal/worker/handlers_sessions.go
Удалён фоновой indexer, добавлены HTTP endpoints: POST /api/sessions/index и POST /api/sessions/check для клиентской индексации и проверки существования сессий.
Sessions store
internal/sessions/store.go
Добавлен метод CheckSessionsExist(ctx, ids []string) ([]string, error) для проверки уже индексированных сессий.
Удаление локального indexer
internal/sessions/indexer.go, cmd/mcp/main.go
Удалён локальный индексер файлов сессий; убрана автозагрузка/индексация при старте MCP.
Config & local verification
internal/config/config.go, internal/worker/sdk/processor.go
Удалено SessionsDir; добавлены EncryptionKeyFile, EncryptionKey, LocalVerificationEnabled; локальная проверка файлов/Claude‑CLI теперь зависит от LocalVerificationEnabled.
Instincts import
internal/instincts/content.go, internal/mcp/server.go, internal/mcp/tools_instincts.go
Добавлены ParseContent/ImportFromContent для импорта instincts из содержания файлов; MCP tool расширена для приёма files (name+content) и сохранения совместимости с устаревшим path.
Install / registration scripts
scripts/install.sh, scripts/install.ps1, scripts/register-plugin.sh
Скрипты установки и регистрации теперь копируют только .js hook-файлы и hooks.json; статуслайн запускается через node statusline.js вместо бинарника.
Документация
README.md, docs/arch/...
Документация и диаграммы обновлены: hooks описаны как JS-плагины в plugin/engram/hooks/, контракт STDIO -> JSON, обновлены архитектурные документы и Quickstart.
Make/Build help text
Makefile, Dockerfile
Удалены цели и шаги, связанные со сборкой hook-бинарников; образы и шаги копирования упрощены для JS hooks.

Sequence Diagram(s)

sequenceDiagram
    participant CC as Claude Code (Client)
    participant JS as Hook (plugin/engram/hooks/*.js)
    participant Lib as lib.js
    participant Worker as Worker HTTP API
    participant Store as Sessions DB
    CC->>JS: Запуск hook (stdin JSON)
    activate JS
    JS->>Lib: WorkstationID / requestUpload / getProject
    Lib->>Worker: HTTP GET/POST (например /api/observations)
    Worker-->>Lib: JSON ответ (observations / status)
    Lib-->>JS: Возвращает данные
    JS->>Worker: При необходимости POST /api/sessions/index (transcript, workstation_id)
    Worker->>Store: Upsert IndexedSession
    Worker-->>JS: Ответ {status, session_id}
    JS-->>CC: Печать JSON { "continue": true, ... } на stdout
    deactivate JS
Loading
sequenceDiagram
    participant Client as sync-sessions.js
    participant Worker as Worker HTTP API
    participant Store as Sessions DB
    Client->>Worker: POST /api/sessions/check { session_ids }
    activate Worker
    Worker->>Store: Query indexed_sessions (CheckSessionsExist)
    Store-->>Worker: existing IDs
    Worker-->>Client: { missing: [...] }
    deactivate Worker
    Client->>Worker: POST /api/sessions/index?workstation_id=... (multipart/ndjson)
    activate Worker
    Worker->>Worker: Parse JSONL, extract content/tool counts
    Worker->>Store: Upsert IndexedSession
    Worker-->>Client: { status: "indexed", session_id, exchange_count }
    deactivate Worker
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Пояснение

Основное изменение касается миграции с Go-based hooks на JavaScript-based hooks, размещённые в plugin/engram/hooks/, удаления инфраструктуры pkg/hooks, переноса indexing с фонового процесса на REST API endpoints и обновления конфигурации для поддержки локальной верификации.

Изменения

Когорта / Файлы Описание
Build и Release Configuration
.goreleaser.yaml, Dockerfile, Makefile
Удалены сборка и встраивание Go-based hook binaries; переконфигурированы артефакты для копирования JS hooks из plugin/engram/hooks/ вместо компилированных бинарников.
Hook Framework Removal
pkg/hooks/response.go, pkg/hooks/worker.go, pkg/hooks/worker_test.go, cmd/hooks/post-tool-use/main.go, cmd/hooks/session-start/main.go, cmd/hooks/statusline/main.go, cmd/hooks/stop/main.go, cmd/hooks/subagent-stop/main.go, cmd/hooks/user-prompt/main.go
Полное удаление Go-based hook framework, включая типы HookResponse, BaseInput, HookContext, worker управление и все 6 hook реализаций (session-start, user-prompt, post-tool-use, subagent-stop, stop, statusline).
Session Indexing Refactor
internal/sessions/indexer.go, internal/sessions/store.go, internal/worker/service.go, internal/worker/handlers_sessions.go
Замена фонового indexer на REST API endpoints; удалён background indexing, добавлены POST /api/sessions/index и POST /api/sessions/check с поддержкой client-driven indexing.
Configuration Updates
internal/config/config.go, internal/worker/sdk/processor.go
Удалено SessionsDir, добавлены EncryptionKeyFile, EncryptionKey и LocalVerificationEnabled; локальная верификация (Claude CLI) теперь гейтируется флагом.
Instincts Import Pipeline
internal/instincts/content.go, internal/mcp/server.go, internal/mcp/tools_instincts.go
Добавлена ParseContent и ImportFromContent для импорта instincts из содержимого; MCP tool обновлена для поддержки файлов с содержимым вместо только пути к директории.
JavaScript Hooks и Sync
plugin/engram/hooks/lib.js, plugin/engram/hooks/stop.js, plugin/engram/hooks/sync-sessions.js
Добавлены WorkstationID и requestUpload функции в lib.js; добавлен stop.js функционал для indexing транскриптов; новый sync-sessions.js скрипт для массовой загрузки сессий.
Installation Scripts
scripts/install.sh, scripts/install.ps1, scripts/register-plugin.sh
Обновлены для копирования только .js hooks и hooks.json; statusline теперь запускается через node вместо прямого бинарника.
Documentation
README.md, docs/arch/API_CONTRACTS.md, docs/arch/COMPONENTS.md, docs/arch/OVERVIEW.md, docs/arch/QUICKSTART.md, docs/arch/architecture.md
Обновлены документация и архитектурные диаграммы для отражения JS-based hooks, новых endpoints API, и REST-based session indexing.
Worker MCP Changes
cmd/mcp/main.go
Удалён automatic session indexing bootstrap на startup; retains sessionIdxStore initialization только.

Стихотворение

🐰 Я — кролик, радуюсь переменам в коде,
JS‑скрипты в плагинах скачут по ходу.
Ушёл indexer в фон — теперь клиент шлёт,
Сессии летят в API, всё вокруг поёт.
Пусть сборка компактна — прыгай, проект, в свободу!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 63.16% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main objective: removing Go hook binaries and shifting to JavaScript-only hooks while eliminating local-server assumptions for a Docker-first architecture.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch refactor/client-server-cleanup
📝 Coding Plan
  • Generate coding plan for human review comments

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.11.3)

Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions
The command is terminated due to an error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions


Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request implements a significant refactoring to eliminate local server assumptions and enhance compatibility with Docker and remote environments. The core of these changes involves migrating all lifecycle hooks from Go binaries to JavaScript, which are now executed directly by the Claude Code plugin system. Additionally, the instinct import mechanism has been updated to accept file content directly, and a new configuration flag controls local file verification, making the system more robust and portable for its intended deployment scenarios.

Highlights

  • Go Hooks Removed, JavaScript Hooks Introduced: All Go-based lifecycle hooks (cmd/hooks/, pkg/hooks/) have been removed and replaced with JavaScript-only hooks (plugin/engram/hooks/). This eliminates the need for Go hook binaries and simplifies the build process.
  • Redesigned Instinct Import Mechanism: The import_instincts MCP tool has been redesigned to accept instinct file content directly via an API, deprecating the previous local filesystem path-based import method. This improves compatibility with Docker and remote deployments.
  • Local Verification Configuration: A new LOCAL_VERIFICATION_ENABLED configuration flag has been introduced. This flag gates local file verification and CLI calls, defaulting to off for Docker environments to prevent local filesystem assumptions.
  • Session Indexing via API: The system now supports session transcript indexing and checking via new API endpoints (/api/sessions/index, /api/sessions/check), replacing the previous local filesystem-based session indexer (internal/sessions/indexer.go was removed).
  • Build and Documentation Updates: Build configurations (.goreleaser.yaml, Dockerfile, Makefile) and architecture documentation (docs/arch/*, README.md) have been updated to reflect the JS-only hook architecture and the Docker-centric deployment model.
  • Removed Dead Code: The dbWatcher field, an SQLite-era leftover, has been removed from the worker service, cleaning up obsolete components.
Changelog
  • .goreleaser.yaml
    • Removed Go hook binaries from the build process.
    • Updated archive configuration to include JavaScript hook files.
  • Dockerfile
    • Removed build steps for Go hook binaries.
    • Adjusted copy commands to include JavaScript hook files instead of Go binaries.
  • Makefile
    • Removed hooks build target and related commands for Go hook binaries.
    • Updated LDFLAGS to remove references to pkg/hooks.
    • Modified build and install targets to reflect the removal of Go hooks and inclusion of JS hooks.
  • README.md
    • Updated references to hooks, indicating they are now JS-based.
    • Removed pkg/hooks from the package list.
  • cmd/hooks/post-tool-use/main.go
    • Removed file.
  • cmd/hooks/session-start/main.go
    • Removed file.
  • cmd/hooks/statusline/main.go
    • Removed file.
  • cmd/hooks/stop/main.go
    • Removed file.
  • cmd/hooks/subagent-stop/main.go
    • Removed file.
  • cmd/hooks/user-prompt/main.go
    • Removed file.
  • cmd/mcp/main.go
    • Removed the initialization and background indexing of sessions from local filesystem.
  • docs/arch/API_CONTRACTS.md
    • Updated hook interface descriptions to reflect JS-only execution and HTTP communication.
    • Changed Go input struct examples to JSON.
  • docs/arch/COMPONENTS.md
    • Updated hook descriptions to specify JavaScript files executed via node.
    • Removed pkg/hooks section.
  • docs/arch/OVERVIEW.md
    • Updated diagrams and descriptions to reflect JS hooks and their execution.
  • docs/arch/QUICKSTART.md
    • Updated build instructions to remove references to Go hooks.
    • Clarified that hooks are JavaScript files and do not require compilation.
    • Updated install steps to copy JS hooks.
  • internal/config/config.go
    • Removed SessionsDir configuration field.
    • Added LocalVerificationEnabled boolean configuration flag.
  • internal/instincts/content.go
    • Added new file to handle instinct imports from content (over the wire).
  • internal/mcp/server.go
    • Updated the import_instincts tool definition to accept a files array (name, content) and marked the path parameter as deprecated.
  • internal/mcp/tools_instincts.go
    • Modified handleImportInstincts to prioritize importing from the files array and log a deprecation warning for path.
  • internal/sessions/indexer.go
    • Removed file.
  • internal/sessions/store.go
    • Added CheckSessionsExist method to query for existing session IDs in the database.
  • internal/worker/handlers_sessions.go
    • Added handleIndexSession endpoint to receive and index session transcripts via API.
    • Added handleCheckSessions endpoint to check for already indexed session IDs.
  • internal/worker/sdk/processor.go
    • Modified SDK processor initialization to conditionally find Claude Code CLI and enable local file verification based on LocalVerificationEnabled flag.
    • Updated GetFileMtimes and GetFileContent to return empty/nil if local verification is disabled.
  • internal/worker/service.go
    • Removed dbWatcher field.
    • Removed the background initialization and indexing of sessions from local filesystem.
    • Added new routes for session indexing and checking.
  • pkg/hooks/response.go
    • Removed file.
  • pkg/hooks/worker.go
    • Removed file.
  • pkg/hooks/worker_test.go
    • Removed file.
  • plugin/engram/hooks/lib.js
    • Added WorkstationID function to generate a deterministic ID.
    • Added requestUpload function to send raw content (potentially gzipped) to the server.
  • plugin/engram/hooks/stop.js
    • Added logic to index session transcripts by uploading them to the new /api/sessions/index endpoint.
  • plugin/engram/hooks/sync-sessions.js
    • Added new JavaScript file for a catch-up sync script for session transcripts.
  • scripts/install.ps1
    • Updated PowerShell installation script to copy JS hook files and configure statusline to use node.
  • scripts/install.sh
    • Updated shell installation script to copy JS hook files and remove chmod +x for Go hooks.
    • Configured statusline to use node.
  • scripts/register-plugin.sh
    • Updated STATUSLINE_CMD to use node for the statusline.js hook.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Mar 14, 2026

@coderabbitai review

@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Mar 14, 2026

@gemini-code-assist review

@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Mar 14, 2026

@codex review

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 14, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is a comprehensive refactoring that successfully removes local-server assumptions and transitions to JS-only hooks, significantly improving the application's portability for Docker-based deployments. The changes are consistently applied across the codebase, including build configurations, application logic, and documentation. The new API-driven approach for session transcript indexing is well-implemented, complete with a thoughtful backfill script. My feedback is minor, focusing on improving the robustness of error handling in the new session synchronization script.

Comment thread plugin/engram/hooks/sync-sessions.js
Comment thread plugin/engram/hooks/sync-sessions.js
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request is a significant and well-executed refactoring to remove local filesystem dependencies, primarily by replacing Go-based hooks with JavaScript hooks and introducing a client-server model for session transcript indexing. The changes are consistent across the codebase, including build configurations, Dockerfile, application logic, and documentation. The new REST API for session indexing is a solid design.

My review includes a few suggestions for improvement:

  • Making a deprecation warning more explicit.
  • Using asynchronous file I/O in the new Node.js scripts for better performance and adherence to best practices.

Comment thread internal/mcp/tools_instincts.go Outdated
Comment thread plugin/engram/hooks/stop.js
Comment thread plugin/engram/hooks/sync-sessions.js
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 14

🧹 Nitpick comments (1)
internal/mcp/server.go (1)

1029-1045: Стоит явно зафиксировать контракт “files или path”.

Сейчас схема не требует ни одно поле, поэтому клиенты могут формировать заведомо невалидные вызовы. Лучше добавить anyOf с обязательностью одного из путей ввода.

Предлагаемое усиление схемы
 			InputSchema: map[string]any{
 				"type": "object",
+				"anyOf": []map[string]any{
+					{"required": []string{"files"}},
+					{"required": []string{"path"}},
+				},
 				"properties": map[string]any{
 					"files": map[string]any{
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/mcp/server.go` around lines 1029 - 1045, The JSON schema for the
InputSchema in internal/mcp/server.go currently doesn't require either "files"
or "path", so add an anyOf constraint to require at least one input: update the
InputSchema object (where InputSchema is defined) to include an "anyOf" array
with two entries, one requiring ["files"] and the other requiring ["path"],
keeping the existing "files" and deprecated "path" property definitions
unchanged so clients must provide either files or path.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/arch/API_CONTRACTS.md`:
- Around line 184-187: Замените устаревшее описание вычисления Project в
docs/arch/API_CONTRACTS.md: вместо «Project identifier from CWD path hash»
документируйте текущее поведение, которое реализовано в
plugin/engram/hooks/lib.js в функции ProjectIDWithName — сначала формируется
идентификатор по git remote + относительному пути (чтобы обеспечить
cross-workstation семантику), а только при отсутствии git-remote падает назад на
вычисление по path-hash; укажите порядок приоритетов и кратко причину fallback.
- Around line 171-173: The docs incorrectly state that hooks write plain text to
stdout; update API_CONTRACTS.md to reflect the actual stdout contract used by
plugin/engram/hooks/lib.js: hooks must output a JSON object (e.g. { continue:
true, ... }) for normal responses and only the statusline uses plain text;
revise the session-start example and any text around lines 171-173 and 190-198
to show the JSON shape and required fields (including continue) and reference
lib.js behavior so readers implement compatible hooks.

In `@docs/arch/architecture.md`:
- Line 302: Нужно устранить несогласованность в документе: заменить все старые
упоминания «Hooks (Go binaries)», а также устаревшие блоки про SQLite/FTS5 и `DB
Watcher` в предыдущих разделах на новую реализацию `plugin/engram/hooks/*.js`
(JS lifecycle hooks), чтобы карта компонентов и описания выше по тексту
соответствовали строке с `plugin/engram/hooks/*.js`; найдите и обновите
заголовки/пояснения, которые ссылаются на Go-бинарники и старую интеграцию БД,
переопределив их на JS hooks, указав, где теперь обрабатываются
session-start/stop и убрав/перенастроив описания SQLite/FTS5 и DB Watcher так,
чтобы архитектурная схема была единообразна.

In `@docs/arch/QUICKSTART.md`:
- Line 122: В документе QUICKSTART заменить неверный путь
`~/.claude/installed_plugins.json` и `~/.claude/known_marketplaces.json` на
фактически используемые пути с поддиректорией plugins — например
`~/.claude/plugins/installed_plugins.json` и
`~/.claude/plugins/known_marketplaces.json`; обновите все вхождения этих строк в
docs/arch/QUICKSTART.md так, чтобы они совпадали с путями, которые создают
скрипты.

In `@internal/instincts/content.go`:
- Around line 26-37: The current split by plain "---" is brittle; update the
frontmatter parsing to search for the closing delimiter "\n---\n" (handling CRLF
if needed) after the initial opening marker instead of
strings.SplitN(content[3:], "---", 2). In the function that builds parts (use
the existing variables content and name and the Inst struct/inst), find the
index of the closing delimiter in content[3:], verify it exists (return the same
malformed YAML error if not), then slice the YAML frontmatter and body using
that index, call yaml.Unmarshal on the exact frontmatter slice, and set
inst.Body and inst.FilePath as before.

In `@internal/worker/handlers_sessions.go`:
- Around line 513-525: Текущий код использует http.MaxBytesReader(body,
maxSessionIndexBody) только на сжатом потоке, поэтому распаковка через
gzip.NewReader(body) может создать неограниченный по размеру reader; исправьте
это, оборачивая распакованный gz в ограничитель по maxSessionIndexBody (например
reader = io.LimitReader(gz, maxSessionIndexBody)) вместо прямого присваивания,
сохранив текущие defer закрытия; используйте те же символы в коде
(http.MaxBytesReader, maxSessionIndexBody, gzip.NewReader, reader) чтобы
локализовать и заменить поведение для проверки предела уже на распакованном
потоке.

In `@internal/worker/sdk/processor.go`:
- Around line 1006-1010: The function GetFileMtimes currently returns nil when
LocalVerificationEnabled is false, which contradicts its comment and can cause
nil-handling issues for callers; change the early return in GetFileMtimes so it
returns an empty map[string]int64 (e.g. make(map[string]int64)) instead of nil
so callers always receive a non-nil map; update the return statement inside
GetFileMtimes accordingly.

In `@plugin/engram/hooks/lib.js`:
- Around line 242-253: WorkstationID() currently hashes hostname+username but
must match server-side parser.go which hashes hostname+machine_id; update
WorkstationID() to read the same machine_id source used by
internal/sessions/parser.go (e.g., read /etc/machine-id or call the shared
helper that returns machine_id), then compute hash over hostname+machine_id and
return the first 8 hex chars; ensure sync-sessions.js and stop.js continue to
use WorkstationID() so all clients produce the identical ID.

In `@plugin/engram/hooks/sync-sessions.js`:
- Around line 27-33: The findSessionFiles function currently hardcodes the
sessions directory to path.join(os.homedir(), '.claude', 'projects'); change it
to first read process.env.ENGRAM_SESSIONS_DIR (if set and non-empty) and use
that as the projectsDir, otherwise fall back to the existing default; ensure you
still call fs.existsSync on the selected projectsDir and return [] if it does
not exist so behavior is unchanged for missing dirs. Use the existing function
name findSessionFiles and the variable projectsDir to locate where to apply the
change.

In `@scripts/install.ps1`:
- Line 173: The $StatuslineCmd assignment builds a node invocation with
$InstallDir\hooks\statusline.js but doesn't quote the path, so it breaks on
directories with spaces; update the $StatuslineCmd string to wrap the script
path (the $InstallDir\hooks\statusline.js portion) in double quotes (taking care
to escape or use an alternate quoting style appropriate for PowerShell) so Node
receives a single quoted pathname when $InstallDir contains spaces.
- Around line 81-83: Копирование обязательных хуков сейчас выполняется с
-ErrorAction SilentlyContinue, поэтому ошибки игнорируются; замените эти вызовы
Copy-Item (относящиеся к "$TempDir\hooks\*.js", "hooks.json" и
"$InstallDir\hooks\") на версии, которые не подавляют ошибки (например
использовать -ErrorAction Stop) и/или оберните вызовы в try/catch: при ошибке
логируйте понятное сообщение с указанием "$TempDir" и "$InstallDir" и прерывайте
установку (exit 1); альтернативно, после копирования проверьте наличие целевых
файлов в "$InstallDir\hooks\" и в случае отсутствия тоже выводите ошибку и
завершайте процесс.

In `@scripts/install.sh`:
- Line 264: The statusline command currently builds path without quoting so
paths with spaces break; update the assignment of the statusline_cmd variable to
quote or escape the expanded INSTALL_DIR path (the statusline_cmd variable that
references $INSTALL_DIR/hooks/statusline.js) so the node invocation receives the
full path as a single argument (e.g., wrap the expanded path in quotes or use
proper shell escaping).
- Around line 173-174: Копирование хуков в install.sh не должно молча
проваливаться; правьте вызовы cp (тот, который копирует "$tmp_dir/hooks/"*.js и
тот, который копирует "$tmp_dir/hooks/hooks.json") чтобы убрать "|| true" и
добавить явную проверку наличия исходных файлов/директории (например test -e
"$tmp_dir/hooks/hooks.json" или glob non-empty) и в случае отсутствия вывести
понятную ошибку и завершить скрипт с ненулевым кодом (exit 1); используйте
переменные tmp_dir и INSTALL_DIR для поиска и сообщения об ошибке, чтобы
установка не продолжалась без обязательных JS-хуков.

In `@scripts/register-plugin.sh`:
- Line 75: STATUSLINE_CMD currently builds a command without quoting
$MARKETPLACE_PATH, so spaces in paths (e.g., $HOME) break execution; update the
assignment of STATUSLINE_CMD to quote the path to statusline.js (referencing
STATUSLINE_CMD and the statusline.js invocation that uses $MARKETPLACE_PATH) so
the node command is invoked with the full quoted path (e.g., wrap
"$MARKETPLACE_PATH/hooks/statusline.js" in quotes) ensuring proper handling of
spaces and special characters.

---

Nitpick comments:
In `@internal/mcp/server.go`:
- Around line 1029-1045: The JSON schema for the InputSchema in
internal/mcp/server.go currently doesn't require either "files" or "path", so
add an anyOf constraint to require at least one input: update the InputSchema
object (where InputSchema is defined) to include an "anyOf" array with two
entries, one requiring ["files"] and the other requiring ["path"], keeping the
existing "files" and deprecated "path" property definitions unchanged so clients
must provide either files or path.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: dfd7ebd4-0184-4073-a623-dd578b62aa15

📥 Commits

Reviewing files that changed from the base of the PR and between a06a9cf and a7d1099.

📒 Files selected for processing (34)
  • .goreleaser.yaml
  • Dockerfile
  • Makefile
  • README.md
  • cmd/hooks/post-tool-use/main.go
  • cmd/hooks/session-start/main.go
  • cmd/hooks/statusline/main.go
  • cmd/hooks/stop/main.go
  • cmd/hooks/subagent-stop/main.go
  • cmd/hooks/user-prompt/main.go
  • cmd/mcp/main.go
  • docs/arch/API_CONTRACTS.md
  • docs/arch/COMPONENTS.md
  • docs/arch/OVERVIEW.md
  • docs/arch/QUICKSTART.md
  • docs/arch/architecture.md
  • internal/config/config.go
  • internal/instincts/content.go
  • internal/mcp/server.go
  • internal/mcp/tools_instincts.go
  • internal/sessions/indexer.go
  • internal/sessions/store.go
  • internal/worker/handlers_sessions.go
  • internal/worker/sdk/processor.go
  • internal/worker/service.go
  • pkg/hooks/response.go
  • pkg/hooks/worker.go
  • pkg/hooks/worker_test.go
  • plugin/engram/hooks/lib.js
  • plugin/engram/hooks/stop.js
  • plugin/engram/hooks/sync-sessions.js
  • scripts/install.ps1
  • scripts/install.sh
  • scripts/register-plugin.sh
💤 Files with no reviewable changes (10)
  • cmd/hooks/stop/main.go
  • pkg/hooks/worker_test.go
  • cmd/hooks/subagent-stop/main.go
  • cmd/hooks/post-tool-use/main.go
  • cmd/hooks/session-start/main.go
  • pkg/hooks/response.go
  • pkg/hooks/worker.go
  • cmd/hooks/statusline/main.go
  • internal/sessions/indexer.go
  • cmd/hooks/user-prompt/main.go

Comment thread docs/arch/API_CONTRACTS.md
Comment thread docs/arch/API_CONTRACTS.md
Comment thread docs/arch/architecture.md
Comment thread docs/arch/QUICKSTART.md Outdated
Comment thread internal/instincts/content.go Outdated
Comment thread scripts/install.ps1 Outdated
Comment thread scripts/install.ps1 Outdated
Comment thread scripts/install.sh Outdated
Comment thread scripts/install.sh Outdated
Comment thread scripts/register-plugin.sh Outdated
@thebtf
Copy link
Copy Markdown
Owner Author

thebtf commented Mar 14, 2026

🤖 PR Review MCP State (auto-managed, do not edit)
{
  "version": 2,
  "parentChildren": {},
  "resolvedNitpicks": {
    "coderabbit-nitpick-cd9e0c0f-1029": {
      "resolvedAt": "2026-03-14T00:52:51.312Z",
      "resolvedBy": "agent"
    }
  },
  "updatedAt": "2026-03-14T00:52:51.840Z"
}

- docs/arch/API_CONTRACTS.md: correct hook stdout contract (JSON object,
  not plain text), update Project ID derivation docs (git remote primary,
  path hash fallback), fix hook return value descriptions throughout
- docs/arch/QUICKSTART.md: fix plugin JSON file paths to use
  ~/.claude/plugins/ subdirectory (was incorrectly showing ~/.claude/)
- docs/arch/architecture.md: rename 'Hooks (Go binaries)' to
  'Hooks (JS — plugin/engram/hooks)', remove stale 'DB Watcher' node
- internal/instincts/content.go: fix fragile YAML frontmatter parser to
  use '\n---\n' closing delimiter instead of plain '---' split
- internal/mcp/server.go: add anyOf constraint to import_instincts schema
  requiring either 'files' or 'path' parameter
- internal/mcp/tools_instincts.go: always emit deprecation warning when
  legacy path-based import is used, not only when path is non-empty
- internal/worker/handlers_sessions.go: wrap decompressed gzip stream in
  io.LimitReader to prevent gzip bomb attacks
- internal/worker/sdk/processor.go: return empty map instead of nil from
  GetFileMtimes when local verification is disabled
- plugin/engram/hooks/lib.js: fix WorkstationID() to use hostname+machine_id
  (matching server-side parser.go), not hostname+username; add ENGRAM_SESSIONS_DIR
  support to sync-sessions.js findSessionFiles()
- plugin/engram/hooks/sync-sessions.js: log errors in catch blocks instead
  of silently swallowing them; support ENGRAM_SESSIONS_DIR env override
- scripts/install.ps1: use -ErrorAction Stop for required hook file copy,
  quote statusline.js path to handle spaces in install dir
- scripts/install.sh: fail loudly if hook file copy fails, quote
  statusline.js path to handle spaces in install dir
- scripts/register-plugin.sh: quote statusline.js path to handle spaces
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
docs/arch/architecture.md (1)

234-234: ⚠️ Potential issue | 🟡 Minor

Несоответствие в разделе Data Flow.

Строка 234 всё ещё упоминает "post-tool-use hook (Go binary)", хотя диаграмма и Component Map уже обновлены на JS-хуки.

Предлагаемое исправление
-  → post-tool-use hook (Go binary)
+  → post-tool-use hook (JS — plugin/engram/hooks/post-tool-use.js)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/arch/architecture.md` at line 234, В разделе Data Flow в файле
architecture.md найдена устаревшая фраза "post-tool-use hook (Go binary)";
замените её на актуальную формулировку, согласованную с диаграммой и Component
Map — например "post-tool-use hook (JavaScript/JS hook)" — чтобы текст отражал
переход с Go на JS; проверьте в том же разделе упоминания функции/компонента
"post-tool-use hook" и обновите все вхождения описаний и примечаний, чтобы не
оставалось ссылок на Go-бинарь.
docs/arch/API_CONTRACTS.md (2)

193-243: ⚠️ Potential issue | 🟠 Major

Уберите термин BaseInput из описания JS-хуков.

Секция Hook Input уже задаёт JSON-вход через stdin, но ниже снова используется Go-термин BaseInput (Line 230, Line 236, Line 242). Это смешивает контракты и может привести к неверной реализации хуков.

💡 Предлагаемая правка
-**Input:** BaseInput + prompt text fields
+**Input:** stdin JSON (`session_id`, `cwd`) + поля текста промпта

-**Input:** BaseInput + tool name + tool input/output fields
+**Input:** stdin JSON (`session_id`, `cwd`) + `tool_name` и поля `tool_input`/`tool_output`

-**Input:** BaseInput + subagent fields
+**Input:** stdin JSON (`session_id`, `cwd`) + поля subagent-события
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/arch/API_CONTRACTS.md` around lines 193 - 243, The docs incorrectly use
the Go-specific term "BaseInput" inside the JS hook sections; replace
"BaseInput" with the previously defined "Hook Input" (or explicitly state "stdin
JSON" and list the shared fields session_id and cwd) in the user-prompt Hook,
post-tool-use Hook, and subagent-stop Hook descriptions so the contract is
consistent with the "Hook Input" section and does not mix Go terminology into
the JS hooks.

248-255: ⚠️ Potential issue | 🟠 Major

Блок stop Hook всё ещё документирован как Go-структура.

Для JS-only архитектуры здесь нужен JSON-контракт stdin, а не type Input struct { hooks.BaseInput ... }. Сейчас это противоречит остальной части документа и PR-целям.

💡 Предлагаемая правка
-```go
-// Input
-type Input struct {
-    hooks.BaseInput
-    TranscriptPath string `json:"transcript_path"` // path to session JSONL transcript
-    StopHookActive bool   `json:"stop_hook_active"`
-}
-// Return value (stdout JSON): { "continue": true }
-```
+```json
+// Input (stdin JSON)
+{
+  "session_id": "string",
+  "cwd": "string",
+  "transcript_path": "string",
+  "stop_hook_active": true
+}
+// Return value (stdout JSON): { "continue": true }
+```
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/arch/API_CONTRACTS.md` around lines 248 - 255, The docs still show the
stop Hook input as a Go struct (type Input with hooks.BaseInput, TranscriptPath,
StopHookActive) which conflicts with the JS-only design; replace that Go struct
with a JSON stdin contract for the stop hook, documenting fields like
session_id, cwd, transcript_path, and stop_hook_active and keep the stdout
return example ({ "continue": true }) so readers see the JSON input/output
contract instead of Go symbols.
🧹 Nitpick comments (1)
internal/instincts/content.go (1)

22-24: Рассмотрите поддержку CRLF для Windows-файлов.

Проверка strings.HasPrefix(content, "---\n") и поиск \n---\n не обработают файлы с Windows-окончаниями строк (\r\n). Если файлы могут приходить с разных платформ, стоит нормализовать окончания строк.

Возможное исправление
 func ParseContent(name, content string) (*Instinct, error) {
+	// Normalize line endings for cross-platform compatibility
+	content = strings.ReplaceAll(content, "\r\n", "\n")
 	if !strings.HasPrefix(content, "---\n") {
 		return nil, fmt.Errorf("missing YAML frontmatter in %s", name)
 	}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/instincts/content.go` around lines 22 - 24, The prefix/search for
YAML frontmatter uses strings.HasPrefix(content, "---\n") (and later looks for
"\n---\n") which fails on Windows CRLF endings; before performing these checks
(i.e., before the strings.HasPrefix call and any subsequent search on content)
normalize line endings by replacing CRLF ("\r\n") with LF ("\n") on the content
variable so both checks correctly detect frontmatter across platforms.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@internal/worker/handlers_sessions.go`:
- Around line 628-633: handleCheckSessions reads the request body into a
CheckSessionsRequest via json.NewDecoder(r.Body).Decode without any size limit;
wrap the request body with http.MaxBytesReader (e.g. http.MaxBytesReader(w,
r.Body, maxRequestBodySize)) before decoding to prevent oversized JSON from
exhausting memory/CPU, use a named constant like maxRequestBodySize (set to an
appropriate value) and decode from the MaxBytesReader; also return a clear HTTP
413/400 error when the body exceeds the limit or decoding fails.
- Around line 524-527: Detect and handle the decompressed gzip size limit: when
wrapping the gzip reader with io.LimitReader (the local variable reader used
before calling ParseSessionReader), check after ParseSessionReader whether EOF
truncation occurred (e.g., detect that the underlying LimitReader hit its limit
or that ParseSessionReader returned a partial result due to hitting the limit)
and if so return http.StatusRequestEntityTooLarge instead of proceeding to index
partial data; update the warning/logging path that currently continues to abort
and ensure it returns 413. Also protect handleCheckSessions by wrapping r.Body
with http.MaxBytesReader (similar to handleIndexSession) before calling
json.NewDecoder(r.Body).Decode(&req) so excessively large JSON bodies are
rejected early.

---

Outside diff comments:
In `@docs/arch/API_CONTRACTS.md`:
- Around line 193-243: The docs incorrectly use the Go-specific term "BaseInput"
inside the JS hook sections; replace "BaseInput" with the previously defined
"Hook Input" (or explicitly state "stdin JSON" and list the shared fields
session_id and cwd) in the user-prompt Hook, post-tool-use Hook, and
subagent-stop Hook descriptions so the contract is consistent with the "Hook
Input" section and does not mix Go terminology into the JS hooks.
- Around line 248-255: The docs still show the stop Hook input as a Go struct
(type Input with hooks.BaseInput, TranscriptPath, StopHookActive) which
conflicts with the JS-only design; replace that Go struct with a JSON stdin
contract for the stop hook, documenting fields like session_id, cwd,
transcript_path, and stop_hook_active and keep the stdout return example ({
"continue": true }) so readers see the JSON input/output contract instead of Go
symbols.

In `@docs/arch/architecture.md`:
- Line 234: В разделе Data Flow в файле architecture.md найдена устаревшая фраза
"post-tool-use hook (Go binary)"; замените её на актуальную формулировку,
согласованную с диаграммой и Component Map — например "post-tool-use hook
(JavaScript/JS hook)" — чтобы текст отражал переход с Go на JS; проверьте в том
же разделе упоминания функции/компонента "post-tool-use hook" и обновите все
вхождения описаний и примечаний, чтобы не оставалось ссылок на Go-бинарь.

---

Nitpick comments:
In `@internal/instincts/content.go`:
- Around line 22-24: The prefix/search for YAML frontmatter uses
strings.HasPrefix(content, "---\n") (and later looks for "\n---\n") which fails
on Windows CRLF endings; before performing these checks (i.e., before the
strings.HasPrefix call and any subsequent search on content) normalize line
endings by replacing CRLF ("\r\n") with LF ("\n") on the content variable so
both checks correctly detect frontmatter across platforms.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 547da83f-0be9-49cf-b22e-f7ab3d9d900b

📥 Commits

Reviewing files that changed from the base of the PR and between a7d1099 and 55ea9b1.

📒 Files selected for processing (13)
  • docs/arch/API_CONTRACTS.md
  • docs/arch/QUICKSTART.md
  • docs/arch/architecture.md
  • internal/instincts/content.go
  • internal/mcp/server.go
  • internal/mcp/tools_instincts.go
  • internal/worker/handlers_sessions.go
  • internal/worker/sdk/processor.go
  • plugin/engram/hooks/lib.js
  • plugin/engram/hooks/sync-sessions.js
  • scripts/install.ps1
  • scripts/install.sh
  • scripts/register-plugin.sh
🚧 Files skipped from review as they are similar to previous changes (3)
  • plugin/engram/hooks/sync-sessions.js
  • internal/worker/sdk/processor.go
  • internal/mcp/server.go

Comment on lines +524 to +527
// Limit the decompressed stream to prevent gzip bomb attacks;
// the compressed-stream limit above only applies to the wire bytes.
reader = io.LimitReader(gz, maxSessionIndexBody)
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Проверка, что в gzip-пути есть явная детекция overflow (а не только LimitReader).
rg -n -C3 'io\.LimitReader\(gz,\s*maxSessionIndexBody\)|io\.LimitedReader|maxSessionIndexBody \+ 1|StatusRequestEntityTooLarge' internal/worker/handlers_sessions.go

Repository: thebtf/engram

Length of output: 361


🏁 Script executed:

#!/bin/bash
# Получить полный контекст функций handleIndexSession и handleCheckSessions
sed -n '492,610p' internal/worker/handlers_sessions.go | cat -n

Repository: thebtf/engram

Length of output: 4740


🏁 Script executed:

#!/bin/bash
# Проверить наличие и использование ParseSessionReader
rg -n 'ParseSessionReader|sessions\.Parse' internal/worker/handlers_sessions.go

Repository: thebtf/engram

Length of output: 186


🏁 Script executed:

#!/bin/bash
# Проверить, используется ли MaxBytesReader или другие лимиты в файле
rg -n 'MaxBytesReader|LimitReader|StatusRequestEntityTooLarge|413' internal/worker/handlers_sessions.go

Repository: thebtf/engram

Length of output: 176


🏁 Script executed:

#!/bin/bash
# Найти реализацию ParseSessionReader
rg -n 'func.*ParseSessionReader' internal/

Repository: thebtf/engram

Length of output: 313


🏁 Script executed:

#!/bin/bash
# Получить функцию handleCheckSessions
sed -n '618,660p' internal/worker/handlers_sessions.go | cat -n

Repository: thebtf/engram

Length of output: 1436


🏁 Script executed:

#!/bin/bash
# Проверить, используется ли MaxBytesReader в handleCheckSessions
rg -n 'MaxBytesReader|r\.Body' internal/worker/handlers_sessions.go | grep -A2 -B2 'handleCheckSessions'

Repository: thebtf/engram

Length of output: 39


🏁 Script executed:

#!/bin/bash
# Получить реализацию ParseSessionReader для анализа поведения при truncated input
sed -n '74,120p' internal/sessions/parser.go | cat -n

Repository: thebtf/engram

Length of output: 1615


🏁 Script executed:

#!/bin/bash
# Проверить, возвращает ли ParseSessionReader partial result или полностью nil при ошибке
rg -n 'meta.*error|return.*meta|struct.*SessionMeta' internal/sessions/parser.go | head -20

Repository: thebtf/engram

Length of output: 168


Нужно явно детектировать превышение лимита распакованного gzip-потока; иначе происходит молчаливая неполная индексация.

На line 526 используется io.LimitReader(gz, maxSessionIndexBody), но этот reader не сигнализирует об ошибке при достижении лимита. Когда распакованные данные превышают лимит:

  1. io.LimitReader молча обрезает поток на EOF
  2. ParseSessionReader возвращает partial результат с error
  3. Код логирует warning на line 541 и продолжает индексировать неполные данные
  4. Клиент получает статус 200 вместо 413

Требуется явно проверять, был ли достигнут лимит распакованного потока, и возвращать http.StatusRequestEntityTooLarge вместо индексации partial данных.


Добавить ограничение размера входного тела в handleCheckSessions.

На line 629 json.NewDecoder(r.Body).Decode(&req) используется без http.MaxBytesReader, что создаёт DoS-риск. Клиент может отправить huge JSON, потребляя неограниченно память/CPU. Требуется применить http.MaxBytesReader перед декодированием, аналогично handleIndexSession.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/worker/handlers_sessions.go` around lines 524 - 527, Detect and
handle the decompressed gzip size limit: when wrapping the gzip reader with
io.LimitReader (the local variable reader used before calling
ParseSessionReader), check after ParseSessionReader whether EOF truncation
occurred (e.g., detect that the underlying LimitReader hit its limit or that
ParseSessionReader returned a partial result due to hitting the limit) and if so
return http.StatusRequestEntityTooLarge instead of proceeding to index partial
data; update the warning/logging path that currently continues to abort and
ensure it returns 413. Also protect handleCheckSessions by wrapping r.Body with
http.MaxBytesReader (similar to handleIndexSession) before calling
json.NewDecoder(r.Body).Decode(&req) so excessively large JSON bodies are
rejected early.

Comment on lines +628 to +633
var req CheckSessionsRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "invalid request body", http.StatusBadRequest)
return
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Добавьте лимит размера тела запроса в handleCheckSessions.

На Line 629 тело читается без http.MaxBytesReader. Большой JSON с session_ids может привести к избыточному потреблению памяти/CPU.

💡 Предложение по исправлению
 func (s *Service) handleCheckSessions(w http.ResponseWriter, r *http.Request) {
@@
-	var req CheckSessionsRequest
-	if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+	body := http.MaxBytesReader(w, r.Body, maxSessionIndexBody)
+	defer func() { _ = body.Close() }()
+
+	var req CheckSessionsRequest
+	if err := json.NewDecoder(body).Decode(&req); err != nil {
 		http.Error(w, "invalid request body", http.StatusBadRequest)
 		return
 	}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
var req CheckSessionsRequest
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, "invalid request body", http.StatusBadRequest)
return
}
body := http.MaxBytesReader(w, r.Body, maxSessionIndexBody)
defer func() { _ = body.Close() }()
var req CheckSessionsRequest
if err := json.NewDecoder(body).Decode(&req); err != nil {
http.Error(w, "invalid request body", http.StatusBadRequest)
return
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@internal/worker/handlers_sessions.go` around lines 628 - 633,
handleCheckSessions reads the request body into a CheckSessionsRequest via
json.NewDecoder(r.Body).Decode without any size limit; wrap the request body
with http.MaxBytesReader (e.g. http.MaxBytesReader(w, r.Body,
maxRequestBodySize)) before decoding to prevent oversized JSON from exhausting
memory/CPU, use a named constant like maxRequestBodySize (set to an appropriate
value) and decode from the MaxBytesReader; also return a clear HTTP 413/400
error when the body exceeds the limit or decoding fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant