fix: post-merge cleanup — vault refactor + secret detection#4
Conversation
Replace global vaultOnce/sharedVault/vaultInitErr with Server struct fields. Vault is still lazily initialized on first credential tool use, but now scoped to the server instance for better test isolation.
Wire existing privacy.RedactSecrets() into store_memory handler and hook event ingestion (handlers_ingest.go). Warn-and-redact pattern: log detection at WARN level, redact secrets before storage. Resolves TECHNICAL_DEBT.md P1: privacy/secrets.go dead code activation.
|
Caution Review failedPull request was closed or merged during review WalkthroughДобавлена поддержка хранилища (vault) к серверу MCP путем переноса состояния из глобальных переменных в поля структуры Server. Реализована обнаружение и удаление секретных данных в нескольких точках обработки: при сохранении памяти и при приеме событий. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 golangci-lint (2.5.0)Error: can't load config: unsupported version of the configuration: "" See https://golangci-lint.run/docs/product/migration-guide for migration instructions Comment |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the system's robustness and security posture. It improves the testability of the vault component by isolating its state within the server instance and strengthens data privacy by implementing proactive secret redaction at critical data entry points. These changes ensure that sensitive information is handled more securely and that the system's architecture is more maintainable. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces two significant improvements. First, it refactors the vault initialization by moving it from a global singleton to a field within the Server struct. This is an excellent change that improves encapsulation and test isolation. Second, it activates secret detection and redaction at key data ingestion points (store_memory and ingestEvent), changing the behavior from rejection to a more robust warn-and-redact pattern. This enhances data privacy without causing data loss. The changes are well-implemented. I have one suggestion to reduce code duplication in the secret redaction logic.
| // Redact secrets from tool input/result before any pipeline processing. | ||
| if privacy.ContainsSecrets(toolInputStr) { | ||
| log.Warn().Str("tool", req.ToolName).Msg("ingest: tool_input contains secrets — redacting before pipeline processing") | ||
| toolInputStr = privacy.RedactSecrets(toolInputStr) | ||
| } | ||
| if privacy.ContainsSecrets(toolResultStr) { | ||
| log.Warn().Str("tool", req.ToolName).Msg("ingest: tool_result contains secrets — redacting before pipeline processing") | ||
| toolResultStr = privacy.RedactSecrets(toolResultStr) | ||
| } |
There was a problem hiding this comment.
The logic for redacting secrets in toolInputStr and toolResultStr is duplicated. You can refactor this into a local helper function to improve readability and maintainability. This would also make it easier to add redaction for more fields in the future.
| // Redact secrets from tool input/result before any pipeline processing. | |
| if privacy.ContainsSecrets(toolInputStr) { | |
| log.Warn().Str("tool", req.ToolName).Msg("ingest: tool_input contains secrets — redacting before pipeline processing") | |
| toolInputStr = privacy.RedactSecrets(toolInputStr) | |
| } | |
| if privacy.ContainsSecrets(toolResultStr) { | |
| log.Warn().Str("tool", req.ToolName).Msg("ingest: tool_result contains secrets — redacting before pipeline processing") | |
| toolResultStr = privacy.RedactSecrets(toolResultStr) | |
| } | |
| // Redact secrets from tool input/result before any pipeline processing. | |
| redactAndLog := func(s, fieldName string) string { | |
| if privacy.ContainsSecrets(s) { | |
| log.Warn().Str("tool", req.ToolName).Msgf("ingest: %s contains secrets — redacting before pipeline processing", fieldName) | |
| return privacy.RedactSecrets(s) | |
| } | |
| return s | |
| } | |
| toolInputStr = redactAndLog(toolInputStr, "tool_input") | |
| toolResultStr = redactAndLog(toolResultStr, "tool_result") |
…ncation Extract splitIntoChunks, inferType, loadMarker, saveMarker, discoverMemoryFiles, safeReadFile, fileExists into utils/memory-files.ts for reuse by both memory-migrate tool and file-watcher service. Remove artificial 900-char content truncation on import — engram server stores full TEXT without limits. Fix BUG #4: dry run output now shows summary + first 5 chunks instead of flooding with all chunks.
…ncation Extract splitIntoChunks, inferType, loadMarker, saveMarker, discoverMemoryFiles, safeReadFile, fileExists into utils/memory-files.ts for reuse by both memory-migrate tool and file-watcher service. Remove artificial 900-char content truncation on import — engram server stores full TEXT without limits. Fix BUG #4: dry run output now shows summary + first 5 chunks instead of flooding with all chunks.
* feat: accept agent_id param in context endpoints
Allow OpenClaw agents (which lack filesystem context) to use agent_id
as project scope in handleSearchByPrompt and handleContextInject.
Backward-compatible: existing Claude Code hooks using project param
are unaffected.
* feat: OpenClaw engram plugin — persistent memory for AI gateway agents
TypeScript plugin connecting OpenClaw's AI gateway to engram via REST API.
Core features:
- Dual-tier context injection (session-level static + per-turn dynamic)
- Automatic self-learning via tool event ingestion and transcript backfill
- 3-strike passive availability detection with 60s cooldown
- agentId-first identity with git remote fallback
- Zod-validated config with JSON Schema export
Hooks: session_start, before_prompt_build, after_tool_call,
before_compaction, session_end
Tools: engram_search, engram_remember, engram_decisions
Commands: /memory (status), /remember (quick store)
Context formatter ported from plugin/engram/hooks/user-prompt.js:
credential filter, Jaccard title dedup, type grouping, token budget,
XML-tag rendering.
* refactor: align OpenClaw engram plugin with real SDK types
- Rewrite types/openclaw.ts with verified SDK shapes: register(api),
AnyAgentTool, ToolFactory, PluginLogger, 24 hook names, CliProgram
- Rewrite index.ts: id-based plugin, api.on() hooks, tool factory
pattern, registerCli for 'memory' CLI subcommands
- Refactor 3 tool files to factory pattern with TypeBox schemas
- Add memory_search (alias), memory_store (text/content compat),
memory_forget (bulk delete), memory_get (dual-mode: fs + engram)
- Update 5 hooks: optional event fields, PluginLogger injection
- Update 2 commands: OpenClawPluginCommandDefinition, name without slash
- Add @sinclair/typebox dep, move zod to devDependencies
- Add bulkDelete method to EngramRestClient
- Version bump to 0.2.0, manifest updated with 7 tools + CLI section
* feat: add memory_migrate tool for importing local memory files
Adds memory_migrate tool, /migrate slash command, and `openclaw memory migrate`
CLI subcommand. Discovers MEMORY.md + memory/**/*.md, splits by ## headers,
bulk-imports into engram with SHA256-based idempotency marker.
* fix: address code review findings in memory_migrate
- Path traversal: validate resolved path stays within workspaceDir
- Marker idempotency: only save marker on full success, not partial failure
- Symlink safety: skip symlinks + max depth guard in findMdFiles
- Code fence awareness: track fenced blocks to avoid false ## splits
- Truncation visibility: warn about content truncation in dry-run and report
* fix: address remaining low-priority review findings
- Schema drift guard: compile-time check that Zod and TypeBox field names match
- Marker race condition: atomic write via temp file + rename
- CLI tool caching: reuse migrate tool instance instead of recreating per call
* fix: correct plugin discovery — rename to "engram" and add openclaw extensions entry
OpenClaw discovery ignores package.json "main" field. It uses:
1. package.json "openclaw.extensions" array for entry point resolution
2. Fallback to index.ts/index.js in root
Added "openclaw": { "extensions": ["dist/index.js"] } and renamed
package name from "openclaw-engram-plugin" to "engram" so idHint
matches manifest id and config key.
* fix: make register() synchronous and use SDK command handler signature
Bug 1: register() was async — OpenClaw ignores Promise returns from
register(), logging "async registration is ignored". Made synchronous.
Bug 2: Commands used { execute(args, ctx) } shape but SDK expects
{ handler(ctx) } where handler is typeof "function". SDK checks
`typeof command.handler !== "function"` and rejects with
"Command handler must be a function".
Also updated PluginCommandContext and PluginCommandResult types
to match real SDK (ctx.args is raw string, result uses { text }
not { output }).
* fix: resolve workspaceDir for migrate command via api.resolvePath
* fix: use config.workspaceDir with ~/.openclaw/workspace fallback for migrate command
* refactor: extract shared memory file utilities and remove content truncation
Extract splitIntoChunks, inferType, loadMarker, saveMarker, discoverMemoryFiles,
safeReadFile, fileExists into utils/memory-files.ts for reuse by both
memory-migrate tool and file-watcher service.
Remove artificial 900-char content truncation on import — engram server
stores full TEXT without limits. Fix BUG #4: dry run output now shows
summary + first 5 chunks instead of flooding with all chunks.
* feat: add file watcher service for live memory sync
Add chokidar-based FileWatcherService that watches MEMORY.md and
memory/**/*.md in the agent workspace. On file changes, debounces
1500ms, computes SHA256 hash, and bulk-imports new/changed chunks
into engram. Reuses shared chunking and marker logic from
utils/memory-files.ts.
Replaces memory-core's built-in chokidar watcher, enabling engram
to fully occupy the exclusive memory plugin slot.
Review fixes applied:
- stopped flag prevents callbacks after stop()
- inFlight collision reschedules instead of dropping events
- projectId cached in constructor (avoids git execSync per sync)
* fix: use service.id instead of service.name to match SDK expectation
* fix: wrap bulkImport body in {project, observations} to match server API
* fix: replace invalid observation type 'context' with 'change'
Engram API only accepts: bugfix, feature, refactor, discovery, decision, change.
The plugin was using 'context' as default type which caused bulk-import to reject
observations with "invalid type 'context'".
* fix: address code review findings from CodeRabbit and Gemini
- extractOrigin preserves URL subpath (client.ts)
- Replace hardcoded 'cli' project fallback with workspace-derived ID
- force flag no longer wipes migration marker state (memory-migrate.ts)
- Path traversal protection via relative() instead of startsWith
- Use event.sessionId instead of agentId for session tracking
- Enforce search limit parameter (engram-search.ts)
- Restrict memory_get to .md files only (memory-get.ts)
- Safe serialization + .catch() for fire-and-forget calls
- Fix tool name references in formatter reminder text
- Validate URL scheme is http/https (config.ts)
- Log warning on marker save failure (memory-files.ts)
* fix: address code review findings from CodeRabbit and Gemini (round 2)
- package.json: align @types/node to ^18.19.0 to match engines.node >=18
- client.ts: return null instead of {} for empty response body; fix extractOrigin fallback to preserve path segments
- identity.ts: add module-level memoization cache for getGitRemoteID to avoid blocking execSync on hot paths; derive projectId from remote URL (not local dir name) for stable cross-checkout IDs
- formatter.ts: fix token budget bypass — oversized observations now skipped with continue instead of always inserting the first entry
- before-prompt-build.ts: wrap searchContext in try/catch to prevent hook lifecycle disruption on network errors
- session-end.ts: use event.sessionId only (no agentId fallback) to avoid conflating backfill across sessions
- availability.ts: fix recordSuccess wasUnavailable detection (use unavailableSince directly, not isAvailable); fix recordFailure to re-trip circuit in half-open state
- openclaw.plugin.json: add uiHints sensitive:true to token field
---------
Co-authored-by: Kirill Turanskiy <thebtf@users.noreply.github.com>
Summary
sync.OncetoServerstruct field for test isolationprivacy.RedactSecrets()atstore_memoryand hook event ingestion boundaries (warn-and-redact pattern)Changes
internal/mcp/server.go: Addvault,vaultInitErr,vaultOncefields to Serverinternal/mcp/tools_credential.go: ConvertgetVault()from free function to Server method, remove globalsinternal/mcp/tools_memory.go: Add secret redaction before storeinternal/worker/handlers_ingest.go: Add secret redaction for tool input/result stringsTECHNICAL_DEBT.md: UpdatedTest plan
go build ./internal/mcp/ ./internal/worker/ ./cmd/worker/passesgo vetclean on modified packagesSummary by CodeRabbit
Примечания к выпуску
Улучшения безопасности
Управление учетными данными