Skip to content

fix: filter binary asset paths and numeric segments from concept extraction#310

Merged
CalebisGross merged 1 commit intomainfrom
fix/theme-path-leakage
Mar 21, 2026
Merged

fix: filter binary asset paths and numeric segments from concept extraction#310
CalebisGross merged 1 commit intomainfrom
fix/theme-path-leakage

Conversation

@CalebisGross
Copy link
Copy Markdown
Collaborator

Summary

  • Expand the skip list in FromPath to filter asset directories (images, icons, fonts, docs), generic noise (bytes, data, cache), and documentation directories
  • Add isNumeric filter to drop dimension-like segments (96x96, 512x512)
  • Simplify FromEventType using strings.CutPrefix (lint fix)

Before: themes included docs/images/mnemonic.png, bytes, 96x96
After: themes include only mnemonic, favicon, etc.

Test plan

  • 5 new test cases for binary asset paths, noise segments, and dimensions
  • All existing tests pass (no regressions on code paths)
  • make build && make check clean

Fixes #305

🤖 Generated with Claude Code

…action

Expand the skip list in FromPath to filter out asset directories (images,
icons, fonts, docs), generic noise segments (bytes, data, cache), and
documentation directories. Add isNumeric filter to drop dimension-like
segments (96x96, 512x512) from themes.

Fixes #305

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@CalebisGross CalebisGross merged commit 63fa4ef into main Mar 21, 2026
@CalebisGross CalebisGross deleted the fix/theme-path-leakage branch March 21, 2026 05:37
@CalebisGross
Copy link
Copy Markdown
Collaborator Author

Not a bug — the watcher works correctly. Investigation revealed:

  1. touch produces CHMOD events, which sendEvent() intentionally drops (only Create/Write/Remove/Rename are meaningful)
  2. Real file edits (echo, sed, editor saves) produce Write events that flow through the full pipeline
  3. The heuristic filter correctly rejects temp files (empty content) while passing real edits
  4. context_boost works end-to-end: daemon watcher → activity tracker → /api/v1/activity → MCP sync → retrieval scoring

Verified context_boost: 0.199 in MCP recall after editing internal/agent/retrieval/agent.go.

The initial test used only touch commands, which masked the fact that the pipeline was working. No code change needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: theme path leakage for binary assets in get_context

1 participant