feat: refine MCP tool descriptions per R79 (<1024 chars, Use when/Don't use when)#223
feat: refine MCP tool descriptions per R79 (<1024 chars, Use when/Don't use when)#223
Conversation
… when/Don't use when)
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
📝 WalkthroughWalkthroughThe PR updates the MCP server's top-level instructions and revises tool descriptions for nine tools to clarify usage boundaries and parameter expectations. It also introduces a new test module that validates tool descriptions against quality standards (character limits and required text patterns). No logic or control flow changes were made. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_tool_description_quality.py`:
- Around line 15-17: The helper _tool_descriptions repeatedly calls
asyncio.run(list_tools()); replace this with a cached result to avoid multiple
event loop runs—either compute a module-level constant (e.g., TOOL_DESCRIPTIONS)
by running asyncio.run(list_tools()) once and building the {tool.name:
tool.description} dict at import time, or make _tool_descriptions a cached/lazy
function (e.g., functools.lru_cache or a module-level memo) that calls
list_tools only on first invocation; update references to use the cached dict
and keep the function name _tool_descriptions (or return the cached dict) so
tests continue to work.
- Around line 5-12: The ACTIVE_TOOL_NAMES set in
tests/test_tool_description_quality.py is missing recently added active tools;
update the set literal ACTIVE_TOOL_NAMES to include "brain_supersede",
"brain_archive", and "brain_enrich" so the test matches the non-deprecated tools
defined in src/brainlayer/mcp/__init__.py (verify names exactly match the
function/class identifiers used there).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 1fd08e05-31c2-43c3-a0ab-8d5524bb5bd7
📒 Files selected for processing (2)
src/brainlayer/mcp/__init__.pytests/test_tool_description_quality.py
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: test (3.11)
- GitHub Check: test (3.12)
- GitHub Check: test (3.13)
🧰 Additional context used
📓 Path-based instructions (2)
**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
**/*.py: Flag risky DB or concurrency changes explicitly and do not hand-wave lock behavior
Enforce one-write-at-a-time concurrency constraint; reads are safe but brain_digest is write-heavy and must not run in parallel with other MCP work
Run pytest before claiming behavior changed safely; current test suite has 929 tests
**/*.py: Usepaths.py:get_db_path()for all database path resolution; all scripts and CLI must use this function rather than hardcoding paths
When performing bulk database operations: stop enrichment workers first, checkpoint WAL before and after, drop FTS triggers before bulk deletes, batch deletes in 5-10K chunks, and checkpoint every 3 batches
Files:
tests/test_tool_description_quality.pysrc/brainlayer/mcp/__init__.py
src/brainlayer/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
src/brainlayer/**/*.py: Use retry logic onSQLITE_BUSYerrors; each worker must use its own database connection to handle concurrency safely
Classification must preserveai_code,stack_trace, anduser_messageverbatim; skipnoiseentries entirely and summarizebuild_loganddir_listingentries (structure only)
Use AST-aware chunking via tree-sitter; never split stack traces; mask large tool output
For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback viaenrichment_controller.py, and Ollama as offline last-resort; allow override viaBRAINLAYER_ENRICH_BACKENDenv var
Configure enrichment rate viaBRAINLAYER_ENRICH_RATEenvironment variable (default 0.2 = 12 RPM)
Implement chunk lifecycle columns:superseded_by,aggregated_into,archived_aton chunks table; exclude lifecycle-managed chunks from default search; allowinclude_archived=Trueto show history
Implementbrain_supersedewith safety gate for personal data (journals, notes, health/finance); use soft-delete forbrain_archivewith timestamp
Addsupersedesparameter tobrain_storefor atomic store-and-replace operations
Run linting and formatting with:ruff check src/ && ruff format src/
Run tests withpytest
UsePRAGMA wal_checkpoint(FULL)before and after bulk database operations to prevent WAL bloat
Files:
src/brainlayer/mcp/__init__.py
🧠 Learnings (13)
📓 Common learnings
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-03-17T01:04:22.497Z
Learning: Applies to src/brainlayer/mcp/**/*.py and brain-bar/Sources/BrainBar/MCPRouter.swift: The 8 required MCP tools are `brain_search`, `brain_store`, `brain_recall`, `brain_entity`, `brain_expand`, `brain_update`, `brain_digest`, `brain_tags`. `brain_tags` is the 8th tool, replacing `brain_get_person`, as defined in the Phase B spec merged in PR `#72`. The Python MCP server already implements `brain_tags`. Legacy `brainlayer_*` aliases must be maintained for backward compatibility.
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/mcp/*.py : MCP tools include: brain_search, brain_store, brain_recall, brain_entity, brain_expand, brain_update, brain_digest, brain_get_person, brain_tags, brain_supersede, brain_archive (legacy brainlayer_* aliases still supported)
📚 Learning: 2026-03-17T01:04:22.497Z
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 0
File: :0-0
Timestamp: 2026-03-17T01:04:22.497Z
Learning: Applies to src/brainlayer/mcp/**/*.py and brain-bar/Sources/BrainBar/MCPRouter.swift: The 8 required MCP tools are `brain_search`, `brain_store`, `brain_recall`, `brain_entity`, `brain_expand`, `brain_update`, `brain_digest`, `brain_tags`. `brain_tags` is the 8th tool, replacing `brain_get_person`, as defined in the Phase B spec merged in PR `#72`. The Python MCP server already implements `brain_tags`. Legacy `brainlayer_*` aliases must be maintained for backward compatibility.
Applied to files:
tests/test_tool_description_quality.pysrc/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-01T01:24:44.281Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/mcp/*.py : MCP tools include: brain_search, brain_store, brain_recall, brain_entity, brain_expand, brain_update, brain_digest, brain_get_person, brain_tags, brain_supersede, brain_archive (legacy brainlayer_* aliases still supported)
Applied to files:
tests/test_tool_description_quality.pysrc/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-02T10:56:45.869Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-02T10:56:45.869Z
Learning: BrainLayer serves as a local, private knowledge base for Claude Code sessions with fast search and context retrieval
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-06T08:40:13.531Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Memory tools: always use `brain_search` before answering questions about project history, architecture, or past decisions; use `brain_store` after making decisions, hitting bugs, or receiving corrections; don't rely solely on hook-injected context
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-04T15:21:39.570Z
Learnt from: EtanHey
Repo: EtanHey/brainlayer PR: 198
File: hooks/brainlayer-prompt-search.py:169-169
Timestamp: 2026-04-04T15:21:39.570Z
Learning: In EtanHey/brainlayer, `hooks/brainlayer-prompt-search.py` reads `entity_type` directly from existing rows in `kg_entities` (read-only). `contracts/entity-types.yaml` defines the write-side schema only and is not authoritative for what `entity_type` values exist in the DB. The DB already stores `technology` (72 entities), `project` (24), and `tool` (1) as valid `entity_type` values, so `INJECT_TYPES` in the hook should match these DB values, not the contract file.
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-06T08:40:13.531Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Add `supersedes` parameter to `brain_store` for atomic store-and-replace operations
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-04T23:24:03.159Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-04T23:24:03.159Z
Learning: Applies to src/brainlayer/{vector_store,search}*.py : Chunk lifecycle: implement columns `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-06T08:40:13.531Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Implement `brain_supersede` with safety gate for personal data (journals, notes, health/finance); use soft-delete for `brain_archive` with timestamp
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-06T08:40:13.531Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : Implement chunk lifecycle columns: `superseded_by`, `aggregated_into`, `archived_at` on chunks table; exclude lifecycle-managed chunks from default search; allow `include_archived=True` to show history
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-01T22:38:15.693Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T22:38:15.693Z
Learning: Applies to src/brainlayer/mcp/brain_supersede.py : `brain_supersede` tool must include safety gate for personal data (journals, notes, health/finance)
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-01T01:24:44.281Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-01T01:24:44.281Z
Learning: Applies to src/brainlayer/*enrichment*.py : Enrichment backend priority: Groq (primary/cloud) → Gemini (fallback) → Ollama (offline last-resort), configurable via `BRAINLAYER_ENRICH_BACKEND` environment variable
Applied to files:
src/brainlayer/mcp/__init__.py
📚 Learning: 2026-04-06T08:40:13.531Z
Learnt from: CR
Repo: EtanHey/brainlayer PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-04-06T08:40:13.531Z
Learning: Applies to src/brainlayer/**/*.py : For enrichment backend selection: use Groq as primary backend (cloud, configured in launchd plist), Gemini as fallback via `enrichment_controller.py`, and Ollama as offline last-resort; allow override via `BRAINLAYER_ENRICH_BACKEND` env var
Applied to files:
src/brainlayer/mcp/__init__.py
🔇 Additional comments (13)
src/brainlayer/mcp/__init__.py (10)
88-101: Server instructions are well-structured with clear routing guidance.The updated
instructionsstring provides good primary routing guidance with "Use when:" and "Don't use when:" patterns for the top three tools. The cross-references between sibling tools are helpful for LLM routing decisions.
294-294: brain_search description follows R79 pattern correctly.The description includes "Use when:", "Don't use when:", parameter hints, and sibling tool cross-references. The guidance is clear and actionable.
434-434: brain_store description is comprehensive with proper lifecycle guidance.The description correctly mentions the
supersedesparameter for atomic store-and-replace operations, which aligns with the coding guidelines. Cross-references tobrain_archiveandbrain_supersedeare accurate.
536-536: brain_get_person description is well-structured.Clear guidance for person-specific context retrieval with appropriate cross-references to sibling tools.
563-563: brain_recall description effectively explains the unified routing entry point.The description clearly explains mode detection and provides appropriate guidance for when to use direct tools (brain_search) vs. the routing entry point.
699-699: brain_digest description clarifies mode differences well.The distinction between
mode='digest',mode='connect', andmode='enrich'is clear, with appropriate guidance on when to usebrain_storefor quick notes instead.
740-740: brain_entity description provides clear entity lookup guidance.Good distinction between fuzzy topical search (brain_search) and structured entity graph lookup, with appropriate cross-reference to brain_get_person for person-specific needs.
902-902: brain_supersede description correctly documents the safety gate.The description appropriately mentions
safety_check='confirm'for personal data, aligning with the coding guideline requirement for safety gates on personal data (journals, notes, health/finance).
933-933: brain_archive description correctly explains soft-delete semantics.The description aligns with the coding guideline to use soft-delete with timestamp for
brain_archive.
953-953: brain_enrich description is comprehensive with mode and scope guidance.Clear explanation of enrichment modes (realtime, batch, local) and scope-narrowing parameters (limit, since_hours, chunk_ids). The cross-references to sibling tools are accurate.
tests/test_tool_description_quality.py (3)
20-24: Character limit test covers all tools correctly.Good approach to check all tool descriptions (including deprecated ones) against the 1024 character limit for Azure OpenAI compatibility.
27-32: Test structure is good with clear assertions.The test correctly verifies both tool existence and the "Use when:" pattern presence.
35-41: Flexible pattern matching for negative guidance is appropriate.The dual check for
"Don't use when:"or"Does NOT"allows reasonable flexibility in how tools document their non-applicable scenarios.
| ACTIVE_TOOL_NAMES = { | ||
| "brain_search", | ||
| "brain_store", | ||
| "brain_recall", | ||
| "brain_entity", | ||
| "brain_digest", | ||
| "brain_get_person", | ||
| } |
There was a problem hiding this comment.
ACTIVE_TOOL_NAMES is missing recently added active tools.
Based on the tool descriptions in src/brainlayer/mcp/__init__.py, the following tools also have "Use when:" and "Don't use when:" guidance and are not deprecated: brain_supersede, brain_archive, and brain_enrich. These should be included in ACTIVE_TOOL_NAMES for comprehensive test coverage.
🔧 Proposed fix to include all active tools
ACTIVE_TOOL_NAMES = {
"brain_search",
"brain_store",
"brain_recall",
"brain_entity",
"brain_digest",
"brain_get_person",
+ "brain_supersede",
+ "brain_archive",
+ "brain_enrich",
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ACTIVE_TOOL_NAMES = { | |
| "brain_search", | |
| "brain_store", | |
| "brain_recall", | |
| "brain_entity", | |
| "brain_digest", | |
| "brain_get_person", | |
| } | |
| ACTIVE_TOOL_NAMES = { | |
| "brain_search", | |
| "brain_store", | |
| "brain_recall", | |
| "brain_entity", | |
| "brain_digest", | |
| "brain_get_person", | |
| "brain_supersede", | |
| "brain_archive", | |
| "brain_enrich", | |
| } |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_tool_description_quality.py` around lines 5 - 12, The
ACTIVE_TOOL_NAMES set in tests/test_tool_description_quality.py is missing
recently added active tools; update the set literal ACTIVE_TOOL_NAMES to include
"brain_supersede", "brain_archive", and "brain_enrich" so the test matches the
non-deprecated tools defined in src/brainlayer/mcp/__init__.py (verify names
exactly match the function/class identifiers used there).
| def _tool_descriptions() -> dict[str, str]: | ||
| tools = asyncio.run(list_tools()) | ||
| return {tool.name: tool.description for tool in tools} |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
Consider caching tool descriptions to avoid repeated asyncio.run() calls.
The _tool_descriptions() helper is called once per test, resulting in three asyncio.run() calls. While this works, caching the result at module level would be more efficient.
♻️ Optional: Cache tool descriptions
+import functools
import asyncio
from brainlayer.mcp import list_tools
ACTIVE_TOOL_NAMES = {
"brain_search",
"brain_store",
"brain_recall",
"brain_entity",
"brain_digest",
"brain_get_person",
}
+@functools.lru_cache(maxsize=1)
def _tool_descriptions() -> dict[str, str]:
tools = asyncio.run(list_tools())
return {tool.name: tool.description for tool in tools}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/test_tool_description_quality.py` around lines 15 - 17, The helper
_tool_descriptions repeatedly calls asyncio.run(list_tools()); replace this with
a cached result to avoid multiple event loop runs—either compute a module-level
constant (e.g., TOOL_DESCRIPTIONS) by running asyncio.run(list_tools()) once and
building the {tool.name: tool.description} dict at import time, or make
_tool_descriptions a cached/lazy function (e.g., functools.lru_cache or a
module-level memo) that calls list_tools only on first invocation; update
references to use the cached dict and keep the function name _tool_descriptions
(or return the cached dict) so tests continue to work.
Summary
Context
R79 research found: "Tool descriptions are the highest-leverage fix. Anthropic's team states: description refinements alone achieved SWE-bench SOTA." PR #212 did a first pass; this refines per R79's specific guidance.
Test plan
pytest tests/test_tool_description_quality.py -v— 3 testsruff check src/ tests/clean🤖 Generated with Claude Code
Note
Refine MCP tool descriptions to include 'Use when'/'Don't use when' guidance under 1024 chars
brain_search,brain_store,brain_recall,brain_digest,brain_entity, etc.) in init.py to follow a structured format with explicitUse when:andDon't use when:sections.Macroscope summarized a89e3e0.
Summary by CodeRabbit
Documentation
Tests