docs(codex): add MCP transport troubleshooting and restart policy#166
docs(codex): add MCP transport troubleshooting and restart policy#166Xangel0s wants to merge 1 commit intoGentleman-Programming:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds documentation to help users recover from Codex MCP stdio failures (notably Transport closed) when using Engram memory tools, and proposes a restart policy after binary/config changes.
Changes:
- Added a Codex-specific troubleshooting section for
Transport closedindocs/AGENT-SETUP.md. - Added MCP transport troubleshooting guidance to
README_CONTEXT.md(alongside a broader “Context Bus Manual” document).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| README_CONTEXT.md | Introduces a new operator/agent context manual and includes an MCP transport troubleshooting section. |
| docs/AGENT-SETUP.md | Adds Codex troubleshooting steps and a restart policy note for stale MCP stdio sessions. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ## API Endpoints | ||
|
|
||
| ### POST /remember | ||
|
|
||
| Stores one context pill in `context_logs`. | ||
|
|
||
| Required fields: | ||
| - `agent_source`: must be one of `codex`, `gemini`, `system`. | ||
| - `content`: the context pill text. | ||
|
|
There was a problem hiding this comment.
The HTTP API described here (POST /remember with agent_source/content + context_logs table) does not match Engram’s implemented HTTP server. The server routes are /sessions, /observations, /search, /timeline, /context, etc. (see internal/server/server.go routes), and there is no /remember endpoint or context_logs table in the current schema. Please update this section to reflect the actual API, or explicitly label it as a separate/legacy service so readers aren’t misled.
| ### GET /recall | ||
|
|
||
| Returns enriched recall payload with: | ||
| - `focused_observation`: single focus context log. | ||
| - `timeline`: chronological surrounding entries (up to 5 before + 5 after). | ||
|
|
||
| Query params: | ||
| - `q` optional: partial match search over `content`. | ||
| - `agent_source` optional: filter to one source (`codex`, `gemini`, `system`). | ||
| - `mode` optional: `compact` (default) or `full`. | ||
| - `recall_profile` optional: `lean`, `balanced`, `deep` (native presets). | ||
| - `timeline_limit` optional: number of items before/after focus (default 5, max 20). | ||
| - `max_chars` optional: compact preview char budget (default 180, min 60, max 500). |
There was a problem hiding this comment.
GET /recall (and its query params like recall_profile, timeline_limit, max_chars) is documented as a core endpoint, but there is no /recall handler in the current HTTP server (internal/server/server.go). If the intended endpoint is /context, /search, or /timeline, the docs should be adjusted accordingly.
| Example request body: | ||
| { | ||
| "agent_source": "codex", | ||
| "content": "Refactored recall query to return chronological timeline around focus.", | ||
| "metadata": { | ||
| "file_path": "internal/store/store.go", | ||
| "status": "in_progress", | ||
| "priority": "high" | ||
| } | ||
| } |
There was a problem hiding this comment.
The JSON example request body is not inside a fenced code block, so it won’t render cleanly in Markdown and is easy to miscopy. Consider wrapping the example in a ```json fenced block (and likewise for other raw payload examples).
| | Content Type | Compression Rule | | ||
| |---|---| | ||
| | Go/Python-like code | Keep declaration line (`func ...`, `class ...`, or first structural line). | | ||
| | System logs | Keep last non-empty line; if HTTP/server error code exists, emit `Error <code>`. | | ||
| | Markdown | Keep only heading lines (`#`, `##`) and join compactly. | | ||
| | Plain text / notes | Normalize repeated newlines and truncate to 180 chars with ellipsis. | | ||
|
|
There was a problem hiding this comment.
This Markdown table uses leading double pipes (e.g., "|| Content Type | Compression Rule |"), which doesn’t render as a table in standard Markdown. Please convert it to a normal pipe table format so it renders correctly in GitHub.
| ### GET / | ||
|
|
||
| Returns an embedded dark-mode monitoring dashboard (HTML) rendered directly by the Go server. | ||
|
|
||
| Behavior: | ||
| - Shows the latest 20 rows from `context_logs`. | ||
| - Colors `agent_source` badges (`codex` in blue, `gemini` in purple, `system` in green). | ||
| - Auto-refreshes every 5 seconds using a lightweight fetch call. | ||
|
|
||
| ### GET /dashboard/logs | ||
|
|
||
| Returns JSON rows for dashboard live-refresh. | ||
|
|
||
| Query params: | ||
| - `limit` optional: defaults to 20, capped at 200. | ||
|
|
||
| Response behavior: | ||
| - Ordered newest-to-oldest. | ||
| - Includes `timestamp_human` for display formatting. | ||
|
|
There was a problem hiding this comment.
The root/dashboard endpoints documented here (GET / returning an HTML dashboard and GET /dashboard/logs) don’t appear to exist in the current HTTP server routes (internal/server/server.go). As written, readers will expect these endpoints to work but they won’t. Please either document the actual endpoints (e.g., /health, /stats, etc.) or remove/relocate this section.
| ## SSD Sidecar (.engram-ssd.md) | ||
|
|
||
| The server keeps a small sidecar file in the project root: `.engram-ssd.md`. | ||
|
|
||
| Behavior: | ||
| - On `engram serve` startup: if the file does not exist, bootstrap it from the latest 15 context logs. | ||
| - On every successful `POST /remember`: trigger async SSD refresh (non-blocking HTTP response). | ||
| - File write is atomic (temp file + rename) to avoid partial reads by agents. | ||
|
|
There was a problem hiding this comment.
The SSD sidecar behavior (.engram-ssd.md creation/refresh, bootstrap from latest 15 “context logs”, refresh on POST /remember) is documented as implemented behavior, but there is no reference to .engram-ssd.md in the repo and no /remember endpoint. Please avoid stating this as a runtime guarantee unless the feature exists, or clearly mark it as a proposed/experimental contract with pointers to the implementation.
| # Context Bus Manual for AI Agents | ||
|
|
||
| This document defines how AI agents should write and recall shared context in Engram. | ||
|
|
||
| ## Purpose | ||
|
|
||
| Engram Context Bus enables multiple agents to share short, high-signal context logs through a local REST API backed by SQLite. | ||
|
|
||
| Goals: | ||
| - Preserve work progress across sessions. | ||
| - Keep memory entries compact and searchable. | ||
| - Allow agent-aware handoff between Codex, Gemini, and system processes. |
There was a problem hiding this comment.
PR title/description indicate this is primarily MCP transport troubleshooting guidance, but this file introduces a full “Context Bus Manual” (API contract, compression rules, SSD sidecar, etc.) well beyond the described scope. Either trim this file to the intended troubleshooting section or expand the PR description to explain why this larger, new API/manual belongs in this change.
Closes #167
Summary
Adds official troubleshooting guidance for Codex MCP Transport closed failures when using Engram memory tools.
Problem
In real workflows, users can hit:
This appears most often after:
Root Cause
Codex may keep a stale MCP stdio channel after session-boundary operational changes. The HTTP server can be healthy while MCP transport in the active chat is stale.
Changes
Recovery Sequence (documented)
Why This Helps
Scope and Risk
Validation