Add GitHub Copilot CLI support#1
Conversation
- plugin/.plugin/plugin.json: Copilot manifest with name/version/skills/mcpServers/hooks refs - plugin/.mcp.copilot.json: MCP server config with type:local, npx, env passthrough, tools:[*] - plugin/hooks/hooks.copilot.json: Copilot hooks (version:1) with 11 supported events and PreToolUse matcher - test/copilot-plugin.test.ts: 11 tests covering manifest, MCP config, and hooks validation Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Adds Copilot CLI support through a root plugin manifest, Copilot-specific MCP and hook configuration, and a connect adapter for MCP-only setup. Includes Windows-safe Copilot MCP command generation, COPILOT_HOME handling, Copilot hook payload normalization, generated hook scripts, and targeted tests for plugin shape, hook execution, and connect behavior. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Reopening to trigger CI after workflows were registered on the fork. |
|
CI probe complete; closing fork-local PR. |
There was a problem hiding this comment.
Pull request overview
Adds GitHub Copilot CLI as a supported agent. Introduces a new plugin/plugin.json manifest plus Copilot-specific MCP and hooks config, a new copilot-cli connect adapter (with COPILOT_HOME support and Windows-safe command handling), and normalizes hook payload handling to also accept Copilot's camelCase field names (e.g. sessionId, toolName, toolArgs, userPrompt, errorMessage, notificationType, tool_result/toolResult). Adds focused tests for the plugin manifest, hooks config, hook scripts, and the new adapter.
Changes:
- New Copilot plugin manifest +
.mcp.copilot.json+hooks/hooks.copilot.jsonand a newcopilot-cliconnect adapter (allowed on Windows). - All Claude/Codex hook scripts and their TS sources now read both
snake_caseandcamelCasepayload fields;pre-tool-useswitches to lowercase tool matching withcreate/viewadded;post-tool-usefalls back totool_result/toolResult.text_result_for_llm. - README/AGENTS/CLI help updated; new
test/copilot-plugin.test.tsand Copilot-adapter tests intest/cli-connect.test.ts.
Reviewed changes
Copilot reviewed 33 out of 33 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| plugin/plugin.json | New top-level Copilot plugin manifest. |
| plugin/.mcp.copilot.json | Copilot MCP server block (npx -y @agentmemory/mcp). |
| plugin/hooks/hooks.copilot.json | Copilot hook event registrations. |
| plugin/scripts/*.mjs | Built hook scripts updated to accept Copilot camelCase fields. |
| src/hooks/*.ts | Source hooks updated in parallel for camelCase compatibility. |
| src/cli/connect/copilot-cli.ts | New connect adapter for ~/.copilot/mcp-config.json. |
| src/cli/connect/util.ts | Adds AGENTMEMORY_COPILOT_MCP_BLOCK with Windows-safe command. |
| src/cli/connect/index.ts | Registers adapter; allows copilot-cli on Windows. |
| src/cli.ts | Updates help text to list copilot-cli. |
| README.md / AGENTS.md | Document new agent and update maintenance checklists. |
| test/cli-connect.test.ts | Tests for new adapter and adapter count. |
| test/copilot-plugin.test.ts | New tests for plugin manifest, MCP config, hooks, and hook scripts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| { | ||
| "name": "agentmemory", | ||
| "version": "0.9.20", | ||
| "description": "Persistent memory for AI coding agents -- captures tool usage, compresses via LLM, injects context into future sessions. 12 hooks, 53 MCP tools, 4 skills, real-time viewer.", |
Addresses upstream AI review suggestions by aligning the Copilot preToolUse matcher with the hook allowlist, narrowing hook payload fields at runtime, normalizing subagent fallbacks, and tightening hook config validation. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Includes GitHub Copilot CLI in the first-run agent picker and adds a regression test so the Copilot setup path remains discoverable. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Detect Copilot CLI environment markers during first-run setup so pressing Enter wires the current agent instead of the historical Claude Code default. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Accept Content-Length framed JSON-RPC messages in addition to the existing newline-delimited transport so Copilot CLI can initialize the standalone MCP server. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ohitg00#517) The viewer's five search inputs (graph, memories, lessons, actions, crystals) destroy and recreate their input DOM via innerHTML on every keystroke, which interrupts active IME composition sessions and makes non-Latin input (Chinese, Japanese, Korean) unusable. Additionally, the viewer's CSP includes script-src-attr 'none', which silently blocks the inline oninput=/onchange= handlers on the lessons, actions, and crystals panels. Those three search/filter controls have been non-functional under the strict CSP. This patch: 1. Adds a bindImeSafeSearch helper that guards on both an explicit compositionstart/compositionend flag and event.isComposing. compositionend triggers an immediate commit and sets a justCommitted one-shot flag to suppress the redundant trailing input event that browsers dispatch after compositionend. 2. Adds captureSearchFocus/restoreSearchFocus helpers to preserve focus and cursor position across innerHTML rebuilds, so multi-word IME input doesn't require clicking back into the search box after each commit. 3. Migrates all five search inputs to addEventListener via the new helpers, removing the CSP-blocked inline handlers on lessons, actions, and crystals. The actions panel's status filter <select> is also migrated for the same reason. 4. Unifies debounce delay to 200ms across all five panels. Verified via: - 8/8 jsdom + synthetic CompositionEvent regression cases (ASCII debounce, IME composing suppresses input, compositionend immediate commit + justCommitted suppression, post-IME ASCII resumes, fast typing coalesces, composition cancel returns to idle). - 8/8 manual browser cases on Windows + Chrome (Chinese pinyin commit, multi-word IME without re-focusing, English regression, all five panels, zero CSP violations in DevTools, cursor position retained). Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: 이민재 <19909783+honor2030@users.noreply.github.com>
Co-authored-by: honor2030 <19909783+honor2030@users.noreply.github.com>
…oken_budget (rohitg00#507) (rohitg00#516) * fix(mcp): route memory_recall to /agentmemory/search and forward format/token_budget memory_recall and memory_smart_search were sharing the smart-search endpoint, which always returns compact mode and silently drops the format and token_budget parameters that the tool schema advertises. Split the cases so memory_recall hits /agentmemory/search (which honors format) while memory_smart_search keeps its own endpoint. Default format to "full" for memory_recall so the documented behavior matches the wire call. Signed-off-by: serhiizghama <zmrser@gmail.com> * test(mcp): cover memory_recall endpoint, format forwarding, and defaults Two new proxy tests for issue rohitg00#507: one asserts memory_recall calls POST /agentmemory/search with the format and token_budget fields, and never falls through to smart-search; the other pins the default format to "full" when the caller omits it. Signed-off-by: serhiizghama <zmrser@gmail.com> --------- Signed-off-by: serhiizghama <zmrser@gmail.com>
…660 shipped) (rohitg00#546) v0.9.19 (rohitg00#460 / commit bb259ac) routed the first-run iii-console install through `bash -s -- --next` to dodge the upstream tag-prefix bug at iii-hq/iii#1652. Upstream PR iii-hq/iii#1660 fixed the bug on 2026-05-19 — installer's jq filter now accepts both `iii/v...` and bare `v...` tags, and `-v X.Y.Z` falls back gracefully. `install.iii.dev/console/main/install.sh` is a thin proxy serving `raw.githubusercontent.com/iii-hq/iii/main/console/install.sh` with a 5-minute CDN cache — verified byte-for-byte that the live URL already serves the post-#1660 fix. No iii release tag needed. Switch agentmemory back to the canonical bare invocation: curl -fsSL https://install.iii.dev/console/main/install.sh | sh Drops the workaround comment block (10 lines) explaining the prior detour. v0.9.19/v0.9.20 users on the `--next` path will still resolve a valid release (next-release lookup also handles `iii/v...-next.*` correctly post-#1660), so this isn't a forced upgrade. 1038/1038 tests pass.
…ce (rohitg00#545) * feat(repo): add Sponsor button + GH Packages mirror for sidebar surface Three additions that make the repo page surface clearer + give users a single place to fund the project: 1. `.github/FUNDING.yml` — `github: [rohitg00]` renders the "Sponsor" button at the top of the repo + the Sponsor widget in the right sidebar. Requires GitHub Sponsors to be enabled at github.com/sponsors/accounts on the rohitg00 profile before the link resolves (currently 404s — enable before merging this PR). 2. `.github/workflows/publish.yml` — new `publish-github-packages` job runs after the existing public-npm publish completes. Republishes the main package as `@rohitg00/agentmemory` to `npm.pkg.github.com`. The repo's right-sidebar "Packages" widget only surfaces packages on GitHub Packages, not packages on the public npm registry, so this is what makes the sidebar widget non-empty. Public npm remains the canonical install source; GH Packages is purely a discovery surface. - Uses built-in GITHUB_TOKEN, no new secrets needed. - Rewrites package.json `name` + `publishConfig` in-runner via a small node one-liner, publishes, then restores the original so main isn't permanently scope-changed. - Skip-on-already-published guard mirrors the existing public publish steps. - Marked `|| echo "non-fatal"` so a GH Packages hiccup never blocks the canonical npm release. - `permissions: packages: write` added at workflow level. 3. README badge row — added `npm downloads`, `GitHub Packages mirror`, and `Sponsor rohitg00 on GitHub Sponsors` badges alongside the existing `npm version` / `CI` / `License` / `Stars` row. The sponsor badge is the same link the FUNDING.yml sidebar widget uses; surfacing it in-README means readers who don't notice the sidebar still see it. Out of scope (asked, declined): - Docker Hub / ghcr.io publish workflow. Not in this PR. * ci(publish): scope write perms per-job + persist-credentials false Inline review on rohitg00#545 flagged that the workflow-level permissions block granted `id-token: write` + `packages: write` to every job, including ones that don't need them. Tightened to least-privilege: - Workflow-level: only `contents: read`. - `publish` job: adds `id-token: write` (required for `npm publish --provenance` to mint a Sigstore OIDC token). The GH Packages job doesn't inherit this. - `publish-github-packages` job: adds `packages: write` (required to push to npm.pkg.github.com). The public-npm publish job doesn't inherit this. Both `actions/checkout@v6` calls also pick up `persist-credentials: false`. The publish steps never push back to the repo, so the GITHUB_TOKEN doesn't need to land in `.git/config` after checkout. Same posture both jobs. Skipped from the same review pass: - **Pin actions to commit SHAs.** Industry rule but introduces real maintenance friction — Renovate/Dependabot don't auto-bump SHA-pinned actions to new minors, so SHA pinning trades easy semver tracking for stale-action drift. We stay on `@v6` major-tag pins (GitHub publishes those via verified moving refs). - **Disable setup-node cache.** `actions/setup-node@v6` defaults to cache-off (the `cache:` input is opt-in). `package-manager-cache` only auto-enables when `package.json` has a `packageManager` field — agentmemory's doesn't (verified via `grep`). The fix is a no-op on this workflow.
…tg00#547) Sponsor button still missing from the repo page despite rohitg00#545 merging. The committed FUNDING.yml started with 4 lines of `#` comments before the canonical `github: [rohitg00]` directive. GitHub's FUNDING parser documents only the canonical key-value form; leading comments shouldn't break it but some users have reported indexer lag when the file starts with non-data lines. Strip to the bare single-line form to match the documented schema and remove any ambiguity. Sponsor profile is enabled (github.com/sponsors/rohitg00 returns 200 + 'Sponsor @rohitg00' button), so the only remaining gap is GitHub's side-bar indexing. Tightening the file forces a re-parse.
…ohitg00#548) Reverting the GH Packages publish from rohitg00#545. GH Packages is a separate registry from npmjs.com — anyone installing `@rohitg00/agentmemory` from `npm.pkg.github.com` needs to point their registry there and authenticate, which is friction users don't have on the canonical `@agentmemory/agentmemory` install from public npm. The right-sidebar Packages widget on the repo page was the only motivation for the mirror. Acceptable to leave it empty — the single canonical install path is the better DX. - Drop `publish-github-packages` job from `.github/workflows/publish.yml` - Drop `packages: write` perm wording from the workflow comment block - Remove "GitHub Packages mirror" badge from README Manual follow-up (post-merge): delete the already-published `@rohitg00/agentmemory@0.9.20` from GH Packages registry via github.com/users/rohitg00/packages/npm/agentmemory/settings → Delete.
…#549) GitHub auto-renders the "Sponsor this project" widget in the right sidebar from .github/FUNDING.yml (Sponsor button + heart icon + "Learn more about GitHub Sponsors" link). The README badge was redundant noise on the top badge row. Sidebar widget is the canonical surface — one path, one click.
Co-authored-by: honor2030 <19909783+honor2030@users.noreply.github.com>
* fix(hermes): declare all plugin hooks * test(hermes): compare manifest hooks to provider --------- Co-authored-by: honor2030 <19909783+honor2030@users.noreply.github.com>
…s run (rohitg00#500) mem::observe's boot flow had this sequence in main(): 1. registerSearchFunction / registerContextFunction / ... (sync — completes immediately) 2. restore persisted vector index from disk 3. await rebuildIndex(kv) ← blocks here 4. bootLog "Ready" / "REST API" / "MCP surface" 5. startViewerServer(...) 6. setInterval auto-forget / lesson decay / consolidation rebuildIndex iterates every observation across every session and AWAITS an embedding-provider call per record. On a large corpus + a rate-limited embedding endpoint (e.g. 100 RPM), step 3 takes hours to days. Everything that runs AFTER it — including startViewerServer — is silently delayed for the same duration. Symptoms in the wild: - http://localhost:3113/ unreachable (no listening socket on the viewer port) even on a freshly-started server - `agentmemory doctor` reports "viewer-unreachable" - log floods with `vector-index add: embed failed — skipping {429: ...}` from the still-running rebuild burning rate-limit budget - no error message — the worker stays alive serving HTTP because sdk.registerFunction had already completed synchronously in step 1 Fix: detach rebuildIndex with `void` + .then/.catch instead of awaiting. The index lazily fills in over time, search degrades gracefully (BM25 keeps working immediately, vector results fill in as the embed queue drains), and the viewer comes up in seconds. Repro on the operator side: 1. import a sizeable jsonl corpus (`mem::replay::import-jsonl`) 2. clear the persisted vector index so rebuildIndex runs on next boot 3. restart agentmemory with EMBEDDING_PROVIDER pointed at a rate-limited endpoint (any OpenAI-compat with low RPM) 4. observe: REST API responds on :3111, but :3113 is never bound, and the doctor's "viewer-unreachable" check fires until the rebuild finishes (hours-to-days for a 300+ session corpus) The 5-second non-fix workaround was a hard kill + restart; that just re-entered the same hang. No tests added — main() isn't unit-tested today and wiring up a fake slow rebuildIndex + asserting the post-rebuild boot lines run early would need the full worker mock harness. The change is one line and the failure mode is dramatic; visual review + integration smoke covers the regression risk.
…rpora) (rohitg00#504) * fix(rebuild): batch embed calls in rebuildIndex (25h → 3h on large corpora) rebuildIndex called `await vectorIndexAddGuarded(...)` per memory and per observation. Each call is one HTTP round-trip to the embedding provider for a single input. On a 500k-observation imported corpus against an embedding endpoint with even modest latency, that's serial 100-200ms per call = 14-28 hours of wallclock. The new non-blocking rebuild path (rohitg00#500) made this no longer block boot, but the rebuild itself still takes the same wallclock. Add `vectorIndexAddBatchGuarded()` next to the existing per-item helper, accepting an array of items and calling `provider.embedBatch()` once. For batchable endpoints (vLLM, Triton, OpenAI's `/v1/embeddings` all accept an `input` array), latency for N items is roughly the latency of a single embed because network + GPU setup amortize. Refactor `rebuildIndex` to accumulate items into a buffer and flush every REBUILD_EMBED_BATCH_SIZE (default 32). BM25 add stays per-item-synchronous; only the vector path is batched. Validated against a vLLM Qwen3-Embedding-8B endpoint: - single embed: 175ms - batch-of-32: 737ms (= 23ms/item amortized, ~7.6× speedup) - projected backfill time for 500k obs: 25h → 3h Per-item failure shape is preserved: - whole-batch network/provider error → all skipped, single warn line (vs N warns previously when the same error hit every item) - per-item dimension mismatch → that item skipped, others continue - rebuildIndex return value unchanged (count of attempted items) Override knob: - REBUILD_EMBED_BATCH_SIZE (default 32) — set lower for endpoints with small per-request input limits, higher for endpoints that prefer larger batches. Set to 1 to fall back to the per-item path. 39/39 existing tests in search-index/vector-index/remember-bm25-index pass unchanged. Related: rohitg00#500 (non-blocking rebuildIndex), rohitg00#503 (separate embedding base URL). * fix(rebuild): per-item vi.add try/catch to preserve soft-fail Restores the pre-batch soft-fail behavior — a single failing vi.add() no longer aborts the entire rebuild batch. Failures are logged and counted toward fail, just like dimension mismatches above.
…g00#472) * fix(summarize): chunk large sessions to fit LLM context window JSONL-imported sessions can have far more observations than the 500-cap MAX_OBS_PER_SESSION that constrains native sessions. mem::summarize previously built one prompt containing every observation and shipped it as a single LLM call, which exceeded the provider's context window for sessions >~7,000 observations and returned an unhelpful 400 from upstream — silently leaving large bulk-imported sessions out of the semantic tier. Approach: map-reduce inside mem::summarize. - Sessions ≤ SUMMARIZE_CHUNK_SIZE (default 400) take the legacy single-call path with no overhead - Larger sessions are split into chunks, each summarized with the existing per-session prompt in parallel batches of SUMMARIZE_CHUNK_CONCURRENCY (default 6), and partial summaries merged via a new REDUCE_SYSTEM prompt - Per-chunk retry-once on transient parse / provider errors - Persistently-failing chunks are skipped (not propagated) so a flaky chunk doesn't waste 30+ already-completed LLM calls on the same session - Bail with too_many_chunks_skipped only if >50% of chunks fail Companion operator tool: scripts/backfill-imported-sessions.sh walks jsonl-imported sessions and POSTs mem::summarize per session, with project / agent / obs-count filters, cost estimation, and per-failure payload dumping for debugging provider rejections. Validated locally against a real corpus: - 5,392-obs session (14 chunks, c=6): 39s - 10,704-obs session (27 chunks, c=6): 34s - 105,966-obs session (265 chunks, c=50): handler completes server-side and persists - 52-session bulk backfill → 25 new semantic facts + 6 new reflect insights produced by consolidate-pipeline Known limit: iii-engine has a hardcoded 180s function-invocation timeout. Sessions large enough that chunked summarize wallclock exceeds that will return a timeout/500 to the HTTP client even though the handler completes and persists server-side. High-RPM providers (Novita / DeepInfra / DeepSeek typically allow 100+ concurrent) can raise SUMMARIZE_CHUNK_CONCURRENCY to push the cliff well past any realistic session size. True fix is an async-job pattern; left as follow-up. - src/prompts/summary.ts: add REDUCE_SYSTEM + buildReducePrompt - src/functions/summarize.ts: chunking, retry, skip, parallelism - test/summarize.test.ts: 9 cases covering single-call path, chunking, env-override, retry-then-success, persistent skip, too-many-skipped bail, provider error after retry, concurrency - .env.example: document SUMMARIZE_CHUNK_SIZE / _CONCURRENCY - .gitignore: agentmemory-debug/, data-*/ (operator artefacts) - scripts/backfill-imported-sessions.sh: bulk-import backfill tool 9/9 new tests pass; existing tests untouched. * fix(summarize): address CodeRabbit review on rohitg00#472 Four nits flagged by the automated reviewer, all worth fixing: - scripts/backfill: add curl --connect-timeout + --max-time profiles (META_CURL_OPTS vs WORK_CURL_OPTS). Metadata reads fail fast and retry on transient blips; LLM-backed work calls get a wide 30-min cap and no retry (retrying a half-finished LLM job double-spends). - scripts/backfill: sanitize sessionId before joining with DEBUG_DIR in dump_failure() (otherwise a session id containing `/` or `..` could escape the debug dir). UUIDs in practice, but the server doesn't enforce that. - scripts/backfill: switch the observations query to `--get --data-urlencode "sessionId=$id"` so special characters can't corrupt the query string. - scripts/backfill: guard `jq` on summarize + consolidate responses with `jq -e . </dev/null 2>&1` first. iii's HTTP layer occasionally returns non-JSON (HTML 5xx, empty body on timeout). Without the guard, `set -e` aborts the whole backfill loop on a single bad response — now it logs `invalid_json_response` and moves on. - test/summarize.test.ts: fix `vi.mock("./audit.js", ...)` path to `"../src/functions/audit.js"`. The old path resolved to `test/audit.js` (nonexistent), so the mock was a silent no-op. Tests passed anyway because `safeAudit` writes to a mocked KV. 9/9 tests still pass; backfill dry-run still resolves the corpus cleanly.
… diagnose (rohitg00#473) * fix(visibility): surface lessons in smart-search + tally per-store in diagnose Two related UX gaps in the memory layer's reflection surfaces. A consumer that calls `memory_lesson_save` and gets `success:true` reasonably expects to find the lesson via `memory_smart_search` ("did my save land?") and to see it counted in `memory_diagnose` ("what's in the store?"). Neither was true: lessons live in their own KV store (`KV.lessons`), and both diagnostic surfaces only looked at `KV.observations` / `KV.memories`. A 4,350-lesson store could read as "memories: 0" on diagnose and return zero hits on smart_search — the trust-shock that prompted this fix. A) mem::smart-search: also return lessons in the compact response. - New optional `project` and `includeLessons` (default true) params. - Delegates lesson scoring to the existing mem::lesson-recall via sdk.trigger, so confidence + recency weighting stays consistent with mem::lesson-recall (no duplicate scoring logic). - Lessons come back in a separate `lessons` field on the response, not merged into `results`. Existing consumers reading `results` are unaffected; new consumers can read `result.lessons` too. - Content truncated to 240 chars in compact mode (full content remains available via mem::lesson-recall directly). - Lesson-recall failures are soft: log + return empty lessons, observation results still flow through. B) mem::diagnose: add per-store tally categories for lessons, summaries, semantic, procedural, crystals, insights. Mirrors the existing `memories` pattern: count + light consistency check (confidence range for scored memories; non-empty title/narrative/ steps for the rest). Each new category is in ALL_CATEGORIES so `--categories lessons` filtering works as expected. The empty-system pass count goes from 8 to 14 (8 original + 6 new stores). Test updated accordingly. - src/types.ts: add CompactLessonResult - src/functions/smart-search.ts: lesson recall + merge (single-call path unchanged, expand mode unchanged) - src/functions/diagnostics.ts: six new category blocks before mesh - test/smart-search.test.ts: 6 new cases (lesson inclusion, content preview truncation, includeLessons=false opt-out, project filter passthrough, soft-fail on recall error / non-success response) - test/diagnostics.test.ts: 7 new pass/warn cases for each new category + filter check; empty-system pass count bumped 8→14 43/43 tests pass. * fix(diagnostics): defensive guards on new validators (CodeRabbit rohitg00#473 review) CodeRabbit flagged two patterns in the per-store validators added in the parent commit: 1. .trim() on .title / .narrative was unconditional — a corrupted row with title=null or title=42 would throw, abort the whole diagnose run, and silently skip every later category. Add typeof guards. 2. confidence range checks were `< 0 || > 1` which silently passes NaN and Infinity (NaN < 0 is false, NaN > 1 is false → "healthy"). Add Number.isFinite(...) prefix so corrupted scored rows surface as warnings instead. Applied across all 6 new validators: lesson confidence, summary title, semantic confidence, crystal narrative, insight confidence. Tests added in test/diagnostics.test.ts under "defensive row-shape handling": NaN confidence on a lesson, null summary title (verifies diagnose still completes and later categories still execute), undefined crystal narrative, Infinity / NaN on insight + semantic. 34/34 tests pass.
Quality + integration wave. Bundles 11 PRs since v0.9.20: Contributor feature: - rohitg00#237 OpenCode plugin with 22 auto-capture hooks (@cl0ckt0wer) Bug fixes (9): - rohitg00#516 memory_recall endpoint + format/token_budget (@serhiizghama, closes rohitg00#507/rohitg00#440) - rohitg00#461 env-file AGENTMEMORY_DROP_STALE_INDEX flag honored (@honor2030, closes rohitg00#456) - rohitg00#487 Windows hook path quoting (@honor2030, closes rohitg00#477) - rohitg00#517 viewer IME composition guard (@jonathanzhan1975) - rohitg00#472 chunk large sessions for LLM context window (@efenex) - rohitg00#473 surface lessons in smart-search + diagnose tally (@efenex) - rohitg00#486 declare all Hermes plugin hooks (@honor2030) - rohitg00#500 rebuildIndex non-blocking on boot (@efenex) - rohitg00#504 batched embed in rebuildIndex (25h -> 3h) (@efenex) - rohitg00#491 cli skip onboarding without tty (@honor2030) Upstream-installer revert: - rohitg00#546 drop --next workaround now that iii-hq/iii#1660 shipped 1067/1067 tests pass across 95 files.
* ci: cross-platform matrix + paths-ignore + concurrency 1. **OS matrix** — Linux + Windows + macOS, both Node 20 + 22. 6 cells, ~3min each, ~18min wall time. Direct test against the class of bug rohitg00#487 caught: hooks crashing on Windows usernames with spaces. Pre-merge Linux-only CI meant that bug landed in main + a release. fail-fast: false so a flake on one cell doesn't mask whether the same failure reproduces elsewhere. 2. **paths-ignore** — skip CI runs on README / CHANGELOG / docs / website / assets / .md / .mdx pushes. ~half the runner minutes back on doc-only churn. Source / config / workflow changes always run. 3. **concurrency + cancel-in-progress** — PR force-pushes cancel in-flight runs instead of piling them up. Push to main protected (concurrency group still scoped to ref, no cancel for main pushes). Plus minor hardening: persist-credentials: false on the checkout step so the GITHUB_TOKEN doesn't land in .git/config. What was NOT lifted (rationale per plan): - Per-package reusable workflows (Rust/Python/Homebrew — non-TS). - License-header check (no per-file Apache banners in agentmemory). - CLA bot (defer until external PR volume justifies friction). - tsc --noEmit lint job (codebase has ~10 pre-existing type errors tsdown skips; gating CI on those would block every PR until fixed; tracked as separate cleanup). - Smoke test (`agentmemory demo + livez`) — defer to its own PR with its own validation cycle. - Codecov badge — defer until baseline is set. * ci(windows): force bash shell so build script's POSIX idioms work Windows runners default to cmd.exe for npm run scripts; the build script uses POSIX patterns the build script's exit codes (`cp ... 2>/dev/null || true`, `mkdir -p`) that cmd doesn't parse. ubuntu + macos already use bash by default so this is Windows-only behaviour change. Alternative: rewrite the build script in Node. Bigger lift, not minimal. * ci(windows): point npm script-shell at git-bash before build `shell: bash` on the step only sets the shell for the step's own runner; `npm run` still spawns its inner script via npm's `script-shell` config, which defaults to cmd.exe on Windows. Configure npm to use Git-Bash (preinstalled on GitHub-hosted Windows runners) so `npm run build` and `npm run test` execute the build script the same way ubuntu + macos do. Step is gated on `runner.os == 'Windows'` so it's a no-op on the other matrix cells. * ci: drop windows-latest from matrix (obsidian-export hardcoded POSIX paths) Windows runners fail on test/obsidian-export.test.ts because the test + src hardcode `/tmp/...` POSIX paths that don't resolve on the D:\ drive Windows uses. Fixing it cleanly requires reworking src/functions/obsidian-export.ts to use os.tmpdir() + path.join, which is a separate scope. Drop windows from the matrix for now. Ship ubuntu + macos coverage (real darwin/linux divergence catch) and file a follow-up to make obsidian-export cross-platform so Windows can be added back. * test(fs-watcher): bump waits to 1500ms + describe retry for macos fsevents flake
…rpus (rohitg00#562) * feat(eval): pluggable benchmark harness with in-house coding-agent corpus Adds eval/ tree (outside files field so npm tarball stays thin) with Adapter interface, three reference adapters (grep / vector / agentmemory-hybrid), two benchmarks (LongMemEval _s public, coding-agent-life-v1 in-house 15 sessions), scoring (P@K, R@K, hit, top-gold-rank), NDJSON output, sandbox script. coding-agent-life-v1 published scorecard at docs/benchmarks/2026-05-20-coding-agent-life-v1.md: agentmemory-hybrid R@5=0.967 P@5=0.578 (100% hit) vs grep R@5=0.967 P@5=0.267. 2.2x better precision on identical input, sandbox-reproducible. Adapter contract: init(sessions, config) -> State; query(q, state, k) -> RankedDoc[] npm scripts: npm run eval:coding-life (no download, no API key for grep) npm run eval:longmemeval (needs OPENAI key + 278MB download) eval/scripts/sandbox.sh boots clean agentmemory + iii-engine on ports 3411/3412 with isolated data dir; tears down on exit. README headline updated. 1072/1072 tests pass + 5 new eval tests. * fix(eval): address review findings on benchmark harness - agentmemory adapter: prefer row.sessionId before observationToSession lookup - vector adapter: validate embedBatch response (length, indexes, non-empty rows) - coding-life: positive-int guard on --k; wrap query loop in try/finally so teardown runs - longmemeval: positive-int guards on --k/--limit/--stratify; per-question try/finally - load: throw on haystack_session_ids vs haystack_sessions length mismatch - score: P@K denominator is k (requested cutoff) not topK.length - sandbox.sh: guard rm -rf with non-empty + /tmp/ prefix check - README: drop unsafe rm "$(which iii)"; instruct ~/.local/bin + PATH instead; add language tag to repo-layout fenced block - sessions.json: fix "two-phase" -> "three-phase" wording mismatch
…hitg00#509) (rohitg00#564) Codex Desktop currently does not dispatch plugin-local hooks.json even though both CodexHooks and PluginHooks feature flags are stable + default-enabled in codex-rs/features/src/lib.rs (openai/codex#16430). MCP tools still work; lifecycle observations are silently missing. Adds `agentmemory connect codex --with-hooks` which mirrors the bundled hooks.codex.json into the user-scope ~/.codex/hooks.json: - Resolves ${CLAUDE_PLUGIN_ROOT} to the absolute bundled plugin/ path (user-scope hooks don't get plugin-root injection) - Idempotent merge: previous agentmemory entries are stripped on reinstall via the resolved scripts/ path prefix; unrelated user hooks are preserved untouched - Preserves matcher fields from the bundled manifest so PreToolUse routing still works - findPluginRoot walks up from import.meta.url to locate the plugin/ dir; works for both dist/cli.mjs (bundled) and src/ (dev) layouts - Dry-run path previews both TOML and hooks.json changes Closes rohitg00#509.
* fix(deps): pin iii-sdk to 0.11.2 to avoid routing regression in 0.11.6 iii-sdk@0.11.6 changes nested behavior so that all /agentmemory/* routes return 404 against the iii-engine, even though both packages still satisfy the previous "^0.11.2" semver range. npm picked up the new version on `npm install -g @agentmemory/agentmemory` after 0.9.21 shipped, silently breaking installs. Two pin sites: 1. package.json — caret -> exact "0.11.2" so npm cannot drift forward on minor releases until the upstream regression is sorted. 2. src/cli.ts — `agentmemory setup` previously ran `pnpm up iii-sdk@latest` / `npm install iii-sdk@latest`, which would re-pull 0.11.6+ even after a freshly-pinned install. Both call sites now pin to 0.11.2 with a label referencing this issue. Tests (1081) + build pass against iii-sdk@0.11.2. Closes rohitg00#555. * fix(cli): drop issue ref from iii-sdk pin label Source labels should describe what the code does, not point at issues that rot as the codebase evolves. Issue context lives in the PR body.
…ohitg00#561) * fix: read tool_response instead of tool_output in PostToolUse hook Claude Code's PostToolUse payload sends the field as `tool_response`, not `tool_output`. The hook was reading `data.tool_output` which is always undefined, so `cleanOutput` was undefined, the observe request contained no `tool_output` value, and mem::compress consistently failed its XML schema validation (requires narrative >= 10 chars + facts >= 1). Fix: read `data.tool_response` with `data.tool_output` as a fallback so older integrations that emit the legacy field name keep working. Fixes rohitg00#539 * style: remove explanatory comment per repo guidelines
…body (rohitg00#526) OpenAI API spec defines `stream` as defaulting to false when absent, so the current code (which omits it) should yield JSON. Some OpenAI-compatible proxies disagree and default to text/event-stream, which crashes the `response.json()` parser below with: Unexpected token 'd', "data: {"id"... is not valid JSON After a few of these in a row, the resilient wrapper's circuit breaker trips and all subsequent compression calls fail with `circuit_breaker_open`, silently disabling LLM-backed compression / summarisation / reflection. Reproduced upstream in decolua/9router#1260: 9Router's `handleChatCore` returns SSE unless `stream: false` is explicit. PR decolua/9router#1272 fixes the proxy side, but sending the field explicitly here is defensive — other OpenAI-compatible endpoints (older self-hosted proxies, vLLM compat shims, …) hit the same spec gap. No behavior change for spec-compliant endpoints (openai.com, Azure OpenAI, well-behaved proxies): they already default to non-streaming when `stream` is absent, so making it explicit is a no-op there. Co-authored-by: Ptah-CT <221234802+Ptah-CT@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
…tg00#560) * fix(cli): accurately display bound viewer port on splash screen - Expose viewerPort and viewerSkipped state in /agentmemory/livez endpoint. - Update CLI readiness check to poll until the viewer port is bound or explicitly skipped. - Prevents misleading default port (3113) display on splash screen when the viewer falls back to another port. * fix(viewer): address CodeRabbitAI review
Signed-off-by: aqilaziz <gonzes7@gmail.com>
Ensures pre-tool-use only forwards string session IDs and falls back to unknown for invalid Copilot payload values, with regression coverage for the generated plugin script. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolves conflicts with current main, keeps Copilot CLI support intact, and preserves the Codex hook idempotency fix from main. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
* Add sudo for global installation command * Update README.md Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> * Update README with EACCES retry instructions Added installation instructions for macOS/Linux users. * docs: drop backticks around package name inside bash code fence Backticks inside a ```bash fenced block are still copy-pasted literally by users, and bash interprets them as command substitution. The package name in the install line had decorative backticks that turn into a shell syntax error when pasted as-is. --------- Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: Rohit Ghumare <ghumare64@gmail.com>
Keeps upstream install guidance and retains Copilot CLI in the supported agent list. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
agentmemory connect copilot-clifor MCP-only setup, includingCOPILOT_HOMEand Windows-safe command handlingValidation
npx tsdownpassed locallyThis PR is against the fork only for review before deciding whether to open the upstream PR.