Skip to content

feat(context): add limit & compact query params to GET /context (byte-compatible)#162

Open
todie wants to merge 3 commits intoGentleman-Programming:mainfrom
todie:feat/context-limit-compact
Open

feat(context): add limit & compact query params to GET /context (byte-compatible)#162
todie wants to merge 3 commits intoGentleman-Programming:mainfrom
todie:feat/context-limit-compact

Conversation

@todie
Copy link
Copy Markdown

@todie todie commented Apr 7, 2026

Closes #163 (together with #161).

Stacking note

This PR is stacked on top of #161 (perf: shrink SessionStart hook injection). Review #161 first. Because GitHub doesn't support cross-repo stacked PRs with a fork base, this PR's diff against main currently includes #161's changes. Once #161 lands, I'll rebase and force-push-with-lease, and the diff will shrink to just this PR's additions (5 files, +192/−17).

If you'd prefer, I can open this as a single combined PR instead — just let me know.

Motivation

#161 solved the SessionStart token cost client-side, in the hook, using awk. That's the right shape for users who can't easily rebuild the engram binary, but it has drawbacks as a long-term solution:

  1. The knob lives far from the data. The decision about how much context to render should be made in the Go code that knows the data model, not in a shell script that parses the rendered output. Every client (MCP, the CLI, the TUI, external agents) currently has to re-implement its own compaction if it wants short output.
  2. Compact truncation is fragile in shell. The awk pass works but has edge cases — byte vs rune truncation, multi-line observation bodies with embedded ### headers that could confuse section tracking, etc. Doing it in Go is about 20 lines and covers all of them correctly.
  3. Shell compaction can't undo server-side work. The server still serializes the full 300-char content preview for every observation on every request, even if the client immediately throws it away. It's wasted CPU and memory on the engram daemon.

This PR moves the knobs into the server via two new query parameters on GET /context, while preserving byte-for-byte backward compatibility for existing callers.

Design: byte-compatible interface via zero-value options struct

The core requirement: don't break any existing caller. FormatContext is called from:

  • internal/server/server.go (HTTP handler) — 1 site
  • internal/mcp/mcp.go (MCP context tool) — 1 site
  • cmd/engram/main.go (CLI context command) + test shim storeFormatContext — 2 sites
  • internal/store/store_test.go — 8 direct test assertions
  • cmd/engram/main_extra_test.go — 2 shim overrides

Changing the signature of FormatContext(project, scope string) would cascade to ~15 call sites, many of them in tests that pin exact output strings. That's too much churn for a feature addition.

The pattern I chose — used elsewhere in this codebase (SearchOptions, AddObservationParams) — is the options struct + wrapper idiom:

type ContextOptions struct {
    Limit   int  // 0 → use MaxContextResults default
    Compact bool // false → legacy behaviour (inline 300-char preview)
}

// FormatContext is a thin wrapper that forwards a zero-value ContextOptions
// to FormatContextWithOptions. This keeps every existing caller unchanged.
func (s *Store) FormatContext(project, scope string) (string, error) {
    return s.FormatContextWithOptions(project, scope, ContextOptions{})
}

func (s *Store) FormatContextWithOptions(project, scope string, opts ContextOptions) (string, error) {
    // ... new implementation honoring opts ...
}

Byte-compatibility proof: a zero-value ContextOptions{} hits the same code paths as the old FormatContext body. The Limit <= 0 branch falls back to s.cfg.MaxContextResults (exactly what the old code used). The !opts.Compact branch renders - [%s] **%s**: %s\n with truncate(obs.Content, 300) (exactly the old format string). No conditionals flip, no new allocations in the default path.

Empirical verification: I built both the old binary (main) and this branch, pointed them at the same 200+ observation database, and compared the responses:

old binary, /context                 → 7949 B, ~1.0 ms
new binary, /context (no params)     → 7949 B, ~0.9 ms   ← byte-for-byte identical
new binary, /context?limit=8&compact=1 → 728 B, ~0.8 ms   ← −91%

The default-params path is a byte-identical drop-in replacement for the old behavior.

Query parameters

param values effect
limit positive integer Cap the observations section at N bullets. 0, negative, missing, or unparseable → falls back to MaxContextResults (default 20). Sessions (5) and prompts (10) are never affected.
compact strconv.ParseBool: 1, 0, t, f, T, F, true, false, True, False, TRUE, FALSE Render observation bullets as - [type] **title** only, dropping the : <300 chars of body> preview. Unknown values silently fall back to false (legacy behavior).

Caveat on compact=: I originally documented "yes" as accepted because I assumed ParseBool followed common-sense rules. It doesn't — only the 12 values above. The second commit in this PR fixes the misleading inline comment. I considered accepting yes/no via a custom parser but decided consistency with the Go stdlib was more valuable than convenience, and silent fallback to false is safer than erroring out on unknown values (bad clients get the old behavior, not a 400).

Hook integration (matched-version fast path + mixed-version fallback)

plugin/claude-code/scripts/session-start.sh is updated to pass ?limit=${ENGRAM_CONTEXT_LIMIT}&compact=1. On a matched-version deployment (new plugin + new binary), the server does the compaction and the awk pass from #161 becomes a near no-op. On a mixed-version deployment (new plugin + old binary), the old binary silently ignores the unknown query params, returns the full response, and the awk fallback from #161 compacts it client-side. Both cases produce a correctly compacted output.

This is the main reason I kept the awk pass in the hook rather than removing it in this PR.

Measurement methodology

  • Database: my real working engram.db with 200+ observations across several months (ctodie project), all in personal scope. Mix of session_summary (multi-line markdown), pattern, discovery, decision, manual, bug types.
  • Setup: built both binaries from the same git state (old = main, new = this branch), ran each in turn on port 7438 to avoid stepping on the live daemon's port. Hit /context 3 times per variant with curl -w \"time_total=%{time_total}s size=%{size_download}\\n\" to smooth out cold-start jitter.
  • Latencies are loopback. Don't read too much into the micro-differences; the real signal is the payload size delta. The "fast" path is ~10% faster mainly because it serializes less data, not because of any algorithmic win.
binary request payload median latency
old (main) /context?project=ctodie 7949 B ~1.0 ms
new (this PR) /context?project=ctodie (default params) 7949 B ~0.9 ms
new (this PR) /context?project=ctodie&limit=8&compact=1 728 B (−91%) ~0.8 ms

End-to-end SessionStart hook injection, real project:

bytes
original (before #161) ~9800
after #161 alone (shell awk) ~1900 (−81%)
after #161 + this PR (server-side) ~1151 (−88%)

Risk analysis

What could break?

  1. Silent semantic drift in FormatContext. Mitigated by TestFormatContextWithOptions asserting defaultCtx == legacyCtx as exact string equality. Any future refactor that touches FormatContextWithOptions without updating this test will flag a drift immediately.
  2. Query-param injection. Both params are parsed with strconv.Atoi / strconv.ParseBool before any store interaction. Unparseable values silently fall back to defaults, never reach the store, and never affect the SQL query (the store-level limit passed to RecentObservations is either a validated positive int or the config default).
  3. Existing test fixtures breaking. None did. go test ./... → 748 passed → 749 passed (one new test) → 750 passed (one more assertion in the e2e suite). All existing tests untouched.
  4. MCP clients expecting the old format. MCP clients go through internal/mcp/mcp.go which still calls FormatContext(project, scope) — i.e. the backward-compat wrapper. Their output is unchanged byte-for-byte.

What's the rollback?

  • git revert <merge-commit>. No database migration, no config file format change, no on-disk state change. The new ContextOptions struct and FormatContextWithOptions method disappear with the revert; nothing outside the store package depends on them.

Tests

  • New: internal/store/store_test.go::TestFormatContextWithOptions. Covers:
    • ContextOptions{} → byte-equal to FormatContext (backward compat)
    • Compact: true → drops content preview, keeps titles
    • Limit: 2 → exactly 2 observation bullets (using strings.Count)
    • Limit: 3, Compact: true → both knobs compose correctly
    • Limit: 0 → falls back to default (byte-equal to legacy)
  • Extended: internal/server/server_e2e_test.go's /context test now also hits ?limit=1&compact=1 and asserts the compact payload is strictly smaller than the default and contains no : <body> segments.
  • Full suite: go test ./... → 748 passed in 10 packages (was 747 before the new test; a second e2e assertion brings it to 749 on this branch).

Files changed (this PR's additions only, excluding #161)

internal/server/server.go                   | 25 ++++++--
internal/server/server_e2e_test.go          | 20 ++++++
internal/store/store.go                     | 43 ++++++++++++-
internal/store/store_test.go                | 96 +++++++++++++++++++++++++++++
plugin/claude-code/scripts/session-start.sh | 25 +++++---
5 files changed, 192 insertions(+), 17 deletions(-)

Commits

  1. feat(context): add limit + compact query params to GET /context — the main change
  2. docs(context): fix compact= param comment — ParseBool rejects 'yes' — drive-by doc fix after I benchmarked the live binary and caught the inaccuracy

Happy to squash if you prefer a single commit.

🤖 Generated with Claude Code

todie and others added 3 commits April 7, 2026 01:52
)

The SessionStart hook was injecting ~10 KB of additionalContext into every
Claude Code session:

  - ~1.8 KB hardcoded "ACTIVE PROTOCOL" heredoc that duplicates the rules
    already shipped in skills/memory/SKILL.md (which loads on demand)
  - ~8 KB of /context payload, because the server inlines up to 300 chars
    of raw (often multi-line markdown) content per observation bullet and
    returns up to MaxContextResults (default 20)

On a busy project this meant every new session burned ~2.5k tokens on
redundant protocol reminders and verbose observation previews before the
user had typed a single word.

Changes to plugin/claude-code/scripts/session-start.sh:

1. Replace the 35-line PROTOCOL heredoc with a short pointer that lists the
   available tools and directs the agent to the engram:memory skill for the
   full protocol. The skill is already part of this plugin, so the rules
   are one ToolSearch away when they are actually needed.

2. Post-process the /context response with awk to (a) concatenate each
   observation's multi-line content onto a single line, (b) collapse
   whitespace, (c) cap per-bullet length at ENGRAM_CONTEXT_MAXLEN chars
   (default 140), and (d) keep at most ENGRAM_CONTEXT_LIMIT bullets
   (default 8). Both tunables are env-overridable so users can dial the
   verbosity back up if they want.

No server or Go changes — fully backward compatible. The raw /context
endpoint behaviour is unchanged; only the hook's rendering of it is
trimmed.

Measured on a ctodie project with 200+ observations:
  before: 7961 B context + ~1800 B protocol ≈ 9.8 KB per session start
  after:  1487 B context +  ~450 B protocol ≈  1.9 KB per session start
  savings: ~8 KB (~80%) of additionalContext per session

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Stacked on top of #1 (perf: shrink SessionStart hook injection).

The /context endpoint always rendered every recent observation with up to
300 chars of raw content inlined per bullet, returning ~8 KB on any busy
project. PR #1 worked around this client-side with awk; this commit moves
the knobs into the server so the rendering choice lives next to the data.

New query parameters
--------------------
  limit=N    Cap the observations section at N bullets. Zero or absent
             falls back to the store-wide MaxContextResults default.
             Sessions and prompts are not affected (still 5 and 10).

  compact=1  Render observation bullets as `- [type] **title**` only,
             dropping the `: <300 chars of body>` preview. Parsed with
             strconv.ParseBool so `1`, `true`, `yes`, etc. all work.

Backward compatibility
----------------------
FormatContext(project, scope) stays as a thin wrapper over the new
FormatContextWithOptions(project, scope, opts) method with a zero-value
ContextOptions, so all ~15 existing callers and test fixtures are
untouched. A GET /context request with no new params produces byte-for-
byte identical output to the pre-change binary (verified empirically,
see benchmark below).

Hook update
-----------
plugin/claude-code/scripts/session-start.sh now passes
`?limit=${ENGRAM_CONTEXT_LIMIT}&compact=1` to the server. The awk
post-processor from #1 stays in place as a belt-and-suspenders fallback
for users whose binary is older than their plugin — on a new server it's
a near no-op; on an old server it still compacts things client-side.

Tests
-----
- internal/store: new TestFormatContextWithOptions covers default
  equivalence with FormatContext, Compact dropping previews, Limit
  capping observations, Limit+Compact composition, and Limit<=0 falling
  back to the default.
- internal/server: e2e test now hits /context with and without
  &limit=1&compact=1 and asserts the compact payload is strictly smaller
  and contains no `: <body>` segments.

Full suite: 748 passed in 10 packages.

Benchmark (ctodie project, 200+ observations, loopback)
-------------------------------------------------------
  old binary,  /context                        7949 B   ~1.0 ms
  new binary,  /context (default params)       7949 B   ~0.9 ms  ← identical bytes
  new binary,  /context?limit=8&compact=1       728 B   ~0.8 ms  ← −91%

End-to-end SessionStart hook injection
--------------------------------------
  original (before #1):                       ~9800 B
  after #1 (shell awk only):                  ~1900 B   −81%
  after #1 + #2 (server-side):                ~1151 B   −88%

PR #1 alone already solved the token cost for users who can't upgrade
their binary; this PR makes the fast path the default and keeps the
shell fallback for mixed-version deployments.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Go's strconv.ParseBool only accepts 1/0/t/f/T/F/true/false/True/False/
TRUE/FALSE. The original comment claimed 'yes' was also supported; it
isn't — compact=yes silently falls back to Compact=false. Verified
against the live binary: /context?limit=8&compact=yes returned 8 full
(non-compact) observations.

Drive-by doc fix; no behaviour change.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Collaborator

@Alan-TheGentleman Alan-TheGentleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review — APPROVED

Excelente PR. Backward compatibility verificada al 100%, bien testeada, y la implementación es idiomática.

Backward Compatibility ✓

  • FormatContext(project, scope) sigue existiendo como wrapper de FormatContextWithOptions con zero-value options
  • MCP clients siguen llamando FormatContext() directamente — output byte-por-byte idéntico
  • GET /context sin query params → flujo idéntico al handler anterior
  • El test de defaultCtx == legacyCtx como string equality es un guardrail excelente

Compact mode ✓

  • Elimina solo el preview de contenido, mantiene type + título + sesiones + prompts
  • Suficiente para que un agente sepa QUÉ existe y haga mem_get_observation on-demand

Edge cases ✓

  • limit=0, limit=-1, limit=abc → todos caen al default correctamente
  • Double safety net en RecentObservations — redundante pero seguro
  • limit + compact combinados → testeado

Observaciones menores (non-blocking)

  1. Test e2e de compact: la assertion busca **: como substring global — podría matchear un título de observation. Mejor verificar línea por línea.
  2. compact=yes cae silenciosamente a false: strconv.ParseBool no acepta "yes"/"on". El fallback es safe (retorna full context) pero puede confundir.
  3. Squash commits antes de merge — el commit 1 es de PR #161.

Fantástico laburo. El patrón de options struct + wrapper es consistente con SearchOptions del mismo codebase. Se puede mergear con confianza.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

perf: SessionStart hook injects ~10 KB (~2.5k tokens) into every Claude Code session

2 participants