Skip to content

cllama: session history post-processing layer (retention, memory, embeddings, archival) #164

@mostlydev

Description

@mostlydev

Problem

Session history (history.jsonl) is append-only with no post-processing hooks. ADR-018 acknowledges retention as future work, and Phase 2/3 plans call for derived memory, summaries, and embeddings — but these are separate concerns in the doc. In practice they're all the same shape: something consumes session history entries and produces a derived artifact or mutates the history.

Rather than building each as a bespoke feature, this issue proposes a general session post-processing layer — a pipeline of pluggable processors that run over session history, with retention/truncation as one of several built-in processors.

Motivating example

On Tiverton, sentinel runs a heartbeat poll every ~9 minutes (144 calls/day). Each entry is ~154 KB (full request + system prompt + response). After ~3 weeks the file is 380 MB, growing ~22 MB/day, and it's almost entirely HEARTBEAT_OK responses with no long-term value. A simple age/size-based retention processor would solve this. But the same plumbing would let us plug in summarization for traders, embeddings for research agents, and cold archival for everyone.

Processor types

Different processors, same interface — consume entries, optionally produce derived output, optionally mutate history:

Processor Purpose Output
retain Drop entries older than N days or when file exceeds N MB Mutates history.jsonl
summarize LLM-driven condensation of stale entries into rolling memory Writes to .claw-memory/<agent>/
embed Generate embeddings for semantic recall Writes to embedding store
extract Pull structured facts/decisions/tasks into memory Writes to memory
archive Move old entries to cold storage (e.g. xz tarball) External blob
redact Strip secrets/PII before retention or archival Mutates entries
forward Push to external analytics (PostHog LLM analytics, etc.) External sink

Phase 2/3 of ADR-018 (derived memory, summaries, embeddings) fall out naturally as implementations of this interface instead of separate subsystems.

Config surface

session_history:
  processors:
    - type: retain
      max_age_days: 30
      max_size_mb: 100
    - type: archive
      older_than_days: 30
      destination: ./cold-storage
    - type: summarize
      trigger: on_size_threshold
      threshold_mb: 50
      model: claude-haiku-4-5
      output: .claw-memory/{agent}/summary.md

  agents:
    sentinel:
      processors:
        - type: retain
          max_age_days: 3
    trader-dundas:
      processors:
        - type: summarize
          trigger: nightly
        - type: embed
          trigger: on_write

Processor interface (sketch)

type Processor interface {
    Name() string
    Triggers() []Trigger  // on_write, on_schedule, on_startup, on_size_threshold
    Process(ctx context.Context, agent string, entries []Entry) (ProcessResult, error)
}

type ProcessResult struct {
    Drop    []EntryID           // entries to remove from history
    Replace map[EntryID]Entry   // entries to rewrite (e.g. redaction)
    Derived []Artifact          // memory, embeddings, summaries, archive blobs
}

Runs out-of-band from the hot path — the recorder keeps writing synchronously, processors run on a schedule or against a backlog so they never block LLM turns.

Implementation notes

  • Recorder lives in cllama/internal/sessionhistory/recorder.go
  • The existing history.index.json (checkpoints every 128 entries) makes time-range seeks cheap — retention and archival can operate incrementally
  • Per-agent config plumbs through CllamaProxyConfig in internal/pod/compose_emit.go
  • Running processors in the cllama process keeps them close to the data, but a sidecar model is also viable (read via the existing /history/{agentID} HTTP endpoint)

References

  • ADR-018: docs/decisions/018-session-history-and-memory-retention.md (Phase 1 complete; Phase 2/3 subsumed by this)
  • Implementation plan: docs/plans/2026-03-26-cllama-session-history.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions