cllama: session history post-processing layer (retention, memory, embeddings, archival)

## Problem

Session history (`history.jsonl`) is append-only with no post-processing hooks. ADR-018 acknowledges retention as future work, and Phase 2/3 plans call for derived memory, summaries, and embeddings — but these are separate concerns in the doc. In practice they're all **the same shape**: something consumes session history entries and produces a derived artifact or mutates the history.

Rather than building each as a bespoke feature, this issue proposes a **general session post-processing layer** — a pipeline of pluggable processors that run over session history, with retention/truncation as one of several built-in processors.

## Motivating example

On Tiverton, `sentinel` runs a heartbeat poll every ~9 minutes (144 calls/day). Each entry is ~154 KB (full request + system prompt + response). After ~3 weeks the file is 380 MB, growing ~22 MB/day, and it's almost entirely `HEARTBEAT_OK` responses with no long-term value. A simple age/size-based retention processor would solve this. But the same plumbing would let us plug in summarization for traders, embeddings for research agents, and cold archival for everyone.

## Processor types

Different processors, same interface — consume entries, optionally produce derived output, optionally mutate history:

| Processor | Purpose | Output |
|---|---|---|
| **retain** | Drop entries older than N days or when file exceeds N MB | Mutates `history.jsonl` |
| **summarize** | LLM-driven condensation of stale entries into rolling memory | Writes to `.claw-memory/<agent>/` |
| **embed** | Generate embeddings for semantic recall | Writes to embedding store |
| **extract** | Pull structured facts/decisions/tasks into memory | Writes to memory |
| **archive** | Move old entries to cold storage (e.g. xz tarball) | External blob |
| **redact** | Strip secrets/PII before retention or archival | Mutates entries |
| **forward** | Push to external analytics (PostHog LLM analytics, etc.) | External sink |

Phase 2/3 of ADR-018 (derived memory, summaries, embeddings) fall out naturally as implementations of this interface instead of separate subsystems.

## Config surface

```yaml
session_history:
  processors:
    - type: retain
      max_age_days: 30
      max_size_mb: 100
    - type: archive
      older_than_days: 30
      destination: ./cold-storage
    - type: summarize
      trigger: on_size_threshold
      threshold_mb: 50
      model: claude-haiku-4-5
      output: .claw-memory/{agent}/summary.md

  agents:
    sentinel:
      processors:
        - type: retain
          max_age_days: 3
    trader-dundas:
      processors:
        - type: summarize
          trigger: nightly
        - type: embed
          trigger: on_write
```

## Processor interface (sketch)

```go
type Processor interface {
    Name() string
    Triggers() []Trigger  // on_write, on_schedule, on_startup, on_size_threshold
    Process(ctx context.Context, agent string, entries []Entry) (ProcessResult, error)
}

type ProcessResult struct {
    Drop    []EntryID           // entries to remove from history
    Replace map[EntryID]Entry   // entries to rewrite (e.g. redaction)
    Derived []Artifact          // memory, embeddings, summaries, archive blobs
}
```

Runs out-of-band from the hot path — the recorder keeps writing synchronously, processors run on a schedule or against a backlog so they never block LLM turns.

## Implementation notes

- Recorder lives in `cllama/internal/sessionhistory/recorder.go`
- The existing `history.index.json` (checkpoints every 128 entries) makes time-range seeks cheap — retention and archival can operate incrementally
- Per-agent config plumbs through `CllamaProxyConfig` in `internal/pod/compose_emit.go`
- Running processors in the cllama process keeps them close to the data, but a sidecar model is also viable (read via the existing `/history/{agentID}` HTTP endpoint)

## References

- ADR-018: `docs/decisions/018-session-history-and-memory-retention.md` (Phase 1 complete; Phase 2/3 subsumed by this)
- Implementation plan: `docs/plans/2026-03-26-cllama-session-history.md`


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cllama: session history post-processing layer (retention, memory, embeddings, archival) #164

Problem

Motivating example

Processor types

Config surface

Processor interface (sketch)

Implementation notes

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Processor	Purpose	Output
retain	Drop entries older than N days or when file exceeds N MB	Mutates `history.jsonl`
summarize	LLM-driven condensation of stale entries into rolling memory	Writes to `.claw-memory/<agent>/`
embed	Generate embeddings for semantic recall	Writes to embedding store
extract	Pull structured facts/decisions/tasks into memory	Writes to memory
archive	Move old entries to cold storage (e.g. xz tarball)	External blob
redact	Strip secrets/PII before retention or archival	Mutates entries
forward	Push to external analytics (PostHog LLM analytics, etc.)	External sink

cllama: session history post-processing layer (retention, memory, embeddings, archival) #164

Description

Problem

Motivating example

Processor types

Config surface

Processor interface (sketch)

Implementation notes

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions