Compaction: align summary framing across cache-aligned and fallback paths + add cache-hit telemetry

## Context

Follow-ups from an external code review of the cache-aligned compaction work that landed in v0.8.10 (#580 / `754e8bd4` / `crates/tui/src/compaction.rs`).

The review's verdict on the change itself was positive: the `should_use_cache_aligned_summary` gate is conservative (V4-scale window, ≤ 85% of budget, model identifiable), the 512-token instruction buffer is reasonable, and the test `cache_aligned_summary_request_preserves_message_prefix` validates the structural invariant the cache needs.

The reviewer flagged one nuance worth tracking and one piece of telemetry that would let us see the win in production.

## The behavioral fork

> The two paths give the model slightly different context for the summary. The cache-aligned path has the model "reading" the conversation as its own history and then being asked to summarize it. The fallback path feeds a reformatted `User:/Assistant:` transcript with a "You are a helpful assistant that creates concise conversation summaries" system prompt. In practice this shouldn't matter — the V4 models are strong enough that either framing produces equivalent summaries — but it's a behavioral fork worth being aware of.

Concretely:

- **Cache-aligned path** (`build_cache_aligned_summary_request`): replays the original message prefix, sends `system: None`, appends the instruction as a final `user` turn.
- **Fallback path**: reformats `User:/Assistant:` into a flat transcript and sends it under a "helpful assistant that creates concise conversation summaries" system prompt.

Two healthy outcomes:

1. **Document the divergence.** Add a doc-comment on `should_use_cache_aligned_summary` explaining *why* the framing diverges and that the empirical bar is "summaries are equivalent in practice on V4."
2. **Decide whether to unify.** Either drop the system prompt from the fallback path so both paths look like "summarize your own conversation," or keep both and codify the rationale.

## The missing telemetry

Today the footer chip shows cache-hit % for normal turns, but the **compaction summary call is the request we most expect to benefit** from the cache-aligned change — and we don't have a per-call number for it. Without it the win is unobservable post-deploy.

Suggested:
- Record the cache-hit rate of the summary call itself and surface it once (transcript line, log, or footer chip) when a compaction completes.
- Optional: emit a one-line debug log under `RUST_LOG=deepseek_cli::core::compaction=debug` with `path={cache_aligned|fallback}, prompt_tokens=…, cache_hit_pct=…` so we can grep production logs.

## Suggested follow-ups for v0.8.11

- [ ] Add a test (or runtime sanity check in debug builds) asserting that summaries from both paths are roughly equivalent — e.g. length within ±25% and both contain the user's original goal phrase.
- [ ] Decide whether to unify the framing or document the divergence; land whichever you choose with a doc-comment on `should_use_cache_aligned_summary`.
- [ ] Surface the compaction-summary call's cache-hit rate on completion (transcript, footer chip, or log line) so the V4 cache win is observable.

## References

- v0.8.10 change: commit `754e8bd4`, PR #572, motivated by #575 / #580.
- Implementation: `crates/tui/src/compaction.rs` — `should_use_cache_aligned_summary`, `build_cache_aligned_summary_request`.
- Companion change: prompt-cache awareness section added in `crates/tui/src/prompts.rs` (`874e8b4b`).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compaction: align summary framing across cache-aligned and fallback paths + add cache-hit telemetry #584

Context

The behavioral fork

The missing telemetry

Suggested follow-ups for v0.8.11

References

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Compaction: align summary framing across cache-aligned and fallback paths + add cache-hit telemetry #584

Description

Context

The behavioral fork

The missing telemetry

Suggested follow-ups for v0.8.11

References

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions