Skip to content

feat(memory_tree): add helper to clear reembed-skipped tombstones #75

@ElioNeto

Description

@ElioNeto

Original issue tinyhumansai#2358 by @sanil-23 on 2026-05-20T15:16:12Z


Summary

Add an admin / RPC path to clear rows from mem_tree_chunk_reembed_skipped and mem_tree_summary_reembed_skipped. Today the only way to un-skip a chunk/summary after a terminal failure is direct DELETE FROM … SQL against the workspace database.

Why

PR tinyhumansai#2349 introduced the two *_reembed_skipped tombstone tables so the re-embed backfill terminates instead of looping forever on rows that fail their body read / embed (tinyhumansai#1574 §6 runaway-loop fix). The tombstones are durably keyed by (chunk_id | summary_id, model_signature).

That's the right durability for the common case, but it creates an irreversible state when the failure was environmental rather than data-permanent:

  • User moves their workspace dir, leaving relative paths broken, then restores → the orphan rows are tombstoned forever even after the body files are recoverable
  • A partial backup-restore leaves tombstones from a state that no longer matches reality
  • An operator wants to retry a row after fixing a misconfigured embedder

Right now the only remediation is opening the SQLite file by hand and running DELETE FROM mem_tree_chunk_reembed_skipped WHERE … — not contributor-safe and easy to get wrong.

Proposed scope

  • A core helper: pub fn clear_chunk_reembed_skipped(config, chunk_id, model_signature) -> Result<()> (and the summary counterpart). Idempotent.
  • A bulk variant: pub fn clear_reembed_skipped_for_signature(config, model_signature) -> Result<usize> for clearing the entire signature when the operator knows the underlying cause has been resolved.
  • (Optional) An admin JSON-RPC method like openhuman.memory_clear_reembed_skipped exposing both single-row and bulk variants — feature-gated behind the existing admin/debug flag if one exists.
  • Test that after clearing, the worklist queries (NOT EXISTS … reembed_skipped) re-include the row, and ensure_reembed_backfill's has_uncovered probe flips back to true.

Out of scope

  • Auto-clearing tombstones based on heuristics. The point of the tombstone is to be sticky — a manual operator action is the right interface for un-stick.
  • UI surface. Backend helper is enough; if users hit this often, an Intelligence / Settings affordance is a follow-up.

Background

PR tinyhumansai#2349 reviewer (M3gA-Mind) flagged the missing un-tombstone path as a follow-up:

Once a (chunk_id, model_signature) is tombstoned, the only way to clear it is direct DELETE FROM mem_tree_chunk_reembed_skipped SQL — there's no admin RPC or helper. […] Worth filing the issue now so it doesn't get lost; if a contributor moves their workspace back the orphan chunks will silently stay skipped until tombstones are cleared manually.

Filing here so it has a tracking number to reference from the PR description and future user reports.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions