Skip to content

refactor: continuation of #835/#842 — extract more sibling tests + split agent/prompts#863

Closed
jwalin-shah wants to merge 38 commits into
tinyhumansai:mainfrom
jwalin-shah:refactor/tame-god-files
Closed

refactor: continuation of #835/#842 — extract more sibling tests + split agent/prompts#863
jwalin-shah wants to merge 38 commits into
tinyhumansai:mainfrom
jwalin-shah:refactor/tame-god-files

Conversation

@jwalin-shah
Copy link
Copy Markdown
Contributor

Stacks on #835 + #842. Do NOT merge until both land.

This is the continuation of the sibling-test extraction refactor. 38 commits ahead of #835, covering additional god-files not included in that PR's initial scope.

What's here beyond #835

  • refactor(agent/prompts) — split mod.rs into sections/ and helper modules
  • Extracted tests to siblings for 4+5+9 more god-files (3 batches)
  • rustfmt pass over extracted siblings

Net: +22,622 / −15,433 lines, 112 files.

Recommendation

Merge in order: #835#842 → this PR. Each lands atop the previous. Will need rebase cycles as #835/#842 evolve.

Close and re-open as needed once the base PRs settle.

Drafts the wedge for OpenHuman: continuous ingestion of Gmail/Calendar/Slack/
iMessage into a local personal_index.db, with a Today brief and retrieval-
tuned chat as the two surfaces. Three composable privacy modes (Convenience /
Hybrid / Fully local) sharing one code path. Track 1 unblocks the ship
pipeline; Track 2 builds the spine over 4 weeks. Reuses ingestion logic from
~/projects/inbox.
Two implementation plans drafted from the life-capture spec:

- Track 1 (ship pipeline): fix Ubuntu installer smoke, land in-flight PRs
  tinyhumansai#806/tinyhumansai#786/tinyhumansai#788/tinyhumansai#797, wire Tauri auto-updater + signed Mac/Windows builds.
- Life-Capture #1 (foundation): SQLite + sqlite-vec personal_index.db,
  Embedder trait + HostedEmbedder (OpenAI), PII redaction, quoted-thread
  stripping, hybrid retrieval (vector + keyword + recency), controller
  schema + RPC. End-to-end test with synthetic items. No ingestion or UI
  yet — those are subsequent milestone plans.
Refactors scripts/install.sh to expose resolve_asset_url and
verify_asset_reachable. Adds scripts/test_install.sh that exercises
the resolver against a committed fixture latest.json. Failures now
report the resolved URL and the parse error instead of dying silently.
Adopts the Hermes pattern (NousResearch/hermes-agent, MIT) of an
agent-writable, char-bounded curated memory file pair. F15 builds the
char-bounded store with atomic writes; F16 wires a session-start
snapshot into the system prompt (preserves prefix cache); F17 exposes
memory.{add,replace,remove,read} through the controller dispatch so
both the agent loop and skills can use it.

Sits between TinyHumans synthesis (volatile) and personal_index.db
(raw events) — fills the "deliberately curated facts" gap that
neither covered.
Adds src/openhuman/life_capture/ module tree (stubs for embedder,
index, migrations, quote_strip, redact, rpc, schemas, types, plus
tests/) per Foundation Plan F1. Wires the module in src/openhuman/mod.rs.

Adds sqlite-vec to deps and httpmock as a dev-dep. Reuses existing
rusqlite/regex/once_cell/async-trait/tempfile already in tree
(Option C — avoids sqlx/rusqlite libsqlite3-sys conflict).
…egex

Previously the country-code group required at least one digit (\+?\d{1,3}),
so '(415) 555-0123' (no leading country code) wouldn't match. Wrapped the
whole prefix in a non-capturing optional group so parenthesized area codes
match without a leading country code.
… table

Loads sqlite_vec via sqlite3_auto_extension exactly once per process so every
new connection picks up the vec0 module. PersonalIndex wraps a single rusqlite
Connection in Arc<Mutex<_>> for async sharing; open() runs migrations and sets
WAL+foreign_keys, open_in_memory() is for tests.

vec_version() round-trip and a 1536-dim INSERT + MATCH query both verified.
…_id) dedupe

Adds IndexWriter with upsert (ON CONFLICT(source, external_id) DO UPDATE — FTS
stays in sync via triggers) and upsert_vector (DELETE + INSERT since vec0 has
no ON CONFLICT for virtual table primary keys). Both wrap rusqlite work in
spawn_blocking and share the PersonalIndex's Arc<Mutex<Connection>>.

Exposes ensure_vec_extension_registered() crate-wide so the migrations test
can register vec0 before opening its in-memory connection.
- keyword_search: FTS5 ranked by negated bm25 with snippet markers (« » …),
  plus the user:local ACL EXISTS clause so the same query shape works for the
  multi-token team v2 ACL without rewrites.
- vector_search: sqlite-vec MATCH with k = ? clause (vec0 KNN requirement)
  and ORDER BY distance; score = 1/(1 + distance) so callers can blend it
  with keyword bm25 on the same monotonic scale.

Shared ItemRow + into_hit() helper lets both queries reuse the same row
shape; rusqlite query_map closures hand-build it because rusqlite has no
sqlx FromRow equivalent.
Pulls oversampled candidates (k*3 min 20) from both keyword and vector legs,
normalises each independently to [0,1], then re-ranks with
0.55 * vec_norm + 0.35 * kw_norm + 0.10 * exp(-age_days/30).

Documents present in only one leg get 0 for the missing signal but still
compete on the others; only items with neither signal are dropped. Same-vector
twin items now break their tie by recency.
…rch)

Adds two controllers exposed via the registry:
- life_capture.get_stats — total items, per-source counts, last-ingest ts.
- life_capture.search    — hybrid (vector + keyword + recency) search with
  embed-then-rank, optional k (default 10, capped at 100).

Runtime state (PersonalIndex + dyn Embedder) lives in life_capture::runtime
behind a tokio OnceCell because handlers are stateless fn(Map<String,Value>)
and have no per-call context object. F14 will call runtime::init() at app
startup; until then handlers return a structured 'not initialised' error so
the failure mode is loud, not silent.

Schemas registered in core/all.rs alongside cron.
… → search)

FakeEmbedder hashes input bytes into a deterministic sparse 1536-dim vector
so the same text round-trips identically through vec0 — keeps the e2e test
hermetic, no network call. Verifies that:
- quote-strip drops the 'On … wrote:' block before indexing,
- redact masks the email address before indexing,
- the upserted item resurfaces via hybrid_search with both transformations
  preserved on the returned text.
After Config::load_or_init succeeds, open the personal_index.db at
config.workspace_dir and register the life_capture runtime. Embedder is
env-gated:
  OPENHUMAN_EMBEDDINGS_KEY > OPENAI_API_KEY
  OPENHUMAN_EMBEDDINGS_URL (default: https://api.openai.com/v1)
  OPENHUMAN_EMBEDDINGS_MODEL (default: text-embedding-3-small)

If no key is set we still open the index file (so Plan #2's ingestion
worker can write to it) but skip runtime registration — controllers then
return the 'not initialised' error from runtime::get() instead of silently
calling a misconfigured embedder.

Config-schema integration deferred to a follow-up; env-driven keeps F14
non-invasive while we land the rest of the foundation.
…ection

snapshot_pair(memory, user) returns a MemorySnapshot containing plain
strings — no reference back to the stores — so taking a snapshot at session
start and reusing it across turns gives a stable system-prompt prefix and
lets the LLM prefix-cache hit on every subsequent turn.

Plan #1 also calls for wiring this into chat.rs' OpenClaw context loader,
but openhuman assembles agent prompts per-agent under
src/openhuman/agent/agents/*/prompt.rs (not the plan's stale
src-tauri/src/commands/chat.rs path); the prompt-builder integration is
deferred to Plan #2 alongside the agent-context refactor.
… controllers

Adds four controllers under the memory_curated namespace (the 'memory'
namespace is already owned by the long-term memory subsystem):
- memory_curated.read    — read MEMORY.md or USER.md
- memory_curated.add     — append entry, char-bounded
- memory_curated.replace — substring replace, char-bounded
- memory_curated.remove  — drop entries containing a needle

'file' input is a typed enum (memory|user) so adapters reject anything
else at validation time. Runtime state lives in curated_memory::runtime
behind a tokio OnceCell, mirroring life_capture's pattern: startup in
jsonrpc.rs constructs both stores at <workspace>/memories/ with Hermes'
char limits (memory: 2200, user: 1375) and registers the runtime.
Handlers return 'not initialised' until init runs, so failure is loud.
…ings

Findings from Codex + Gemini second-opinion review:

1. IndexWriter::upsert orphaned vectors on re-ingest (Codex). On
   (source, external_id) conflict the row's id was kept but the caller's
   fresh UUID was used for upsert_vector — the vector wrote under an id no
   row joined to. Fix: explicit SELECT-then-UPDATE-or-INSERT in the same
   transaction; mutates caller's Item.id to the canonical id (so the next
   upsert_vector lands on the right row) and orphan-deletes any vector
   already written under the wrong id. Signature change: upsert(&[Item])
   -> upsert(&mut [Item]).

2. upsert_vector DELETE+INSERT was outside a transaction (Gemini) — a
   failed INSERT permanently lost the item's vector. Now wrapped in tx.

3. Runtime over-gated get_stats on the embedder (Codex). Split runtime
   into separate INDEX + EMBEDDER OnceCells: get_stats only requires
   index, search requires both. Index initialises whenever the workspace
   dir is reachable; embedder is env-gated as before.

4. Startup race (both reviewers): runtime init lived inside the post-
   serve tokio::spawn block, so axum::serve was already accepting before
   the OnceCells were set. Hoisted both bootstraps (curated_memory and
   life_capture) into helpers called inline before axum::serve.

5. runtime::get error message lied (Codex) — said 'set embeddings.api_key
   in config' but startup actually reads env vars. Fixed text.

Bonus: rename controller namespace memory_curated -> curated_memory
(Codex preference; nothing depends on it yet so renamed before clients
do).

Adds regression test for #1 (reingest_with_fresh_uuid_keeps_vector_findable).
4696 lib tests pass.
…emory namespace rename

Pre-push hook auto-applied cargo fmt across the foundation files; also fixed the schemas.rs docstring still referencing the old memory_curated namespace.
Critical:
- curated_memory/store.rs: propagate read errors instead of `unwrap_or_default`,
  which could rewrite MEMORY.md / USER.md from an empty baseline on transient
  I/O or permission failures. Reject empty needle in replace/remove.
- curated_memory/rpc.rs: belt-and-suspenders empty-needle guard at the RPC
  boundary (remove with "" deletes every entry).
- life_capture/index.rs: upsert_vector rejects orphan ids (items row missing),
  which would have inserted vectors that never join in vector_search.
- life_capture/embedder.rs: validate response length matches input, indices
  are contiguous 0..n, and every vector matches dim() — prevents silent
  misalignment or wrong-dimensional vectors flowing into the 1536-wide
  sqlite-vec table. Added 30s request + 10s connect timeouts.

Retrieval hardening:
- life_capture/index.rs: keyword_search now escapes FTS5 operators by
  tokenizing on whitespace and wrapping each token as a quoted literal.
  Prevents errors or unintended matches from stray quotes, AND/OR/NEAR,
  column specifiers in user input.
- life_capture/index.rs: hybrid_search applies q.sources / q.since / q.until
  via post-filter (single consistent pass across keyword+vector fusion).
- life_capture/rpc.rs: handle_search validates embedder dim and response
  length against the fixed 1536-wide index column — clear RPC error instead
  of a cryptic sqlite-vec failure when the embeddings model is swapped.

Docs:
- docs/event-bus.md: remove erroneous `.await` on register_native_global
  (it is sync; the handler closure is async).
- life-capture design spec: soften PII redaction claim — only regex is
  implemented today; light NER is flagged as a future enhancement.
Rendered prompts now surface runtime curated-memory writes:

- `PromptContext` gains `curated_snapshot: Option<&MemorySnapshot>`.
- `UserFilesSection` prefers the snapshot over the workspace-file
  loader when one is attached, and injects `USER.md` alongside
  `MEMORY.md` using a byte-compatible `inject_snapshot_content` helper.
- `Session` carries `Option<Arc<MemorySnapshot>>`, populated by
  `ensure_curated_snapshot` on the first turn from
  `curated_memory::runtime::get()`. Reused across turns so prompt
  bytes stay frozen (KV-cache prefix contract) while mid-session
  `curated_memory.add/replace/remove` writes land on the NEXT session.
- `ParentExecutionContext` inherits the snapshot so sub-agents render
  identical `MEMORY.md`/`USER.md` blocks as the parent.
- Legacy workspace-file fallback preserved for embeds that don't
  initialise the curated-memory runtime (pure unit tests).
Loads a 32-item corpus across gmail/calendar/imessage/slack into an
in-memory PersonalIndex and runs 12 queries through keyword / vector /
hybrid paths, asserting must_contain / must_not_contain within per-query
top-K prefixes.

The vector leg uses a deterministic FNV-1a feature-hashing embedder
(1536 dims, L2-normalized) so the test is offline and reproducible; real
embedder swaps behind one call. Fixture reserves `relevant` and
`pending` fields so recall@k / MRR / nDCG bolt onto the same JSON later.

q-src-01 is marked pending: hybrid_search currently ignores
Query.sources/since/until. Flip to false once filtering lands.
Previously every read grabbed the same `Arc<Mutex<Connection>>` as the
writer, so `IndexReader` calls serialised behind in-flight ingests even
though WAL would allow parallel readers. Split `PersonalIndex` into:

- `writer: Arc<Mutex<Connection>>` — unchanged single-writer model.
- `reader_pool: Option<Arc<r2d2::Pool<SqliteConnectionManager>>>` —
  four-connection read pool on file-backed indexes, built alongside the
  writer after migrations run. Each pooled connection gets
  `query_only=ON` as a belt-and-suspenders guard; sqlite-vec is loaded
  automatically via the process-wide auto-extension.

`IndexReader` routes `keyword_search` / `vector_search` through a new
`with_read_conn` helper that picks the pool when present and falls
back to the writer lock otherwise. In-memory handles keep the
single-connection layout — shared-cache URIs buy nothing at
test-fixture scale and would force every test to grow a unique name.
Moves the inline #[cfg(test)] mod tests block out of prompts/mod.rs into
prompts/tests.rs to bring the production file under the project's
<=500 line guideline and reduce mod.rs weight per the "light mod.rs" rule.
No behavior change.
Moves the inline #[cfg(test)] mod tests block out of channel.rs into
channel_tests.rs to bring the production file under the project's
<=500 line guideline. No behavior change.
Tidies whitespace left over from the parallel extraction commits in
b946aff and 82b739c — rustfmt re-flowed a handful of long string
literals in channel_tests.rs and compatible_tests.rs and dropped one
trailing blank line in channel.rs. No behavior change.
Splits the 1059-line prompts/mod.rs into:
- builder.rs (SystemPromptBuilder, ArchetypePromptSection, DynamicPromptSection)
- sections/{identity,tools,safety,workspace,runtime,datetime,user_memory,user_files}.rs
- subagent.rs (sub-agent prompt rendering)
- workspace_files.rs (workspace-file injection helpers)

mod.rs becomes a slim re-export hub. Closes the project's "light mod.rs"
rule violation and brings every file under the ~500 line guideline.
No behavior change; all 23 prompt tests still pass.
Moves inline #[cfg(test)] mod tests blocks into sibling _tests.rs
files to bring production files under the 500-line guideline.
No behavior change.

  skills/ops.rs          2678 -> 1711 (-967)
  security/policy.rs     1761 ->  817 (-944)
  providers/reliable.rs  1679 ->  861 (-818)
  composio/ops.rs        1583 ->  901 (-682)

Wired via `#[cfg(test)] #[path = "{name}_tests.rs"] mod tests;`
Same pattern as prior ef411ce (channel.rs, compatible.rs).
Mechanical test-sibling extraction. No behavior change.

  memory/ingestion.rs        1891 -> 1677 (-214)
  memory/store/unified/query 1653 -> 1270 (-383)
  memory/ops.rs              1481 -> 1158 (-323)
  config/ops.rs              1362 ->  803 (-559)
  agent/harness/tool_loop.rs 1383 ->  710 (-673)

Same pattern as 92da9cc / ef411ce.
4863 lib tests pass.
Mechanical test-sibling extraction. No behavior change.

  tools/impl/browser/browser.rs           2938 -> 2054 (-884)
  tokenjuice/reduce.rs                    2376 ->  907 (-1469)
  agent/harness/session/turn.rs           2003 -> 1435 (-568)
  agent/harness/subagent_runner/ops.rs    1725 -> 1146 (-579)
  channels/providers/lark.rs              1513 ->  900 (-613)
  core/jsonrpc.rs                         1488 -> 1110 (-378)
  channels/controllers/ops.rs             1413 -> 1051 (-362)
  voice/server.rs                         1398 -> 1069 (-329)
  channels/providers/whatsapp_storage.rs  1345 -> 1264 (-81)

Also rolls in rustfmt fixes for agent/prompts/{mod,sections/*}.rs
from an earlier commit that missed them.

Same pattern as 825d734 / 92da9cc / ef411ce.
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 23, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 98d07e46-7f60-4ff7-94ec-8f35d6ea7c3b

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jwalin-shah
Copy link
Copy Markdown
Contributor Author

Closing — continuation of #842 which was closed for the same reason: refactor design assumed pre-#835 state and is no longer reconcilable with current main without rewriting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants