feat(search-engine): add Elasticsearch backend and canonical import/sync by Intrinsical-AI · Pull Request #44 · Intrinsical-AI/rag-prototype

Intrinsical-AI · 2026-03-14T21:17:28Z

Summary

Elasticsearch persistence backend: full document storage, history,
system state, diagnostics, and index mapping management; wired through
composition and runtime alongside the existing SQL backend.
scope / snapshot_id fields: propagated across the domain model,
mutation contracts, both persistence adapters (SQL + ES), HTTP schemas,
and CLI — enabling documents to be tagged with an origin scope and version.
Canonical import endpoint (POST /api/docs/import-canonical) and
CLI command (rag-import-canonical): declarative snapshot-based sync
that upserts a scoped document collection and optionally hard-deletes stale
docs not present in the current snapshot (replace_scope mode).

Test plan

Unit tests pass for SQL and Elasticsearch document storage (scope persistence, list_external_ids_by_scope)
Unit tests pass for canonical import endpoint (upsert-only and replace_scope paths, duplicate external_id rejection)
Unit tests pass for rag-import-canonical CLI (sync + cross-snapshot deletion)
CLI registry test confirms import-canonical command and entry point are wired
Elasticsearch backend runtime tests pass (client, mappings, diagnostics)
Spin up stack with docker-compose up, index docs via rag-import-canonical, verify scope-aware deletion

Add nullable scope and snapshot_id fields to the SQL Document table and Elasticsearch index mappings, enabling external producers to tag documents with an origin scope and version snapshot.

…ields Extend UpsertDocBuilderPort, MutationUpsertInput, and normalization/serialization to carry scope and snapshot_id. Add CanonicalImportSummary result type for reporting insert/update/delete statistics after a canonical import.

…xecutors Wire the new fields from MutationUpsertInput into the upsert builder calls in both the atomic and saga mutation execution paths.

…nal_ids_by_scope() Update SQL and Elasticsearch repositories to read/write scope and snapshot_id in upsert, change detection, and domain mapping. Add list_external_ids_by_scope() to both stores to support stale-doc deletion in canonical imports.

Introduce CanonicalImportRequest/Response schemas with scope, snapshot_id, replace_scope flag, and duplicate external_id validation. Add the endpoint behind the multi-store write lock. Extend existing mutate endpoint schemas to accept scope/snapshot_id on individual upsert items.

Implement execute_import_canonical_sync() use case: batched upsert (256 docs) with optional replace_scope hard-deletion of stale documents not present in the current snapshot. Wire as rag-import-canonical CLI entry point and register in the CLI group.

Document the new import-canonical HTTP endpoint and CLI command with curl and CLI invocation examples. Note scope/snapshot synchronization semantics and replace_scope behavior.

Intrinsical-AI · 2026-03-14T21:25:53Z

Unstable - mergin to dev first

Intrinsical-AI added 13 commits March 13, 2026 09:17

feat(rag): add elasticsearch persistence backend

ecb3556

feat(rag): wire elasticsearch backend through runtime and api

83a5297

test(rag): cover elasticsearch backend wiring and storage semantics

5e2acd9

docs(rag): document elasticsearch backend configuration and behavior

2e52036

feat(persistence): add scope and snapshot_id columns to document stores

e23692b

Add nullable scope and snapshot_id fields to the SQL Document table and Elasticsearch index mappings, enabling external producers to tag documents with an origin scope and version snapshot.

feat(use-cases): pass scope and snapshot_id through atomic and saga e…

30c7023

…xecutors Wire the new fields from MutationUpsertInput into the upsert builder calls in both the atomic and saga mutation execution paths.

docs: add canonical import examples to README and USAGE guide

d722dce

Document the new import-canonical HTTP endpoint and CLI command with curl and CLI invocation examples. Note scope/snapshot synchronization semantics and replace_scope behavior.

(mutate): commit pending file to last batch - missing

530fbca

fix(lint-format): formatted files

094881c

Intrinsical-AI closed this Mar 14, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(search-engine): add Elasticsearch backend and canonical import/sync#44

feat(search-engine): add Elasticsearch backend and canonical import/sync#44
Intrinsical-AI wants to merge 13 commits intomasterfrom
feat/elastic-backend

Intrinsical-AI commented Mar 14, 2026

Uh oh!

Intrinsical-AI commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Intrinsical-AI commented Mar 14, 2026

Summary

Test plan

Uh oh!

Intrinsical-AI commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant