Skip to content

feat(filter): metadata retrieval filter#53

Merged
Intrinsical-AI merged 6 commits intodevelopfrom
feat/metadata-retrieval-filter
Mar 15, 2026
Merged

feat(filter): metadata retrieval filter#53
Intrinsical-AI merged 6 commits intodevelopfrom
feat/metadata-retrieval-filter

Conversation

@Intrinsical-AI
Copy link
Owner

ef745ff docs: update README and USAGE for metadata filter fields
cd82ff0 test: update unit/integration/e2e for new retriever API and docs filter
11f11e0 feat(docs-api): add filter support to docs listing endpoint
46306c1 refactor(retrievers): drop legacy str-query overloads across all backends
5ec14fd refactor(domain): centralise filter helpers, add snapshot_id, drop legacy metadata fields

Intrinsical-AI and others added 6 commits March 15, 2026 15:13
…gacy metadata fields

Promotes normalize_filter_values, document_field_values, document_matches_filters
into the domain module (retrieval.py). Adds snapshot_id to TOP_LEVEL_FILTER_FIELDS.
Removes LEGACY_METADATA_FILTER_FIELDS (path, language, unit_type). Drops legacy
(str, k) overload from RetrieverPort and EvalRetrieverPort — protocol is now
retrieve(request: RetrievalRequest) -> RetrievalResult only. Renames list_docs_page
→ query_docs with filters param in DocsReadPort.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ends

All retrievers (dense, sparse BM25, hybrid, elastic-like, local-split, Solr,
reranking, elastic-lexical) now implement retrieve(request: RetrievalRequest)
only. Removes all @overload stubs and legacy dispatch branches. local_split.py
switches to domain-level document_matches_filters. Elastic/Solr filter fields
updated (snapshot_id in, path/language/unit_type out). _ElasticLexicalRetriever
and _RepoDocsReadPort.query_docs gain filter support.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
New DocsQueryRequest schema with limit, offset, and filters fields.
DocumentInDB gains external_id, source_id, and metadata. Routers
wired to the new query_docs port method.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adapts all tests to the unified retrieve(RetrievalRequest) signatures
and the new docs query/filter contract.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Updates test_dense_edgecases, test_hybrid_weighting, test_retrievers,
test_sparse_empty, test_sparse_in_memory_cache, test_sparse_tokenization,
test_composition, and the dense/hybrid e2e to use RetrievalRequest instead
of the legacy (str, k) overload. Edge-case tests for blank query and
top_k=0 now assert ValueError at RetrievalRequest construction.
@Intrinsical-AI Intrinsical-AI merged commit cd05ada into develop Mar 15, 2026
10 checks passed
@Intrinsical-AI Intrinsical-AI deleted the feat/metadata-retrieval-filter branch March 15, 2026 14:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant