search: precompute intent once, reuse across all semantic_search calls by donsummerwind · Pull Request #76 · sopaco/cortex-mem

donsummerwind · 2026-04-28T12:35:08Z

Summary

Precompute LLM intent analysis once per search, then reuse across all semantic_search calls. Reduces LLM calls from 5 to 1 per search.

Changes

SearchOptions gains precomputed_intent: Option<Arc<EnhancedQueryIntent>>
search_handler calls analyze_intent() once before layered_semantic_search
layered + 4 semantic_search calls all reuse the same intent
Result: ~3-5x search speedup

Performance

Before	After
5× LLM calls (~16s)	1× LLM call (~3s warm, 7B cold ~8s)

Files changed

cortex-mem-core/src/search/vector_engine.rs
cortex-mem-core/src/types.rs
cortex-mem-core/src/vector_store/qdrant.rs
cortex-mem-service/src/handlers/filesystem.rs
cortex-mem-service/src/handlers/search.rs
cortex-mem-service/src/main.rs

- SearchOptions gains precomputed_intent: Option<Arc<EnhancedQueryIntent>> - search handler calls analyze_intent() once before layered_semantic_search - layered + 4 semantic_search calls all reuse the same intent (5 LLM calls → 1) - ~3-5x search speedup depending on model warm/cold state fixes: 5 serial LLM calls per search (intent analysis bottleneck)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

search: precompute intent once, reuse across all semantic_search calls#76

search: precompute intent once, reuse across all semantic_search calls#76
donsummerwind wants to merge 1 commit intosopaco:mainfrom
donsummerwind:main

donsummerwind commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

donsummerwind commented Apr 28, 2026

Summary

Changes

Performance

Files changed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant