fix(search): wire search_skills to SkillRanker embedding cache by fabioscarsi · Pull Request #81 · HKUDS/OpenSpace

fabioscarsi · 2026-04-18T09:53:34Z

Problem

hybrid_search_skills in openspace/cloud/search.py generates candidate embeddings via generate_embedding on every invocation, and SkillSearchEngine._bm25_phase instantiates a fresh SkillRanker per call that reloads the pickle cache each time. The persistent embedding cache in openspace/skill_engine/skill_ranker.py is unused by the MCP search_skills tool path.

On a 28-skill local registry with text-embedding-3-small via OpenRouter, this produces 8-14 seconds of latency per query.

Additionally, SkillRanker._embedding_cache is keyed by skill_id alone, so any edit to a SKILL.md body or description would leave stale embeddings in place until a manual invalidate_cache call. This behaviour was masked while the cache was only used by select_skills_with_llm, but is exposed by wiring it into search_skills.

Fix

Commit 1 — wire the cache (fc95348). Route both paths through a shared SkillRanker singleton, reusing the persistent pickle at .openspace/skill_embedding_cache/skill_embeddings_v1.pkl across invocations. Candidates without a stable skill_id are skipped to avoid cache key collisions.

Commit 2 — content-addressed cache key (3708359). Change the cache key from skill_id to "{skill_id}:{sha256(embedding_text)[:16]}". Any change to the text that feeds the embedding (name, description, body, truncated to SKILL_EMBEDDING_MAX_CHARS) flips the key and forces a fresh embedding. Both get_or_compute_embedding and _embedding_rank use this pattern; invalidate_cache(skill_id) was updated to remove every entry with that skill_id prefix plus any legacy-format entry.

Backward compatibility: existing pickle files keyed by skill_id alone are migrated in place on first lookup (no API call). The old key is dropped after migration, so no double bookkeeping persists.

Bounded cache growth: when a new embedding is computed for a skill, older entries carrying the same "{skill_id}:" prefix are pruned in the same write. Net result: at most one cached embedding per skill_id in steady state.

Measured impact

On 28 local skills, text-embedding-3-small via OpenRouter:

	before	after
query latency (warm)	8-14 s	~300 ms
top-1 match identity	—	preserved on all test queries
score drift	—	<0.001
cache staleness after SKILL.md edit	persists until manual invalidation	automatically invalidated on next lookup

Per-query latency (warm) on 6 test queries after both commits: 259, 292, 308, 333, 298, 272 ms (mean 294 ms).

Out of scope

Cloud-only paths (server-ranked _search_rank): unchanged. Cloud candidates carry _embedding from the server and are skipped by the new code path.
Thread-safety of SkillRanker._save_cache: pre-existing; asyncio single-thread usage makes races unlikely in practice.

No new dependency; SkillRanker already ships in the repo and is exercised by select_skills_with_llm. The only stdlib addition is hashlib.

Both paths in search_skills/hybrid_search_skills now go through a shared SkillRanker singleton: - SkillSearchEngine._bm25_phase: previously instantiated a fresh SkillRanker per call, reloading the pickle cache each time. - hybrid_search_skills candidate loop: previously generated embeddings via generate_embedding on every query, ignoring the persistent cache entirely. The persistent pickle at .openspace/skill_embedding_cache/skill_embeddings_v1.pkl is reused across invocations and survives process restarts. Candidates without a stable skill_id are skipped to avoid cache key collisions. On a 28-skill local registry with text-embedding-3-small via OpenRouter, query latency drops from 8-14s to ~300ms after warm-up. Top-1 match identity is preserved on all test queries (score drift <0.001). Cloud candidates that already carry _embedding from the server-side search endpoint are skipped and unchanged.

The embedding cache was keyed by skill_id alone, so any edit to a SKILL.md body or description produced stale embeddings that get_or_compute_embedding kept serving until a manual invalidate_cache call or a file deletion. Previously this was mostly invisible because select_skills_with_llm was the only caller exercising the cache; after the preceding commit wires search_skills through the same path the staleness becomes observable on every MCP query. Use "{skill_id}:{sha256(embedding_text)[:16]}" as the cache key, so any change to the text produced by _build_embedding_text (name + description + body, truncated to SKILL_EMBEDDING_MAX_CHARS) causes an automatic cache miss and a fresh embedding. Both get_or_compute_embedding and _embedding_rank are updated. Bounded growth: on each successful new compute, older entries with the same "{skill_id}:" prefix are pruned in the same write. Net result: at most one cached embedding per skill_id at any time, aside from transient migration state. Backward compatibility: existing pickle files keyed by skill_id alone are migrated in place on first lookup (no API call needed); the old key is dropped after migration. invalidate_cache(skill_id) now removes every content-addressed entry and any legacy entry for that skill_id, so historical versions do not leak across evolutions. Functional benchmark on a 28-skill local registry with text-embedding-3-small via OpenRouter: top-1 match identity preserved on all test queries, score drift below 0.001, warm latency ~260-400ms/query (unchanged from the previous commit).

fabioscarsi added 2 commits April 18, 2026 11:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(search): wire search_skills to SkillRanker embedding cache#81

fix(search): wire search_skills to SkillRanker embedding cache#81
fabioscarsi wants to merge 2 commits intoHKUDS:mainfrom
fabioscarsi:feat/wire-skill-ranker-cache-to-search

fabioscarsi commented Apr 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fabioscarsi commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Measured impact

Out of scope

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fabioscarsi commented Apr 18, 2026 •

edited

Loading