Skip to content

feat(docs): enhance search with TF-IDF relevance ranking (#40)#656

Closed
diberry wants to merge 1 commit intobradygaster:devfrom
diberry:squad/40-semantic-search
Closed

feat(docs): enhance search with TF-IDF relevance ranking (#40)#656
diberry wants to merge 1 commit intobradygaster:devfrom
diberry:squad/40-semantic-search

Conversation

@diberry
Copy link
Copy Markdown
Collaborator

@diberry diberry commented Mar 27, 2026

What this improves

Adds an optional relevance ranking toggle to the existing Pagefind search (Ctrl+K / Cmd+K). When enabled, results are re-ranked using TF-IDF cosine similarity with title/heading boost — surfacing the best-match page first instead of showing every page that contains the keyword.

How it works

  • Toggle off (default): Pagefind keyword search — unchanged behavior
  • Toggle on: Pagefind finds candidates, TF-IDF re-ranks by relevance, best match first
  • Checkbox persists in localStorage

Changes (6 files)

  • Search.astro — relevance ranking toggle + TF-IDF scoring engine merged into existing Pagefind search
  • build-search-index.mjs — build-time script: chunks ~108 markdown files into ~1638 segments for TF-IDF scoring
  • docs-search.test.ts — 9 search quality tests (schema, coverage, relevance, data quality)
  • global.css — search result styling cleanup
  • docs/package.json — build:search script
  • docs/.gitignore — exclude generated search-index.json

Testing

  • 9/9 search quality tests pass locally
  • docs-quality CI check passes
  • Zero new dependencies — pure JS scoring

Decisions (posted on #40)

  • Static JSON index for MVP (not SQLite)
  • Fuse.js as upgrade path (PR 2)
  • Extractable as standalone astro-search-quality plugin
  • Single search UX with feature flag toggle

Relates to diberry#40

@diberry diberry marked this pull request as draft March 27, 2026 18:33
@diberry diberry force-pushed the squad/40-semantic-search branch 5 times, most recently from 455dc4b to cbe69de Compare March 27, 2026 20:30
@diberry diberry closed this Mar 27, 2026
@diberry diberry reopened this Mar 27, 2026
@diberry diberry changed the title feat(docs): enhance search with TF-IDF relevance ranking (#40) wip-feat(docs): enhance search with TF-IDF relevance ranking (#40) Mar 27, 2026
@diberry
Copy link
Copy Markdown
Collaborator Author

diberry commented Mar 27, 2026

Live site Squad - with regular search is keyword based

image

Live site Squad - with a chat-style entry

image

Proposed change in this PR - keyword search is default and still works

image

Proposed change in this PR for chat-style entry (with relevance ranking checked

image

@diberry diberry force-pushed the squad/40-semantic-search branch from cbe69de to a8f2dce Compare March 27, 2026 22:56
@diberry diberry changed the title wip-feat(docs): enhance search with TF-IDF relevance ranking (#40) feat(docs): enhance search with TF-IDF relevance ranking (#40) Mar 27, 2026
@diberry diberry force-pushed the squad/40-semantic-search branch from a8f2dce to 26af2fb Compare March 27, 2026 23:02
@diberry diberry changed the title feat(docs): enhance search with TF-IDF relevance ranking (#40) [wip] feat(docs): enhance search with TF-IDF relevance ranking (#40) Mar 27, 2026
Search improvements:
- TF-IDF relevance ranking toggle (localStorage persistent)
- Pagefind basePath fix for dev mode
- Conditional keyword highlights (on when relevance off)
- Playwright e2e tests with self-contained webServer
- build:pagefind convenience script for dev
- CI: docs search index build step for test suite

Team review: Flight approve, FIDO approve
Relates to #40

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@diberry diberry force-pushed the squad/40-semantic-search branch from 26af2fb to 120ad0c Compare March 27, 2026 23:09
@diberry diberry marked this pull request as ready for review March 27, 2026 23:14
@diberry diberry changed the title [wip] feat(docs): enhance search with TF-IDF relevance ranking (#40) feat(docs): enhance search with TF-IDF relevance ranking (#40) Mar 28, 2026
@diberry diberry marked this pull request as draft March 28, 2026 14:13
@diberry
Copy link
Copy Markdown
Collaborator Author

diberry commented Mar 28, 2026

Closing -- will re-open via fork-first pipeline when fully polished.

@diberry diberry closed this Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants