Skip to content

PRD: SQLite vector search for docs chat — feasibility and architecture #40

@diberry

Description

@diberry

Problem

The Squad docs site needs a chat/search experience that lets users ask questions and get answers grounded in the documentation. The question is: can we do this with SQLite + vector search instead of requiring a cloud vector database?

Context

  • Squad already uses SQLite (node:sqlite in Node 22.5+) for the StorageProvider and session store
  • The docs site is custom Astro (not Starlight)
  • We want to avoid adding cloud dependencies (Azure AI Search, Pinecone, etc.) if possible
  • The docs content is markdown files in docs/src/content/

Research Questions

1. SQLite Vector Search Capabilities

  • Does sqlite-vec or sqlite-vss provide vector similarity search?
  • What's the Node.js integration story? (better-sqlite3 vs node:sqlite built-in)
  • Performance: how many vectors can SQLite handle before degrading? (we have ~100 docs pages)
  • Comparison: sqlite-vec vs sqlite-vss vs vectorlite — which is production-ready?

2. Embedding Pipeline

  • What embedding model to use? (local vs API — all-MiniLM-L6-v2 via @xenova/transformers for local?)
  • Chunking strategy for markdown docs (by heading? by paragraph? fixed token windows?)
  • When to generate embeddings? (build time? squad docs:build step? CI?)
  • Storage: embed the SQLite DB in the docs build output? Or generate at runtime?

3. Chat UI Architecture

  • Astro component for chat overlay/sidebar
  • Query flow: user question → embed → vector search → retrieve chunks → LLM answer
  • LLM for answer generation: use an API (OpenAI, Azure OpenAI) or keep it local?
  • Fallback: if no vector match, fall back to keyword search?
  • Streaming responses?

4. Alternative Approaches to Evaluate

  • Full local: SQLite-vec + local embeddings + local LLM (Ollama) — zero cloud deps
  • Hybrid: SQLite-vec + local embeddings + cloud LLM (OpenAI API) — minimal cloud
  • Cloud: Azure AI Search + Azure OpenAI — full cloud, most capable
  • Existing tools: Can we just use Astro's built-in search (Pagefind)? Does it meet the need?
  • GitHub Copilot integration: Can we leverage Copilot's knowledge base features instead of building custom?

5. Build vs Buy

  • Evaluate existing doc-chat solutions: Markprompt, Inkeep, Mendable
  • Cost comparison: self-hosted SQLite vs managed vector DB
  • Maintenance burden: keeping embeddings in sync with docs changes

Deliverables for the PRD

  1. Feasibility report: Can SQLite vector search handle our scale (~100 pages, ~500 chunks)?
  2. Architecture proposal: Recommended stack with trade-offs table
  3. Prototype scope: What's the MVP? (search only? chat? streaming?)
  4. Cost estimate: Per-query cost for each approach
  5. Implementation plan: Phased rollout with dependencies

Success Criteria

  • User can ask a natural language question about Squad docs
  • Answers are grounded in actual docs content (with source links)
  • Latency < 3 seconds for first response
  • Works without requiring users to have API keys (for public docs site)
  • Embedding pipeline runs automatically on docs changes

Labels

squad, enhancement, go:needs-research

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestgo:needs-researchNeeds investigationsquadSquad triage inbox — Lead will assign to a membersquad:fidoAssigned to FIDO (Quality Owner)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions