Skip to content

Implement Vector Storage with LanceDB and Transformers.js #4

@prosdev

Description

@prosdev

Description

Implement vector storage and retrieval system using LanceDB for embedded storage and @xenova/transformers for local embedding generation. This replaces the original plan for Chroma DB to better support a local-first, serverless architecture.

Acceptance Criteria

  • Vector storage system initializes using LanceDB (embedded)
  • Embedding generation uses @xenova/transformers (all-MiniLM-L6-v2)
  • System downloads and caches the embedding model on first run
  • Vector search returns relevant results ranked by cosine similarity
  • Metadata is stored efficiently alongside vectors
  • System handles standard repository sizes efficiently without a separate server process

Technical Requirements

  • Integrate @lancedb/lancedb for serverless vector storage
  • Integrate @xenova/transformers for local embedding generation
  • Implement efficient batching for embedding generation
  • Define clear interfaces for 'VectorStore' and 'Embedder'
  • Ensure cross-platform compatibility for the native bindings
  • Add unit tests for storage and retrieval operations

Branch: feat/vector-storage
Priority: High
Estimate: 4 days
Parent Epic: #1

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions