feat: Vector search

Unless we have an Open Source LLM with a 1M+ token context length then we need vector search for the assistant API: https://github.com/mudler/LocalAI/issues/1273#issuecomment-1971736182

Even with a very large context length it is still far cheaper to use vector search with embeddings. It can all easily be done on CPU.

# Implementation

I see three main options for adding vector search:

1. Simple in-memory brute force search. We regenerate the embeddings instead of saving them to storage.
2. Add one or more vector databases as a backend
3. Connect to an external database

The first is easy to implement and doesn't have any upkeep because we flush everything after a restart. If we want to change chunking size or any hyperparameter it has the same cost doing a restart. There is plenty of prior art in Go:
 - https://github.com/marekgalovic/anndb
 - https://github.com/aws-samples/gofast-hnsw/?tab=readme-ov-file#brute-search-performance
 - Milvus, Weaviate, Gorse

Even implementing HNSW or Annoy would not be difficult. The main problems I see are the classic database issues. So I am in favor of doing 1. or 3. no in-between. Although saving embeddings to a flat file could be OK, just not on the first iteration.

I did make an experiment using BadgerDB, but talked myself out of it: https://github.com/richiejp/badger-cybertron-vector/blob/main/main.go. The problem is that it complicates comparing the vectors and then we also have to maintain state between restarts.

# API

Obviously we will follow the OpenAI API as in #1273, but I think it would also make sense to have some API to do simple search without an LLM. Just so people can do fuzzy search with LocalAI instead of reaching for another tool. Suggestions for how this API should look welcome.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Vector search #1792

Implementation

API

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

feat: Vector search #1792

Description

Implementation

API

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions