Make the embedding service detect a CUDA device and load the model on GPU when available (device="cuda" in the FastEmbed/SentenceTransformer constructor). No code change needed on the Qdrant side. Reduces ingestion time significantly for large corpora.
Make the embedding service detect a CUDA device and load the model on GPU when available (
device="cuda"in the FastEmbed/SentenceTransformer constructor). No code change needed on the Qdrant side. Reduces ingestion time significantly for large corpora.