A common use case is: user stores tags in a text column. Among all records that contain the specified tag(e.g. city=shanghai), he/she needs to search for the records nearest to the specified vector(e.g. embedding of traffic light image).
Currently, Scanner does not support setting both FTS query and vector query. I think we can support full text search as a filter in vector scan, for better performance and usability.
Since FTS is a filter of vector scan, the behavior is also affected by prefilter.
- If prefilter is true, we perform vector search based on the results of FTS.
- If prefilter is false, we perform vector search, then filter out the results that do not meet the FTS conditions.
In lancedb, we have supported hybrid search, which is a powerful feature that allows arbitrary sorting of query results through a Reranker. However, it requires executing two separate queries, even if we just want to perform a vector search based on the filtering results of FTS. I think in lance, we can handle this case better.
A common use case is: user stores tags in a text column. Among all records that contain the specified tag(e.g. city=shanghai), he/she needs to search for the records nearest to the specified vector(e.g. embedding of traffic light image).
Currently,
Scannerdoes not support setting both FTS query and vector query. I think we can support full text search as a filter in vector scan, for better performance and usability.Since FTS is a filter of vector scan, the behavior is also affected by prefilter.
In lancedb, we have supported hybrid search, which is a powerful feature that allows arbitrary sorting of query results through a Reranker. However, it requires executing two separate queries, even if we just want to perform a vector search based on the filtering results of FTS. I think in lance, we can handle this case better.