feat: add support for Valkey vector database [Internal Feedback]#1
feat: add support for Valkey vector database [Internal Feedback]#1rileydes-improving wants to merge 1 commit into
Conversation
Jonathan-Improving
left a comment
There was a problem hiding this comment.
QA Review — Valkey Vector Database Backend
Verdict: REQUEST_CHANGES — 2 must-fix, 5 suggestions
🔴 Must Fix
-
Use
valkey-glideinstead ofvalkeypackage — GLIDE is the official recommended client with full FT.* module support, type-safe APIs, and better connection management. No blocking constraint exists; every operation in this file has a GLIDE equivalent. See inline comment on line 12. -
IP score normalization formula is incorrect (line 631) — treats
__vector_scoreas raw inner product but valkey-search returns it as a distance. The re-sort workaround at line 616 confirms the formula is wrong.
🟡 Suggestions
- Missing
DIALECT 2inquery()anddelete()FT.SEARCH calls (lines 414-427, 487-495) - N+1 query pattern in
get()— individual HGETALL per key instead of pipeline (lines 443-445) _escape_tag_valuemissing?wildcard (line 46)- No config validation for
VALKEY_DISTANCE_METRICat init (line 86) query()hardcoded 10000 limit may silently truncate (line 413)
✅ Positives
- Excellent startup version validation with clear error messages
- Dimension mismatch detection prevents silent data corruption
- Correct FT.CREATE schema construction
- Correct KNN query syntax with DIALECT 2 in
search() - Good edge case handling (
$in: [], empty items) - Clean integration with existing factory/type patterns
- Thorough PR description with known workarounds documented
| try: | ||
| result = self.client.execute_command( | ||
| 'FT.SEARCH', | ||
| self._index_name(collection_name), | ||
| query_str, | ||
| 'RETURN', | ||
| 3, | ||
| 'id', | ||
| 'text', | ||
| 'metadata_json', | ||
| 'LIMIT', | ||
| 0, | ||
| effective_limit, | ||
| ) |
There was a problem hiding this comment.
🟡 Suggestion — Add DIALECT 2 to this FT.SEARCH call
search() correctly uses 'DIALECT', 2 but query() omits it. DIALECT 2 is required for correct parsing of filter expressions with negation (-@field:{...} from $ne). Without it, complex filters may be parsed incorrectly.
Add 'DIALECT', 2, after the effective_limit, line (same pattern as search()).
| 'Set it to your Valkey server URL (e.g., valkey://localhost:6379).' | ||
| ) | ||
|
|
||
| self.collection_prefix = VALKEY_COLLECTION_PREFIX |
There was a problem hiding this comment.
🟡 Suggestion — Validate VALKEY_DISTANCE_METRIC at init
Invalid VALKEY_INDEX_TYPE silently falls back to FLAT (line 237), but invalid VALKEY_DISTANCE_METRIC (e.g., "HAMMING") passes through to FT.CREATE and fails with a cryptic server error. Validate against {'COSINE', 'L2', 'IP'} here with a clear message.
|
|
||
| def _escape_tag_value(value: str) -> str: | ||
| """Escape special characters for RediSearch/Valkey-Search TAG field queries.""" | ||
| return re.sub(r'([,.<>{}\[\]"\':;!@#$%^&*()\-+=~\\/| \t\n\r])', r'\\\1', str(value)) |
There was a problem hiding this comment.
🟢 Suggestion — Missing ? wildcard in escape regex
The character class escapes * but not ?. In valkey-search TAG queries, ? is a single-character wildcard. A filter value containing ? could match unintended documents. Add ? to the character class.
| result = self.client.execute_command( | ||
| 'FT.SEARCH', | ||
| index_name, | ||
| filter_expr, | ||
| 'NOCONTENT', | ||
| 'LIMIT', | ||
| 0, | ||
| page_size, | ||
| ) |
There was a problem hiding this comment.
🟡 Suggestion — Add DIALECT 2 here as well
Same issue as query() — this FT.SEARCH call for delete-by-filter omits DIALECT 2. Negation filters from $ne may not parse correctly without it.
| # L2 distance: 0 (identical) to infinity. | ||
| return 1.0 / (1.0 + score) | ||
| # IP (Inner Product): assumes unit-normalized vectors. | ||
| return max(0.0, min(1.0, (score + 1.0) / 2.0)) |
There was a problem hiding this comment.
🔴 Must Fix — IP score normalization formula is incorrect
valkey-search returns __vector_score as a distance (lower = more similar) for all metrics, including IP where it returns 1 - inner_product. This formula treats the raw score as an inner product in [-1, 1], but it's actually a distance in [0, 2] for unit vectors (same range as COSINE).
The re-sort at line 616-618 is a symptom — if normalization were correct and monotonic, the server's ascending-distance order would already be correct.
Suggested fix: return 1.0 - score (same as COSINE without the /2.0 divisor, since IP distance range for unit vectors is [0, 2] → similarity [−1, 1], but clamped to [0,1] it becomes max(0.0, 1.0 - score)).
| cursor, keys = self.client.scan(cursor=cursor, match=f'{prefix}*', count=500) | ||
| for key in keys: | ||
| fields = self.client.hgetall(key) |
There was a problem hiding this comment.
🟡 Suggestion — N+1 query pattern: use pipeline for HGETALL calls
Each key triggers a separate round-trip. For collections with hundreds of documents this is a significant perf issue. Batch with a pipeline:
pipe = self.client.pipeline()
for key in keys:
pipe.hgetall(key)
results = pipe.execute()
for fields in results:
...(Note: if migrating to GLIDE, this becomes a ClusterBatch or similar pattern.)
| import struct | ||
| from urllib.parse import urlparse | ||
|
|
||
| import valkey |
There was a problem hiding this comment.
🟡 Advisory — Consider valkey-glide for typed FT.* APIs
The valkey package (redis-py fork) is the legacy client. The recommended path forward is valkey-glide — the official high-performance GLIDE client. GLIDE has full FT.* module support via module-level functions:
Current (valkey) |
GLIDE equivalent (valkey-glide) |
|---|---|
execute_command('FT.CREATE', ...) |
ft.create(client, index_name, schema, options) |
execute_command('FT.SEARCH', ...) |
ft.search(client, index_name, query, options) |
execute_command('FT.DROPINDEX', ...) |
ft.dropindex(client, index_name) |
execute_command('FT._LIST') |
ft.list(client) |
execute_command('FT.INFO', ...) |
ft.info(client, index_name) |
execute_command('MODULE', 'LIST') |
client.custom_command(['MODULE', 'LIST']) |
Benefits of switching:
- Type-safe schema construction —
VectorField,VectorFieldAttributesHnsw,FtCreateOptionsetc. vs raw string arrays - Cleaner response format —
ft.search()returns[count, {key: {field: value}}](a dict), eliminating the need for_decode_kv_pairs/_parse_ft_resultmanual parsing - Better connection management — multiplexed, thread-safe, auto-reconnect
- Typed search options —
FtSearchOptions,FtSearchLimitvs positional string args
Package: valkey-glide (async) or valkey-glide-sync (sync). Import pattern:
from glide import ft, GlideClient, FtCreateOptions, VectorField, ...
# or sync:
from glide_sync import ft, GlideClient, ...No blocking constraint exists — every operation in this file has a GLIDE equivalent.
There was a problem hiding this comment.
Clarification on tone: This isn't necessarily "wrong" — there may be valid reasons to use valkey over valkey-glide here (e.g., matching the existing redis==7.4.0 dependency already in pyproject.toml, minimizing unfamiliar APIs for upstream reviewers, or GLIDE Python package maturity concerns).
The key ask is: please document the rationale in the PR description before submitting upstream. If it was intentional, a sentence explaining why is sufficient. If it was an oversight, it's worth evaluating — GLIDE would simplify the response parsing significantly and eliminate the manual _decode_kv_pairs / _parse_ft_result machinery.
MatthiasHowellYopp
left a comment
There was a problem hiding this comment.
Python Code Review — Additional Findings
Verdict: Request changes — 5 items to address before merge.
1. IP score normalization is incorrect (critical)
Line 631 — _normalize_score for IP metric:
# IP (Inner Product): assumes unit-normalized vectors.
return max(0.0, min(1.0, (score + 1.0) / 2.0))valkey-search returns 1 - inner_product as the distance (not raw IP), so for unit vectors the score is in [0, 2]. The formula (score + 1.0) / 2.0 maps [-1, 1] → [0, 1], which is the wrong input domain. This will produce incorrect similarity rankings for the IP metric.
Suggested fix: 1.0 - score for unit-normalized vectors (maps distance [0, 2] → similarity [-1, 1], then clamp), or document and validate the expected score semantics from valkey-search for your version.
2. Missing DIALECT 2 in query() and delete() FT.SEARCH calls (warning)
Lines 414-427, 487-495 — search() correctly uses DIALECT 2, but query() and delete() omit it. Without DIALECT 2, complex filter expressions (especially those with TAG unions like @field:{val1|val2}) may not parse correctly.
Add 'DIALECT', 2 to both FT.SEARCH calls.
3. N+1 in get() — pipeline the HGETALL calls (warning)
Line 443 — Each key from SCAN gets an individual HGETALL round trip:
for key in keys:
fields = self.client.hgetall(key)For 500 keys per SCAN batch, that is 500 network round trips. Use a pipeline:
pipe = self.client.pipeline()
for key in keys:
pipe.hgetall(key)
results = pipe.execute()
for fields in results:
...4. Broad exception swallowing hides connection failures (warning)
Lines 350, 420, 490 — search(), query(), and delete() all catch bare Exception and return None / silently continue:
except Exception as e:
log.error(f'Valkey search error on collection {collection_name}: {e}')
return NoneThe caller interprets None as "no results" rather than "search failed." Catch valkey.exceptions.ValkeyError (or valkey.exceptions.ConnectionError + valkey.exceptions.ResponseError) specifically. Let unexpected exceptions propagate so transient failures are visible to the application layer.
5. _escape_tag_value — compile regex at module level, add missing ? wildcard (warning)
Line 46 — The regex is compiled on every call and is missing ? (a wildcard in valkey-search TAG queries):
def _escape_tag_value(value: str) -> str:
return re.sub(r'([,.<>{}\[\]"\\'\\\\:;!@#$%^&*()\-+=~\\\\/| \t\n\r])', r'\\\1', str(value))Suggested fix:
_TAG_SPECIAL_RE = re.compile(r'([,.<>{}\[\]"\\'\\\\:;!@#$%^&*()\-+=~?/| \t\n\r])')
def _escape_tag_value(value: str) -> str:
"""Escape special characters for valkey-search TAG field queries."""
return _TAG_SPECIAL_RE.sub(r'\\\1', str(value))Overall the implementation is well-structured, follows existing backend patterns, and has excellent startup validation. These 5 items are the blockers; the rest of the code is solid.
|
Notes on the PR description (for upstream submission prep):
|
Jonathan-Improving
left a comment
There was a problem hiding this comment.
Follow-up review — 3 additional findings from a second pass with anti-pattern and security-focused reviewers.
One new critical (error swallowing), two new medium suggestions (unbounded SCAN, filter key validation). These complement the original 7 findings.
| ids, documents, metadatas = [], [], [] | ||
| cursor = 0 | ||
| while True: |
There was a problem hiding this comment.
🟡 Suggestion — Add a hard upper limit when limit is None
The VectorDBBase.get() signature has no limit parameter, so callers using the base interface will always hit this with limit=None. In that case, this loop scans the entire keyspace with no cap — for a collection with hundreds of thousands of documents, this could OOM the application server or block the Valkey instance with sustained SCAN + HGETALL traffic.
query() caps at 10000 (line 413). Consider the same here:
effective_limit = limit if limit and limit > 0 else 10_000Or at minimum, log a warning when returning large result sets so operators can tune.
| Raises ValueError on unsupported operators rather than silently matching nothing. | ||
| """ | ||
| parts = [] | ||
| for key, value in filter.items(): |
There was a problem hiding this comment.
🟡 Suggestion — Guard against => in constructed filter expressions
The => token in FT.SEARCH syntax delimits a filter from a KNN vector clause. This regex escapes = and > individually (both are in the character class), but if a filter value contains the literal sequence => and the escaping doesn't break it apart, it could inject a KNN clause into a filter-only query like query() or delete().
Looking more carefully: = IS escaped (it's in +=~) and > IS escaped (it's in <>). So => in a value becomes \=\> which is safe. However, the filter keys (line 56: f'@{key}:...') are NOT escaped or validated at all. If a future caller passes user-controlled keys, => injection becomes possible.
Defensive fix — validate keys in _build_filter_expression:
_ALLOWED_FILTER_KEYS = frozenset({'id', 'hash', 'file_id', 'source', 'knowledge_base_id'})
for key in filter.keys():
if key not in _ALLOWED_FILTER_KEYS:
raise ValueError(f"Unsupported filter key: {key!r}")This is defense-in-depth — current callers use hardcoded keys, but the function itself has no guard.
| except Exception as e: | ||
| log.error(f'Valkey search error on collection {collection_name}: {e}') | ||
| return None |
There was a problem hiding this comment.
🔴 Must Fix — except Exception swallows all errors silently
This catches everything — KeyboardInterrupt, SystemExit, MemoryError, and programming bugs like TypeError or AttributeError. The caller receives None with no way to distinguish "no results found" from "server is down" or "there's a bug in the code."
Same pattern appears in query() (line 428), delete() (line 471), and delete()-by-filter (line 496).
Fix: Catch only Valkey protocol/connection errors:
except valkey.exceptions.ValkeyError as e:
log.error(f'Valkey search error on collection {collection_name}: {e}')
return NoneThis still handles connection failures, timeouts, and server errors gracefully while letting programming bugs propagate so they're caught during development.
|
LGTM |
Signed-off-by: Riley Des <riley.desserre@improving.com>
c9e6a43 to
20252b5
Compare
Pull Request Checklist
Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.
This is to ensure large feature PRs are discussed with the community first, before starting work on it. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.
Before submitting, make sure you've checked the following:
dev.open-webui/docscoveringVECTOR_DB=valkey, the newVALKEY_*environment variables, and module/version requirements. Backend brief included atvalkey-testing/VALKEY_VECTOR_STORE_BRIEF.mdfor reviewers.valkey==6.1.1(wire-compatible with redis-py). Added to bothbackend/requirements.txtandpyproject.toml. The client is only imported lazily from the factory whenVECTOR_DB=valkey, so existing deployments are unaffected. The code paths that use the client have been exercised end-to-end againstvalkey-bundle:9.1.0-rc2.valkey-bundle:9.1.0-rc2(Valkey core 9.1.0 + valkey-search 1.2.0) and a custom image with (Valkey core 9.0.1 + valkey-search 1.2.0). Manually verified document ingestion, KNN search, filtered KNN, metadata queries, collection reset, and deletion by id/filter.backend/open_webui/retrieval/vector/dbs/.VECTOR_DB=valkeyandVALKEY_URL. Defaults mirror the other vector backends.dev.feat:Changelog Entry
Description
Adds Valkey as a vector database backend for RAG, selectable via
VECTOR_DB=valkey. The implementation uses the valkey-search module's nativeFT.CREATE/FT.SEARCHprimitives for HNSW/FLAT vector indexing, KNN queries, and tag-filtered metadata queries, no client-side vector math. Requires valkey-search 1.2.0+ running on Valkey core 9.0.1+ (either thevalkey-bundle:9.1.0-rc2+image, or a stable core withlibsearch.soloaded via--loadmodule).Motivation: Valkey is the BSD-3-licensed, community-governed fork of Redis after the Redis 7.4 license change. Valkey's search module is now feature-complete enough to back Open WebUI's RAG layer.
Added
valkeyinbackend/open_webui/retrieval/vector/dbs/valkey.pyimplementing theVectorDBBaseinterface (insert,upsert,search,query,get,delete,reset,has_collection,delete_collection).VectorType.VALKEYenum entry and factory wiring inbackend/open_webui/retrieval/vector/{type,factory}.py.backend/open_webui/config.pyand.env.example:VALKEY_URLVALKEY_COLLECTION_PREFIX(defaultopen_webui)VALKEY_INDEX_TYPE(HNSW|FLAT, defaultHNSW)VALKEY_DISTANCE_METRIC(COSINE|L2|IP, defaultCOSINE)VALKEY_HNSW_M(default16)VALKEY_HNSW_EF_CONSTRUCTION(default200)VALKEY_HNSW_EF_RUNTIME(default10)MODULE LISTand fails fast ifsearchis missing or older than 1.2.0.valkey==6.1.1(added tobackend/requirements.txtandpyproject.toml).Changed
Deprecated
Removed
Fixed
Security
valkey://host:port/db). No credentials are logged.Breaking Changes
VECTOR_DBvalues and behavior are unchanged. Valkey is purely opt-in.Additional Information
Module version requirement. valkey-search 1.2.0 is the binding dependency. It's what adds TEXT fields, TAG expressions, and filter-only
FT.SEARCH. Users have two supported deployment paths:valkey/valkey-bundle:9.1.0-rc2or later (single image).--loadmodule libsearch.so.The runtime check validates either option at startup.
Known workarounds (both upstream-tracked, low runtime impact):
FT.SEARCH idx "*"wildcard is not yet in a tagged valkey-search release —get()falls back toSCAN+HGETALL. Tracked in valkey-search#957, fix merged in PR #960, expected in 1.2.1/1.3.0.get()is not on the hot search path.__vector_scorefield isn't returned when aRETURNclause is used, and can't be used inSORTBY. Thesearch()method omitsRETURNand sorts client-side. Tracked as an enhancement in valkey-search#989. Extra payload is ~1.5KB per result; client-side sort of ≤10 items is microseconds.Both workarounds are one-line changes once upstream ships the fixes.
Testing artifacts (not committed, local to my working tree under
valkey-testing/):valkey-bundle:9.1.0-rc2.Screenshots or Videos
VECTOR_DB=valkey, plus the startup log line confirming the detected valkey-search version.Contributor License Agreement
Note
Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.