feat: add support for Valkey vector database [Internal Feedback] by rileydes-improving · Pull Request #1 · rileydes-improving/open-webui

rileydes-improving · 2026-05-13T16:01:56Z

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.

This is to ensure large feature PRs are discussed with the community first, before starting work on it. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.

Before submitting, make sure you've checked the following:

Changelog Entry

Description

Adds Valkey as a vector database backend for RAG, selectable via VECTOR_DB=valkey. The implementation uses the valkey-search module's native FT.CREATE / FT.SEARCH primitives for HNSW/FLAT vector indexing, KNN queries, and tag-filtered metadata queries, no client-side vector math. Requires valkey-search 1.2.0+ running on Valkey core 9.0.1+ (either the valkey-bundle:9.1.0-rc2+ image, or a stable core with libsearch.so loaded via --loadmodule).

Motivation: Valkey is the BSD-3-licensed, community-governed fork of Redis after the Redis 7.4 license change. Valkey's search module is now feature-complete enough to back Open WebUI's RAG layer.

Added

New vector backend valkey in backend/open_webui/retrieval/vector/dbs/valkey.py implementing the VectorDBBase interface (insert, upsert, search, query, get, delete, reset, has_collection, delete_collection).
VectorType.VALKEY enum entry and factory wiring in backend/open_webui/retrieval/vector/{type,factory}.py.
New environment variables in backend/open_webui/config.py and .env.example:
- VALKEY_URL
- VALKEY_COLLECTION_PREFIX (default open_webui)
- VALKEY_INDEX_TYPE (HNSW | FLAT, default HNSW)
- VALKEY_DISTANCE_METRIC (COSINE | L2 | IP, default COSINE)
- VALKEY_HNSW_M (default 16)
- VALKEY_HNSW_EF_CONSTRUCTION (default 200)
- VALKEY_HNSW_EF_RUNTIME (default 10)
Startup validation that inspects MODULE LIST and fails fast if search is missing or older than 1.2.0.
New runtime dependency: valkey==6.1.1 (added to backend/requirements.txt and pyproject.toml).

Changed

None.

Deprecated

None.

Removed

None.

Fixed

None.

Security

No new attack surface. Connection URL follows the same pattern as existing Redis configuration (valkey://host:port/db). No credentials are logged.

Breaking Changes

None. Existing VECTOR_DB values and behavior are unchanged. Valkey is purely opt-in.

Additional Information

Module version requirement. valkey-search 1.2.0 is the binding dependency. It's what adds TEXT fields, TAG expressions, and filter-only FT.SEARCH. Users have two supported deployment paths:

Option A: valkey/valkey-bundle:9.1.0-rc2 or later (single image).
Option B: any stable Valkey core 9.0.1+ with valkey-search 1.2.0 loaded via --loadmodule libsearch.so.

The runtime check validates either option at startup.

Known workarounds (both upstream-tracked, low runtime impact):

FT.SEARCH idx "*" wildcard is not yet in a tagged valkey-search release — get() falls back to SCAN + HGETALL. Tracked in valkey-search#957, fix merged in PR #960, expected in 1.2.1/1.3.0. get() is not on the hot search path.
The computed __vector_score field isn't returned when a RETURN clause is used, and can't be used in SORTBY. The search() method omits RETURN and sorts client-side. Tracked as an enhancement in valkey-search#989. Extra payload is ~1.5KB per result; client-side sort of ≤10 items is microseconds.

Both workarounds are one-line changes once upstream ships the fixes.

Testing artifacts (not committed, local to my working tree under valkey-testing/):

66 integration tests covering schema creation, HNSW/FLAT, all three distance metrics, insert/upsert/delete, KNN, tag-filtered KNN, filter-only queries, reset, and error paths.
43 end-to-end tests exercising the real Open WebUI RAG flow against a containerized valkey-bundle:9.1.0-rc2.

Screenshots or Videos

To be added before review: screenshots of a document uploaded and retrieved via RAG with VECTOR_DB=valkey, plus the startup log line confirming the detected valkey-search version.

Contributor License Agreement

By submitting this pull request, I confirm that I have read and fully agree to the Contributor License Agreement (CLA), and I am providing my contributions under its terms.

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.

Jonathan-Improving

QA Review — Valkey Vector Database Backend

Verdict: REQUEST_CHANGES — 2 must-fix, 5 suggestions

🔴 Must Fix

Use valkey-glide instead of valkey package — GLIDE is the official recommended client with full FT.* module support, type-safe APIs, and better connection management. No blocking constraint exists; every operation in this file has a GLIDE equivalent. See inline comment on line 12.
IP score normalization formula is incorrect (line 631) — treats __vector_score as raw inner product but valkey-search returns it as a distance. The re-sort workaround at line 616 confirms the formula is wrong.

🟡 Suggestions

Missing DIALECT 2 in query() and delete() FT.SEARCH calls (lines 414-427, 487-495)
N+1 query pattern in get() — individual HGETALL per key instead of pipeline (lines 443-445)
_escape_tag_value missing ? wildcard (line 46)
No config validation for VALKEY_DISTANCE_METRIC at init (line 86)
query() hardcoded 10000 limit may silently truncate (line 413)

✅ Positives

Excellent startup version validation with clear error messages
Dimension mismatch detection prevents silent data corruption
Correct FT.CREATE schema construction
Correct KNN query syntax with DIALECT 2 in search()
Good edge case handling ($in: [], empty items)
Clean integration with existing factory/type patterns
Thorough PR description with known workarounds documented

Jonathan-Improving · 2026-05-13T17:22:00Z

+        try:
+            result = self.client.execute_command(
+                'FT.SEARCH',
+                self._index_name(collection_name),
+                query_str,
+                'RETURN',
+                3,
+                'id',
+                'text',
+                'metadata_json',
+                'LIMIT',
+                0,
+                effective_limit,
+            )


🟡 Suggestion — Add DIALECT 2 to this FT.SEARCH call

search() correctly uses 'DIALECT', 2 but query() omits it. DIALECT 2 is required for correct parsing of filter expressions with negation (-@field:{...} from $ne). Without it, complex filters may be parsed incorrectly.

Add 'DIALECT', 2, after the effective_limit, line (same pattern as search()).

Jonathan-Improving · 2026-05-13T17:22:00Z

+                'Set it to your Valkey server URL (e.g., valkey://localhost:6379).'
+            )
+
+        self.collection_prefix = VALKEY_COLLECTION_PREFIX


🟡 Suggestion — Validate VALKEY_DISTANCE_METRIC at init

Invalid VALKEY_INDEX_TYPE silently falls back to FLAT (line 237), but invalid VALKEY_DISTANCE_METRIC (e.g., "HAMMING") passes through to FT.CREATE and fails with a cryptic server error. Validate against {'COSINE', 'L2', 'IP'} here with a clear message.

Jonathan-Improving · 2026-05-13T17:22:00Z

+
+def _escape_tag_value(value: str) -> str:
+    """Escape special characters for RediSearch/Valkey-Search TAG field queries."""
+    return re.sub(r'([,.<>{}\[\]"\':;!@#$%^&*()\-+=~\\/| \t\n\r])', r'\\\1', str(value))


🟢 Suggestion — Missing ? wildcard in escape regex

The character class escapes * but not ?. In valkey-search TAG queries, ? is a single-character wildcard. A filter value containing ? could match unintended documents. Add ? to the character class.

Jonathan-Improving · 2026-05-13T17:22:00Z

+                result = self.client.execute_command(
+                    'FT.SEARCH',
+                    index_name,
+                    filter_expr,
+                    'NOCONTENT',
+                    'LIMIT',
+                    0,
+                    page_size,
+                )


🟡 Suggestion — Add DIALECT 2 here as well

Same issue as query() — this FT.SEARCH call for delete-by-filter omits DIALECT 2. Negation filters from $ne may not parse correctly without it.

Jonathan-Improving · 2026-05-13T17:22:00Z

+            # L2 distance: 0 (identical) to infinity.
+            return 1.0 / (1.0 + score)
+        # IP (Inner Product): assumes unit-normalized vectors.
+        return max(0.0, min(1.0, (score + 1.0) / 2.0))


🔴 Must Fix — IP score normalization formula is incorrect

valkey-search returns __vector_score as a distance (lower = more similar) for all metrics, including IP where it returns 1 - inner_product. This formula treats the raw score as an inner product in [-1, 1], but it's actually a distance in [0, 2] for unit vectors (same range as COSINE).

The re-sort at line 616-618 is a symptom — if normalization were correct and monotonic, the server's ascending-distance order would already be correct.

Suggested fix: return 1.0 - score (same as COSINE without the /2.0 divisor, since IP distance range for unit vectors is [0, 2] → similarity [−1, 1], but clamped to [0,1] it becomes max(0.0, 1.0 - score)).

Jonathan-Improving · 2026-05-13T17:22:00Z

+            cursor, keys = self.client.scan(cursor=cursor, match=f'{prefix}*', count=500)
+            for key in keys:
+                fields = self.client.hgetall(key)


🟡 Suggestion — N+1 query pattern: use pipeline for HGETALL calls

Each key triggers a separate round-trip. For collections with hundreds of documents this is a significant perf issue. Batch with a pipeline:

pipe = self.client.pipeline() for key in keys: pipe.hgetall(key) results = pipe.execute() for fields in results: ...

(Note: if migrating to GLIDE, this becomes a ClusterBatch or similar pattern.)

Jonathan-Improving · 2026-05-13T17:22:01Z

+import struct
+from urllib.parse import urlparse
+
+import valkey


🟡 Advisory — Consider valkey-glide for typed FT.* APIs

The valkey package (redis-py fork) is the legacy client. The recommended path forward is valkey-glide — the official high-performance GLIDE client. GLIDE has full FT.* module support via module-level functions:

Current (valkey) GLIDE equivalent (valkey-glide)

execute_command('FT.CREATE', ...) ft.create(client, index_name, schema, options)

execute_command('FT.SEARCH', ...) ft.search(client, index_name, query, options)

execute_command('FT.DROPINDEX', ...) ft.dropindex(client, index_name)

execute_command('FT._LIST') ft.list(client)

execute_command('FT.INFO', ...) ft.info(client, index_name)

execute_command('MODULE', 'LIST') client.custom_command(['MODULE', 'LIST'])

Benefits of switching:

Type-safe schema construction — VectorField, VectorFieldAttributesHnsw, FtCreateOptions etc. vs raw string arrays

Cleaner response format — ft.search() returns [count, {key: {field: value}}] (a dict), eliminating the need for _decode_kv_pairs / _parse_ft_result manual parsing

Better connection management — multiplexed, thread-safe, auto-reconnect

Typed search options — FtSearchOptions, FtSearchLimit vs positional string args

Package: valkey-glide (async) or valkey-glide-sync (sync). Import pattern:

from glide import ft, GlideClient, FtCreateOptions, VectorField, ... # or sync: from glide_sync import ft, GlideClient, ...

No blocking constraint exists — every operation in this file has a GLIDE equivalent.

Clarification on tone: This isn't necessarily "wrong" — there may be valid reasons to use valkey over valkey-glide here (e.g., matching the existing redis==7.4.0 dependency already in pyproject.toml, minimizing unfamiliar APIs for upstream reviewers, or GLIDE Python package maturity concerns).

The key ask is: please document the rationale in the PR description before submitting upstream. If it was intentional, a sentence explaining why is sufficient. If it was an oversight, it's worth evaluating — GLIDE would simplify the response parsing significantly and eliminate the manual _decode_kv_pairs / _parse_ft_result machinery.

MatthiasHowellYopp

Python Code Review — Additional Findings

Verdict: Request changes — 5 items to address before merge.

1. IP score normalization is incorrect (critical)

Line 631 — _normalize_score for IP metric:

# IP (Inner Product): assumes unit-normalized vectors.
return max(0.0, min(1.0, (score + 1.0) / 2.0))

valkey-search returns 1 - inner_product as the distance (not raw IP), so for unit vectors the score is in [0, 2]. The formula (score + 1.0) / 2.0 maps [-1, 1] → [0, 1], which is the wrong input domain. This will produce incorrect similarity rankings for the IP metric.

Suggested fix: 1.0 - score for unit-normalized vectors (maps distance [0, 2] → similarity [-1, 1], then clamp), or document and validate the expected score semantics from valkey-search for your version.

2. Missing `DIALECT 2` in `query()` and `delete()` FT.SEARCH calls (warning)

Lines 414-427, 487-495 — search() correctly uses DIALECT 2, but query() and delete() omit it. Without DIALECT 2, complex filter expressions (especially those with TAG unions like @field:{val1|val2}) may not parse correctly.

Add 'DIALECT', 2 to both FT.SEARCH calls.

3. N+1 in `get()` — pipeline the HGETALL calls (warning)

Line 443 — Each key from SCAN gets an individual HGETALL round trip:

for key in keys:
    fields = self.client.hgetall(key)

For 500 keys per SCAN batch, that is 500 network round trips. Use a pipeline:

pipe = self.client.pipeline()
for key in keys:
    pipe.hgetall(key)
results = pipe.execute()
for fields in results:
    ...

4. Broad exception swallowing hides connection failures (warning)

Lines 350, 420, 490 — search(), query(), and delete() all catch bare Exception and return None / silently continue:

except Exception as e:
    log.error(f'Valkey search error on collection {collection_name}: {e}')
    return None

The caller interprets None as "no results" rather than "search failed." Catch valkey.exceptions.ValkeyError (or valkey.exceptions.ConnectionError + valkey.exceptions.ResponseError) specifically. Let unexpected exceptions propagate so transient failures are visible to the application layer.

5. `_escape_tag_value` — compile regex at module level, add missing `?` wildcard (warning)

Line 46 — The regex is compiled on every call and is missing ? (a wildcard in valkey-search TAG queries):

def _escape_tag_value(value: str) -> str:
    return re.sub(r'([,.<>{}\[\]"\\'\\\\:;!@#$%^&*()\-+=~\\\\/| \t\n\r])', r'\\\1', str(value))

Suggested fix:

_TAG_SPECIAL_RE = re.compile(r'([,.<>{}\[\]"\\'\\\\:;!@#$%^&*()\-+=~?/| \t\n\r])')

def _escape_tag_value(value: str) -> str:
    """Escape special characters for valkey-search TAG field queries."""
    return _TAG_SPECIAL_RE.sub(r'\\\1', str(value))

Overall the implementation is well-structured, follows existing backend patterns, and has excellent startup validation. These 5 items are the blockers; the rest of the code is solid.

Jonathan-Improving · 2026-05-13T17:40:26Z

Notes on the PR description (for upstream submission prep):

valkey-testing/VALKEY_VECTOR_STORE_BRIEF.md not in PR — The Documentation checkbox references this file "for reviewers" but it's not committed. For the upstream PR, either include it or remove the reference.
Screenshots placeholder — Still says "To be added before review." Should be populated before the upstream submission.
RC image as primary recommendation — valkey-bundle:9.1.0-rc2 is a release candidate. Upstream reviewers may push back on recommending an RC for production. Clarify timeline for stable release or lead with Option B (stable core + --loadmodule).
Client library choice (valkey vs valkey-glide) — The PR uses valkey==6.1.1 (redis-py fork) rather than valkey-glide, which is the official recommended GLIDE client with native FT.* module support, type-safe APIs, and better connection management. If this was an intentional choice (e.g., minimizing new dependencies, matching existing redis usage in the codebase, avoiding GLIDE's current maturity status), that rationale should be documented in the PR description — upstream reviewers will likely ask. If it was an oversight, it's worth evaluating before the upstream submission.

Jonathan-Improving

Follow-up review — 3 additional findings from a second pass with anti-pattern and security-focused reviewers.

One new critical (error swallowing), two new medium suggestions (unbounded SCAN, filter key validation). These complement the original 7 findings.

Jonathan-Improving · 2026-05-13T19:16:09Z

+        ids, documents, metadatas = [], [], []
+        cursor = 0
+        while True:


🟡 Suggestion — Add a hard upper limit when limit is None

The VectorDBBase.get() signature has no limit parameter, so callers using the base interface will always hit this with limit=None. In that case, this loop scans the entire keyspace with no cap — for a collection with hundreds of thousands of documents, this could OOM the application server or block the Valkey instance with sustained SCAN + HGETALL traffic.

query() caps at 10000 (line 413). Consider the same here:

effective_limit = limit if limit and limit > 0 else 10_000

Or at minimum, log a warning when returning large result sets so operators can tune.

Jonathan-Improving · 2026-05-13T19:16:09Z

+    Raises ValueError on unsupported operators rather than silently matching nothing.
+    """
+    parts = []
+    for key, value in filter.items():


🟡 Suggestion — Guard against => in constructed filter expressions

The => token in FT.SEARCH syntax delimits a filter from a KNN vector clause. This regex escapes = and > individually (both are in the character class), but if a filter value contains the literal sequence => and the escaping doesn't break it apart, it could inject a KNN clause into a filter-only query like query() or delete().

Looking more carefully: = IS escaped (it's in +=~) and > IS escaped (it's in <>). So => in a value becomes \=\> which is safe. However, the filter keys (line 56: f'@{key}:...') are NOT escaped or validated at all. If a future caller passes user-controlled keys, => injection becomes possible.

Defensive fix — validate keys in _build_filter_expression:

_ALLOWED_FILTER_KEYS = frozenset({'id', 'hash', 'file_id', 'source', 'knowledge_base_id'}) for key in filter.keys(): if key not in _ALLOWED_FILTER_KEYS: raise ValueError(f"Unsupported filter key: {key!r}")

This is defense-in-depth — current callers use hardcoded keys, but the function itself has no guard.

Jonathan-Improving · 2026-05-13T19:16:09Z

+        except Exception as e:
+            log.error(f'Valkey search error on collection {collection_name}: {e}')
+            return None


🔴 Must Fix — except Exception swallows all errors silently

This catches everything — KeyboardInterrupt, SystemExit, MemoryError, and programming bugs like TypeError or AttributeError. The caller receives None with no way to distinguish "no results found" from "server is down" or "there's a bug in the code."

Same pattern appears in query() (line 428), delete() (line 471), and delete()-by-filter (line 496).

Fix: Catch only Valkey protocol/connection errors:

except valkey.exceptions.ValkeyError as e: log.error(f'Valkey search error on collection {collection_name}: {e}') return None

This still handles connection failures, timeouts, and server errors gracefully while letting programming bugs propagate so they're caught during development.

MatthiasHowellYopp · 2026-05-15T15:22:24Z

LGTM

Signed-off-by: Riley Des <riley.desserre@improving.com>

Jonathan-Improving suggested changes May 13, 2026

View reviewed changes

MatthiasHowellYopp suggested changes May 13, 2026

View reviewed changes

Jonathan-Improving reviewed May 13, 2026

View reviewed changes

feat: add support for Valkey vector database

20252b5

Signed-off-by: Riley Des <riley.desserre@improving.com>

rileydes-improving force-pushed the feat/add-valkey-vector-database branch from c9e6a43 to 20252b5 Compare May 15, 2026 18:40

Current (`valkey`)	GLIDE equivalent (`valkey-glide`)
`execute_command('FT.CREATE', ...)`	`ft.create(client, index_name, schema, options)`
`execute_command('FT.SEARCH', ...)`	`ft.search(client, index_name, query, options)`
`execute_command('FT.DROPINDEX', ...)`	`ft.dropindex(client, index_name)`
`execute_command('FT._LIST')`	`ft.list(client)`
`execute_command('FT.INFO', ...)`	`ft.info(client, index_name)`
`execute_command('MODULE', 'LIST')`	`client.custom_command(['MODULE', 'LIST'])`

Conversation

rileydes-improving commented May 13, 2026

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.

Changelog Entry

Description

Added

Changed

Deprecated

Removed

Fixed

Security

Breaking Changes

Additional Information

Screenshots or Videos

Contributor License Agreement

Uh oh!

Jonathan-Improving left a comment

Choose a reason for hiding this comment

QA Review — Valkey Vector Database Backend

🔴 Must Fix

🟡 Suggestions

✅ Positives

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jonathan-Improving May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jonathan-Improving May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MatthiasHowellYopp left a comment

Choose a reason for hiding this comment

Python Code Review — Additional Findings

1. IP score normalization is incorrect (critical)

2. Missing DIALECT 2 in query() and delete() FT.SEARCH calls (warning)

3. N+1 in get() — pipeline the HGETALL calls (warning)

4. Broad exception swallowing hides connection failures (warning)

5. _escape_tag_value — compile regex at module level, add missing ? wildcard (warning)

Uh oh!

Jonathan-Improving commented May 13, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jonathan-Improving left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MatthiasHowellYopp commented May 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Jonathan-Improving May 13, 2026 •

edited

Loading

Jonathan-Improving May 13, 2026 •

edited

Loading

2. Missing `DIALECT 2` in `query()` and `delete()` FT.SEARCH calls (warning)

3. N+1 in `get()` — pipeline the HGETALL calls (warning)

5. `_escape_tag_value` — compile regex at module level, add missing `?` wildcard (warning)

Jonathan-Improving commented May 13, 2026 •

edited

Loading