Skip to content

feat: add support for Valkey vector database [Internal Feedback]#1

Open
rileydes-improving wants to merge 1 commit into
devfrom
feat/add-valkey-vector-database
Open

feat: add support for Valkey vector database [Internal Feedback]#1
rileydes-improving wants to merge 1 commit into
devfrom
feat/add-valkey-vector-database

Conversation

@rileydes-improving
Copy link
Copy Markdown
Owner

Pull Request Checklist

Note to first-time contributors: Please open a discussion post in Discussions to discuss your idea/fix with the community before creating a pull request, and describe your changes before submitting a pull request.

This is to ensure large feature PRs are discussed with the community first, before starting work on it. If the community does not want this feature or it is not relevant for Open WebUI as a project, it can be identified in the discussion before working on the feature and submitting the PR.

Before submitting, make sure you've checked the following:

  • Linked Issue/Discussion: Closes # 24623, relates to discussion # 24624.
  • Target branch: This PR targets dev.
  • Description: See below.
  • Changelog: Entry added below.
  • Documentation: A companion docs PR will be opened in open-webui/docs covering VECTOR_DB=valkey, the new VALKEY_* environment variables, and module/version requirements. Backend brief included at valkey-testing/VALKEY_VECTOR_STORE_BRIEF.md for reviewers.
  • Dependencies: Adds valkey==6.1.1 (wire-compatible with redis-py). Added to both backend/requirements.txt and pyproject.toml. The client is only imported lazily from the factory when VECTOR_DB=valkey, so existing deployments are unaffected. The code paths that use the client have been exercised end-to-end against valkey-bundle:9.1.0-rc2.
  • Testing: 66/66 integration tests and 43/43 end-to-end application tests pass against valkey-bundle:9.1.0-rc2 (Valkey core 9.1.0 + valkey-search 1.2.0) and a custom image with (Valkey core 9.0.1 + valkey-search 1.2.0). Manually verified document ingestion, KNN search, filtered KNN, metadata queries, collection reset, and deletion by id/filter.
  • Agentic AI Code: AI Assistance was used while writing this feature. Every line has been human-reviewed by myself and others and I manually tested against a running Valkey instance.
  • Code review: Self-reviewed and internally reviewed by my team. Matches the pattern used by other backends in backend/open_webui/retrieval/vector/dbs/.
  • Design & Architecture: No new user-facing settings - operators opt in by setting VECTOR_DB=valkey and VALKEY_URL. Defaults mirror the other vector backends.
  • Git Hygiene: Single logical change on top of dev.
  • Title Prefix: feat:

Changelog Entry

Description

Adds Valkey as a vector database backend for RAG, selectable via VECTOR_DB=valkey. The implementation uses the valkey-search module's native FT.CREATE / FT.SEARCH primitives for HNSW/FLAT vector indexing, KNN queries, and tag-filtered metadata queries, no client-side vector math. Requires valkey-search 1.2.0+ running on Valkey core 9.0.1+ (either the valkey-bundle:9.1.0-rc2+ image, or a stable core with libsearch.so loaded via --loadmodule).

Motivation: Valkey is the BSD-3-licensed, community-governed fork of Redis after the Redis 7.4 license change. Valkey's search module is now feature-complete enough to back Open WebUI's RAG layer.

Added

  • New vector backend valkey in backend/open_webui/retrieval/vector/dbs/valkey.py implementing the VectorDBBase interface (insert, upsert, search, query, get, delete, reset, has_collection, delete_collection).
  • VectorType.VALKEY enum entry and factory wiring in backend/open_webui/retrieval/vector/{type,factory}.py.
  • New environment variables in backend/open_webui/config.py and .env.example:
    • VALKEY_URL
    • VALKEY_COLLECTION_PREFIX (default open_webui)
    • VALKEY_INDEX_TYPE (HNSW | FLAT, default HNSW)
    • VALKEY_DISTANCE_METRIC (COSINE | L2 | IP, default COSINE)
    • VALKEY_HNSW_M (default 16)
    • VALKEY_HNSW_EF_CONSTRUCTION (default 200)
    • VALKEY_HNSW_EF_RUNTIME (default 10)
  • Startup validation that inspects MODULE LIST and fails fast if search is missing or older than 1.2.0.
  • New runtime dependency: valkey==6.1.1 (added to backend/requirements.txt and pyproject.toml).

Changed

  • None.

Deprecated

  • None.

Removed

  • None.

Fixed

  • None.

Security

  • No new attack surface. Connection URL follows the same pattern as existing Redis configuration (valkey://host:port/db). No credentials are logged.

Breaking Changes

  • None. Existing VECTOR_DB values and behavior are unchanged. Valkey is purely opt-in.

Additional Information

Module version requirement. valkey-search 1.2.0 is the binding dependency. It's what adds TEXT fields, TAG expressions, and filter-only FT.SEARCH. Users have two supported deployment paths:

  • Option A: valkey/valkey-bundle:9.1.0-rc2 or later (single image).
  • Option B: any stable Valkey core 9.0.1+ with valkey-search 1.2.0 loaded via --loadmodule libsearch.so.

The runtime check validates either option at startup.

Known workarounds (both upstream-tracked, low runtime impact):

  1. FT.SEARCH idx "*" wildcard is not yet in a tagged valkey-search release — get() falls back to SCAN + HGETALL. Tracked in valkey-search#957, fix merged in PR #960, expected in 1.2.1/1.3.0. get() is not on the hot search path.
  2. The computed __vector_score field isn't returned when a RETURN clause is used, and can't be used in SORTBY. The search() method omits RETURN and sorts client-side. Tracked as an enhancement in valkey-search#989. Extra payload is ~1.5KB per result; client-side sort of ≤10 items is microseconds.

Both workarounds are one-line changes once upstream ships the fixes.

Testing artifacts (not committed, local to my working tree under valkey-testing/):

  • 66 integration tests covering schema creation, HNSW/FLAT, all three distance metrics, insert/upsert/delete, KNN, tag-filtered KNN, filter-only queries, reset, and error paths.
  • 43 end-to-end tests exercising the real Open WebUI RAG flow against a containerized valkey-bundle:9.1.0-rc2.

Screenshots or Videos

  • To be added before review: screenshots of a document uploaded and retrieved via RAG with VECTOR_DB=valkey, plus the startup log line confirming the detected valkey-search version.

Contributor License Agreement

Note

Deleting the CLA section will lead to immediate closure of your PR and it will not be merged in.

Copy link
Copy Markdown

@Jonathan-Improving Jonathan-Improving left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QA Review — Valkey Vector Database Backend

Verdict: REQUEST_CHANGES — 2 must-fix, 5 suggestions

🔴 Must Fix

  1. Use valkey-glide instead of valkey package — GLIDE is the official recommended client with full FT.* module support, type-safe APIs, and better connection management. No blocking constraint exists; every operation in this file has a GLIDE equivalent. See inline comment on line 12.

  2. IP score normalization formula is incorrect (line 631) — treats __vector_score as raw inner product but valkey-search returns it as a distance. The re-sort workaround at line 616 confirms the formula is wrong.

🟡 Suggestions

  1. Missing DIALECT 2 in query() and delete() FT.SEARCH calls (lines 414-427, 487-495)
  2. N+1 query pattern in get() — individual HGETALL per key instead of pipeline (lines 443-445)
  3. _escape_tag_value missing ? wildcard (line 46)
  4. No config validation for VALKEY_DISTANCE_METRIC at init (line 86)
  5. query() hardcoded 10000 limit may silently truncate (line 413)

✅ Positives

  • Excellent startup version validation with clear error messages
  • Dimension mismatch detection prevents silent data corruption
  • Correct FT.CREATE schema construction
  • Correct KNN query syntax with DIALECT 2 in search()
  • Good edge case handling ($in: [], empty items)
  • Clean integration with existing factory/type patterns
  • Thorough PR description with known workarounds documented

Comment on lines +414 to +427
try:
result = self.client.execute_command(
'FT.SEARCH',
self._index_name(collection_name),
query_str,
'RETURN',
3,
'id',
'text',
'metadata_json',
'LIMIT',
0,
effective_limit,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion — Add DIALECT 2 to this FT.SEARCH call

search() correctly uses 'DIALECT', 2 but query() omits it. DIALECT 2 is required for correct parsing of filter expressions with negation (-@field:{...} from $ne). Without it, complex filters may be parsed incorrectly.

Add 'DIALECT', 2, after the effective_limit, line (same pattern as search()).

'Set it to your Valkey server URL (e.g., valkey://localhost:6379).'
)

self.collection_prefix = VALKEY_COLLECTION_PREFIX
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion — Validate VALKEY_DISTANCE_METRIC at init

Invalid VALKEY_INDEX_TYPE silently falls back to FLAT (line 237), but invalid VALKEY_DISTANCE_METRIC (e.g., "HAMMING") passes through to FT.CREATE and fails with a cryptic server error. Validate against {'COSINE', 'L2', 'IP'} here with a clear message.


def _escape_tag_value(value: str) -> str:
"""Escape special characters for RediSearch/Valkey-Search TAG field queries."""
return re.sub(r'([,.<>{}\[\]"\':;!@#$%^&*()\-+=~\\/| \t\n\r])', r'\\\1', str(value))
Copy link
Copy Markdown

@Jonathan-Improving Jonathan-Improving May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Suggestion — Missing ? wildcard in escape regex

The character class escapes * but not ?. In valkey-search TAG queries, ? is a single-character wildcard. A filter value containing ? could match unintended documents. Add ? to the character class.

Comment on lines +487 to +495
result = self.client.execute_command(
'FT.SEARCH',
index_name,
filter_expr,
'NOCONTENT',
'LIMIT',
0,
page_size,
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion — Add DIALECT 2 here as well

Same issue as query() — this FT.SEARCH call for delete-by-filter omits DIALECT 2. Negation filters from $ne may not parse correctly without it.

# L2 distance: 0 (identical) to infinity.
return 1.0 / (1.0 + score)
# IP (Inner Product): assumes unit-normalized vectors.
return max(0.0, min(1.0, (score + 1.0) / 2.0))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Must Fix — IP score normalization formula is incorrect

valkey-search returns __vector_score as a distance (lower = more similar) for all metrics, including IP where it returns 1 - inner_product. This formula treats the raw score as an inner product in [-1, 1], but it's actually a distance in [0, 2] for unit vectors (same range as COSINE).

The re-sort at line 616-618 is a symptom — if normalization were correct and monotonic, the server's ascending-distance order would already be correct.

Suggested fix: return 1.0 - score (same as COSINE without the /2.0 divisor, since IP distance range for unit vectors is [0, 2] → similarity [−1, 1], but clamped to [0,1] it becomes max(0.0, 1.0 - score)).

Comment on lines +443 to +445
cursor, keys = self.client.scan(cursor=cursor, match=f'{prefix}*', count=500)
for key in keys:
fields = self.client.hgetall(key)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion — N+1 query pattern: use pipeline for HGETALL calls

Each key triggers a separate round-trip. For collections with hundreds of documents this is a significant perf issue. Batch with a pipeline:

pipe = self.client.pipeline()
for key in keys:
    pipe.hgetall(key)
results = pipe.execute()
for fields in results:
    ...

(Note: if migrating to GLIDE, this becomes a ClusterBatch or similar pattern.)

import struct
from urllib.parse import urlparse

import valkey
Copy link
Copy Markdown

@Jonathan-Improving Jonathan-Improving May 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Advisory — Consider valkey-glide for typed FT.* APIs

The valkey package (redis-py fork) is the legacy client. The recommended path forward is valkey-glide — the official high-performance GLIDE client. GLIDE has full FT.* module support via module-level functions:

Current (valkey) GLIDE equivalent (valkey-glide)
execute_command('FT.CREATE', ...) ft.create(client, index_name, schema, options)
execute_command('FT.SEARCH', ...) ft.search(client, index_name, query, options)
execute_command('FT.DROPINDEX', ...) ft.dropindex(client, index_name)
execute_command('FT._LIST') ft.list(client)
execute_command('FT.INFO', ...) ft.info(client, index_name)
execute_command('MODULE', 'LIST') client.custom_command(['MODULE', 'LIST'])

Benefits of switching:

  1. Type-safe schema constructionVectorField, VectorFieldAttributesHnsw, FtCreateOptions etc. vs raw string arrays
  2. Cleaner response formatft.search() returns [count, {key: {field: value}}] (a dict), eliminating the need for _decode_kv_pairs / _parse_ft_result manual parsing
  3. Better connection management — multiplexed, thread-safe, auto-reconnect
  4. Typed search optionsFtSearchOptions, FtSearchLimit vs positional string args

Package: valkey-glide (async) or valkey-glide-sync (sync). Import pattern:

from glide import ft, GlideClient, FtCreateOptions, VectorField, ...
# or sync:
from glide_sync import ft, GlideClient, ...

No blocking constraint exists — every operation in this file has a GLIDE equivalent.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarification on tone: This isn't necessarily "wrong" — there may be valid reasons to use valkey over valkey-glide here (e.g., matching the existing redis==7.4.0 dependency already in pyproject.toml, minimizing unfamiliar APIs for upstream reviewers, or GLIDE Python package maturity concerns).

The key ask is: please document the rationale in the PR description before submitting upstream. If it was intentional, a sentence explaining why is sufficient. If it was an oversight, it's worth evaluating — GLIDE would simplify the response parsing significantly and eliminate the manual _decode_kv_pairs / _parse_ft_result machinery.

Copy link
Copy Markdown

@MatthiasHowellYopp MatthiasHowellYopp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python Code Review — Additional Findings

Verdict: Request changes — 5 items to address before merge.


1. IP score normalization is incorrect (critical)

Line 631_normalize_score for IP metric:

# IP (Inner Product): assumes unit-normalized vectors.
return max(0.0, min(1.0, (score + 1.0) / 2.0))

valkey-search returns 1 - inner_product as the distance (not raw IP), so for unit vectors the score is in [0, 2]. The formula (score + 1.0) / 2.0 maps [-1, 1][0, 1], which is the wrong input domain. This will produce incorrect similarity rankings for the IP metric.

Suggested fix: 1.0 - score for unit-normalized vectors (maps distance [0, 2] → similarity [-1, 1], then clamp), or document and validate the expected score semantics from valkey-search for your version.


2. Missing DIALECT 2 in query() and delete() FT.SEARCH calls (warning)

Lines 414-427, 487-495search() correctly uses DIALECT 2, but query() and delete() omit it. Without DIALECT 2, complex filter expressions (especially those with TAG unions like @field:{val1|val2}) may not parse correctly.

Add 'DIALECT', 2 to both FT.SEARCH calls.


3. N+1 in get() — pipeline the HGETALL calls (warning)

Line 443 — Each key from SCAN gets an individual HGETALL round trip:

for key in keys:
    fields = self.client.hgetall(key)

For 500 keys per SCAN batch, that is 500 network round trips. Use a pipeline:

pipe = self.client.pipeline()
for key in keys:
    pipe.hgetall(key)
results = pipe.execute()
for fields in results:
    ...

4. Broad exception swallowing hides connection failures (warning)

Lines 350, 420, 490search(), query(), and delete() all catch bare Exception and return None / silently continue:

except Exception as e:
    log.error(f'Valkey search error on collection {collection_name}: {e}')
    return None

The caller interprets None as "no results" rather than "search failed." Catch valkey.exceptions.ValkeyError (or valkey.exceptions.ConnectionError + valkey.exceptions.ResponseError) specifically. Let unexpected exceptions propagate so transient failures are visible to the application layer.


5. _escape_tag_value — compile regex at module level, add missing ? wildcard (warning)

Line 46 — The regex is compiled on every call and is missing ? (a wildcard in valkey-search TAG queries):

def _escape_tag_value(value: str) -> str:
    return re.sub(r'([,.<>{}\[\]"\\'\\\\:;!@#$%^&*()\-+=~\\\\/| \t\n\r])', r'\\\1', str(value))

Suggested fix:

_TAG_SPECIAL_RE = re.compile(r'([,.<>{}\[\]"\\'\\\\:;!@#$%^&*()\-+=~?/| \t\n\r])')

def _escape_tag_value(value: str) -> str:
    """Escape special characters for valkey-search TAG field queries."""
    return _TAG_SPECIAL_RE.sub(r'\\\1', str(value))

Overall the implementation is well-structured, follows existing backend patterns, and has excellent startup validation. These 5 items are the blockers; the rest of the code is solid.

@Jonathan-Improving
Copy link
Copy Markdown

Jonathan-Improving commented May 13, 2026

Notes on the PR description (for upstream submission prep):

  1. valkey-testing/VALKEY_VECTOR_STORE_BRIEF.md not in PR — The Documentation checkbox references this file "for reviewers" but it's not committed. For the upstream PR, either include it or remove the reference.

  2. Screenshots placeholder — Still says "To be added before review." Should be populated before the upstream submission.

  3. RC image as primary recommendationvalkey-bundle:9.1.0-rc2 is a release candidate. Upstream reviewers may push back on recommending an RC for production. Clarify timeline for stable release or lead with Option B (stable core + --loadmodule).

  4. Client library choice (valkey vs valkey-glide) — The PR uses valkey==6.1.1 (redis-py fork) rather than valkey-glide, which is the official recommended GLIDE client with native FT.* module support, type-safe APIs, and better connection management. If this was an intentional choice (e.g., minimizing new dependencies, matching existing redis usage in the codebase, avoiding GLIDE's current maturity status), that rationale should be documented in the PR description — upstream reviewers will likely ask. If it was an oversight, it's worth evaluating before the upstream submission.

Copy link
Copy Markdown

@Jonathan-Improving Jonathan-Improving left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up review — 3 additional findings from a second pass with anti-pattern and security-focused reviewers.

One new critical (error swallowing), two new medium suggestions (unbounded SCAN, filter key validation). These complement the original 7 findings.

Comment on lines +440 to +442
ids, documents, metadatas = [], [], []
cursor = 0
while True:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion — Add a hard upper limit when limit is None

The VectorDBBase.get() signature has no limit parameter, so callers using the base interface will always hit this with limit=None. In that case, this loop scans the entire keyspace with no cap — for a collection with hundreds of thousands of documents, this could OOM the application server or block the Valkey instance with sustained SCAN + HGETALL traffic.

query() caps at 10000 (line 413). Consider the same here:

effective_limit = limit if limit and limit > 0 else 10_000

Or at minimum, log a warning when returning large result sets so operators can tune.

Raises ValueError on unsupported operators rather than silently matching nothing.
"""
parts = []
for key, value in filter.items():
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion — Guard against => in constructed filter expressions

The => token in FT.SEARCH syntax delimits a filter from a KNN vector clause. This regex escapes = and > individually (both are in the character class), but if a filter value contains the literal sequence => and the escaping doesn't break it apart, it could inject a KNN clause into a filter-only query like query() or delete().

Looking more carefully: = IS escaped (it's in +=~) and > IS escaped (it's in <>). So => in a value becomes \=\> which is safe. However, the filter keys (line 56: f'@{key}:...') are NOT escaped or validated at all. If a future caller passes user-controlled keys, => injection becomes possible.

Defensive fix — validate keys in _build_filter_expression:

_ALLOWED_FILTER_KEYS = frozenset({'id', 'hash', 'file_id', 'source', 'knowledge_base_id'})

for key in filter.keys():
    if key not in _ALLOWED_FILTER_KEYS:
        raise ValueError(f"Unsupported filter key: {key!r}")

This is defense-in-depth — current callers use hardcoded keys, but the function itself has no guard.

Comment on lines +397 to +399
except Exception as e:
log.error(f'Valkey search error on collection {collection_name}: {e}')
return None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Must Fix — except Exception swallows all errors silently

This catches everythingKeyboardInterrupt, SystemExit, MemoryError, and programming bugs like TypeError or AttributeError. The caller receives None with no way to distinguish "no results found" from "server is down" or "there's a bug in the code."

Same pattern appears in query() (line 428), delete() (line 471), and delete()-by-filter (line 496).

Fix: Catch only Valkey protocol/connection errors:

except valkey.exceptions.ValkeyError as e:
    log.error(f'Valkey search error on collection {collection_name}: {e}')
    return None

This still handles connection failures, timeouts, and server errors gracefully while letting programming bugs propagate so they're caught during development.

@MatthiasHowellYopp
Copy link
Copy Markdown

LGTM

Signed-off-by: Riley Des <riley.desserre@improving.com>
@rileydes-improving rileydes-improving force-pushed the feat/add-valkey-vector-database branch from c9e6a43 to 20252b5 Compare May 15, 2026 18:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants