Skip to content

Conversation

@zaidoon1
Copy link
Contributor

@zaidoon1 zaidoon1 commented Oct 30, 2025

Problem

Today, users who only need to know “does any key with this prefix exist?” typically:

  • Create a new iterator
  • Enable prefix_same_as_start
  • Seek to the prefix
  • Tear down the iterator

This pattern incurs expensive iterator lifecycle overhead per prefix, touches many index/filter blocks with poor locality, and forces every user to reinvent the same existence check.

Solution

Introduce prefix-existence APIs that:

  1. Avoid value materialization and scan only as far as the first matching key.
  2. Reuse one SuperVersion and one set of iterators (memtable, imm memtables, and SSTs) for the entire batch via PrefixExistsMulti
  3. Offer a prefixes_sorted hint so callers who pre-sort/dedup can skip internal sorting; otherwise the implementation sorts/dedups to maximize locality and reduce seeks.

Performance

  • Compared to creating/destroying an iterator per prefix (no reuse), PrefixExistsMulti reduces overhead substantially by amortizing iterator setup and improving locality via sorting/dedup. Typical gains for batch workloads are significant.
  • Compared to single-prefix PrefixExists (per-call setup), PrefixExistsMulti is faster for any meaningful batch (N ≥ 4–8) due to amortized setup and fewer overall seeks.
  • Qualitative expectations in common setups (prefix extractor + partitioned filters + prefix_same_as_start):
    • Old pattern (new iterator per prefix) → Baseline (slowest).
    • Single-prefix PrefixExists → Moderately faster than baseline due to existence-only short-circuit.
    • Batched PrefixExistsMulti → Substantially faster than baseline for N>1; best when inputs are unsorted or have duplicates.

Note that for my workload. I do over 100K+ prefix checks per second right now, and we can definitely batch so batching is very important and optimizing this path is critical for us. I'm very open to making any other changes that will help make this path more efficient so open to suggestions.

@meta-cla meta-cla bot added the CLA Signed label Oct 30, 2025
@zaidoon1 zaidoon1 force-pushed the zaidoon/prefix-exists-feature branch 17 times, most recently from 0f55da4 to 125ade0 Compare October 31, 2025 00:13
@zaidoon1 zaidoon1 force-pushed the zaidoon/prefix-exists-feature branch from 125ade0 to fa53433 Compare October 31, 2025 00:34
@zaidoon1
Copy link
Contributor Author

zaidoon1 commented Oct 31, 2025

@jaykorean would love feedback on this change from you and the team please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant