Skip to content

feat: Rust API safety, serialization, filtering, and upstream bug fixes#1

Merged
jagan-nuvai merged 11 commits intomainfrom
nuvai/rust-serialize-and-lifetime-safety
Mar 23, 2026
Merged

feat: Rust API safety, serialization, filtering, and upstream bug fixes#1
jagan-nuvai merged 11 commits intomainfrom
nuvai/rust-serialize-and-lifetime-safety

Conversation

@mohanprasand-nuvai
Copy link
Copy Markdown
Collaborator

Summary

Cherry-picks and adapts 9 unmerged upstream PRs from rapidsai/cuvs, plus Nuvai-authored IVF-PQ/IVF-Flat serialization, into a single integration branch for Knuckles' cuVS dependency.

Rust API improvements

Bug fixes (C++/CUDA)

VRAM optimization

Upstream PR references

Commit Upstream PR Status
Lifetime safety rapidsai#1869 Open (CI style failure)
CAGRA serialize rapidsai#1840 Open (merge conflicts)
IVF-PQ/Flat serialize Nuvai-authored N/A
Multi-threaded OOB rapidsai#1781 Open
FP16 overflow rapidsai#1084 Open
Assert fix rapidsai#1890 Open
Uninitialized members rapidsai#1922 Open
cudaFuncSetAttribute rapidsai#1851 Open
attach_dataset_on_build rapidsai#1842 Open
Recall degradation rapidsai#1841 Open
Validated builders rapidsai#1861 Open
Filtering support rapidsai#1478 Open (WIP)

Test plan

  • Verify CAGRA build + search round-trip on GPU
  • Verify IVF-PQ and IVF-Flat serialize/deserialize round-trip
  • Verify filtered search with Bitset and Bitmap filters
  • Verify concurrent CAGRA search does not crash (OOB fix)
  • Verify FP16 datasets produce finite distance values
  • Verify large batch sizes maintain recall with search_width > 1
  • Run cargo test in rust/cuvs/ (requires CUDA toolkit)

🤖 Generated with Claude Code

Ludu-nuvai and others added 11 commits March 23, 2026 12:13
…ypes

Cherry-picked from upstream PR rapidsai#1869 (rapidsai/cuvs).

Introduces Index<'a> with DatasetOwnership enum (Borrowed/Owned) across
brute_force, cagra, ivf_flat, and ivf_pq. Adds build() for borrowed
datasets (compiler enforces lifetime) and build_owned() for self-contained
indexes. Fixes use-after-free when C library dereferences freed dataset.

Original author: zbennett10
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… indexes

Cherry-picks CAGRA serialization from upstream PR rapidsai#1840 (zbennett10),
adapted for the Index<'a> lifetime API from PR rapidsai#1869.

Adds new Nuvai-authored serialize/deserialize methods for IVF-PQ and
IVF-Flat, following the same pattern (CString filename, FFI call to
cuvsIvfPqSerialize/Deserialize and cuvsIvfFlatSerialize/Deserialize).

Includes round-trip tests for CAGRA and IVF-PQ serialization, plus
hnswlib export test for CAGRA.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from upstream PR rapidsai#1781.
Fixes cudaErrorIllegalAddress under concurrent search by ensuring
per-thread distance buffer indexing stays within bounds.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from upstream PR rapidsai#1084.
Uses float accumulator for FP16 distance computation to prevent
overflow when distance values exceed FP16 range.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from upstream PR rapidsai#1890.
Removes incorrect assertion that triggered crashes in valid
IVF-Flat fused kernel use cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from upstream PR rapidsai#1922.
Adds default member initializers for public header structs
(distance, grammian, ball_cover, hnsw, vamana, spectral_embedding)
to prevent undefined behavior when fields are unset.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from upstream PR rapidsai#1851.
Tracks kernel function pointer changes and re-applies shared memory
attribute when CAGRA search switches between kernel variants, preventing
silent performance degradation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from upstream PR rapidsai#1842.
Adds C API functions to control dataset attachment during CAGRA build,
halving VRAM usage for large datasets. Also adds cuvsCagraUpdateDataset
for replacing the dataset after build without rebuilding the graph.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… batch sizes

Cherry-picked from upstream PR rapidsai#1841.
Fixes silent recall drop when AUTO algorithm selector switches from
MULTI_CTA to SINGLE_CTA at large batch sizes with search_width > 1.
Includes regression benchmark and test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from upstream PR rapidsai#1861 (fixes rapidsai#1859).
Adds builder pattern with validation for CAGRA, IVF-Flat, IVF-PQ,
and Vamana index/search params. Invalid parameters now return errors
instead of silently causing GPU crashes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from upstream PR rapidsai#1478, adapted for Index<'a>
lifetime API from PR rapidsai#1869.

Adds Filter trait and implementations (NoFilter, Bitset, Bitmap) in Rust,
updates search() signatures across CAGRA, IVF-PQ, and IVF-Flat to accept
Optional<&dyn Filter>. Includes C API filter plumbing, C tests for
filtered search, and Go/Python bindings updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jagan-nuvai jagan-nuvai merged commit 52772b8 into main Mar 23, 2026
@jagan-nuvai jagan-nuvai deleted the nuvai/rust-serialize-and-lifetime-safety branch March 23, 2026 08:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants