Skip to content

feat(rust)!: compile-time dataset lifetime safety for all index types#1869

Open
zbennett10 wants to merge 1 commit intorapidsai:mainfrom
zbennett10:feat/rust-dataset-lifetime-safety
Open

feat(rust)!: compile-time dataset lifetime safety for all index types#1869
zbennett10 wants to merge 1 commit intorapidsai:mainfrom
zbennett10:feat/rust-dataset-lifetime-safety

Conversation

@zbennett10
Copy link
Copy Markdown
Contributor

@zbennett10 zbennett10 commented Mar 3, 2026

This is follow-on from #1839 based on comments from @benfred

Introduces compile-time dataset lifetime enforcement across all 4 Rust index types (brute_force, cagra, ivf_flat, ivf_pq)... assuming we want to do it for all of them! It addresses the use-after-free risk when ManagedTensor::from(&ndarray) creates a non-owning view and the C library stores a pointer to the dataset.. and who knows what shenanigans may go on "down there".. 😄

Main Changes

  • DatasetOwnership<'a> enum in dlpack.rs — tracks whether an index borrows or owns its dataset
  • Index<'a> with dual constructors for all 4 index types:
    • build(&'a ManagedTensor) — borrowed path, compiler enforces dataset outlives index
    • build_owned(ManagedTensor) — owned path ('static), index is self-contained
  • create_handle() helper — avoids double-free by separating FFI handle creation from Index struct construction

Design

pub struct Index<'a> { // Ties the lifetime of the index to it's underlying data
    inner: ffi::cuvsXxxIndex_t,
    _data: DatasetOwnership<'a>,
}

// Borrowed — compiler enforces arr outlives index
let tensor = ManagedTensor::from(&arr);
let index = Index::build(&res, &params, &tensor)?;

// Owned — 'static, self-contained
let device = ManagedTensor::from(&arr).to_device(&res)?;
let index = Index::build_owned(&res, &params, device)?;
drop(arr); // Fine — index owns the GPU copy

Breaking Change

Index::build() now takes &ManagedTensor instead of impl Into<ManagedTensor>. Users should:

  • Use build(&tensor) with an explicit ManagedTensor reference (borrowed)
  • Use build_owned(tensor) for the old move semantics (owned)

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 3, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@review-notebook-app
Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@zbennett10 zbennett10 changed the base branch from branch-25.06 to main March 3, 2026 21:56
@zbennett10
Copy link
Copy Markdown
Contributor Author

zbennett10 commented Mar 3, 2026

@benfred This does introduce a breaking change.. which I"m not super happy about it.. I mean we could do something like this where we have a backward-compatible approach.. basically keep build() as the owned path, add build_borrowed() as the new, safe path.. but idk maybe it's best to bite the bullet there.

// Keep the old build() signature — it becomes the "owned" path                                                                                                                               
impl Index<'static> {                                                                                                                                                                         
    pub fn build<T: Into<ManagedTensor>>(
        res: &Resources,
        params: &IndexParams,
        dataset: T,
    ) -> Result<Index<'static>> {
        let dataset: ManagedTensor = dataset.into();
        let inner = Self::create_handle()?;
        unsafe { check_cuvs(ffi::cuvsCagraBuild(res.0, params.0, dataset.as_ptr(), inner))?; }
        Ok(Index { inner, _data: DatasetOwnership::Owned(dataset) })
    }
}

// New safe borrowed path alongside it
impl<'a> Index<'a> {
    pub fn build_borrowed(
        res: &Resources,
        params: &IndexParams,
        dataset: &'a ManagedTensor,
    ) -> Result<Index<'a>> { ... }
}

Existing code like Index::build(&res, &params, &dataset) would still compile — the ndarray ref goes through Into<ManagedTensor> and gets owned.

But — the struct still has <'a>, so fn get_index() -> Index breaks. So for this to work we would need a type alias:
Something like pub type OwnedIndex = Index<'static>;

@zbennett10 zbennett10 force-pushed the feat/rust-dataset-lifetime-safety branch from 6a66208 to e006925 Compare March 4, 2026 14:01
@aamijar aamijar added non-breaking Introduces a non-breaking change Rust improvement Improves an existing functionality labels Mar 4, 2026
@aamijar aamijar requested review from benfred and removed request for a team and AyodeAwe March 4, 2026 21:41
@aamijar
Copy link
Copy Markdown
Member

aamijar commented Mar 4, 2026

/ok to test e006925

@benfred
Copy link
Copy Markdown
Contributor

benfred commented Mar 11, 2026

/ok to test 994f408

Ludu-nuvai added a commit to Nuvai/cuvs that referenced this pull request Mar 23, 2026
…ypes

Cherry-picked from upstream PR rapidsai#1869 (rapidsai/cuvs).

Introduces Index<'a> with DatasetOwnership enum (Borrowed/Owned) across
brute_force, cagra, ivf_flat, and ivf_pq. Adds build() for borrowed
datasets (compiler enforces lifetime) and build_owned() for self-contained
indexes. Fixes use-after-free when C library dereferences freed dataset.

Original author: zbennett10
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ludu-nuvai added a commit to Nuvai/cuvs that referenced this pull request Mar 23, 2026
… indexes

Cherry-picks CAGRA serialization from upstream PR rapidsai#1840 (zbennett10),
adapted for the Index<'a> lifetime API from PR rapidsai#1869.

Adds new Nuvai-authored serialize/deserialize methods for IVF-PQ and
IVF-Flat, following the same pattern (CString filename, FFI call to
cuvsIvfPqSerialize/Deserialize and cuvsIvfFlatSerialize/Deserialize).

Includes round-trip tests for CAGRA and IVF-PQ serialization, plus
hnswlib export test for CAGRA.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Ludu-nuvai added a commit to Nuvai/cuvs that referenced this pull request Mar 23, 2026
Cherry-picked from upstream PR rapidsai#1478, adapted for Index<'a>
lifetime API from PR rapidsai#1869.

Adds Filter trait and implementations (NoFilter, Bitset, Bitmap) in Rust,
updates search() signatures across CAGRA, IVF-PQ, and IVF-Flat to accept
Optional<&dyn Filter>. Includes C API filter plumbing, C tests for
filtered search, and Go/Python bindings updates.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@cjnolet
Copy link
Copy Markdown
Member

cjnolet commented Mar 24, 2026

/ok to test 563762c

@yan-zaretskiy
Copy link
Copy Markdown
Contributor

Hey Zachary, thank you for your contribution — the current state of the Rust bindings is indeed unsound for certain index types and this is a very important area of improvement. I've spent some time investigating the C++ internals to understand exactly which index types need lifetime enforcement, and here's what I found:

  • BruteForce always stores a non-owning view of the dataset
  • CAGRA behavior is runtime-dependent. make_aligned_dataset() in common.hpp stores a non_owning_dataset (just an mdspan view) when the input is device-accessible, row-major, and 16-byte aligned; and it performs a copy otherwise. Hence at the type level we have no choice but to also enforce a lifetime.
  • IVF-Flat & IVF-PQ always copy the dataset, they don't need to reference the original data.

Our Rust bindings should retain the same ownership semantics as the underlying library. For this reason I feel that I'd need to push back on the proposed design.

Besides, the way it is implemented in this PR is too fragile. The ManagedTensor in itself does not indicate at the type level if it actually owns the underlying dataset, even if it has a non-null deleter. So the Owned variant of the DatasetOwnership<'a> does not express the true ownership.

Instead, I propose to simplify this PR as follows:

  1. Revert IVF-Flat and IVF-PQ — these always copy
  2. Remove the build_owned method altogether.
  3. Use PhantomData<&'a ()> directly on the index structs and delete the DatasetOwnership. For the borrowed path, the caller keeps the tensor alive and the compiler enforces it via 'a.
  4. deserialize should return Index<'static> — deserialized indices own their data.

zbennett10 added a commit to zbennett10/cuvs that referenced this pull request Apr 14, 2026
Simplify the dataset-lifetime-safety work per review feedback on
rapidsai#1869:

- IVF-Flat and IVF-PQ always copy their input in the C++ library, so
  their Rust bindings do not need a lifetime parameter and have been
  reverted to the upstream signatures.
- CAGRA's `make_aligned_dataset()` stores a non-owning view when the
  input is device-accessible, row-major and 16-byte aligned, so the
  Rust `Index` now carries a lifetime tied to the `ManagedTensor` it
  was built from. The borrow checker then prevents use-after-free
  regardless of which runtime path the C++ side takes.
- Brute Force always stores a non-owning view and gets the same
  lifetime treatment.
- Dropped `DatasetOwnership` and `build_owned` — the `ManagedTensor`
  type itself doesn't express ownership in the type system, so the
  `Owned` variant was fragile. `Index<'a>` now holds a
  `PhantomData<&'a ()>` directly, which matches the semantics of the
  underlying C++ library exactly.
- Test suites for both index types have been reorganised so the
  dataset + self-neighbor search boilerplate is factored into shared
  helpers, matching the pattern being adopted in the CAGRA
  serialization PR.

This is still a breaking change for `cagra::Index` and
`brute_force::Index::build`, which now take `&ManagedTensor` instead of
`impl Into<ManagedTensor>`. IVF-Flat and IVF-PQ are unchanged.

Signed-off-by: Zach Bennett <zach@worldflowai.com>
@zbennett10 zbennett10 force-pushed the feat/rust-dataset-lifetime-safety branch from 563762c to 3c090d6 Compare April 14, 2026 13:01
@zbennett10
Copy link
Copy Markdown
Contributor Author

zbennett10 commented Apr 14, 2026

Appreciate you @yan-zaretskiy — thanks a lot for the great feedback

The public API change is narrower now — just cagra::Index::build and brute_force::Index::build taking &ManagedTensor instead of impl Into<ManagedTensor>. IVF-Flat and IVF-PQ are completely untouched. Let me know what you think...

Comment thread rust/cuvs/src/cagra/index.rs Outdated
@yan-zaretskiy
Copy link
Copy Markdown
Contributor

Looks good to me otherwise!

zbennett10 added a commit to zbennett10/cuvs that referenced this pull request Apr 15, 2026
Simplify the dataset-lifetime-safety work per review feedback on
rapidsai#1869:

- IVF-Flat and IVF-PQ always copy their input in the C++ library, so
  their Rust bindings do not need a lifetime parameter and have been
  reverted to the upstream signatures.
- CAGRA's `make_aligned_dataset()` stores a non-owning view when the
  input is device-accessible, row-major and 16-byte aligned, so the
  Rust `Index` now carries a lifetime tied to the `ManagedTensor` it
  was built from. The borrow checker then prevents use-after-free
  regardless of which runtime path the C++ side takes.
- Brute Force always stores a non-owning view and gets the same
  lifetime treatment.
- Dropped `DatasetOwnership` and `build_owned` — the `ManagedTensor`
  type itself doesn't express ownership in the type system, so the
  `Owned` variant was fragile. `Index<'a>` now holds a
  `PhantomData<&'a ()>` directly, which matches the semantics of the
  underlying C++ library exactly.
- Test suites for both index types have been reorganised so the
  dataset + self-neighbor search boilerplate is factored into shared
  helpers, matching the pattern being adopted in the CAGRA
  serialization PR.

This is still a breaking change for `cagra::Index` and
`brute_force::Index::build`, which now take `&ManagedTensor` instead of
`impl Into<ManagedTensor>`. IVF-Flat and IVF-PQ are unchanged.

Signed-off-by: Zach Bennett <zach@worldflowai.com>
@zbennett10 zbennett10 force-pushed the feat/rust-dataset-lifetime-safety branch from 3c090d6 to a6c5e46 Compare April 15, 2026 14:23
@yan-zaretskiy
Copy link
Copy Markdown
Contributor

@zbennett10 Could you resolve the branch conflict so that we can merge this PR?

Simplify the dataset-lifetime-safety work per review feedback on
rapidsai#1869:

- IVF-Flat and IVF-PQ always copy their input in the C++ library, so
  their Rust bindings do not need a lifetime parameter and have been
  left on their upstream signatures.
- CAGRA's `make_aligned_dataset()` stores a non-owning view when the
  input is device-accessible, row-major and 16-byte aligned, so the
  Rust `Index` now carries a lifetime tied to the `ManagedTensor` it
  was built from. The borrow checker then prevents use-after-free
  regardless of which runtime path the C++ side takes.
- Brute Force always stores a non-owning view and gets the same
  lifetime treatment.
- `DatasetOwnership` and `build_owned` are dropped — the
  `ManagedTensor` type itself doesn't express ownership in the type
  system, so the `Owned` variant was fragile. `Index<'a>` now holds a
  `PhantomData<&'a ()>` directly, which matches the semantics of the
  underlying C++ library exactly.
- `cagra::Index::deserialize` now returns `Index<'static>`; the
  deserialized index owns its data and is not tied to any input
  `ManagedTensor`. `serialize` and `serialize_to_hnswlib` live on
  `impl<'a> Index<'a>` so they work against both borrowed and
  deserialized indices.
- `pub fn new()` is gone from both `cagra::Index` and
  `brute_force::Index`. An empty index has no real dataset lifetime
  and the builders call the private `create_handle` helper directly.
- Test suites for both index types have been reorganised so the
  dataset + self-neighbor search boilerplate is factored into shared
  helpers, including a negative test that confirms a path containing
  an interior NUL byte surfaces as `Error::InvalidArgument`.

Breaking change: `cagra::Index::build` and `brute_force::Index::build`
now take `&ManagedTensor` instead of `impl Into<ManagedTensor>`, and
the public `new()` constructors have been removed. IVF-Flat and IVF-PQ
are unchanged.

Signed-off-by: Zach Bennett <zach@worldflowai.com>
@zbennett10 zbennett10 force-pushed the feat/rust-dataset-lifetime-safety branch from a6c5e46 to defcc60 Compare April 15, 2026 18:52
@zbennett10
Copy link
Copy Markdown
Contributor Author

@zbennett10 Could you resolve the branch conflict so that we can merge this PR?

@yan-zaretskiy 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change Rust

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

5 participants