[Rust] Add validated IndexParamsBuilder / SearchParamsBuilder (fixes #1859)#1861
[Rust] Add validated IndexParamsBuilder / SearchParamsBuilder (fixes #1859)#1861zbennett10 wants to merge 2 commits intorapidsai:mainfrom
Conversation
Add `IndexParams::builder()` / `SearchParams::builder()` entry points
to the CAGRA, IVF-PQ, IVF-Flat, and Vamana Rust bindings.
## Problem
The existing `IndexParams::new()?.set_graph_degree(0)` setter chain
accepts invalid values silently. Errors surface as an opaque CUDA
assertion inside `Index::build()` — up to 1.8s after the bad config
was written, after GPU memory is already allocated.
## Solution
Each `XxxParamsBuilder`:
- Stores parameters as Rust-native values (no FFI struct allocated yet)
- Exposes `validate() -> Result<()>`: pure-Rust constraint checks,
no GPU work; callable before any device is present
- Exposes `build() -> Result<XxxParams>`: calls `validate()`, then
allocates the FFI struct via the existing `new()?.set_*()` chain
- Fully additive — the existing `new()` + setter API is unchanged
Validation rules enforced:
- CAGRA `IndexParams`: graph_degree > 0, intermediate_graph_degree >=
graph_degree, nn_descent_niter > 0
- CAGRA `SearchParams`: itopk_size is a power of 2 (or 0 for auto),
team_size ∈ {0,4,8,16,32}, hashmap_max_fill_rate ∈ (0.1, 0.9)
- IVF-PQ / IVF-Flat `IndexParams`: n_lists > 0,
kmeans_trainset_fraction ∈ (0, 1]
- Vamana `IndexParams`: graph_degree > 0, visited_size >= graph_degree,
alpha > 0
## Changes
- `src/error.rs`: add `impl From<String> for Error` for ergonomic
validation errors (private-field CuvsError constructed in-module)
- `src/cagra/index_params.rs`: `IndexParamsBuilder` + `IndexParams::builder()`
- `src/cagra/search_params.rs`: `SearchParamsBuilder` + `SearchParams::builder()`
- `src/cagra/mod.rs`: re-export `IndexParamsBuilder`, `SearchParamsBuilder`
- `examples/cagra.rs`: updated to demonstrate builder API
- `src/ivf_pq/index_params.rs`: `IndexParamsBuilder`
- `src/ivf_flat/index_params.rs`: `IndexParamsBuilder`
- `src/vamana/index_params.rs`: `IndexParamsBuilder`
- `src/{ivf_pq,ivf_flat,vamana}/mod.rs`: re-export `IndexParamsBuilder`
## Tests added (all 6 PR gate tests from issue spec)
- `builder_rejects_zero_graph_degree` (validate only, no GPU)
- `builder_rejects_invalid_intermediate_degree` (validate only)
- `builder_rejects_zero_niter` (validate only)
- `builder_accepts_valid_params` (validate only)
- `builder_round_trips_to_ffi` (requires GPU — compares builder output
to manual setter chain at FFI struct level)
- `existing_setter_api_unchanged` (requires GPU — confirms no regression)
a6f6a50 to
46b5c40
Compare
|
/ok to test 46b5c40 |
Cherry-picked from upstream PR rapidsai#1861 (fixes rapidsai#1859). Adds builder pattern with validation for CAGRA, IVF-Flat, IVF-PQ, and Vamana index/search params. Invalid parameters now return errors instead of silently causing GPU crashes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Hey Zachary! Thanks for the thorough work here. I agree that the current bindings lack Rust-side validation that we can report as proper Rust errors before they cross the FFI boundary. I'd like to steer the implementation is a slightly different direction. We've been prototyping the next iteration of these Rust bindings, and we've landed on a pattern that avoids some of the pitfalls common with the hand-rolled builders. I think we can get there without too much rework. One of the problems with your proposal is the default duplication between Rust and C. Each build has a manual impl Default for IndexParamsBuilder {
fn default() -> Self {
Self {
graph_degree: 64, // must match C
intermediate_graph_degree: 128, // must match C
...
}
}
}To avoid this, we should allocate the C params handle first, which gives us all the defaults, and then override explicitly set fields. To avoid writing builders manually, I suggest to bring in the use bon::bon;
#[bon]
impl IndexParams {
#[builder]
pub fn new(
graph_degree: Option<usize>,
intermediate_graph_degree: Option<usize>,
nn_descent_niter: Option<usize>,
build_algo: Option<BuildAlgo>,
compression: Option<CompressionParams>,
) -> Result<Self> {
// 1. Validate (only the fields that were set)
if let Some(gd) = graph_degree {
if gd == 0 {
return Err(/* ... */);
}
}
// Cross-field validation where both are set, etc.
// 2. Allocate C params (gets exact library defaults)
let params = Self::allocate_ffi()?;
// 3. Override only what the caller specified
unsafe {
if let Some(v) = graph_degree {
(*params.0).graph_degree = v;
}
// ...
}
Ok(params)
}
}With this approach, calling Next, the power-of-2 check on
And it rounds up the provided value to a multiple of 32. And the final suggestion is about error handling. Rust-side validation errors should be distinguishable from C library errors. Right now the |
Summary
Closes #1859.
Adds
IndexParams::builder()/SearchParams::builder()entry points to theCAGRA, IVF-PQ, IVF-Flat, and Vamana Rust bindings. Invalid parameter values
(e.g.
graph_degree=0) now produce a clear Rust error naming the offendingfield and its valid range, before any FFI allocation or GPU work begins.
Problem
The existing setter chain is silent about invalid values:
Solution
The builder provides two methods:
validate() -> Result<()>— pure-Rust constraint checks, no GPU neededbuild() -> Result<IndexParams>— validates then allocates the FFI structThe existing
IndexParams::new()?.set_*()API is completely unchanged.Changes
src/error.rsimpl From<String> for Errorfor validation errorssrc/cagra/index_params.rsIndexParamsBuilder+IndexParams::builder()src/cagra/search_params.rsSearchParamsBuilder+SearchParams::builder()src/cagra/mod.rsexamples/cagra.rssrc/ivf_pq/index_params.rsIndexParamsBuildersrc/ivf_flat/index_params.rsIndexParamsBuildersrc/vamana/index_params.rsIndexParamsBuildersrc/{ivf_pq,ivf_flat,vamana}/mod.rsIndexParamsBuilderValidation rules enforced
CAGRA
IndexParams:graph_degree > 0intermediate_graph_degree >= graph_degreenn_descent_niter > 0CAGRA
SearchParams:itopk_sizeis a power of 2 (or 0 for auto)team_size∈ {0, 4, 8, 16, 32}hashmap_max_fill_rate∈ (0.1, 0.9)IVF-PQ / IVF-Flat
IndexParams:n_lists > 0kmeans_trainset_fraction∈ (0, 1]Vamana
IndexParams:graph_degree > 0visited_size >= graph_degreealpha > 0Tests
All 6 specified PR gate tests are implemented. Tests 1–4 use
validate()only(no GPU required). Tests 5–6 call
build()and require a CUDA device (willrun on RAPIDS CI):
builder_rejects_zero_graph_degree— validate onlybuilder_rejects_invalid_intermediate_degree— validate onlybuilder_rejects_zero_niter— validate onlybuilder_accepts_valid_params— validate onlybuilder_round_trips_to_ffi— requires GPU; compares builder vs. setter chain at FFI levelexisting_setter_api_unchanged— requires GPU; confirms no regressionAll 32 validation-logic tests verified locally via a standalone mock-FFI
test harness (pure Rust, no CUDA needed).
Checklist
IndexParams::new()+ setter API is unchangedvalidate()tests pass without a GPU (pure Rust logic)build()/ FFI round-trip tests require CUDA (RAPIDS CI)cargo +nightly fmt --checkclean