Skip to content

[Rust] Add validated IndexParamsBuilder / SearchParamsBuilder (fixes #1859)#1861

Open
zbennett10 wants to merge 2 commits intorapidsai:mainfrom
zbennett10:feat/rust-validated-index-params-builder
Open

[Rust] Add validated IndexParamsBuilder / SearchParamsBuilder (fixes #1859)#1861
zbennett10 wants to merge 2 commits intorapidsai:mainfrom
zbennett10:feat/rust-validated-index-params-builder

Conversation

@zbennett10
Copy link
Copy Markdown
Contributor

Summary

Closes #1859.

Adds IndexParams::builder() / SearchParams::builder() entry points to the
CAGRA, IVF-PQ, IVF-Flat, and Vamana Rust bindings. Invalid parameter values
(e.g. graph_degree=0) now produce a clear Rust error naming the offending
field and its valid range, before any FFI allocation or GPU work begins.

Problem

The existing setter chain is silent about invalid values:

// Compiles fine. 1.823s later, inside Index::build():
// cuvsError { code: CUVS_ERROR, text: "raft::exception: assertion failed" }
let params = IndexParams::new()?
    .set_graph_degree(0)
    .set_intermediate_graph_degree(0)
    .set_nn_descent_niter(0);
let _index = Index::build(&res, &params, &dataset)?; // ← error here, no field name

Solution

// Error fires immediately, before any GPU work:
// Err("graph_degree must be > 0; got 0")
let params = IndexParams::builder()
    .graph_degree(32)
    .intermediate_graph_degree(64)
    .nn_descent_niter(20)
    .build()?;

The builder provides two methods:

  • validate() -> Result<()> — pure-Rust constraint checks, no GPU needed
  • build() -> Result<IndexParams> — validates then allocates the FFI struct

The existing IndexParams::new()?.set_*() API is completely unchanged.

Changes

File Change
src/error.rs Add impl From<String> for Error for validation errors
src/cagra/index_params.rs IndexParamsBuilder + IndexParams::builder()
src/cagra/search_params.rs SearchParamsBuilder + SearchParams::builder()
src/cagra/mod.rs Re-export builder types
examples/cagra.rs Demonstrate builder API
src/ivf_pq/index_params.rs IndexParamsBuilder
src/ivf_flat/index_params.rs IndexParamsBuilder
src/vamana/index_params.rs IndexParamsBuilder
src/{ivf_pq,ivf_flat,vamana}/mod.rs Re-export IndexParamsBuilder

Validation rules enforced

CAGRA IndexParams:

  • graph_degree > 0
  • intermediate_graph_degree >= graph_degree
  • nn_descent_niter > 0

CAGRA SearchParams:

  • itopk_size is a power of 2 (or 0 for auto)
  • team_size ∈ {0, 4, 8, 16, 32}
  • hashmap_max_fill_rate ∈ (0.1, 0.9)

IVF-PQ / IVF-Flat IndexParams:

  • n_lists > 0
  • kmeans_trainset_fraction ∈ (0, 1]

Vamana IndexParams:

  • graph_degree > 0
  • visited_size >= graph_degree
  • alpha > 0

Tests

All 6 specified PR gate tests are implemented. Tests 1–4 use validate() only
(no GPU required). Tests 5–6 call build() and require a CUDA device (will
run on RAPIDS CI):

  1. builder_rejects_zero_graph_degree — validate only
  2. builder_rejects_invalid_intermediate_degree — validate only
  3. builder_rejects_zero_niter — validate only
  4. builder_accepts_valid_params — validate only
  5. builder_round_trips_to_ffi — requires GPU; compares builder vs. setter chain at FFI level
  6. existing_setter_api_unchanged — requires GPU; confirms no regression

All 32 validation-logic tests verified locally via a standalone mock-FFI
test harness (pure Rust, no CUDA needed).

Checklist

  • Existing IndexParams::new() + setter API is unchanged
  • validate() tests pass without a GPU (pure Rust logic)
  • build() / FFI round-trip tests require CUDA (RAPIDS CI)
  • cargo +nightly fmt --check clean
  • No breaking changes

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot bot commented Mar 1, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Add `IndexParams::builder()` / `SearchParams::builder()` entry points
to the CAGRA, IVF-PQ, IVF-Flat, and Vamana Rust bindings.

## Problem

The existing `IndexParams::new()?.set_graph_degree(0)` setter chain
accepts invalid values silently. Errors surface as an opaque CUDA
assertion inside `Index::build()` — up to 1.8s after the bad config
was written, after GPU memory is already allocated.

## Solution

Each `XxxParamsBuilder`:
- Stores parameters as Rust-native values (no FFI struct allocated yet)
- Exposes `validate() -> Result<()>`: pure-Rust constraint checks,
  no GPU work; callable before any device is present
- Exposes `build() -> Result<XxxParams>`: calls `validate()`, then
  allocates the FFI struct via the existing `new()?.set_*()` chain
- Fully additive — the existing `new()` + setter API is unchanged

Validation rules enforced:
- CAGRA `IndexParams`: graph_degree > 0, intermediate_graph_degree >=
  graph_degree, nn_descent_niter > 0
- CAGRA `SearchParams`: itopk_size is a power of 2 (or 0 for auto),
  team_size ∈ {0,4,8,16,32}, hashmap_max_fill_rate ∈ (0.1, 0.9)
- IVF-PQ / IVF-Flat `IndexParams`: n_lists > 0,
  kmeans_trainset_fraction ∈ (0, 1]
- Vamana `IndexParams`: graph_degree > 0, visited_size >= graph_degree,
  alpha > 0

## Changes

- `src/error.rs`: add `impl From<String> for Error` for ergonomic
  validation errors (private-field CuvsError constructed in-module)
- `src/cagra/index_params.rs`: `IndexParamsBuilder` + `IndexParams::builder()`
- `src/cagra/search_params.rs`: `SearchParamsBuilder` + `SearchParams::builder()`
- `src/cagra/mod.rs`: re-export `IndexParamsBuilder`, `SearchParamsBuilder`
- `examples/cagra.rs`: updated to demonstrate builder API
- `src/ivf_pq/index_params.rs`: `IndexParamsBuilder`
- `src/ivf_flat/index_params.rs`: `IndexParamsBuilder`
- `src/vamana/index_params.rs`: `IndexParamsBuilder`
- `src/{ivf_pq,ivf_flat,vamana}/mod.rs`: re-export `IndexParamsBuilder`

## Tests added (all 6 PR gate tests from issue spec)

- `builder_rejects_zero_graph_degree` (validate only, no GPU)
- `builder_rejects_invalid_intermediate_degree` (validate only)
- `builder_rejects_zero_niter` (validate only)
- `builder_accepts_valid_params` (validate only)
- `builder_round_trips_to_ffi` (requires GPU — compares builder output
  to manual setter chain at FFI struct level)
- `existing_setter_api_unchanged` (requires GPU — confirms no regression)
@zbennett10 zbennett10 force-pushed the feat/rust-validated-index-params-builder branch from a6f6a50 to 46b5c40 Compare March 4, 2026 14:02
@aamijar
Copy link
Copy Markdown
Member

aamijar commented Mar 4, 2026

/ok to test 46b5c40

@aamijar aamijar removed request for a team March 4, 2026 21:45
@aamijar aamijar added non-breaking Introduces a non-breaking change Rust improvement Improves an existing functionality labels Mar 4, 2026
Ludu-nuvai added a commit to Nuvai/cuvs that referenced this pull request Mar 23, 2026
Cherry-picked from upstream PR rapidsai#1861 (fixes rapidsai#1859).
Adds builder pattern with validation for CAGRA, IVF-Flat, IVF-PQ,
and Vamana index/search params. Invalid parameters now return errors
instead of silently causing GPU crashes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@yan-zaretskiy
Copy link
Copy Markdown
Contributor

Hey Zachary! Thanks for the thorough work here. I agree that the current bindings lack Rust-side validation that we can report as proper Rust errors before they cross the FFI boundary.

I'd like to steer the implementation is a slightly different direction. We've been prototyping the next iteration of these Rust bindings, and we've landed on a pattern that avoids some of the pitfalls common with the hand-rolled builders. I think we can get there without too much rework.

One of the problems with your proposal is the default duplication between Rust and C. Each build has a manual Default impl like so:

impl Default for IndexParamsBuilder {                                                                                                                
    fn default() -> Self {                                
        Self {
            graph_degree: 64,               // must match C
            intermediate_graph_degree: 128, // must match C                                                                                          
            ...
        }                                                                                                                                            
    }                                                     
}

To avoid this, we should allocate the C params handle first, which gives us all the defaults, and then override explicitly set fields.

To avoid writing builders manually, I suggest to bring in the bon crate, and use its #[builder] macro on the new function, like so:

use bon::bon;                                             

#[bon]                                                                                                                                               
impl IndexParams {
    #[builder]                                                                                                                                       
    pub fn new(                                           
        graph_degree: Option<usize>,
        intermediate_graph_degree: Option<usize>,
        nn_descent_niter: Option<usize>,                                                                                                             
        build_algo: Option<BuildAlgo>,
        compression: Option<CompressionParams>,                                                                                                      
    ) -> Result<Self> {                                   
        // 1. Validate (only the fields that were set)                                                                                               
        if let Some(gd) = graph_degree {                                                                                                             
            if gd == 0 {                                                                                                                             
                return Err(/* ... */);                                                                                                               
            }                                                                                                                                        
        }                                                 
        // Cross-field validation where both are set, etc.
                                                                                                                                                     
        // 2. Allocate C params (gets exact library defaults)                                                                                     
        let params = Self::allocate_ffi()?;                                                                                                          
                                                                                                                                                     
        // 3. Override only what the caller specified     
        unsafe {                                                                                                                                     
            if let Some(v) = graph_degree {               
                (*params.0).graph_degree = v;
            }                                                                                                                                        
            // ...
        }                                                                                                                                            
        Ok(params)                                        
    }
}

With this approach, calling IndexParams::builder().graph_degree(32).build() just works, with proper validation defined in one place.

Next, the power-of-2 check on itopk_size is too strict. The C++ logic in search_plan.cuh only requires:

  • >= k (the number of neighbors requested)
  • <= 1024 for the SINGLE_CTA algorithm

And it rounds up the provided value to a multiple of 32.

And the final suggestion is about error handling. Rust-side validation errors should be distinguishable from C library errors. Right now the From<String> for Error impl wraps them as CuvsError. A dedicated variant (e.g. Error::Validation(String)) would be cleaner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

improvement Improves an existing functionality non-breaking Introduces a non-breaking change Rust

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[Rust] IndexParams/SearchParams setters silently accept invalid values — add validated builder

4 participants