Add filtering for CAGRA to C API#452
Conversation
benfred
left a comment
There was a problem hiding this comment.
thanks for the PR! It'd be great to get this functionality into the cagra c-api
Co-authored-by: Micka <ide.mickael@gmail.com>
|
/merge |
|
/ok to test |
| cuvs::neighbors::cagra::search( | ||
| *res_ptr, search_params, *index_ptr, queries_mds, neighbors_mds, distances_mds); | ||
| } else if (filter.type == BITSET) { | ||
| using filter_mdspan_type = raft::device_vector_view<std::uint32_t, int64_t, raft::row_major>; |
There was a problem hiding this comment.
I think the build fails because of the std::uint32_t as the first type argument. The type arguments should be <index_t, index_t>, see raft bitset.hpp:
template <typename bitset_t = uint32_t, typename index_t = uint32_t>
struct bitset {
static constexpr index_t bitset_element_size = sizeof(bitset_t) * 8;
/**
* @brief Construct a new bitset object with a list of indices to unset.
*
* @param res RAFT resources
* @param mask_index List of indices to unset in the bitset
* @param bitset_len Length of the bitset
* @param default_value Default value to set the bits to. Default is true.
*/
bitset(const raft::resources& res,
raft::device_vector_view<const index_t, index_t> mask_index,
index_t bitset_len,
bool default_value = true);
There was a problem hiding this comment.
Hmm actually I missed this. The function you are using here is creating a bitset from a list of indices and I don't think it is the workflow that we expect.
The C++ function accepts a bitset_view, not a bitset, so at this point the memory for the bitset should already allocated and we just need to transfer the pointer and the length of the bitset. The C function should also assume that the filter given in input is a bitset already allocated and filled, instead of a list of neighbors to filter. So the filter taken as a parameter in this function should be manipulated as a bitset_view object.
| cuvs::neighbors::cagra::search( | ||
| *res_ptr, search_params, *index_ptr, queries_mds, neighbors_mds, distances_mds); | ||
| } else if (filter.type == BITSET) { | ||
| using filter_mdspan_type = raft::device_vector_view<std::uint32_t, int64_t, raft::row_major>; |
There was a problem hiding this comment.
Hmm actually I missed this. The function you are using here is creating a bitset from a list of indices and I don't think it is the workflow that we expect.
The C++ function accepts a bitset_view, not a bitset, so at this point the memory for the bitset should already allocated and we just need to transfer the pointer and the length of the bitset. The C function should also assume that the filter given in input is a bitset already allocated and filled, instead of a list of neighbors to filter. So the filter taken as a parameter in this function should be manipulated as a bitset_view object.
|
/ok to test |
|
/ok to test |
| DLManagedTensor* neighbors, | ||
| DLManagedTensor* distances); | ||
| DLManagedTensor* distances, | ||
| cuvsFilter filter); |
There was a problem hiding this comment.
This API change needs to be propagated to:
- the python package
- the example C project (
cuvs/example/c) - probably the rust package
| distances=None, | ||
| resources=None): | ||
| resources=None, | ||
| filter=None): |
There was a problem hiding this comment.
Add this parameter to the python documentation
|
|
||
| Parameters | ||
| ---------- | ||
| bitmap : numpy.ndarray |
There was a problem hiding this comment.
Update this docstring for bitset instead of bitmap
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/merge |
|
/ok to test |
|
/ok to test |
|
/ok to test |
|
/ok to test |
Adds the CAGRA filtering feature to the C API using DLPack Tensor as blocklist Authors: - Ajit Mistry (https://github.com/ajit283) - Ben Frederickson (https://github.com/benfred) - Micka (https://github.com/lowener) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Micka (https://github.com/lowener) - Ben Frederickson (https://github.com/benfred) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#452
Adds the CAGRA filtering feature to the C API using DLPack Tensor as blocklist