fix: support hamming distance in IndicesBuilder#6295
fix: support hamming distance in IndicesBuilder#6295Xuanwo merged 1 commit intolance-format:mainfrom
Conversation
IndicesBuilder rejected uint8 vector columns and didn't allow "hamming" as a distance type, even though the underlying Rust IVF training supports hamming via k-modes. This relaxes the Python-side validation to accept unsigned integer value types and adds "hamming" to the allowed distance types. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR ReviewClean, focused change. The Rust layer already validates that uint8 vectors are only used with hamming distance ( One minor suggestion: In # Unsigned integer types (e.g. uint8) are accepted here for hamming distance;
# dtype/distance-type compatibility is validated downstream in train_ivf.Otherwise LGTM — test coverage is good, docstrings are updated, and the change is minimal. |
## Summary - `IndicesBuilder` rejected uint8 vector columns and didn't include "hamming" in its allowed distance types, even though the underlying Rust `train_ivf_model` supports hamming via k-modes - Relaxes `_normalize_column` to accept unsigned integer value types alongside floats - Adds "hamming" to `_normalize_distance_type`'s allowed list - Adds `test_ivf_centroids_hamming` test with uint8 vectors ## Test plan - [ ] `test_ivf_centroids_hamming` — end-to-end IVF training with uint8 vectors and hamming distance - [ ] Existing `test_ivf_centroids` tests still pass (float path unchanged) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
IndicesBuilderrejected uint8 vector columns and didn't include "hamming" in its allowed distance types, even though the underlying Rusttrain_ivf_modelsupports hamming via k-modes_normalize_columnto accept unsigned integer value types alongside floats_normalize_distance_type's allowed listtest_ivf_centroids_hammingtest with uint8 vectorsTest plan
test_ivf_centroids_hamming— end-to-end IVF training with uint8 vectors and hamming distancetest_ivf_centroidstests still pass (float path unchanged)🤖 Generated with Claude Code