fix: schema isn't expected for IVF_PQ#3606
Conversation
Signed-off-by: BubbleCal <bubble-cal@outlook.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3606 +/- ##
==========================================
+ Coverage 78.67% 78.68% +0.01%
==========================================
Files 258 258
Lines 96817 96890 +73
Branches 96817 96890 +73
==========================================
+ Hits 76172 76242 +70
+ Misses 17578 17576 -2
- Partials 3067 3072 +5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
westonpace
left a comment
There was a problem hiding this comment.
Good find. A few small questions because I'm not sure why we need extra code to generate an invalid state?
| .unwrap() | ||
| } | ||
|
|
||
| async fn create_pq_storage_with_extra_column() -> ProductQuantizationStorage { |
There was a problem hiding this comment.
Can we still get PQ storage with an extra column from a real workflow? Or is this just generating some kind of invalid input for testing?
There was a problem hiding this comment.
it's just for testing, we shouldn't see any extra column in real workflow
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_remap_with_extra_column() { |
There was a problem hiding this comment.
Is this because some old indices will have this extra column and we need to make sure they are supported?
There was a problem hiding this comment.
right, we saw some feedbacks about this, so add this test to make sure the old indices could work with this fix
now we drop the `__ivf_part_id` when shuffling, the corner is that `num_partitions=1`: 1. if `num_partitions=1` then no shuffling is needed 2. the shuffler reader would return the data directly 3. then the `__ivf_part_id` is not dropped, it's written into the index file as well --------- Signed-off-by: BubbleCal <bubble-cal@outlook.com>
now we drop the
__ivf_part_idwhen shuffling, the corner is thatnum_partitions=1:num_partitions=1then no shuffling is needed__ivf_part_idis not dropped, it's written into the index file as well