Skip to content

Conversation

@GuyAv46
Copy link
Collaborator

@GuyAv46 GuyAv46 commented Apr 9, 2023

Describe the changes in the pull request

Implementation of the batch-iterator of the tiered HNSW.

This PR also includes the new BY_SCORE_THEN_ID order. in most cases, we already return the results this way (the standard priority queue and our updatable heap guarantee it). Still, in the BF batch iterator, it may not be the case (depending on the heuristic we choose).
This may still be redundant, and we can consider making the BY_SCORE order implicitly handle ties by comparing the labels.

Main objects this PR modified

  1. implement the batch iterator
  2. added BY_SCORE_THEN_ID order

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

@GuyAv46 GuyAv46 requested a review from alonre24 April 9, 2023 16:27
@codecov
Copy link

codecov bot commented Apr 9, 2023

Codecov Report

❗ No coverage uploaded for pull request base (feature_HNSW_tiered_index@14bd263). Click here to learn what that means.
Patch has no changes to coverable lines.

Additional details and impacted files
@@                     Coverage Diff                      @@
##             feature_HNSW_tiered_index     #350   +/-   ##
============================================================
  Coverage                             ?   96.77%           
============================================================
  Files                                ?       66           
  Lines                                ?     4527           
  Branches                             ?        0           
============================================================
  Hits                                 ?     4381           
  Misses                               ?      146           
  Partials                             ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@GuyAv46 GuyAv46 marked this pull request as ready for review April 16, 2023 16:02
@GuyAv46 GuyAv46 changed the title Guyav tiered hnsw batch iterator Tiered HNSW - batch iterator [MOD-4323] Apr 16, 2023
@GuyAv46 GuyAv46 force-pushed the guyav-tiered_hnsw_batch_iterator branch from 10e3342 to fe17545 Compare April 18, 2023 09:03
@GuyAv46 GuyAv46 force-pushed the guyav-tiered_hnsw_batch_iterator branch from d403961 to 264212c Compare April 20, 2023 16:38
@GuyAv46 GuyAv46 requested a review from DvirDukhan April 20, 2023 16:39
@GuyAv46 GuyAv46 merged commit 3a8ac2d into feature_HNSW_tiered_index Apr 24, 2023
@GuyAv46 GuyAv46 deleted the guyav-tiered_hnsw_batch_iterator branch April 24, 2023 15:17
meiravgri pushed a commit that referenced this pull request May 8, 2023
* small modification to bf batch iterator

* remove promise of perfect score in HNSW multi batch

* implement batch iterator for tiered and some needed helpers

* fix for merge results

* make the iterator a nested class, fix and modify logic

* added first unit test

* some fixes and more tests

* another test

* first overlapping vector tests

* fix a bug on reallocation

* added an edge cases test

* added comments

* added `BY_SCORE_THEN_ID` order and sorter

* make BF batch iterator use it in select-base search

* modification to the BI to handle resize while alive, and use BY_SCORE_THEN_ID

* added dynamic parallel test

* move iterator from generic vec_sim_tiered to hnsw_tiered

* leak fixes

* fix clang build

* minor test refactor

* review fixes

* decrease index size

* move some array logic to arr_cpp.h

* after rebase fixes

* review fixes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants