The code in dataset/scanner.rs has gotten extremely complicated, to a point where it is hard to test. Before we make any improvements, we need to refactor this to be easier to test and extend.
In addition, outside codebases may wish to extend Lance's capabilities by modifying or composing plans. For example, in LanceDB, we'll want to add a separate WAL that needs to be queried during KNN queries and scans.
Tasks
The code in dataset/scanner.rs has gotten extremely complicated, to a point where it is hard to test. Before we make any improvements, we need to refactor this to be easier to test and extend.
In addition, outside codebases may wish to extend Lance's capabilities by modifying or composing plans. For example, in LanceDB, we'll want to add a separate WAL that needs to be queried during KNN queries and scans.
Tasks
Scanner::create_plan()in terms of that._rowidand_distanceless awkward to deal with -- so thatselect([]).with_row_id()is justselect ("_rowid")We should reserve that column name._distanceunlessselect *orselect _distance