OpenApr 8, 2026
No due date
•Last updated We are migrating the implementation of merge insert to be more efficient, particularly with memory. During the refactor, we are taking the implementation that manually manipulates streams and replacing with an implementation that uses DataFusion to generate and optimize the whole plan.
PRD: Retire v1 Code Path
Three categories of operations still fall back to v1:
WhenMatched::DoNothing(find-or-create pattern)- Partial schema upserts (source has subset of target columns)
- Upserts when a scalar index exists on the join key
Goal: Migrate all three to v2, then delete v1. This is a correctness/parity migration, not a performance optimization pass.
Vertical Slices
- DoNothing on v2 — Add
DoNothingtocan_use_create_planeligibility. Simplest slice; establishes the pattern. - Partial schema upsert on v2 — Fill missing columns via projection in the logical plan. Write full rows. Does NOT replicate v1's column-rewrite optimization (future ticket).
- Research spike: indexed join strategy — Time-boxed evaluation of
ScalarIndexJoinExec(#3480) vs finger search (#4648) vs wrapping v1 logic. Output: design for slice 4. - Scalar-indexed joins on v2 — Implement approach from spike. Remove scalar index fallback. Retain
use_indexas escape hatch. - Remove v1 code — Delete
Merger, v1 branch inexecute_uncommitted_impl,can_use_create_plan, v1-only helpers.
Existing Issue Disposition
| Issue | Action |
|---|---|
| #4194 (optimize merge insert with delete) | Split into two post-retirement tickets: WhenMatched::Delete opt and WhenNotMatchedBySource::Delete opt |
| #4193 (partial schema / TakeExec) | Reframe as follow-up optimization after slice 2 |
| #4266 (TakeExec logical node) | Deferred until #4193 optimization |
| #3480 (indexed merge insert) | Spike (slice 3) updates description; slice 4 implements |
| #4583 (streaming vs materialized) | Out of scope |
| #4648 (finger search) | Feeds into slice 3 research |
Out of Scope
- Ergonomics improvements (milestone 8)
- Performance optimizations: TakeExec column-rewrite avoidance, streaming vs materialized inputs, join order optimization
26% complete
List view
0 issues of 11 selected
- Status: Open.#4583 In lance-format/lance;
- Status: Open.#3480 In lance-format/lance;
- Status: Open.#4193 In lance-format/lance;
- Status: Open.#4194 In lance-format/lance;
- Status: Open.#4266 In lance-format/lance;
- Status: Open.#4648 In lance-format/lance;
- Status: Open.#6441 In lance-format/lance;
- Status: Open.#6442 In lance-format/lance;
- Status: Open.#6443 In lance-format/lance;
- Status: Open.#6444 In lance-format/lance;
- Status: Open.#6445 In lance-format/lance;