This epic captures the evolution from shallow_clone to branch in Lance, based on the design discussion in #4256 #3861. It clarifies how shallow_clone provides the foundational mechanics for branch, how branch brings difference with shallow_clone and outlines the tracked implementation work across runtimes and storage management.
Relationship between shallow_clone and branch
shallow_clone creates a dataset derived from a base dataset without copying the underlying data files. It references the base data and maintains its own metadata and snapshots, enabling quick duplication with copy-on-write behavior for subsequent writes.
- A branch is effectively a shallow_clone with additional branch metadata
shallow_clone is a operation across datasets and branch operations(create/delete/checkout) are within one dataset
- Branch makes global file lifetime management possible as discussed in From shallow clone to branching #4256
Scope
- Deliver branch feature built atop shallow_clone mechanics across supported runtimes (Java, Python).
- Ensure
shallow_clone supports fragment operations and indexes consistently.
- Implement cross-branch file lifetime management with safe retention and cleanup policies.
Out of Scope
- This epic does not track discussions or any design directly. Only issues with some meaningful PRs are listed in the tracked checklists.
clone Issues
Branch issues
This epic captures the evolution from shallow_clone to branch in Lance, based on the design discussion in #4256 #3861. It clarifies how shallow_clone provides the foundational mechanics for branch, how branch brings difference with shallow_clone and outlines the tracked implementation work across runtimes and storage management.
Relationship between shallow_clone and branch
shallow_clonecreates a dataset derived from a base dataset without copying the underlying data files. It references the base data and maintains its own metadata and snapshots, enabling quick duplication with copy-on-write behavior for subsequent writes.shallow_cloneis a operation across datasets and branch operations(create/delete/checkout) are within one datasetScope
shallow_clonesupports fragment operations and indexes consistently.Out of Scope
cloneIssuesBranch issues