feat(compaction): binary copy capability for compaction by zhangyue19921010 · Pull Request #5434 · lance-format/lance

zhangyue19921010 · 2025-12-08T10:01:47Z

Closes: #5433

chatgpt-codex-connector · 2025-12-08T10:01:52Z

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Copilot

Pull request overview

This PR introduces a binary copy optimization for compaction operations in Lance. The feature enables page-level copying of data files during compaction, bypassing the decode-recode cycle for compatible files. This significantly improves compaction performance when merging small Lance files with identical schemas and versions.

Key changes:

Added binary copy capability with enable_binary_copy and related configuration options in CompactionOptions
Implemented rewrite_files_binary_copy function that directly copies page data and buffers from source files to output files with proper alignment
Added version-aware handling for v2_0 vs v2_1+ file format differences (structural column materialization)

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.

File	Description
rust/lance/src/dataset/optimize.rs	Core implementation: added binary copy validation, page-level copy algorithm, footer writing, and comprehensive test coverage
rust/lance-file/src/writer.rs	Added `initialize_with_external_metadata` method to support writing Lance files with pre-built column metadata
rust/lance-datagen/src/generator.rs	Fixed bug: corrected TimestampMillisecondArray usage for millisecond timestamps (was using TimestampMicrosecondArray)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

codecov · 2025-12-09T03:51:28Z

Codecov Report

❌ Patch coverage is 81.95616% with 107 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
rust/lance/src/dataset/optimize/binary_copy.rs	84.74%	34 Missing and 22 partials ⚠️
rust/lance/src/dataset/optimize.rs	76.41%	33 Missing and 17 partials ⚠️
rust/lance-file/src/writer.rs	92.85%	0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

westonpace

Some initial thoughts

westonpace · 2025-12-10T23:54:37Z

-                Box::new(FnGen::<i64, TimestampMicrosecondArray, _>::new_known_size(
+                Box::new(FnGen::<i64, TimestampMillisecondArray, _>::new_known_size(


Good catch 😄

westonpace · 2025-12-10T23:55:37Z

+        if !self.column_writers.is_empty() {
+            self.finish_writers().await?;
+        }


Why is this change needed?

In Binary Copy, we pre-write data and metadata in a copy manner without calling FileWriter. However, when flushing the footer, in order to reuse existing code as much as possible, we will try to mock a file writer and call its finish method to trigger the writing of the footer. Therefore, a check is needed here to skip the execution of finish_writers in scenarios similar to Binary copy (where column_writers is empty).

westonpace · 2025-12-10T23:56:25Z

+        match object_store::aws::resolve_bucket_region(bucket, &client_options).await {
+            Ok(bucket_region) => Ok(Some(bucket_region)),
+            Err(e) => {
+                log::debug!(
+                    "Failed to resolve S3 bucket region for '{}': {:?}; defaulting to provider chain",
+                    bucket,
+                    e
+                );
+                // Fallback to region provider chain; let downstream choose a default
+                Ok(None)
+            }
+        }


This change seems unrelated (but is fairly minor). Will let @jackye1995 or @wjones127 weigh in on whether this is the best thing to do here or not.

Not this PR related. Removed.

westonpace · 2025-12-10T23:57:28Z

+    let first_data_file_version = LanceFileVersion::try_from_major_minor(
+        fragments[0].files[0].file_major_version,
+        fragments[0].files[0].file_minor_version,
+    )
+    .map(|v| v.resolve())
+    .unwrap();


I think you can just use dataset.manifest.data_storage_format like you check above.

westonpace · 2025-12-10T23:58:37Z


    let compaction_plan: CompactionPlan = plan_compaction(dataset, &options).await?;

+    if compaction_plan.tasks().is_empty() && options.enable_binary_copy_force {


Why is this an error instead of a no-op?

Actually, it's not needed. This logic has already been removed.

westonpace · 2025-12-11T00:00:45Z

    } else {
-        let data = SendableRecordBatchStream::from(scanner.try_into_stream().await?);
-        (None, data)
+        prepare_reader(


Why are you preparing a reader in this case? If we can use binary copy we shouldn't need a scan right?

This reader is indeed not needed during the Binary copy process. In the design of binary copy, there is a feature: if any problem causes a panic during the rewrite phase of binary copy, it will roll back to the normal compaction logic. Therefore, this object was initialized in advance. However, after careful consideration, this design is somewhat cumbersome and has been simplified.

The logic for initializing the reader in the scenario of Binary copy has been removed.

westonpace · 2025-12-11T00:01:41Z

+            });
+        }
+
+        if new_fragments.is_empty() {


Why would this be empty? Didn't we already verify that binary copy is supported if we reach this point?

As mentioned before. Removed this logic

westonpace · 2025-12-11T00:03:16Z

    Ok(())
 }

+async fn rewrite_files_binary_copy(


This (and flush_footer) are big methods. We might want some kind of BinaryCopier utility struct instead? It could be in its own sub-module e.g. lance::dataset::optimize::binary_copy

zhangyue19921010 · 2025-12-11T13:18:59Z

Hi @westonpace Thanks a lot for your review. All comments are addressed. PTAL :)

zhangyue19921010 · 2026-01-06T11:49:19Z

Hi @westonpace — hope you’re doing well !

Just a gentle ping on this PR. Your review would be really helpful. When you have time, could you take another look?

Appreciate your help!

zhangyue19921010 · 2026-01-12T06:01:09Z

Hi @jackye1995 would u mind to take a look at your convenience? Really appreciate if there are any feedback. Thanks in advance!

jackye1995 · 2026-01-13T00:04:42Z

+        let mut frag = Fragment::new(0);
+        let mut df = DataFile::new_unstarted(current_filename.take().unwrap(), maj, min);
+        // v2_0 vs v2_1+ field-to-column index mapping for the final file
+        let is_structural = version >= LanceFileVersion::V2_1;


this is duplicated logic with L378

jackye1995 · 2026-01-13T00:06:01Z

+
+    if total_rows_in_current > 0 {
+        // Flush remaining rows as a final output file
+        // v2_0 compatibility: same single-page enforcement applies for the final file close


this is duplicate logic with L346

jackye1995 · 2026-01-13T00:08:06Z

+                for i in 0..count {
+                    let addr =
+                        lance_core::utils::address::RowAddress::new_from_parts(frag_id, i as u32);
+                    addrs.insert(u64::from(addr));


this would be pretty inefficient, can we use insert_range

jackye1995 · 2026-01-13T00:08:51Z

+    version: LanceFileVersion,
+) -> Result<()> {
+    let pos = writer.tell().await? as u64;
+    let _new_pos = apply_alignment_padding(&mut writer, pos, version).await?;


why result is discarded?

apply_alignment_padding may write pad zero bytes via writer.write_all(...) and thereby advances the writer’s internal position

The u64 (new_pos) it returns is just the theoretically new position after writing the padding. This value is not needed for subsequent calculations in flush_footer (FileWriter::finish() will directly start writing the footer from the current position of the writer), so the return value is unused.

lance/rust/lance-file/src/writer.rs

Line 605 in e7540d7

let column_metadata_start = self.writer.tell().await? as u64;

jackye1995 · 2026-01-13T00:15:16Z

+        Err(_) => return false,
+    };
+    // Capture schema mapping baseline from first data file
+    let ref_fields = &fragments[0].files[0].fields;


edge case: to be safe, should check !fragments[0].files.is_empty()

jackye1995 · 2026-01-13T00:18:08Z

+/// - `fragments`: fragments to merge via binary copy (assumed consistent versions).
+/// - `params`: write parameters (uses `max_rows_per_file`).
+/// - `read_batch_bytes_opt`: optional I/O batch size when coalescing page reads.
+pub async fn rewrite_files_binary_copy(


as we discussed in original github issue, we should reject the binary copy if the Lance file has global buffer. we can do that check after reading file_meta.

Nice Catch. Add Global buffer checking in can_use_binary_copy function.

zhangyue19921010 · 2026-01-19T02:35:55Z

Hi @jackye1995 . Just a gentle reminder — could you please take another look when it’s convenient? Thanks!

westonpace

I'm fine moving forwards with this. I'll give a bit for @jackye1995 to weigh in or any comments to be addressed.

I think this is something of a niche use case because we want the underlying data pages to be large but there could be some cases where it is useful (compacting large files into huge files, or search-only use cases where everything is random access and we don't care as much about page size).

It doesn't add too much complexity (mainly some if/else checks in the compaction code) and the rest is hidden in a dedicated module.

westonpace · 2026-01-28T19:53:12Z

+    /// Whether to enable binary copy optimization when eligible.
+    /// Defaults to false.
+    pub enable_binary_copy: bool,


Suggested change

/// Whether to enable binary copy optimization when eligible.

/// Defaults to false.

pub enable_binary_copy: bool,

/// Whether to enable binary copy optimization when eligible.

///

/// This skips re-encoding the data and can lead to faster compaction

/// times. However, it cannot merge pages together and should not be

/// used when compacting small files together because the pages in the

/// compacted file will be too small and this could lead to poor I/O patterns.

///

/// Defaults to false.

pub enable_binary_copy: bool,

westonpace · 2026-01-28T19:59:10Z

+
+    for fragment in fragments {
+        if fragment.deletion_file.is_some() {
+            return Ok(false);


Whenever we return false it might be nice to log a debug message explaining why

westonpace · 2026-01-28T19:59:50Z

+            // Binary copy only preserves page and column-buffer bytes. The output file's footer
+            // (including global buffers) is re-generated, not copied from inputs.
+            //
+            // Therefore, we reject input files that contain any additional global buffers beyond
+            // the required schema / file descriptor global buffer (global buffer index 0).
+            if file_meta.file_buffers.len() > 1 {
+                return Ok(false);
+            }


At some point I think we are going to start writing file stats in the footer which might interfere with this check.

zhangyue19921010 · 2026-01-30T02:30:34Z

Hi @westonpace Thanks a lot for your help. All comments are all addressed and CI passed :) cc @jackye1995

yanghua · 2026-02-02T06:52:21Z

Hi @jackye1995 , any concerns?

zhangyue19921010 added 5 commits December 8, 2025 16:16

feat: binary copy for compaction

5c70543

feat: binary copy for compaction

79dc823

feat: binary copy for compaction

36d0a9c

feat: binary copy for compaction

a7711f8

feat: binary copy for compaction

a5bd940

Copilot AI review requested due to automatic review settings December 8, 2025 10:01

github-actions Bot added the enhancement New feature or request label Dec 8, 2025

Copilot started reviewing on behalf of zhangyue19921010 December 8, 2025 10:02 View session

Copilot AI reviewed Dec 8, 2025

View reviewed changes

zhangyue19921010 added 2 commits December 8, 2025 18:26

feat: binary copy for compaction

e0bc0f9

feat: binary copy for compaction

c683974

westonpace reviewed Dec 11, 2025

View reviewed changes

zhangyue19921010 added 5 commits December 11, 2025 17:18

code review

0e6d74b

code review

041a79b

code review

cf46643

Merge branch 'main' into binary-copy-final

b3a748e

code review

9646e51

github-actions Bot added the java label Dec 11, 2025

code review

582074a

zhangyue19921010 and others added 7 commits December 16, 2025 17:01

bug fix

a352b16

bug fix

2faa848

bug fix

c8255ff

Merge branch 'main' into binary-copy-final

f9d10b0

Verify the consistency of column buffer encoding

ab78372

code review

5b74fb7

code review

d35a130

jackye1995 reviewed Jan 13, 2026

View reviewed changes

zhangyue19921010 added 6 commits January 13, 2026 11:02

code review

902839d

Merge branch 'main' into binary-copy-final

651a52c

code review

1f7c558

code review

7492c4d

code review

9f0ac0f

Merge branch 'binary-copy-final-bak-0113' into binary-copy-final

ed5aed2

zhangyue19921010 requested review from jackye1995 and westonpace January 26, 2026 11:38

westonpace approved these changes Jan 28, 2026

View reviewed changes

zhangyue19921010 added 3 commits January 29, 2026 10:50

code review

7c396ab

Merge branch 'main' into binary-copy-final

2ece222

code review

65befe6

westonpace merged commit 601bd91 into lance-format:main Feb 3, 2026
28 checks passed

andrea-reale mentioned this pull request Mar 30, 2026

emilk/fix write starvation rerun-io/lance#12

Closed

		Box::new(FnGen::<i64, TimestampMicrosecondArray, _>::new_known_size(
		Box::new(FnGen::<i64, TimestampMillisecondArray, _>::new_known_size(


		let compaction_plan: CompactionPlan = plan_compaction(dataset, &options).await?;

		if compaction_plan.tasks().is_empty() && options.enable_binary_copy_force {

Conversation

zhangyue19921010 commented Dec 8, 2025

Uh oh!

chatgpt-codex-connector Bot commented Dec 8, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov Bot commented Dec 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

westonpace left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhangyue19921010 commented Dec 11, 2025

Uh oh!

zhangyue19921010 commented Jan 6, 2026

Uh oh!

zhangyue19921010 commented Jan 12, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Dec 9, 2025 •

edited

Loading