Skip to content

feat(compaction): binary copy capability for compaction#5434

Merged
westonpace merged 29 commits intolance-format:mainfrom
zhangyue19921010:binary-copy-final
Feb 3, 2026
Merged

feat(compaction): binary copy capability for compaction#5434
westonpace merged 29 commits intolance-format:mainfrom
zhangyue19921010:binary-copy-final

Conversation

@zhangyue19921010
Copy link
Copy Markdown
Contributor

Closes: #5433

Copilot AI review requested due to automatic review settings December 8, 2025 10:01
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a binary copy optimization for compaction operations in Lance. The feature enables page-level copying of data files during compaction, bypassing the decode-recode cycle for compatible files. This significantly improves compaction performance when merging small Lance files with identical schemas and versions.

Key changes:

  • Added binary copy capability with enable_binary_copy and related configuration options in CompactionOptions
  • Implemented rewrite_files_binary_copy function that directly copies page data and buffers from source files to output files with proper alignment
  • Added version-aware handling for v2_0 vs v2_1+ file format differences (structural column materialization)

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 10 comments.

File Description
rust/lance/src/dataset/optimize.rs Core implementation: added binary copy validation, page-level copy algorithm, footer writing, and comprehensive test coverage
rust/lance-file/src/writer.rs Added initialize_with_external_metadata method to support writing Lance files with pre-built column metadata
rust/lance-datagen/src/generator.rs Fixed bug: corrected TimestampMillisecondArray usage for millisecond timestamps (was using TimestampMicrosecondArray)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rust/lance/src/dataset/optimize.rs
Comment thread rust/lance/src/dataset/optimize.rs Outdated
Comment thread rust/lance/src/dataset/optimize.rs Outdated
Comment thread rust/lance/src/dataset/optimize.rs Outdated
Comment thread rust/lance/src/dataset/optimize.rs Outdated
Comment thread rust/lance/src/dataset/optimize.rs
Comment thread rust/lance/src/dataset/optimize.rs Outdated
Comment thread rust/lance/src/dataset/optimize.rs Outdated
Comment thread rust/lance/src/dataset/optimize.rs Outdated
Comment thread rust/lance-file/src/writer.rs
@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 9, 2025

Codecov Report

❌ Patch coverage is 81.95616% with 107 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
rust/lance/src/dataset/optimize/binary_copy.rs 84.74% 34 Missing and 22 partials ⚠️
rust/lance/src/dataset/optimize.rs 76.41% 33 Missing and 17 partials ⚠️
rust/lance-file/src/writer.rs 92.85% 0 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some initial thoughts

Comment thread rust/lance-datagen/src/generator.rs Outdated
Comment on lines +2521 to +2522
Box::new(FnGen::<i64, TimestampMicrosecondArray, _>::new_known_size(
Box::new(FnGen::<i64, TimestampMillisecondArray, _>::new_known_size(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch 😄

Comment thread rust/lance-file/src/writer.rs
Comment on lines +600 to +602
if !self.column_writers.is_empty() {
self.finish_writers().await?;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change needed?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Binary Copy, we pre-write data and metadata in a copy manner without calling FileWriter. However, when flushing the footer, in order to reuse existing code as much as possible, we will try to mock a file writer and call its finish method to trigger the writing of the footer. Therefore, a check is needed here to skip the execution of finish_writers in scenarios similar to Binary copy (where column_writers is empty).

Comment on lines +217 to +228
match object_store::aws::resolve_bucket_region(bucket, &client_options).await {
Ok(bucket_region) => Ok(Some(bucket_region)),
Err(e) => {
log::debug!(
"Failed to resolve S3 bucket region for '{}': {:?}; defaulting to provider chain",
bucket,
e
);
// Fallback to region provider chain; let downstream choose a default
Ok(None)
}
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change seems unrelated (but is fairly minor). Will let @jackye1995 or @wjones127 weigh in on whether this is the best thing to do here or not.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not this PR related. Removed.

Comment thread rust/lance/src/dataset/optimize.rs Outdated
Comment on lines +257 to +262
let first_data_file_version = LanceFileVersion::try_from_major_minor(
fragments[0].files[0].file_major_version,
fragments[0].files[0].file_minor_version,
)
.map(|v| v.resolve())
.unwrap();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can just use dataset.manifest.data_storage_format like you check above.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed

Comment thread rust/lance/src/dataset/optimize.rs Outdated

let compaction_plan: CompactionPlan = plan_compaction(dataset, &options).await?;

if compaction_plan.tasks().is_empty() && options.enable_binary_copy_force {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this an error instead of a no-op?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, it's not needed. This logic has already been removed.

Comment thread rust/lance/src/dataset/optimize.rs Outdated
} else {
let data = SendableRecordBatchStream::from(scanner.try_into_stream().await?);
(None, data)
prepare_reader(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you preparing a reader in this case? If we can use binary copy we shouldn't need a scan right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reader is indeed not needed during the Binary copy process. In the design of binary copy, there is a feature: if any problem causes a panic during the rewrite phase of binary copy, it will roll back to the normal compaction logic. Therefore, this object was initialized in advance. However, after careful consideration, this design is somewhat cumbersome and has been simplified.

The logic for initializing the reader in the scenario of Binary copy has been removed.

Comment thread rust/lance/src/dataset/optimize.rs Outdated
});
}

if new_fragments.is_empty() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would this be empty? Didn't we already verify that binary copy is supported if we reach this point?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before. Removed this logic

Comment thread rust/lance/src/dataset/optimize.rs Outdated
Ok(())
}

async fn rewrite_files_binary_copy(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (and flush_footer) are big methods. We might want some kind of BinaryCopier utility struct instead? It could be in its own sub-module e.g. lance::dataset::optimize::binary_copy

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@github-actions github-actions Bot added the java label Dec 11, 2025
@zhangyue19921010
Copy link
Copy Markdown
Contributor Author

Hi @westonpace Thanks a lot for your review. All comments are addressed. PTAL :)

@zhangyue19921010
Copy link
Copy Markdown
Contributor Author

Hi @westonpace — hope you’re doing well !

Just a gentle ping on this PR. Your review would be really helpful. When you have time, could you take another look?

Appreciate your help!

@zhangyue19921010
Copy link
Copy Markdown
Contributor Author

Hi @jackye1995 would u mind to take a look at your convenience? Really appreciate if there are any feedback. Thanks in advance!

let mut frag = Fragment::new(0);
let mut df = DataFile::new_unstarted(current_filename.take().unwrap(), maj, min);
// v2_0 vs v2_1+ field-to-column index mapping for the final file
let is_structural = version >= LanceFileVersion::V2_1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is duplicated logic with L378


if total_rows_in_current > 0 {
// Flush remaining rows as a final output file
// v2_0 compatibility: same single-page enforcement applies for the final file close
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is duplicate logic with L346

Comment thread rust/lance/src/dataset/optimize.rs Outdated
for i in 0..count {
let addr =
lance_core::utils::address::RowAddress::new_from_parts(frag_id, i as u32);
addrs.insert(u64::from(addr));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this would be pretty inefficient, can we use insert_range

version: LanceFileVersion,
) -> Result<()> {
let pos = writer.tell().await? as u64;
let _new_pos = apply_alignment_padding(&mut writer, pos, version).await?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why result is discarded?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. apply_alignment_padding may write pad zero bytes via writer.write_all(...) and thereby advances the writer’s internal position
  2. The u64 (new_pos) it returns is just the theoretically new position after writing the padding. This value is not needed for subsequent calculations in flush_footer (FileWriter::finish() will directly start writing the footer from the current position of the writer), so the return value is unused.
    let column_metadata_start = self.writer.tell().await? as u64;

Err(_) => return false,
};
// Capture schema mapping baseline from first data file
let ref_fields = &fragments[0].files[0].fields;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edge case: to be safe, should check !fragments[0].files.is_empty()

/// - `fragments`: fragments to merge via binary copy (assumed consistent versions).
/// - `params`: write parameters (uses `max_rows_per_file`).
/// - `read_batch_bytes_opt`: optional I/O batch size when coalescing page reads.
pub async fn rewrite_files_binary_copy(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed in original github issue, we should reject the binary copy if the Lance file has global buffer. we can do that check after reading file_meta.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice Catch. Add Global buffer checking in can_use_binary_copy function.

@zhangyue19921010
Copy link
Copy Markdown
Contributor Author

Hi @jackye1995 . Just a gentle reminder — could you please take another look when it’s convenient? Thanks!

Copy link
Copy Markdown
Member

@westonpace westonpace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine moving forwards with this. I'll give a bit for @jackye1995 to weigh in or any comments to be addressed.

I think this is something of a niche use case because we want the underlying data pages to be large but there could be some cases where it is useful (compacting large files into huge files, or search-only use cases where everything is random access and we don't care as much about page size).

It doesn't add too much complexity (mainly some if/else checks in the compaction code) and the rest is hidden in a dedicated module.

Comment on lines +162 to +164
/// Whether to enable binary copy optimization when eligible.
/// Defaults to false.
pub enable_binary_copy: bool,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Whether to enable binary copy optimization when eligible.
/// Defaults to false.
pub enable_binary_copy: bool,
/// Whether to enable binary copy optimization when eligible.
///
/// This skips re-encoding the data and can lead to faster compaction
/// times. However, it cannot merge pages together and should not be
/// used when compacting small files together because the pages in the
/// compacted file will be too small and this could lead to poor I/O patterns.
///
/// Defaults to false.
pub enable_binary_copy: bool,


for fragment in fragments {
if fragment.deletion_file.is_some() {
return Ok(false);
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whenever we return false it might be nice to log a debug message explaining why

Comment on lines +311 to +318
// Binary copy only preserves page and column-buffer bytes. The output file's footer
// (including global buffers) is re-generated, not copied from inputs.
//
// Therefore, we reject input files that contain any additional global buffers beyond
// the required schema / file descriptor global buffer (global buffer index 0).
if file_meta.file_buffers.len() > 1 {
return Ok(false);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At some point I think we are going to start writing file stats in the footer which might interfere with this check.

@zhangyue19921010
Copy link
Copy Markdown
Contributor Author

Hi @westonpace Thanks a lot for your help. All comments are all addressed and CI passed :) cc @jackye1995

@yanghua
Copy link
Copy Markdown
Collaborator

yanghua commented Feb 2, 2026

Hi @jackye1995 , any concerns?

@westonpace westonpace merged commit 601bd91 into lance-format:main Feb 3, 2026
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Up to 20x : Binary Copy Capability For Compaction

5 participants