Skip to content

feat(java): expose merge_insert api#4685

Merged
jackye1995 merged 9 commits intolance-format:mainfrom
fangbo:java-merge-insert
Sep 12, 2025
Merged

feat(java): expose merge_insert api#4685
jackye1995 merged 9 commits intolance-format:mainfrom
fangbo:java-merge-insert

Conversation

@fangbo
Copy link
Copy Markdown
Contributor

@fangbo fangbo commented Sep 10, 2025

Related Issue #4050

@github-actions github-actions Bot added enhancement New feature or request java labels Sep 10, 2025
@fangbo
Copy link
Copy Markdown
Contributor Author

fangbo commented Sep 10, 2025

@jackye1995 @majin1102 This PR is ready. Could you please review it. Thank you.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Sep 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 80.58%. Comparing base (a1b0438) to head (bf57354).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4685      +/-   ##
==========================================
- Coverage   80.58%   80.58%   -0.01%     
==========================================
  Files         317      317              
  Lines      119671   119671              
  Branches   119671   119671              
==========================================
- Hits        96435    96432       -3     
- Misses      19777    19780       +3     
  Partials     3459     3459              
Flag Coverage Δ
unittests 80.58% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good to me! A few nit comments

Comment thread java/lance-jni/src/merge_insert.rs Outdated
#[no_mangle]
pub extern "system" fn Java_com_lancedb_lance_Dataset_nativeMergeInsert<'a>(
mut env: JNIEnv<'a>,
jdataset: JObject,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: we should always add comment about the exact type for any JObject


import java.util.List;

public class MergeInsert {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should better be called MergeInsertParams

Comment thread java/lance-jni/src/merge_insert.rs Outdated
.parse_expr()
.unwrap();

let expr = SqlToRel::new(&LanceContextProvider::default())
Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking what's the best way to do this, this is what I am thinking:

  1. in MergeInsertParams in Java, we should expose not just the String SQL expression API, but also a ByteBuffer API to accept Substrait expression, just like what we do in ScanOptions. So for example withMatchedUpdateIf(String sqlExpr) and withMatchedUpdateIf(ByteBuffer substraitExpr)
  2. in JNI, we should just be able to call LanceFilter::to_datafusion for both cases

Copy link
Copy Markdown
Contributor

@majin1102 majin1102 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for raising this PR. left some comments.
Please take a look when you have time

}

@Override
public String toString() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use guava MoreObjects.toStringHelper() for toString()

import java.util.List;
import java.util.TreeMap;

public class MergeInsertTest extends OperationTestBase {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if this testcase should be put under operation package.

  1. This is a data manipulate unlike other metadata operations.
  2. This is more like a testcase of dataset.
    I think put it under DatasetTest or under the same package would be more reasonable. What do you think

newDataset.allocator = allocator;
}

return new MergeInsertResult(newDataset, result.stats());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a little curious.
Why return a new MergeInsertResult since the old one has got the right allocator.

Comment thread java/lance-jni/src/merge_insert.rs Outdated
"DeleteIf" => {
let sql_expr = DFParserBuilder::new(when_not_matched_by_source_delete_expr)
.build()
.unwrap()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we can use ? instead of unwrap() here. IMO unwrap() should only be used in testcases

@fangbo
Copy link
Copy Markdown
Contributor Author

fangbo commented Sep 11, 2025

@jackye1995 @majin1102 Greatly appreciate your suggestions. I have made some modifications according to your comments. Could you please review it again? Thank you.

Copy link
Copy Markdown
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good to me, @majin1102 any further comments?

Comment thread rust/lance-datafusion/src/planner.rs Outdated
@jackye1995 jackye1995 merged commit 3091488 into lance-format:main Sep 12, 2025
7 checks passed
@fangbo fangbo deleted the java-merge-insert branch November 4, 2025 07:06
jackye1995 pushed a commit to jackye1995/lance that referenced this pull request Jan 21, 2026
Related Issue lance-format#4050

---------

Co-authored-by: fangbo.0511 <fangbo.0511@bytedance.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request java

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants