feat(java): expose merge_insert api#4685
Conversation
018b131 to
ba5126d
Compare
|
@jackye1995 @majin1102 This PR is ready. Could you please review it. Thank you. |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #4685 +/- ##
==========================================
- Coverage 80.58% 80.58% -0.01%
==========================================
Files 317 317
Lines 119671 119671
Branches 119671 119671
==========================================
- Hits 96435 96432 -3
- Misses 19777 19780 +3
Partials 3459 3459
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
jackye1995
left a comment
There was a problem hiding this comment.
mostly looks good to me! A few nit comments
| #[no_mangle] | ||
| pub extern "system" fn Java_com_lancedb_lance_Dataset_nativeMergeInsert<'a>( | ||
| mut env: JNIEnv<'a>, | ||
| jdataset: JObject, |
There was a problem hiding this comment.
nit: we should always add comment about the exact type for any JObject
|
|
||
| import java.util.List; | ||
|
|
||
| public class MergeInsert { |
There was a problem hiding this comment.
I think this should better be called MergeInsertParams
| .parse_expr() | ||
| .unwrap(); | ||
|
|
||
| let expr = SqlToRel::new(&LanceContextProvider::default()) |
There was a problem hiding this comment.
I was thinking what's the best way to do this, this is what I am thinking:
- in MergeInsertParams in Java, we should expose not just the String SQL expression API, but also a ByteBuffer API to accept Substrait expression, just like what we do in ScanOptions. So for example
withMatchedUpdateIf(String sqlExpr)andwithMatchedUpdateIf(ByteBuffer substraitExpr) - in JNI, we should just be able to call
LanceFilter::to_datafusionfor both cases
majin1102
left a comment
There was a problem hiding this comment.
Thanks for raising this PR. left some comments.
Please take a look when you have time
| } | ||
|
|
||
| @Override | ||
| public String toString() { |
There was a problem hiding this comment.
Please use guava MoreObjects.toStringHelper() for toString()
| import java.util.List; | ||
| import java.util.TreeMap; | ||
|
|
||
| public class MergeInsertTest extends OperationTestBase { |
There was a problem hiding this comment.
I'm not sure if this testcase should be put under operation package.
- This is a data manipulate unlike other metadata operations.
- This is more like a testcase of dataset.
I think put it under DatasetTest or under the same package would be more reasonable. What do you think
| newDataset.allocator = allocator; | ||
| } | ||
|
|
||
| return new MergeInsertResult(newDataset, result.stats()); |
There was a problem hiding this comment.
Just a little curious.
Why return a new MergeInsertResult since the old one has got the right allocator.
| "DeleteIf" => { | ||
| let sql_expr = DFParserBuilder::new(when_not_matched_by_source_delete_expr) | ||
| .build() | ||
| .unwrap() |
There was a problem hiding this comment.
I don't know if we can use ? instead of unwrap() here. IMO unwrap() should only be used in testcases
|
@jackye1995 @majin1102 Greatly appreciate your suggestions. I have made some modifications according to your comments. Could you please review it again? Thank you. |
jackye1995
left a comment
There was a problem hiding this comment.
mostly looks good to me, @majin1102 any further comments?
Related Issue lance-format#4050 --------- Co-authored-by: fangbo.0511 <fangbo.0511@bytedance.com>
Related Issue #4050