-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-5290: [Java] Provide a flag to enable/disable null-checking in vector's get methods #4288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ectors' get methods
Codecov Report
@@ Coverage Diff @@
## master #4288 +/- ##
==========================================
+ Coverage 88.46% 88.95% +0.48%
==========================================
Files 773 628 -145
Lines 95602 84909 -10693
Branches 1251 0 -1251
==========================================
- Hits 84578 75528 -9050
+ Misses 10788 9381 -1407
+ Partials 236 0 -236
Continue to review full report at Codecov.
|
|
Hi @jacques-n and @emkornfield, would you please take a look at this PR? |
|
I think this generally looks fine. It seems like the patch should also include the changes to the vectors with the assembly/microbenchmark observations of impact. |
@jacques-n Good suggestion. I have included them as well. Thanks a lot. |
|
It seems like maybe this belongs in vector package not memory? |
Sounds reasonable. I have moved them accordingly. |
|
I think we should refactor the check method into the parent class, but not as part of this PR. I filed https://issues.apache.org/jira/browse/ARROW-5305 to track this. I'm going to merge the change. Thanks for working through this on the mailing list. |
Hi @emkornfield , thanks a lot for all your comments and helpful suggestions, and thank you for opening ARROW-5305 to refine the changes. |
jacques-n
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-1 until we see JIT analysis of both cases (check enabled and not enabled). I want to make sure this condition disappears as it should.
|
@jacques-n sorry misunderstood the original question will hold off on the merge until @liyafan82 links the results (or do you want to handle the merge when you are happy with the change? |
|
@jacques-n and @emkornfield , thanks a lot for your comments. When the null-checking is enabled, we have: Benchmark Mode Cnt Score Error Units This is consistent with the 30% performance difference we observed before. When the null-checking flag is disabled, we have: Benchmark Mode Cnt Score Error Units We can see that by disabling the flag, the performance is almost identical. In addition, by inspecting the assembly code, we can see that the code for the safe API (with null-checking disabled) is almost identical to the code for the unsafe API: The code for the safe API (with null-checking disabled) The code for the unsafe API So it can be seen that the flag solves the performance problem completely. |
@jacques-n Sure. Thanks for your kind reminder. Benchmark Mode Cnt Score Error Units When the static is removed: Benchmark Mode Cnt Score Error Units So the performance is almost identical. Assembly code analysis on-going ... |
@jacques-n Assembly code are also almost identical: |
|
+1 Looks good. Thanks for checking the benchmark, etc! |
…vector's get methods This PR adds a flag to enable/disable null checking in "get" methods of vectors. By disabling null checking, vector APIs can have better performance. To make it more convenient to review this PR. It is split into two parts: this is the first part to provide a flag. If it is OK, another PR will be opened for the second part: to use the flag in vector "get" methods. Author: liyafan82 <fan_li_ya@foxmail.com> Closes apache#4288 from liyafan82/fly_5290 and squashes the following commits: 588ddbf <liyafan82> Move the option to vector package 456e4ee <liyafan82> Modify get methods and provide the performance results 0c6850e <liyafan82> Provide a flag to enable/disable null-checking in vectors' get methods




This PR adds a flag to enable/disable null checking in "get" methods of vectors. By disabling null checking, vector APIs can have better performance.
To make it more convenient to review this PR. It is split into two parts: this is the first part to provide a flag. If it is OK, another PR will be opened for the second part: to use the flag in vector "get" methods.