-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-9300: [Java] Separate Netty Memory to its own module #7619
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d8a34cc to
6ea28a3
Compare
faa517f to
7dbb59f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is a frequently called method, I think it would be beneficial to reuse the empty ArrowBuf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only in tests, but I agree. Fixed
java/vector/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we also need the memory-unsafe dependency here?
Otherwise, our integeration tests will not be able to pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the integration tests use UnsafeAllocator do they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran mvn clean install -Pintegration-tests and it was clean. So the integration tests went ok w/ Netty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your feedback.
It is weird. I thought only unsafe allocator supports allocating a buffer larger than 2GB, because the netty allocator uses int32 internally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the netty allocator can allocate 64-bit spaces but netty buffers can't directly reference them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should run the unit tests once for each of the 2 allocators?
|
Thanks for the comments @liyafan82 . I have updated and rebased to pull in your fix from #7628 |
Sorry for the trouble, and thank you for the effort. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have 3 classes in the code base named "DefaultAllocationManagerFactory". Do we need all of them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @liyafan82, yes i think we do:
- lives in the netty module.
- lives in the unsafe module
- lives in the tests portion of the core memory module.
On startup we first check if a specific memory allocator has been requested. If yes we instantiate that. If no specific impl has been requested then we search the classpath for a DefaultAllocationManagerFactory. If we find one then we instantiate it, else we throw a Runtime exception (can't continue without an allocation manager).
So each concrete impl of AllocationManager requires a factory to create it and we have a trivial allocation manager in arrow-memory-core tests to help the core tests finish.
BryanCutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks fine for the most part, but I'm not really sure why we need to separate arrow-memory-core and arrow-memory-unsafe? Couldn't those be combined since it wouldn't add any other dependencies, and that would also simplify things. Plus, it doesn't really make sense to have a module arrow-memory-core without a default allocator that would probably build fine with arrow-vector, but then blow up at runtime. What do you think @rymurr and @liyafan82 ?
java/adapter/orc/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was this intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, don't know how that snuck through. Fixed
java/vector/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be removed now right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
java/memory/memory-netty/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add newline?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it still looks like it needs a newline at the end of the file
java/vector/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should run the unit tests once for each of the 2 allocators?
java/memory/memory-core/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this necessary here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
This is modeled after the slf4j pattern where the logging implementation is separated from the core apis. That way the default pattern can people sourcing the desired allocator via dependency without having to configure one. This seems much cleaner that the previous approaches where people had to manually configure or override via system properties. Additionally, I'd add that for new users I think we would suggest using the Netty one, not the unsafe one. It is much more complete/comprehensive and intelligent. So having a default implementation that we always tell people to override seems worse than they having a hard failure if they don't source any. If we want to make things easier, we could also introduce some vector + allocator depedency poms and then use those in documentation, etc. |
java/memory/memory-core/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems duplicate with the system property defined above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@BryanCutler I agree with you that it makes things simpler. Since we may need to continue to use netty implementation as the default one (as indicated by @jacques-n ), maybe it is beneficial to keep the arrow-memory-unsafe module, at least for now. |
|
I agree that the recommended allocator should still be the netty one for now, so I guess it wouldn't be good to bundle the unsafe allocator as a possible default. I'm good with raising an error then if a default allocator is not on the classpath, as long as it's clear to the user what they need to do. Right now it looks like the error is: @rymurr would you mind adding something to the message like "it is recommended to add |
|
On a related note, it seems like our netty version 4.1.27 is pretty old now, ~2 years, do you all think it would be good to upgrade this before the 1.0.0 release? It looks like there is a security vulnerability pre 4.1.44 https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2019-20445 |
|
Hey all, Thanks @BryanCutler and @liyafan82 for the comments and thanks @jacques-n for the reasoning behind unsafe/netty module split. I have addressed your comments in the most recent update and have created two new tickets:
I will raise PRs for these soon |
As per apache#7619 (comment) there is a security vulnerability in the current version of Netty. This upgrades to the latest version. A compatible upgrade of grpc was also required
BryanCutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a minor nit on making sure there are newlines at the end of the new pom.xml files for consistency. Also, would you mind improving the error message if no DefaultAllocationManagerFactor mentioned at #7619 (comment)? Otherwise LGTM
java/memory/memory-netty/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it still looks like it needs a newline at the end of the file
java/memory/memory-unsafe/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How come you want a newline? I notice that most/all poms have it but it isn't picked up by the style checker. Not a problem either way for me, just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's usually conventional to have it at the end of text files and some command line tools expect it. Not sure why it's not caught by the style checker, but that's why github marks it with a red icon.
java/memory/memory-core/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
newline here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have added a new line in all 3 locations
5dff1f8 to
928d971
Compare
Both are done now. I mistakenly missed the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, this one is a little different. It looks like it could be from getUnsafeFactory(), getNettyFactory(), or some custom class. I would leave the original exception with clazzName, then in getUnsafeFactory(), getNettyFactory() catch the RuntimeException and add a message like "Please ensure [arrow-memory-netty, arrow-memory-unsafe] or equivalent class containing the DefaultAllocationManager is on the classpath"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea @BryanCutler that is much cleaner. Fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this comment can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
java/memory/memory-netty/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems this is duplicate with arrow.vector.max_allocation_bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, removed
java/performance/pom.xml
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also need the dependency of arrow-memory-netty here?
@rymurr Mostly look good to me. Thanks for your effort. |
As per #7619 (comment) there is a security vulnerability in the current version of Netty. This upgrades to the latest version. A compatible upgrade of grpc was also required Closes #7677 from rymurr/ARROW-9370 Authored-by: Ryan Murray <rymurr@dremio.com> Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
|
Thanks again @liyafan82 and @BryanCutler have addressed your comments. |
liyafan82
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
BryanCutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update, just need to fix the typo in the error message and I think this will be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be arrow-memory-netty and NettyAllocationManager below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sigh....sorry. Copy paste for the win. Fixed.
This finishes the netty split in arrow-memory and creates 3 new modules * memory-core: core memory implementation * memory-netty: netty allocation manager * memory-unsafe: unsafe allocation manager The bulk of the changes here are moving files and adding the correct dependencies to poms
BryanCutler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
merged to master, thanks @rymurr ! |
As per apache#7619 (comment) the vector tests should be run for both netty and unsafe allocators
As per #7619 (comment) the vector tests should be run for both netty and unsafe allocators. The default tests are run with the Netty allocator, and `run-unsafe` tests are done with the Unsafe Allocator. Closes #7676 from rymurr/ARROW-9371 Authored-by: Ryan Murray <rymurr@dremio.com> Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
As per apache/arrow#7619 (comment) there is a security vulnerability in the current version of Netty. This upgrades to the latest version. A compatible upgrade of grpc was also required Closes #7677 from rymurr/ARROW-9370 Authored-by: Ryan Murray <rymurr@dremio.com> Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
This finishes the netty split in arrow-memory and creates 3 new modules * memory-core: core memory implementation * memory-netty: netty allocation manager * memory-unsafe: unsafe allocation manager The bulk of the changes here are moving files and adding the correct dependencies to poms Closes apache#7619 from rymurr/ARROW-9300 Authored-by: Ryan Murray <rymurr@dremio.com> Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
As per apache#7619 (comment) the vector tests should be run for both netty and unsafe allocators. The default tests are run with the Netty allocator, and `run-unsafe` tests are done with the Unsafe Allocator. Closes apache#7676 from rymurr/ARROW-9371 Authored-by: Ryan Murray <rymurr@dremio.com> Signed-off-by: Bryan Cutler <cutlerb@gmail.com>
This finishes the netty split in arrow-memory and creates 3 new modules
The bulk of the changes here are moving files and adding the correct dependencies to poms