-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-16941: [Java] Consolidate Dataset JNI compilation #13481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-16941: [Java] Consolidate Dataset JNI compilation #13481
Conversation
|
|
|
@lidavidm @davisusanibar Would you please review these changes? Thank you. |
|
@github-actions crossbow submit java-jars |
|
Revision: 558ed60 Submitted crossbow builds: ursacomputing/crossbow @ actions-58e01587a4
|
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the CI fails. We need to update the CI:
arrow/ci/scripts/java_jni_manylinux_build.sh
Line 120 in 8b298be
| cp -L ${build_dir}/lib/libarrow_dataset_jni.so ${dist_dir} |
Also, we should presumably update the install instructions to match.
I will take a look. I see the Conda integration test also failed. |
|
@lidavidm The error message I see (many linux build) doesn't look to me to have any direct connection to Java. (see below) Does this kind of thing happen randomly at all? There's also a failure on Conda integration, which I'll look at now. |
|
@lwhite1 ignore the integration test. The important one is the JNI pipeline. After this reorganization, what is the expected process to build and test the JNI libraries? |
oh that. IDK where that came from. When I built locally, it failed so I switched to master and did a clean build and it failed again. I assumed it was a local issue and the CI builds were unaffected. The issue is that the c-data maven module is expecting a variable with one name (for the build dir) and the other three cpp modules expect a different name for the same value. If you run the c-data-module on its own with the expected argument, it works fine. However, the build that creates the other JNI libs also calls the c-data module under the hood, and then it fails - I believe it's because it can't find the directory where the lib file should be. Locally, I got it to work in two ways: First, after building c-data separately, I deleted that step from the other JNI build and everything was fine as it used the previously compiled code. However, I was afraid to commit that change, thinking it would break the CI build. Instead I called the "other JNI" build passing the same value using both argument names. That also worked. I spent some time trying to figure out why this was happening. I couldn't see any related code changes since May so I thought it might just be my environment. We talked about the fact that the c-data-interface was built redundantly in the command line instructions, and whether one version should be removed, but it seemed risky to make that change.
I didn't (consciously) change anything that would require process changes. I will look at the maven build again and see if I can't figure out how to pass the build dir to the c-data module using the name it expects. |
|
Ok. I'm not sure what I was thinking earlier. Based on the description that @davisusanibar gave in #13458, I think he has the right approach to this and that explains why the other PR passes CI. |
|
The CMake project currently under java/dataset is just for generating the JNI header so I don't think it needs modifications right now |
cpp/CMakeLists.txt
Outdated
|
|
||
| if(ARROW_JNI) | ||
| add_subdirectory(src/jni) | ||
| add_subdirectory(../java/dataset/src/main/cpp ./java/jni) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the other PR, this still needs to be conditioned on if(ARROW_DATASET) right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
the change here was to reset it to the original state |
cpp/CMakeLists.txt
Outdated
| if(ARROW_DATASET) | ||
| add_subdirectory(../java/dataset/src/main/cpp ./java/jni) | ||
| endif() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use spaces, not tabs, I think that's why the indentation is off:
| if(ARROW_DATASET) | |
| add_subdirectory(../java/dataset/src/main/cpp ./java/jni) | |
| endif() | |
| if(ARROW_DATASET) | |
| add_subdirectory(../java/dataset/src/main/cpp ./java/jni) | |
| endif() |
|
@github-actions crossbow submit java-jars |
|
Revision: f5226ec Submitted crossbow builds: ursacomputing/crossbow @ actions-6e6c6a526d
|
This is a follow-up of apache#13481. We also need to update jni_util.h path in java/dataset/src/main/cpp/jni_util_test.cc. Removing "./" is needless but it's redundant.
This is a follow-up of #13481. We also need to update jni_util.h path in java/dataset/src/main/cpp/jni_util_test.cc. Removing "./" is needless but it's redundant. Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Moves the Dataset JNI C++ code from the arrow/cpp project to the arrow/java project, updating the CMakeLists.txt files as needed. Authored-by: Larry White <lwhite1@users.noreply.github.com> Signed-off-by: David Li <li.davidm96@gmail.com>
This is a follow-up of apache#13481. We also need to update jni_util.h path in java/dataset/src/main/cpp/jni_util_test.cc. Removing "./" is needless but it's redundant. Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Moves the Dataset JNI C++ code from the arrow/cpp project to the arrow/java project, updating the CMakeLists.txt files as needed. Authored-by: Larry White <lwhite1@users.noreply.github.com> Signed-off-by: David Li <li.davidm96@gmail.com>
This is a follow-up of apache#13481. We also need to update jni_util.h path in java/dataset/src/main/cpp/jni_util_test.cc. Removing "./" is needless but it's redundant. Authored-by: Sutou Kouhei <kou@clear-code.com> Signed-off-by: Sutou Kouhei <kou@clear-code.com>
Moves the Dataset JNI C++ code from the arrow/cpp project to the arrow/java project, updating the CMakeLists.txt files as needed.