-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-12981: [R] Install source package from CRAN alone #11001
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
These builds are failing because they set |
nealrichardson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this! Some initial notes, will give it a closer read later
|
In the latest commit, I removed I wasn't positive I got the logic right in this section of |
I think so, but we'll know for sure once we set up CI.
Yes, looks right, I just suggested a further simplification now that we can. As for CI, there will be an arrow-r-nightly change needed in order to do the rsync etc. that you added to r/Makefile, but the regular CI we want will be in arrow, in our "crossbow" nightly and on-demand builds. There's a bunch of yaml that configures templates here, if you want to take a stab at it. @jonkeane is back from vacation next week and can help with setting that up too. Also, I just want to reiterate: this is great, thank you very much for taking the initiative on this. |
- That is, use the `with_mimalloc()` and `with_s3_support()` functions - Remove an unused function - Add a task `test-r-offline-minimal` that sets `TEST_OFFLINE_BUILD`
nealrichardson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we're getting closer! Need to look into CI next so we can confirm that this is actually doing what we think it is :)
|
Looking through the logs, I'm still downloading XSIMD when |
- Move `download_optional_dependencies()` function - Change output of `download_optional_dependencies` to directory used, and input to `ARROW_THIRDPARTY_DEPENDENCY_DIR` if it's set. - Enable `ARROW_VERBOSE_THIRDPARTY_BUILD` if `ARROW_R_DEV` is true and we're setting `*_SOURCE_URL` flags so the printed log shows the file used. - Change env var management in `nixlibx.R` to work with a vector, rather than adding to one long string. - Add checks to env vars: names must be standard, values can't contain `'`
|
@github-actions crossbow submit -g test-r-offline-minimal |
Looking at https://github.com/apache/arrow/blob/master/cpp/cmake_modules/ThirdpartyToolchain.cmake#L1929-L1935, it looks like you have to set the RUNTIME level to NONE also. |
|
@github-actions crossbow submit test-r-offline-minimal |
|
Revision: 5a13cbf Submitted crossbow builds: ursacomputing/crossbow @ actions-802
|
|
@github-actions crossbow submit test-r-offline-maximal |
|
Revision: 939cb87 Submitted crossbow builds: ursacomputing/crossbow @ actions-816
|
|
Aaah, actually the original configuration was fine (though I've adjusted the "Dump test logs" step to always be run (so that it's easier to confirm without downloading the artifacts). This took a bit of RTFM, but the output of |
|
Ok, this looks good. Though I'm seeing that S3 was disabled, I'll make sure we've got the dependencies installed on the host so that we can catch that too. Here are the skips: |
|
@github-actions crossbow submit -g r I'm running the full suite again since we're close and want to make sure this didn't (accidentally) do anything to our other builds |
|
Revision: 6bf7b85 Submitted crossbow builds: ursacomputing/crossbow @ actions-817 |
|
@github-actions crossbow submit test-r-offline-minimal |
|
Revision: ec726d6 Submitted crossbow builds: ursacomputing/crossbow @ actions-818
|
|
I'll give this one last read through before merging, but I think this is good to go. Thank you for all this work + taking the journey with us as we found the best way to accomplish this. |
|
Thanks! I learned a bunch doing this. I had a couple minor questions, following up on comments from @nealrichardson:
|
jonkeane
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went through this and it's looking good. I've made a few suggestions (mostly wording) and one question about a possible failure mode.
- Should I swap the argument order for create_package_with_all_dependencies?
I commented oncreate_package_with_all_dependencies()argument order inline with my comments (I think it's fine as it is, but also fine the other way)
- Should create_package_with_all_dependencies check ARROW_THIRDPARTY_DEPENDENCY_DIR?
I don't think it needs to — if one had dependencies already one could use the hands on approach. I also suspect that for most people creating a full new download will result in the best experience to avoid version mismatches and the like (yeah, it's a little bandwidth wasteful, but not too much)
- Before this is finalized, do you want to change any of the new names I've made up? (edit)
I think the names are good personally.create_package_with_all_dependencies()is the main UX entry point, it's long but nice and descriptive. We could 🚲 🏠 a bit and changepackagetobundlebut I'm not sure that's a user-experience enhancement.
nealrichardson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful work, thank you for doing this!
e3e4f60 to
7a0bdd4
Compare
|
Thank you both for all your help and patience getting this across the finish line! |
I took a stab at implementing the approach @nealrichardson laid out in [ARROW-12981](https://issues.apache.org/jira/browse/ARROW-12981?focusedCommentId=17400415#comment-17400415). Please let me know what you think, and if you'd like any changes! I wrote some basic tests for the `download_optional_dependencies()` helper function, but it would be good to have more comprehensive install tests. These could be something like: ```sh export LIBARROW_BINARY=false export LIBARROW_BUILD=true export LIBARROW_DOWNLOAD=false export LIBARROW_MINIMAL=false # Make sure offline, feature-light installation works R -e "install.packages('arrow_x.y.z.p.tar.xz')" R -e 'stopifnot(arrow::arrow_available(), isFALSE(arrow::arrow_info()$capabilities["parquet"]))' # Download and install the thirdparty features R -e "arrow::download_optional_dependencies('arrow-thirdparty')" source arrow-thirdparty/DEFINE_ENV_VARS.sh R -e "install.packages('arrow_x.y.z.p.tar.xz') R -e 'stopifnot(arrow::arrow_available(), isTRUE(arrow::arrow_info()$capabilities["parquet"]))' ``` Closes apache#11001 from karldw/fix-12981 Lead-authored-by: karldw <karldw@users.noreply.github.com> Co-authored-by: Jonathan Keane <jkeane@gmail.com> Signed-off-by: Jonathan Keane <jkeane@gmail.com>
I took a stab at implementing the approach @nealrichardson laid out in ARROW-12981. Please let me know what you think, and if you'd like any changes!
I wrote some basic tests for the
download_optional_dependencies()helper function, but it would be good to have more comprehensive install tests. These could be something like: