-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-11460: [R] Use system libraries if present on Linux #10710
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@github-actions crossbow submit test-r-depsource |
|
Revision: 58dc9db0deb842189306982ee87629d00faf0d86 Submitted crossbow builds: ursacomputing/crossbow @ actions-595
|
|
@github-actions crossbow submit test-r-depsource |
|
Revision: e3ffb69de9799801d5c7434405ca907867c2795e Submitted crossbow builds: ursacomputing/crossbow @ actions-596
|
|
@github-actions crossbow submit test-r-depsource |
|
Revision: 55e28b6a55c417124d7a3b48c571eb473fa63136 Submitted crossbow builds: ursacomputing/crossbow @ actions-597
|
|
@github-actions crossbow submit test-r-depsource |
1 similar comment
|
@github-actions crossbow submit test-r-depsource |
|
Revision: 309842e1e8e21c0cf8b226bf8b4ba93b4fea5667 Submitted crossbow builds: ursacomputing/crossbow @ actions-598
|
|
@github-actions crossbow submit test-r-depsource |
|
Revision: aae65707cd2beed2ab38147b82c43a7f08af4ee6 Submitted crossbow builds: ursacomputing/crossbow @ actions-599
|
|
@github-actions crossbow submit test-r-depsource |
|
Revision: 525abeafa20107eb521a34ee7247d3e8284c8b9c Submitted crossbow builds: ursacomputing/crossbow @ actions-643
|
|
@github-actions crossbow submit test-r-depsource |
|
Revision: 16b5bbc0b936621d1cc3f2cb75f81ca7c05c129a Submitted crossbow builds: ursacomputing/crossbow @ actions-662
|
|
@github-actions crossbow submit test-r-depsource |
|
Revision: 0846bd785632cfc3c78d3f8a09dde4d40d50e1d9 Submitted crossbow builds: ursacomputing/crossbow @ actions-663
|
r/inst/build_arrow_static.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it makes sense to try and concoct a CI configuration that hits this else, but I'm curious if you've tried to do it locally and check that you get the expected behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not but happy to do so to test it. What do you think is the most straightforward approach?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should I just spin up a docker image?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this I would ARROW_R_DEV=TRUE R_STATIC_DEPENDENCY_SOURCE=AUTO docker-compose build r and then interactively enter the docker container that is built there, uninstall pkg-config and then run /arrow/ci/scripts/r_test.sh or try installing manually
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, didn't see your second comment before I sent mine, yeah, exactly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want ARROW_R_DEV=TRUE docker run -it apache/arrow-dev:r-rhub-ubuntu-gcc-release-latest -e ARROW_DEPENDENCY_SOURCE=AUTO /bin/bash (get the right variable name, and pass it into the runtime environment with -e)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh you also need -v $(pwd):/arrow or something like that, that's what mounts the /arrow directory based on your local dir
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Neal!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify then, the reason why ARROW_DEPENDENCY_SOURCE needs to be passed in via -e but ARROW_R_DEV doesn't is:
- 'r' (defined in docker-compose.yml) is a service
- specifying environment variables ahead of calling
docker runallows you to set env vars for running that service - with
ARROW_R_DEV, it doesn't matter whether you specify it via-eor via prefixing the call todocker runas it gets passed through anyway via theenvironmentfield of therservice - with
ARROW_DEPENDENCY_SOURCE, it must be set via-eas there's nothing defined in the service to explicitly pass it through
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now tested and working as expected when pkg-config isn't installed
.env
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this here. We have to put env vars here that affect what the Dockerfiles build, but this is a runtime env var, so we can just pass it with -e and don't need to declare it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To add to that, here's one example of how that's done for a different CI job: https://github.com/apache/arrow/blob/master/.github/workflows/r-without-arrow.yml#L76
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And for here, I would do ~the opposite of what I said yesterday, we want something like -e ARROW_DEPENDENCY_SOURCE=... in the docker call since that's the environment variable that we advertise and use throughout Arrow to determine that source.
Yesterday I mentioned calling it something like R_STATIC_DEP... in the .env file, which if we needed to pass it there would be good to mark that it only was getting passed to the R builds and wasn't a parameter that would help on a python build. But Neal's right that we should just put it in the docker command in the GHA workflow file and not worry about passing it around in docker-compose or the .env file. Sorry for my misdirection there yesterday!
r/inst/build_arrow_static.sh
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't quite right/complete, and I think if we look at what is required to be complete, we'll see that we don't actually want this logic in this script. Working backwards, the point of Kou's change is that the pkgconfig files that the arrow C++ build does now contain all of the -l libs that the build requires from the system (which shouldn't be any if we're doing a BUNDLED build, but with AUTO there will be libs in it). So we need to make sure the PKG_LIBS env var in r/configure gets them. And since this script, and even r/tools/nixlibs.R, can't return anything or set anything in the env that r/configure is running in, we need to call pkg-config on the arrow C++ install dir from r/configure. (This is the missing par.) But that will only work if pkg-config exists, so we need to check for it like you do here--but I think we need to check for it in r/configure so that that script know whether to set AUTO or BUNDLED.
r/configure
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't want this change, this is for checking whether libarrow already exists on your system. (If you want to make a change here, edit the comments to clarify which case this is.)
| if [ $? -eq 0 ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ] && [ "$ARROW_DEPENDENCY_SOURCE" == "AUTO" ]; then | |
| if [ $? -eq 0 ] && [ "$ARROW_USE_PKG_CONFIG" != "false" ]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However, you may want to pull Kou's suggestion from #10404 (comment) in here (though I'm not certain about the --static part in this block, may have to experiment locally)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these comments are still valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't disagree! I'm still working on this - converted it to a draft for the moment whilst I go read up on things as I realised I was just changing code as per all the helpful suggestions, but not understanding why.
r/configure
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't want this here. This is only relevant if you've run nixlibs.R to build libarrow. Put this somewhere around L176.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, before L160 (the call to nixlibs.R), add something like:
if [ "${ARROW_DEPENDENCY_SOURCE}" = "" ]; then
# TODO: BUNDLED is still default for now, but we plan to change it to AUTO
ARROW_DEPENDENCY_SOURCE=BUNDLED; export ARROW_DEPENDENCY_SOURCE
fi
if [ "${ARROW_DEPENDENCY_SOURCE}" = "AUTO" ]; then
which pkg-config >/dev/null 2>&1
if [ $? -eq 0 ]; then
export ARROW_DEPENDENCY_SOURCE=BUNDLED
# echo something?
fi
fi
Finally, you also need to edit build_arrow_static.sh to take ARROW_DEPENDENCY_SOURCE from the environment (default arg there BUNDLED)
r/configure
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you want something like this, though we may need to adapt PKG_CONFIG_NAME based on whether dataset and parquet are built (i.e. listed in BUNDLED_LIBS, defined right above where this goes)
| PKG_LIBS = "pkgconfig ${LIB_DIR} $PKG_LIBS" | |
| PKG_LIBS="$PKG_LIBS `PKG_CONFIG_PATH=${LIB_DIR} pkg-config --libs-only-l --libs-only-other --static --silence-errors ${PKG_CONFIG_NAME}`" |
r/configure
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you still need #10710 (comment) here, and also adding a comment explaining what's up would help our future selves.
…ENCY_SOURCE in linux-apt-r
|
@github-actions crossbow submit conda-linux-gcc-py36-cpu-r40 conda-linux-gcc-py37-cpu-r41 test-r-depsource-auto test-r-depsource-system test-r-ubuntu-21.04 test-r-gcc-11 |
|
Revision: 2cf3a26 Submitted crossbow builds: ursacomputing/crossbow @ actions-781
|
|
@github-actions crossbow -g r |
|
|
@github-actions crossbow submit -g r |
|
Revision: 2cf3a26 Submitted crossbow builds: ursacomputing/crossbow @ actions-782 |
|
AFAICT, conda-osx-clang-py36-r40, conda-osx-clang-py37-r41, homebrew-r-autobrew, test-r-rhub-debian-gcc-devel-lto-latest, and test-ubuntu-18.04-r-sanitizer are all failing on crossbow already independent of this change. |
|
Ok, I checked the So, +1 from me again on this. Will merge. |
No description provided.