Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
bba466c
Start on offline build
karldw Aug 24, 2021
0cac874
Add checks for features
karldw Aug 24, 2021
318b166
Fixes to offline build
karldw Aug 25, 2021
adef21d
Re-enable cmake download, add to pkgdown
karldw Aug 25, 2021
5626cb7
Remove LIBARROW_DOWNLOAD, add TEST_OFFLINE_BUILD + minor fixes
karldw Aug 25, 2021
562176f
Simplify turning features off, downloading thirdparty
karldw Aug 26, 2021
81cd30a
Merge branch 'master' into fix-12981
karldw Aug 26, 2021
c03d7dc
Re-add system requirements in nixlibs
karldw Aug 27, 2021
4e2ef52
Tweaks to offline build
karldw Aug 27, 2021
5a13cbf
Set ARROW_RUNTIME_SIMD_LEVEL=NONE as well
karldw Aug 28, 2021
98b5601
Fix identify_os() logic, clarify offline text
karldw Aug 30, 2021
5b9bb77
Clarify build/download explanations
karldw Aug 30, 2021
e08b306
Allow overriding download_dependencies_sh, first stab at CI job for m…
jonkeane Aug 30, 2021
479b054
a few more tweaks
jonkeane Aug 31, 2021
2410d55
docs
jonkeane Aug 31, 2021
adb089e
Change to packing all dependencies into one file
karldw Aug 31, 2021
b560f0b
Merge remote-tracking branch 'upstream/master' into fix-12981
karldw Aug 31, 2021
6daff45
Add narration to create_package_with_all_dependencies
karldw Aug 31, 2021
74093de
Fix check warnings
karldw Aug 31, 2021
2d73ac1
Cleanup create_package_with_all_dependencies
karldw Sep 2, 2021
130683a
Update dev/tasks/r/github.linux.offline.build.yml
jonkeane Sep 3, 2021
13d8c4e
fix testthat output display/uploading
jonkeane Sep 3, 2021
ad9adb4
oops, revert back to original setup
jonkeane Sep 3, 2021
939cb87
always print the testthat output
jonkeane Sep 3, 2021
0fc5600
disable building on git tags
jonkeane Sep 3, 2021
6bf7b85
add openssl + libssl dependencies
jonkeane Sep 3, 2021
1ac2f54
Merge remote-tracking branch 'upstream/master' into fix-12981
karldw Sep 3, 2021
ec726d6
Fix JSON comment
karldw Sep 3, 2021
c0b7f96
Docs tweaks
karldw Sep 4, 2021
7a0bdd4
Clarify binary packages
karldw Sep 6, 2021
2e736f3
Merge branch 'master' into fix-12981
jonkeane Sep 7, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
117 changes: 117 additions & 0 deletions dev/tasks/r/github.linux.offline.build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# NOTE: must set "Crossbow" as name to have the badge links working in the
# github comment reports!
name: Crossbow

on:
push:
branches:
- "*-github-*"

jobs:
grab-dependencies:
name: "Download thirdparty dependencies"
runs-on: ubuntu-20.04
strategy:
fail-fast: false
env:
ARROW_R_DEV: "TRUE"
RSPM: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"
steps:
- name: Checkout Arrow
run: |
git clone --no-checkout {{ arrow.remote }} arrow
git -C arrow fetch -t {{ arrow.remote }} {{ arrow.branch }}
git -C arrow checkout FETCH_HEAD
git -C arrow submodule update --init --recursive
- name: Free Up Disk Space
shell: bash
run: arrow/ci/scripts/util_cleanup.sh
- name: Fetch Submodules and Tags
shell: bash
run: cd arrow && ci/scripts/util_checkout.sh
- uses: r-lib/actions/setup-r@v1
- name: Pull Arrow dependencies
run: |
cd arrow/r
# This is `make build`, but with no vignettes and not running `make doc`
cp ../NOTICE.txt inst/NOTICE.txt
rsync --archive --delete ../cpp tools/
cp -p ../.env tools/
cp -p ../NOTICE.txt tools/
cp -p ../LICENSE.txt tools/
R CMD build --no-build-vignettes --no-manual .
built_tar=$(ls -1 arrow*.tar.gz | head -n 1)
R -e "source('R/install-arrow.R'); create_package_with_all_dependencies(dest_file = 'arrow_with_deps.tar.gz', source_file = \"${built_tar}\")"
shell: bash
- name: Upload the third party dependency artifacts
uses: actions/upload-artifact@v2
with:
name: thirdparty_deps
path: arrow/r/arrow_with_deps.tar.gz

intall-offline:
name: "Install offline"
needs: [grab-dependencies]
runs-on: ubuntu-20.04
strategy:
fail-fast: false
env:
ARROW_R_DEV: TRUE
RSPM: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"
steps:
- name: Checkout Arrow
run: |
git clone --no-checkout {{ arrow.remote }} arrow
git -C arrow fetch -t {{ arrow.remote }} {{ arrow.branch }}
git -C arrow checkout FETCH_HEAD
git -C arrow submodule update --init --recursive
- uses: r-lib/actions/setup-r@v1
- name: Download artifacts
uses: actions/download-artifact@v2
with:
name: thirdparty_deps
path: arrow/r/
- name: Install system dependencies
run: |
sudo apt-get update
sudo apt install libcurl4-openssl-dev libssl-dev
- name: Install dependencies
run: |
install.packages(c("remotes", "glue", "sys"))
remotes::install_deps("arrow/r", dependencies = TRUE)
shell: Rscript {0}
- name: Install
env:
TEST_OFFLINE_BUILD: true
LIBARROW_MINIMAL: false
run: |
cd arrow/r
R CMD INSTALL --install-tests --no-test-load --no-docs --no-help --no-byte-compile arrow_with_deps.tar.gz
- name: Run the tests
run: R -e 'if(tools::testInstalledPackage("arrow") != 0L) stop("There was a test failure.")'
- name: Dump test logs
run: cat arrow-tests/testthat.Rout*
if: always()
- name: Save the test output
uses: actions/upload-artifact@v2
with:
name: test-output
path: arrow-tests/testthat.Rout*
if: always()
13 changes: 13 additions & 0 deletions dev/tasks/tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1033,6 +1033,19 @@ tasks:
flags: '-e ARROW_SOURCE_HOME="/arrow" -e FORCE_BUNDLED_BUILD=TRUE -e LIBARROW_BUILD=TRUE -e ARROW_DEPENDENCY_SOURCE=SYSTEM'
image: ubuntu-r-only-r

test-r-offline-minimal:
ci: azure
template: r/azure.linux.yml
params:
r_org: rocker
r_image: r-base
r_tag: latest
flags: '-e TEST_OFFLINE_BUILD=true'

test-r-offline-maximal:
ci: github
template: r/github.linux.offline.build.yml


{% for r_org, r_image, r_tag in [("rhub", "ubuntu-gcc-release", "latest"),
("rocker", "r-base", "latest"),
Expand Down
8 changes: 8 additions & 0 deletions r/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -18,3 +18,11 @@ vignettes/nyc-taxi/
arrow_*.tar.gz
arrow_*.tgz
extra-tests/files

# C++ sources for an offline build. They're copied from the ../cpp directory, so ignore them here.
/tools/cpp/
# cmake expects .env, NOTICE.txt, and LICENSE.txt to be available one level up
# from cpp/, but again, they're just copies
/tools/.env
/tools/LICENSE.txt
/tools/NOTICE.txt
7 changes: 7 additions & 0 deletions r/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,14 @@ test:
deps:
R -s -e 'lib <- Sys.getenv("R_LIB", .libPaths()[1]); install.packages("devtools", repo="https://cloud.r-project.org", lib=lib); devtools::install_dev_deps(lib=lib)'

# Note: files in tools are available at build time, but not at run time. The thirdparty
# cmake expects .env, NOTICE.txt, and LICENSE.txt to be available one level up from cpp/
build: doc
cp ../NOTICE.txt inst/NOTICE.txt
rsync --archive --delete ../cpp tools/
cp -p ../.env tools/
cp -p ../NOTICE.txt tools/
cp -p ../LICENSE.txt tools/
R CMD build .

check: build
Expand All @@ -56,4 +62,5 @@ clean:
-rm src/Makevars.win
-rm -rf arrow.Rcheck/
-rm -rf libarrow/
-rm -rf tools/cpp/ tools/.env tools/NOTICE.txt tools/LICENSE.txt
-find . -name "*.orig" -delete
1 change: 1 addition & 0 deletions r/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,7 @@ export(codec_is_available)
export(contains)
export(copy_files)
export(cpu_count)
export(create_package_with_all_dependencies)
export(dataset_factory)
export(date32)
export(date64)
Expand Down
102 changes: 101 additions & 1 deletion r/R/install-arrow.R
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,6 @@ install_arrow <- function(nightly = FALSE,
}
} else {
Sys.setenv(
LIBARROW_DOWNLOAD = "true",
LIBARROW_BINARY = binary,
LIBARROW_MINIMAL = minimal,
ARROW_R_DEV = verbose,
Expand Down Expand Up @@ -137,3 +136,104 @@ reload_arrow <- function() {
message("Please restart R to use the 'arrow' package.")
}
}


#' Create a source bundle that includes all thirdparty dependencies
#'
#' @param dest_file File path for the new tar.gz package. Defaults to
#' `arrow_V.V.V_with_deps.tar.gz` in the current directory (`V.V.V` is the version)
#' @param source_file File path for the input tar.gz package. Defaults to
#' downloading the package from CRAN (or whatever you have set as the first in
#' `getOption("repos")`)
#' @return The full path to `dest_file`, invisibly
#'
#' This function is used for setting up an offline build. If it's possible to
#' download at build time, don't use this function. Instead, let `cmake`
#' download the required dependencies for you.
#' These downloaded dependencies are only used in the build if
#' `ARROW_DEPENDENCY_SOURCE` is unset, `BUNDLED`, or `AUTO`.
#' https://arrow.apache.org/docs/developers/cpp/building.html#offline-builds
#'
#' If you're using binary packages you shouldn't need to use this function. You
#' should download the appropriate binary from your package repository, transfer
#' that to the offline computer, and install that. Any OS can create the source
#' bundle, but it cannot be installed on Windows. (Instead, use a standard
#' Windows binary package.)
#'
#' Note if you're using RStudio Package Manager on Linux: If you still want to
#' make a source bundle with this function, make sure to set the first repo in
#' `options("repos")` to be a mirror that contains source packages (that is:
#' something other than the RSPM binary mirror URLs).
#'
#' ## Steps for an offline install with optional dependencies:
#'
#' ### Using a computer with internet access, pre-download the dependencies:
#' * Install the `arrow` package _or_ run
#' `source("https://raw.githubusercontent.com/apache/arrow/master/r/R/install-arrow.R")`
#' * Run `create_package_with_all_dependencies("my_arrow_pkg.tar.gz")`
#' * Copy the newly created `my_arrow_pkg.tar.gz` to the computer without internet access
#'
#' ### On the computer without internet access, install the prepared package:
#' * Install the `arrow` package from the copied file
#' * `install.packages("my_arrow_pkg.tar.gz", dependencies = c("Depends", "Imports", "LinkingTo"))`
#' * This installation will build from source, so `cmake` must be available
#' * Run [arrow_info()] to check installed capabilities
#'
#'
#' @examples
#' \dontrun{
#' new_pkg <- create_package_with_all_dependencies()
#' # Note: this works when run in the same R session, but it's meant to be
#' # copied to a different computer.
#' install.packages(new_pkg, dependencies = c("Depends", "Imports", "LinkingTo"))
#' }
#' @export
create_package_with_all_dependencies <- function(dest_file = NULL, source_file = NULL) {
if (is.null(source_file)) {
pkg_download_dir <- tempfile()
dir.create(pkg_download_dir)
on.exit(unlink(pkg_download_dir, recursive = TRUE), add = TRUE)
message("Downloading Arrow source file")
downloaded <- utils::download.packages("arrow", destdir = pkg_download_dir, type = "source")
source_file <- downloaded[1, 2, drop = TRUE]
}
if (!file.exists(source_file) || !endsWith(source_file, "tar.gz")) {
stop("Arrow package .tar.gz file not found")
}
if (is.null(dest_file)) {
# e.g. convert /path/to/arrow_5.0.0.tar.gz to ./arrow_5.0.0_with_deps.tar.gz
# (add 'with_deps' for clarity if the file was downloaded locally)
dest_file <- paste0(gsub(".tar.gz$", "", basename(source_file)), "_with_deps.tar.gz")
}
untar_dir <- tempfile()
on.exit(unlink(untar_dir, recursive = TRUE), add = TRUE)
utils::untar(source_file, exdir = untar_dir)
tools_dir <- file.path(untar_dir, "arrow/tools")
download_dependencies_sh <- file.path(tools_dir, "cpp/thirdparty/download_dependencies.sh")
# If you change this path, also need to edit nixlibs.R
download_dir <- file.path(tools_dir, "thirdparty_dependencies")
dir.create(download_dir)

message("Downloading files to ", download_dir)
download_successful <- system2(download_dependencies_sh, download_dir, stdout = FALSE) == 0
if (!download_successful) {
stop("Failed to download thirdparty dependencies")
}
# Need to change directory to untar_dir so tar() will use relative paths. That
# means we'll need a full, non-relative path for dest_file. (extra_flags="-C"
# doesn't work with R's internal tar)
orig_wd <- getwd()
on.exit(setwd(orig_wd), add = TRUE)
# normalizePath() may return the input unchanged if dest_file doesn't exist,
# so create it first.
file.create(dest_file)
dest_file <- normalizePath(dest_file, mustWork = TRUE)
setwd(untar_dir)

message("Repacking tar.gz file to ", dest_file)
tar_successful <- utils::tar(dest_file, compression = "gz") == 0
if (!tar_successful) {
stop("Failed to create new tar.gz file")
}
invisible(dest_file)
}
1 change: 1 addition & 0 deletions r/_pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -175,6 +175,7 @@ reference:
- arrow_available
- install_arrow
- install_pyarrow
- create_package_with_all_dependencies

repo:
jira_projects: [ARROW]
Expand Down
21 changes: 9 additions & 12 deletions r/configure
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ FORCE_AUTOBREW=`echo $FORCE_AUTOBREW | tr '[:upper:]' '[:lower:]'`
FORCE_BUNDLED_BUILD=`echo $FORCE_BUNDLED_BUILD | tr '[:upper:]' '[:lower:]'`
ARROW_USE_PKG_CONFIG=`echo $ARROW_USE_PKG_CONFIG | tr '[:upper:]' '[:lower:]'`
LIBARROW_MINIMAL=`echo $LIBARROW_MINIMAL | tr '[:upper:]' '[:lower:]'`
LIBARROW_DOWNLOAD=`echo $LIBARROW_DOWNLOAD | tr '[:upper:]' '[:lower:]'`
TEST_OFFLINE_BUILD=`echo $TEST_OFFLINE_BUILD | tr '[:upper:]' '[:lower:]'`
NOT_CRAN=`echo $NOT_CRAN | tr '[:upper:]' '[:lower:]'`

VERSION=`grep '^Version' DESCRIPTION | sed s/Version:\ //`
Expand Down Expand Up @@ -129,18 +129,15 @@ else
# autobrew sets `PKG_LIBS`, `PKG_DIRS`, and `PKG_CFLAGS`
fi
else

# Set some default values/backwards compatibility
if [ "${LIBARROW_DOWNLOAD}" = "" ] && [ "${NOT_CRAN}" != "" ]; then
LIBARROW_DOWNLOAD=$NOT_CRAN; export LIBARROW_DOWNLOAD
fi
if [ "${LIBARROW_BINARY}" = "" ] && [ "${LIBARROW_DOWNLOAD}" != "" ]; then
LIBARROW_BINARY=$LIBARROW_DOWNLOAD; export LIBARROW_BINARY
fi
if [ "${LIBARROW_MINIMAL}" = "" ] && [ "${LIBARROW_DOWNLOAD}" = "true" ]; then
LIBARROW_MINIMAL=false; export LIBARROW_MINIMAL
fi
if [ "${LIBARROW_MINIMAL}" = "" ] && [ "${NOT_CRAN}" = "true" ]; then
LIBARROW_MINIMAL=false; export LIBARROW_MINIMAL
if [ "${NOT_CRAN}" = "true" ]; then
if [ "${LIBARROW_BINARY}" = "" ]; then
LIBARROW_BINARY=true; export LIBARROW_BINARY
fi
if [ "${LIBARROW_MINIMAL}" = "" ]; then
LIBARROW_MINIMAL=false; export LIBARROW_MINIMAL
fi
fi

# find openssl on macos. macOS ships with libressl. openssl is installable
Expand Down
1 change: 1 addition & 0 deletions r/inst/build_arrow_static.sh
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@ ${CMAKE} -DARROW_BOOST_USE_SHARED=OFF \
-DARROW_WITH_UTF8PROC=${ARROW_WITH_UTF8PROC:-ON} \
-DARROW_WITH_ZLIB=${ARROW_WITH_ZLIB:-$ARROW_DEFAULT_PARAM} \
-DARROW_WITH_ZSTD=${ARROW_WITH_ZSTD:-$ARROW_DEFAULT_PARAM} \
-DARROW_VERBOSE_THIRDPARTY_BUILD=${ARROW_VERBOSE_THIRDPARTY_BUILD:-OFF} \
-DCMAKE_BUILD_TYPE=${CMAKE_BUILD_TYPE:-Release} \
-DCMAKE_INSTALL_LIBDIR=lib \
-DCMAKE_INSTALL_PREFIX=${DEST_DIR} \
Expand Down
Loading