Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
babe5b7
Basic forwards compatibility testing
jonkeane Jan 5, 2021
f0ce0f1
Trying on GHA
jonkeane Jan 6, 2021
8ef0c8d
run on all pushes
jonkeane Jan 6, 2021
2dba6c7
hardcode some crossbowery
jonkeane Jan 6, 2021
90ff23f
oops
jonkeane Jan 6, 2021
ba73af7
oh and needs too
jonkeane Jan 6, 2021
c792d4b
Don't cd
jonkeane Jan 6, 2021
4102ec5
Try running directly on GHA
jonkeane Jan 6, 2021
fcf56cd
Make sure the dir exists
jonkeane Jan 6, 2021
bfb96a4
Add tests
jonkeane Jan 6, 2021
cdf162d
try a different reference
jonkeane Jan 6, 2021
ac71865
oops!
jonkeane Jan 6, 2021
a426014
clean up
jonkeane Jan 6, 2021
596d94d
Can we skip known-bad versions?
jonkeane Jan 6, 2021
83d2c07
What happens in 1.0.1?
jonkeane Jan 6, 2021
6696029
oops
jonkeane Jan 6, 2021
3f55255
Some more cleanup
jonkeane Jan 6, 2021
0961b2b
Cleanup
jonkeane Jan 6, 2021
afb0aee
oops
jonkeane Jan 6, 2021
350c483
Oy
jonkeane Jan 6, 2021
b4c23df
still more changes
jonkeane Jan 6, 2021
d9e7e61
Move workflow to crossbox
jonkeane Jan 6, 2021
8b92050
Add a crossbow task
jonkeane Jan 6, 2021
5cd59d5
Escape jinja
jonkeane Jan 7, 2021
4814d4a
More indirection?
jonkeane Jan 7, 2021
faa9f37
Oops extra typo
jonkeane Jan 7, 2021
b6ec09f
A slightly different way of loading the crossbow...
jonkeane Jan 7, 2021
3959ed3
Add feather tests
jonkeane Jan 7, 2021
afed5b6
More filetypes, will probably fail on 1.0.1 for streams + metadata.
jonkeane Jan 7, 2021
71e74ea
Helpers need some copying too
jonkeane Jan 7, 2021
54d8d2d
Add backwards compatibility testing
jonkeane Jan 8, 2021
9871abc
Add skips if snappy isn't available
jonkeane Jan 8, 2021
5cab1b2
Add feather files
jonkeane Jan 8, 2021
70abb3e
Add apache licensing where it should be.
jonkeane Jan 8, 2021
2323e8f
PR comments, add 0.17.0 fixture that exhibits the case mis-match on c…
jonkeane Jan 9, 2021
8bcb580
Add skip for upcoming compatibility change.
jonkeane Jan 11, 2021
5c9fe3f
Remove the skip now that ARROW-11163 is merged
jonkeane Jan 11, 2021
e134e60
Make a helper, test some (but not all) of the 0.17.0 reading.
jonkeane Jan 11, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 109 additions & 0 deletions dev/tasks/r/github.linux.version.compatibility.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

# NOTE: must set "Crossbow" as name to have the badge links working in the
# github comment reports!
name: Crossbow

on:
push

jobs:
write-files:
name: "Write files"
runs-on: ubuntu-20.04
strategy:
fail-fast: false
env:
ARROW_R_DEV: "TRUE"
RSPM: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"
steps:
- name: Checkout Arrow
run: |
git clone --no-checkout {{ arrow.remote }} arrow
git -C arrow fetch -t {{ arrow.remote }} {{ arrow.branch }}
git -C arrow checkout FETCH_HEAD
git -C arrow submodule update --init --recursive
- name: Free Up Disk Space
shell: bash
run: arrow/ci/scripts/util_cleanup.sh
- name: Fetch Submodules and Tags
shell: bash
run: cd arrow && ci/scripts/util_checkout.sh
- uses: r-lib/actions/setup-r@v1
- name: Install dependencies
run: |
install.packages(c("remotes", "glue", "sys"))
remotes::install_deps("arrow/r", dependencies = TRUE)
shell: Rscript {0}
- name: Install Arrow
run: |
cd arrow/r
R CMD INSTALL .
shell: bash
- name: Write files
run: |
cd arrow/r
R -f extra-tests/write-files.R
shell: bash

- name: Upload the parquet artifacts
uses: actions/upload-artifact@v2
with:
name: files
path: arrow/r/extra-tests/files

read-files:
name: "Read files with Arrow {{ '${{ matrix.old_arrow_version }}' }}"
needs: [write-files]
runs-on: ubuntu-20.04
strategy:
fail-fast: false
matrix:
old_arrow_version:
- "2.0.0"
- "1.0.1"
env:
RSPM: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"
OLD_ARROW_VERSION: {{ '${{ matrix.old_arrow_version }}' }}
steps:
- name: Checkout Arrow
run: |
git clone --no-checkout {{ arrow.remote }} arrow
git -C arrow fetch -t {{ arrow.remote }} {{ arrow.branch }}
git -C arrow checkout FETCH_HEAD
git -C arrow submodule update --init --recursive
- uses: r-lib/actions/setup-r@v1
- name: Install old Arrow
run: |
install.packages(c("remotes", "testthat"))
remotes::install_version("arrow", "{{ '${{ matrix.old_arrow_version }}' }}")
shell: Rscript {0}
- name: Setup our testing directory, copy only the tests to it.
run: |
mkdir -p extra-tests/files
cp arrow/r/extra-tests/helper*.R extra-tests/
cp arrow/r/extra-tests/test-*.R extra-tests/
- name: Download artifacts
uses: actions/download-artifact@v2
with:
name: files
path: extra-tests/files
- name: Test reading
run: |
testthat::test_dir("extra-tests")
shell: Rscript {0}
4 changes: 4 additions & 0 deletions dev/tasks/tasks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -1752,6 +1752,10 @@ tasks:
template: r/github.linux.cran.yml
params:
MATRIX: "${{ matrix.r_image }}"

test-r-version-compatibility:
ci: github
template: r/github.linux.version.compatibility.yml

test-r-rhub-ubuntu-gcc-release:
ci: azure
Expand Down
1 change: 1 addition & 0 deletions r/.Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -23,3 +23,4 @@ clang_format.sh
^autobrew$
^apache-arrow.rb$
^.*\.Rhistory$
^extra-tests
1 change: 1 addition & 0 deletions r/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -17,3 +17,4 @@ revdep/
vignettes/nyc-taxi/
arrow_*.tar.gz
arrow_*.tgz
extra-tests/files
32 changes: 32 additions & 0 deletions r/extra-tests/helpers.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

if_version <- function(version, op = `==`) {
op(packageVersion("arrow"), version)
}

skip_if_version_less_than <- function(version, msg) {
if(if_version(version, `<`)) {
skip(msg)
}
}

skip_if_version_equals <- function(version, msg) {
if(if_version(version, `==`)) {
skip(msg)
}
}
165 changes: 165 additions & 0 deletions r/extra-tests/test-read-files.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

library(arrow)
library(testthat)

pq_file <- "files/ex_data.parquet"

test_that("Can read the file (parquet)", {
# We can read with no error, we assert metadata below
expect_error(
df <- read_parquet(pq_file),
NA
)
})

### Parquet
test_that("Can see the metadata (parquet)", {
skip_if_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")

df <- read_parquet(pq_file)
expect_s3_class(df, "tbl")

expect_equal(
attributes(df),
list(
names = letters[1:4],
row.names = 1L,
top_level = list(
field_one = 12,
field_two = "more stuff"
),
class = c("tbl_df", "tbl", "data.frame")
)
)

# column-level attributes
expect_equal(attributes(df$a), list(class = "special_string"))
expect_equal(
attributes(df$c),
list(
row.names = 1L,
names = c("c1", "c2", "c3"),
class = c("tbl_df", "tbl", "data.frame")
)
)
})
Comment on lines +36 to +61
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about factoring this out to an expect_something() helper in this file, such that your tests here are just expect_something(df)?

Of course, the simplest one would be expect_identical(df, example_with_metadata), right? Though there are tradeoffs with that.


### Feather
for (comp in c("lz4", "uncompressed", "zstd")) {
feather_file <- paste0("files/ex_data_", comp, ".feather")

test_that(paste0("Can read the file (feather ", comp, ")"), {
# We can read with no error, we assert metadata below
expect_error(
df <- read_feather(feather_file),
NA
)
})

test_that(paste0("Can see the metadata (feather ", comp, ")"), {
skip_if_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")

df <- read_feather(feather_file)
expect_s3_class(df, "tbl")

expect_equal(
attributes(df),
list(
names = letters[1:4],
row.names = 1L,
top_level = list(
field_one = 12,
field_two = "more stuff"
),
class = c("tbl_df", "tbl", "data.frame")
)
)

# column-level attributes
expect_equal(attributes(df$a), list(class = "special_string"))
expect_equal(
attributes(df$c),
list(
row.names = 1L,
names = c("c1", "c2", "c3"),
class = c("tbl_df", "tbl", "data.frame")
)
)
})
}

test_that("Can read feather version 1", {
feather_v1_file <- "files/ex_data_v1.feather"

df <- read_feather(feather_v1_file)
expect_s3_class(df, "tbl")

expect_equal(
attributes(df),
list(
names = c("a", "b", "d"),
class = c("tbl_df", "tbl", "data.frame"),
row.names = 1L
)
)
})

### IPC Stream
stream_file <- "files/ex_data.stream"

test_that("Can read the file (parquet)", {
# We can read with no error, we assert metadata below
expect_error(
df <- read_ipc_stream(stream_file),
NA
)
})

test_that("Can see the metadata (stream)", {
skip_if_version_less_than("2.0.0", "Version 1.0.1 can't read new version metadata.")
df <- read_ipc_stream(stream_file)

expect_s3_class(df, "tbl")

expect_equal(
attributes(df),
list(
names = letters[1:4],
row.names = 1L,
top_level = list(
field_one = 12,
field_two = "more stuff"
),
class = c("tbl_df", "tbl", "data.frame")
)
)

# column-level attributes
expect_equal(attributes(df$a), list(class = "special_string"))
expect_equal(
attributes(df$c),
list(
row.names = 1L,
names = c("c1", "c2", "c3"),
class = c("tbl_df", "tbl", "data.frame")
)
)
})


39 changes: 39 additions & 0 deletions r/extra-tests/write-files.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

library(arrow)

if (!dir.exists("extra-tests/files")) {
dir.create("extra-tests/files")
}

source("tests/testthat/helper-data.R")

write_parquet(example_with_metadata, "extra-tests/files/ex_data.parquet")

for (comp in c("lz4", "uncompressed", "zstd")) {
if(!codec_is_available(comp)) break

name <- paste0("extra-tests/files/ex_data_", comp, ".feather")
write_feather(example_with_metadata, name, compression = comp)
}

example_with_metadata_v1 <- example_with_metadata
example_with_metadata_v1$c <- NULL
write_feather(example_with_metadata_v1, "extra-tests/files/ex_data_v1.feather", version = 1)

write_ipc_stream(example_with_metadata, "extra-tests/files/ex_data.stream")
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
11 changes: 5 additions & 6 deletions r/tests/testthat/helper-data.R
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,11 @@ example_with_metadata <- tibble::tibble(
),
d = "four"
)
# TODO: collect top-level dataset metadata
# https://issues.apache.org/jira/browse/ARROW-9271
# attr(example_with_metadata, "top_level") <- list(
# field_one = 12,
# field_two = "more stuff"
# )

attr(example_with_metadata, "top_level") <- list(
field_one = 12,
field_two = "more stuff"
)

haven_data <- tibble::tibble(
num = structure(c(5.1, 4.9),
Expand Down
Loading