Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
2957655
update NEWS
dragosmg Dec 7, 2021
387ae07
added tests for `decimal256()` and updated older tests to use ` inste…
dragosmg Dec 7, 2021
7ab86fc
added `decimal256()` and `check_decimal_args()` functions
dragosmg Dec 7, 2021
1253ba3
docs
dragosmg Dec 7, 2021
e17401c
update datatype + generate `arrowExports`
dragosmg Dec 7, 2021
df96c16
typo + namespace `vec_assert()` + tests checking the `precision` and …
dragosmg Dec 7, 2021
7cf85db
added test on type and dplyr funcs type
dragosmg Dec 7, 2021
c0bd279
updated `decimal()` to call either `decimal128()` or `decimal256()` b…
dragosmg Dec 7, 2021
1cc6b20
updated NEWS with changes to `decimal()`
dragosmg Dec 7, 2021
619a3d6
updated `decimal` docs & tests
dragosmg Dec 7, 2021
6162dee
docs + lintr complaints
dragosmg Dec 8, 2021
c623389
news change
dragosmg Dec 8, 2021
e737c1e
Merge branch 'master' into ARROW-14844_decimal256_take2
dragosmg Dec 8, 2021
e9c40bc
catch `RPrimitiveConverter` error and then cast
dragosmg Dec 8, 2021
36f566f
reoder decimal definitions
dragosmg Dec 8, 2021
9d0b317
reoder decimals take 2
dragosmg Dec 8, 2021
f8bfacf
attempt to return original error if casting fails
dragosmg Dec 8, 2021
d476428
revert to `cnd` (not `cnd2`)
dragosmg Dec 8, 2021
307195d
update array$create to correctly capture and return the initial error…
dragosmg Dec 8, 2021
36b996d
docs + skipped the `decimal256()` unit test for empty ChunkedArrays
dragosmg Dec 9, 2021
ea19fd3
`Array$create()` tries casting and gives the user a hint if it works
dragosmg Dec 9, 2021
20f054f
slightly improved hint
dragosmg Dec 9, 2021
c4442fb
lint
dragosmg Dec 9, 2021
84879af
removing the unused DecimalArray
dragosmg Dec 9, 2021
7eb138b
removed the line as the C++ linter doesn't like commented code
dragosmg Dec 9, 2021
068755f
add back in the definition for DecimalArray and use it
dragosmg Dec 10, 2021
1f02407
add tests for the `decimal` precision range
dragosmg Dec 15, 2021
e9a27f9
added unit test to check the manual casting hint
dragosmg Dec 20, 2021
fb1c39a
improve comment for the `tryCatch()` + `try()` attempts
dragosmg Dec 20, 2021
4622e22
add TODO and reference the C++ unit tests Jira ticket (ARROW-15162)
dragosmg Dec 20, 2021
8ec06ed
removed superfluous unit tests
dragosmg Dec 20, 2021
9f9031e
added Jira ticket number to empty array filtering tests for `decimal2…
dragosmg Dec 20, 2021
6640e8c
test `Array$create()` fallback is as expected when casting doesn't work
dragosmg Dec 20, 2021
db7148d
use `expect_equal` when comparing messages
dragosmg Dec 20, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions r/NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,7 @@ export(date32)
export(date64)
export(decimal)
export(decimal128)
export(decimal256)
export(default_memory_pool)
export(dictionary)
export(ends_with)
Expand Down
3 changes: 2 additions & 1 deletion r/NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,11 +19,12 @@

# arrow 6.0.1.9000

* Added `decimal256()`. Updated `decimal()`, which now calls `decimal256()` or `decimal128()` based on the value of the `precision` argument.
* updated `write_csv_arrow()` to follow the signature of `readr::write_csv()`. The following arguments are supported:
* `file` identical to `sink`
* `col_names` identical to `include_header`
* other arguments are currently unsupported, but the function errors with a meaningful message.
* Added `decimal128()` (identical to `decimal()`) as the name is more explicit and updated docs to encourage its use.
* Added `decimal128()` (~~identical to `decimal()`~~) as the name is more explicit and updated docs to encourage its use.
* Source builds now by default use `pkg-config` to search for system dependencies (such as `libz`) and link to them
if present. To retain the previous behaviour of downloading and building all dependencies, set `ARROW_DEPENDENCY_SOURCE=BUNDLED`.

Expand Down
24 changes: 23 additions & 1 deletion r/R/array.R
Original file line number Diff line number Diff line change
Expand Up @@ -187,8 +187,30 @@ Array$create <- function(x, type = NULL) {
}
return(out)
}
vec_to_Array(x, type)

if (is.null(type)) {
return(vec_to_Array(x, type))
}

# when a type is given, try to create a vector of the desired type. If that
# fails, attempt to cast and if casting is successful, suggest to the user
# to try casting manually. If the casting fails, return the original error
# message.
tryCatch(
vec_to_Array(x, type),
error = function(cnd) {
attempt <- try(vec_to_Array(x, NULL)$cast(type), silent = TRUE)
abort(
c(conditionMessage(cnd),
i = if (!inherits(attempt, "try-error")) {
"You might want to try casting manually with `Array$create(...)$cast(...)`."
}
)
)
}
)
}

#' @include arrowExports.R
Array$import_from_c <- ImportArray

Expand Down
4 changes: 4 additions & 0 deletions r/R/arrowExports.R

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

73 changes: 55 additions & 18 deletions r/R/type.R
Original file line number Diff line number Diff line change
Expand Up @@ -150,8 +150,11 @@ DecimalType <- R6Class("DecimalType",
scale = function() DecimalType__scale(self)
)
)

Decimal128Type <- R6Class("Decimal128Type", inherit = DecimalType)

Decimal256Type <- R6Class("Decimal256Type", inherit = DecimalType)

NestedType <- R6Class("NestedType", inherit = DataType)

#' Apache Arrow data types
Expand Down Expand Up @@ -181,7 +184,7 @@ NestedType <- R6Class("NestedType", inherit = DataType)
#' `bit64::integer64` object) by setting `options(arrow.int64_downcast =
#' FALSE)`.
#'
#' `decimal128()` creates a `decimal128` type. Arrow decimals are fixed-point
#' `decimal128()` creates a `Decimal128Type`. Arrow decimals are fixed-point
#' decimal numbers encoded as a scalar integer. The `precision` is the number of
#' significant digits that the decimal type can represent; the `scale` is the
#' number of digits after the decimal point. For example, the number 1234.567
Expand All @@ -197,21 +200,30 @@ NestedType <- R6Class("NestedType", inherit = DataType)
#' negative, `scale` causes the number to be expressed using scientific notation
#' and power of 10.
#'
#' `decimal()` is identical to `decimal128()`, defined for backward compatibility.
#' Use `decimal128()` as the name is more informative and `decimal()` might be
#' deprecated in the future.
#' `decimal256()` creates a `Decimal256Type`, which allows for higher maximum
#' precision. For most use cases, the maximum precision offered by `Decimal128Type`
#' is sufficient, and it will result in a more compact and more efficient encoding.
#'
#' #' `decimal()` creates either a `Decimal128Type` or a `Decimal256Type`
#' depending on the value for `precision`. If `precision` is greater than 38 a
#' `Decimal256Type` is returned, otherwise a `Decimal128Type`.
#'
#' Use `decimal128()` or `decimal256()` as the names are more informative than
#' `decimal()`.
#'
#' @param unit For time/timestamp types, the time unit. `time32()` can take
#' either "s" or "ms", while `time64()` can be "us" or "ns". `timestamp()` can
#' take any of those four values.
#' @param timezone For `timestamp()`, an optional time zone string.
#' @param byte_width byte width for `FixedSizeBinary` type.
#' @param list_size list size for `FixedSizeList` type.
#' @param precision For `decimal()`, `decimal128()` the number of significant
#' digits the arrow `decimal` type can represent. The maximum precision for
#' `decimal()` and `decimal128()` is 38 significant digits.
#' @param scale For `decimal()` and `decimal128()`, the number of digits after
#' the decimal point. It can be negative.
#' @param precision For `decimal()`, `decimal128()`, and `decimal256()` the
#' number of significant digits the arrow `decimal` type can represent. The
#' maximum precision for `decimal128()` is 38 significant digits, while for
#' `decimal256()` it is 76 digits. `decimal()` will use it to choose which
#' type of decimal to return.
#' @param scale For `decimal()`, `decimal128()`, and `decimal256()` the number
#' of digits after the decimal point. It can be negative.
#' @param type For `list_of()`, a data type to make a list-of-type
#' @param ... For `struct()`, a named list of types to define the struct columns
#'
Expand Down Expand Up @@ -375,25 +387,49 @@ timestamp <- function(unit = c("s", "ms", "us", "ns"), timezone = "") {
Timestamp__initialize(unit, timezone)
}

#' @rdname data-type
#' @export
decimal <- function(precision, scale) {
args <- check_decimal_args(precision, scale)

if (args$precision > 38) {
decimal256(args$precision, args$scale)
} else {
decimal128(args$precision, args$scale)
}
}

#' @rdname data-type
#' @export
decimal128 <- function(precision, scale) {
args <- check_decimal_args(precision, scale)
Decimal128Type__initialize(args$precision, args$scale)
}

#' @rdname data-type
#' @export
decimal256 <- function(precision, scale) {
args <- check_decimal_args(precision, scale)
Decimal256Type__initialize(args$precision, args$scale)
}

check_decimal_args <- function(precision, scale) {
if (is.numeric(precision)) {
precision <- as.integer(precision)
precision <- vec_cast(precision, to = integer())
vctrs::vec_assert(precision, size = 1L)
} else {
stop('"precision" must be an integer', call. = FALSE)
stop("`precision` must be an integer", call. = FALSE)
}

if (is.numeric(scale)) {
scale <- as.integer(scale)
scale <- vec_cast(scale, to = integer())
vctrs::vec_assert(scale, size = 1L)
} else {
stop('"scale" must be an integer', call. = FALSE)
stop("`scale` must be an integer", call. = FALSE)
}
Decimal128Type__initialize(precision, scale)
}

#' @rdname data-type
#' @export
decimal <- decimal128
list(precision = precision, scale = scale)
}

StructType <- R6Class("StructType",
inherit = NestedType,
Expand Down Expand Up @@ -496,6 +532,7 @@ canonical_type_str <- function(type_str) {
null = "null",
timestamp = "timestamp",
decimal128 = "decimal128",
decimal256 = "decimal256",
struct = "struct",
list_of = "list",
list = "list",
Expand Down
34 changes: 23 additions & 11 deletions r/man/data-type.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

11 changes: 8 additions & 3 deletions r/src/array_to_vector.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -905,6 +905,7 @@ class Converter_Timestamp : public Converter_Time<value_type, TimestampType> {
}
};

template <typename Type>
class Converter_Decimal : public Converter {
public:
explicit Converter_Decimal(const std::shared_ptr<ChunkedArray>& chunked_array)
Expand All @@ -919,8 +920,9 @@ class Converter_Decimal : public Converter {

Status Ingest_some_nulls(SEXP data, const std::shared_ptr<arrow::Array>& array,
R_xlen_t start, R_xlen_t n, size_t chunk_index) const {
using DecimalArray = typename TypeTraits<Type>::ArrayType;
auto p_data = REAL(data) + start;
const auto& decimals_arr = checked_cast<const arrow::Decimal128Array&>(*array);
const auto& decimals_arr = checked_cast<const DecimalArray&>(*array);

auto ingest_one = [&](R_xlen_t i) {
p_data[i] = std::stod(decimals_arr.FormatValue(i).c_str());
Expand Down Expand Up @@ -1217,7 +1219,10 @@ std::shared_ptr<Converter> Converter::Make(
}

case Type::DECIMAL128:
return std::make_shared<arrow::r::Converter_Decimal>(chunked_array);
return std::make_shared<arrow::r::Converter_Decimal<Decimal128Type>>(chunked_array);

case Type::DECIMAL256:
return std::make_shared<arrow::r::Converter_Decimal<Decimal256Type>>(chunked_array);

// nested
case Type::STRUCT:
Expand Down Expand Up @@ -1245,7 +1250,7 @@ std::shared_ptr<Converter> Converter::Make(
break;
}

cpp11::stop("cannot handle Array of type ", type->name().c_str());
cpp11::stop("cannot handle Array of type <%s>", type->name().c_str());
}

std::shared_ptr<ChunkedArray> to_chunks(const std::shared_ptr<Array>& array) {
Expand Down
17 changes: 17 additions & 0 deletions r/src/arrowExports.cpp

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 9 additions & 0 deletions r/src/datatype.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,8 @@ const char* r6_class_name<arrow::DataType>::get(

case Type::DECIMAL128:
return "Decimal128Type";
case Type::DECIMAL256:
return "Decimal256Type";

case Type::LIST:
return "ListType";
Expand Down Expand Up @@ -180,6 +182,13 @@ std::shared_ptr<arrow::DataType> Decimal128Type__initialize(int32_t precision,
return ValueOrStop(arrow::Decimal128Type::Make(precision, scale));
}

// [[arrow::export]]
std::shared_ptr<arrow::DataType> Decimal256Type__initialize(int32_t precision,
int32_t scale) {
// Use the builder that validates inputs
return ValueOrStop(arrow::Decimal256Type::Make(precision, scale));
}

// [[arrow::export]]
std::shared_ptr<arrow::DataType> FixedSizeBinary__initialize(R_xlen_t byte_width) {
if (byte_width == NA_INTEGER) {
Expand Down
18 changes: 18 additions & 0 deletions r/tests/testthat/test-Array.R
Original file line number Diff line number Diff line change
Expand Up @@ -795,6 +795,24 @@ test_that("Array$create() should have helpful error", {
expect_error(Array$create(list()), "Requires at least one element to infer")
expect_error(Array$create(list(lgl, lgl, int)), "Expecting a logical vector")
expect_error(Array$create(list(char, num, char)), "Expecting a character vector")

# hint at casting if direct fails and casting looks like it might work
expect_error(
Array$create(as.double(1:10), type = decimal(4, 2)),
"You might want to try casting manually"
)

expect_error(
Array$create(1:10, type = decimal(12, 2)),
"You might want to try casting manually"
)

a <- expect_error(Array$create("one", int32()))
b <- expect_error(vec_to_Array("one", int32()))
# the captured conditions (errors) are not identical, but their messages should be
expect_s3_class(a, "rlang_error")
expect_s3_class(b, "simpleError")
expect_equal(a$message, b$message)
})

test_that("Array$View() (ARROW-6542)", {
Expand Down
Loading