-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-12745: [C++][Compute] Add floor, ceiling, and truncate kernels #10727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ARROW-12745: [C++][Compute] Add floor, ceiling, and truncate kernels #10727
Conversation
|
Food for thought: Tests fail for An example test is: and the error message is: The meaning of these numbers is: There are two alternatives to handle min/max tests:
|
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
compute.rst needs to be updated as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This means we truncate towards negative infinity? (e.g. truncate(-1.1) = -2 since -1 > -1.1?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, not greater in magnitude according to cppreference. Maybe that should be clarified.
720e80f to
1df410e
Compare
cpp/src/arrow/compute/api_scalar.cc
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| SCALAR_EAGER_UNARY(Ceiling, "ceiling") | |
| SCALAR_EAGER_UNARY(Ceil, "ceil") |
Is there a reason to have it "ceiling" instead of "ceil" (the C++ function is ceil, as well as numpy. SQL seems to have both)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that ceil should be the name used to invoked the function. The only reason for using the long form for the internal variable names is to be somewhat consistent with other compute functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jorisvandenbossche Compute function names (for both C++ API and CallFunction name) are inconsistent w.r.t. to short vs. long form. For example, Ceiling, Negate, Power, etc. I think this should be revisited in another JIRA issue and will require updating Python and R bindings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used the short form for compute function names in this PR, and opened this JIRA to revisit the names of compute functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| "Calculate the greatest integer in magnitude less than or equal to the " | |
| "argument element-wise", | |
| "", | |
| "Round down to the nearest integer", | |
| "Calculate the greatest integer in magnitude less than or equal to the " | |
| "argument element-wise", |
(that's how you explained it in the api_scalar.h doc comments, which I find easier to understand as short summary of the function.
1df410e to
3a657c4
Compare
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. One comment about the tests, one comment about the docs.
docs/source/cpp/compute.rst
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is worded a little confusingly in my opinion. If we're going to reference rounding strategy here, the notes column should describe the rounding behavior for each function (even if it's just the 'obvious' or 'expected' one).
Or alternatively, something like 'rounding functions find the nearest integer (as a floating-point value) to the argument based on a rounding strategy'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the rounding functions section/text based on the soon-to-be-ready round and mround functions. Although, floor, ceil, and trunc do round to the nearest integer, round/mround do not necessarily. They have options to specify fractional precision and have options for various rounding strategies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok, sounds good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This drops the tests for atan2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this, I had moved atan2 to the binary DispatchBest but apparently skipped it during a rebase/merge. I will add them back.
3a657c4 to
c0b1041
Compare
lidavidm
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Thanks @edponce! |
This PR adds floor, ceiling, and truncate scalar kernels. For all integral inputs, output is a 64-bit floating-point value.