-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-15639 [C++][Python] UDF Scalar Function Implementation #12590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Changes from all commits
Commits
Show all changes
131 commits
Select commit
Hold shift + click to select a range
db634fb
updating submodule
vibhatha 9ad5c8f
temp commit to remove files in submodule
vibhatha 12010e8
adding submodule
vibhatha d7c4593
updating testing submodule
vibhatha 2a2ec21
revert to uupstream version
vibhatha 7aa5fc4
nltk example step 1
vibhatha 994b629
rebase
vibhatha 86632be
function registry example enhanced for udf implementation
vibhatha c49ece3
adding udf poc interfaces
vibhatha f2f35cb
testing udf
vibhatha 5598cdc
adding cpp interfaces in cython and creating a basic UDF synthesizer
vibhatha dc9c8df
adding cython binding for funcptr init
vibhatha e5dc4b3
initial version of function registry WIP
vibhatha dc15bda
updating call back API
vibhatha 78ca0a9
func registry with a cython udf
vibhatha 9531409
testing udf python expose
vibhatha 7400faf
initial version of end-to-end pycallable
vibhatha 5ec38ed
adding and end-to-end udf for scalar array
vibhatha bd3f1ce
reformat
vibhatha 9f5e865
minor changes
vibhatha 6d98c17
removing inspect func
vibhatha b4a8fd3
cleaning up the current python API
vibhatha e13259e
cleaning up the current code
vibhatha 541fffd
temp checkin
vibhatha 82b71b7
minor changes
vibhatha c31ccc1
updating cmakelist
vibhatha 92caca7
updating cmakelist(examples)
vibhatha 1ff043f
minor fix for python
vibhatha 460e2c9
refactor code v1
vibhatha a82ecd9
adding scalar unary and array unary ops
vibhatha 846cb6c
adding initial macro component
vibhatha 2ef0e9b
adding exceptions and refactor
vibhatha c238ba1
updating example
vibhatha 7e0ea90
moved udf example
vibhatha a987068
fix varargs function registration issue
vibhatha f73fe0c
fix memory issue
vibhatha e839616
adding udf example
vibhatha 35a94c2
refactor code and adding python test cases
vibhatha f1f9687
cleaning up udf C++ example
vibhatha 534592e
cleaning up examples cmake file
vibhatha a7abaf8
cleaning up temp test
vibhatha bff2be8
reformat tests
vibhatha b2df0c3
cleaning code
vibhatha 4c6efc2
acleaning spacing
vibhatha 286629e
adding doc string for registration function
vibhatha facf36b
update function call
vibhatha d8ac3a6
updating registration code
vibhatha 06e042d
refactor python bindings and func docs
vibhatha a35066f
addressing reviews
vibhatha 427ef1d
adding test cases for negative cases
vibhatha bd1e74b
fixing an issue in func docs passing
vibhatha be1e59f
minor check for appveyour
vibhatha a9228e9
removing aggregate example
vibhatha a34a455
addressing reviews on udf example
vibhatha 09d126b
addressing reviews p1
vibhatha 113d35f
addressing reviews p2
vibhatha 1b8183a
fixing a typo
vibhatha 58e8b90
fixing typo
vibhatha 9c68525
fixing a formatting typo
vibhatha 0eff947
removing custom exceptions
vibhatha 493426d
cmake formatting
vibhatha 24c1d40
removing arity python interface
vibhatha 20ebc30
refactor the udf builder API and add options
vibhatha 6d0215f
rebase
vibhatha 1247558
addressing reviews on function docs, and api modifications
vibhatha e327e4a
reformatting python docs and testing output type of udf execution
vibhatha 3f35820
removing udf example
vibhatha 221700d
format doc strings
vibhatha 2a2d649
refactor util functions and docstrings
vibhatha 323e845
updating test cases
vibhatha f8be4c7
updating test case format
vibhatha 5f537e3
updating capture clause for udf caller
vibhatha 707b910
adding function input validation
vibhatha 9eba9cb
updating function signature and usage
vibhatha 24fdd31
adding validation and test cases
vibhatha 7a17d02
adding doc check test cases
vibhatha 5769f49
updating test cases
vibhatha 228d43a
updating udf c++ kernel usage
vibhatha 60bfd7f
refactor test suite
vibhatha 545925d
simplifying udf options and fix arity usage
vibhatha b47924a
removing num_args and updating test cases
vibhatha 4566078
refactor test cases exception message validation
vibhatha 0158514
updating function doc usage and input type usage in function registra…
vibhatha 91a410b
adding validation for output type
vibhatha df99ace
doc format and add python error check
vibhatha 31a9ae4
adding todo docstring and add python check error
vibhatha 9056a0a
formatting exception messages
vibhatha 27b55cb
addressing reviews
vibhatha 440e19f
func doc usage fix
vibhatha e21c82d
fcleaning up the udf builder into a plain function
vibhatha 5ec35ca
cleaning unused element
vibhatha 5b2c982
address minor issues in formatting and usage in tests
vibhatha 196b66e
update documentation for InputType funcs
vibhatha 6aed6ad
minor documentation fix
vibhatha 22a48de
remove const
vibhatha dfa60bd
update function args
vibhatha ac844e1
update test case and validations for output types
vibhatha 5dfe7ea
update test cases and usage
vibhatha 086d803
fixing a memory issue and adding strongly typed function API
vibhatha a2c309e
fixing error messages, updating test cases
vibhatha bc1953c
minor change to error message
vibhatha 34238ca
adding context to udfs
vibhatha a52b8dd
adding documentation for scalar udf context
vibhatha 88bf789
minor formatting
vibhatha 869b0fb
addressing code doc reviews
vibhatha 49441c9
updating test cases with fixtures and updating memory pool example
vibhatha eb6aaf4
updating validity scalar class
vibhatha 3023124
converting scalar udf options to a struct
vibhatha 5a3ff2e
refactor the test suite for fixture usage in registration
vibhatha 30b2f8f
removing the risky usage of memory pool
vibhatha 494f610
remove InputType and update correspondeces
vibhatha bc91482
format test cases
vibhatha 2358a95
addressed minor nits and formatting
vibhatha d54da8b
addressing review comments
vibhatha 39a9cae
minor changes
vibhatha e48b94c
addressing missed pr feedback
vibhatha 47329ec
minor modifications
vibhatha 0d1a1a8
updating docstrings of scalar function registration
vibhatha 0f03d1f
fixed typo on input_types doctoring
vibhatha f294d53
addressing review comments
vibhatha 705e4df
fixed the definition for batch_length extraction
vibhatha 0ca7303
fixing a typo
vibhatha 8208ace
added a missing move op
vibhatha 29b5121
minor nit
vibhatha c961c4e
Update cpp/src/arrow/compute/function.h
vibhatha fc354e2
Apply suggestions from code review
vibhatha 8425e57
addressing reviews
vibhatha eef3896
avoid a copying input types
vibhatha a51a6a0
addressing move issue
vibhatha 706087a
Fix Python reference leaks and improve tests
pitrou 08880e7
Fix subtract_checked doc
pitrou File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,103 @@ | ||
| // Licensed to the Apache Software Foundation (ASF) under one | ||
| // or more contributor license agreements. See the NOTICE file | ||
| // distributed with this work for additional information | ||
| // regarding copyright ownership. The ASF licenses this file | ||
| // to you under the Apache License, Version 2.0 (the | ||
| // "License"); you may not use this file except in compliance | ||
| // with the License. You may obtain a copy of the License at | ||
| // | ||
| // http://www.apache.org/licenses/LICENSE-2.0 | ||
| // | ||
| // Unless required by applicable law or agreed to in writing, | ||
| // software distributed under the License is distributed on an | ||
| // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | ||
| // KIND, either express or implied. See the License for the | ||
| // specific language governing permissions and limitations | ||
| // under the License. | ||
|
|
||
| #include <arrow/api.h> | ||
| #include <arrow/compute/api.h> | ||
|
|
||
| #include <cstdlib> | ||
| #include <iostream> | ||
| #include <memory> | ||
| #include <string> | ||
| #include <type_traits> | ||
| #include <utility> | ||
| #include <vector> | ||
|
|
||
| // Demonstrate registering a user-defined Arrow compute function outside of the Arrow | ||
| // source tree | ||
|
|
||
| namespace cp = ::arrow::compute; | ||
|
|
||
| #define ABORT_ON_FAILURE(expr) \ | ||
| do { \ | ||
| arrow::Status status_ = (expr); \ | ||
| if (!status_.ok()) { \ | ||
| std::cerr << status_.message() << std::endl; \ | ||
| abort(); \ | ||
| } \ | ||
| } while (0); | ||
|
|
||
| template <typename TYPE, | ||
| typename = typename std::enable_if<arrow::is_number_type<TYPE>::value | | ||
| arrow::is_boolean_type<TYPE>::value | | ||
| arrow::is_temporal_type<TYPE>::value>::type> | ||
| arrow::Result<std::shared_ptr<arrow::Array>> GetArrayDataSample( | ||
| const std::vector<typename TYPE::c_type>& values) { | ||
| using ArrowBuilderType = typename arrow::TypeTraits<TYPE>::BuilderType; | ||
| ArrowBuilderType builder; | ||
| ARROW_RETURN_NOT_OK(builder.Reserve(values.size())); | ||
| ARROW_RETURN_NOT_OK(builder.AppendValues(values)); | ||
| return builder.Finish(); | ||
| } | ||
|
|
||
| const cp::FunctionDoc func_doc{ | ||
| "User-defined-function usage to demonstrate registering an out-of-tree function", | ||
| "returns x + y + z", | ||
| {"x", "y", "z"}, | ||
| "UDFOptions"}; | ||
|
|
||
| arrow::Status SampleFunction(cp::KernelContext* ctx, const cp::ExecBatch& batch, | ||
| arrow::Datum* out) { | ||
| // temp = x + y; return temp + z | ||
| ARROW_ASSIGN_OR_RAISE(auto temp, cp::CallFunction("add", {batch[0], batch[1]})); | ||
| return cp::CallFunction("add", {temp, batch[2]}).Value(out); | ||
| } | ||
|
|
||
| arrow::Status Execute() { | ||
| const std::string name = "add_three"; | ||
| auto func = std::make_shared<cp::ScalarFunction>(name, cp::Arity::Ternary(), func_doc); | ||
| cp::ScalarKernel kernel( | ||
| {cp::InputType::Array(arrow::int64()), cp::InputType::Array(arrow::int64()), | ||
| cp::InputType::Array(arrow::int64())}, | ||
| arrow::int64(), SampleFunction); | ||
|
|
||
| kernel.mem_allocation = cp::MemAllocation::NO_PREALLOCATE; | ||
| kernel.null_handling = cp::NullHandling::COMPUTED_NO_PREALLOCATE; | ||
|
|
||
| ARROW_RETURN_NOT_OK(func->AddKernel(std::move(kernel))); | ||
|
|
||
| auto registry = cp::GetFunctionRegistry(); | ||
| ARROW_RETURN_NOT_OK(registry->AddFunction(std::move(func))); | ||
|
|
||
| ARROW_ASSIGN_OR_RAISE(auto x, GetArrayDataSample<arrow::Int64Type>({1, 2, 3})); | ||
| ARROW_ASSIGN_OR_RAISE(auto y, GetArrayDataSample<arrow::Int64Type>({4, 5, 6})); | ||
| ARROW_ASSIGN_OR_RAISE(auto z, GetArrayDataSample<arrow::Int64Type>({7, 8, 9})); | ||
|
|
||
| ARROW_ASSIGN_OR_RAISE(auto res, cp::CallFunction(name, {x, y, z})); | ||
| auto res_array = res.make_array(); | ||
| std::cout << "Result" << std::endl; | ||
| std::cout << res_array->ToString() << std::endl; | ||
| return arrow::Status::OK(); | ||
| } | ||
|
|
||
| int main(int argc, char** argv) { | ||
| auto status = Execute(); | ||
| if (!status.ok()) { | ||
| std::cerr << "Error occurred : " << status.message() << std::endl; | ||
| return EXIT_FAILURE; | ||
| } | ||
| return EXIT_SUCCESS; | ||
| } | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.