-
Notifications
You must be signed in to change notification settings - Fork 4k
Description
Describe the enhancement requested
Description
Our team has been leveraging Gandiva in our projects, and its performance and capabilities have been commendable. However, we've identified a constraint concerning the registration of functions. At present, Gandiva necessitates that functions be registered directly within its codebase. This method, while functional, is not the most user-friendly and presents hurdles for those aiming to incorporate third-party functions. Direct modifications to Gandiva's source code for such integrations can inadvertently introduce maintenance challenges and potential versioning conflicts down the line.
Proposal
To address this limitation, I propose the introduction of an external function registry mechanism in Gandiva. This would allow users and developers to register and integrate custom functions without directly modifying Gandiva's core source code.
Two major changes are proposed for supporting this capability:
- add a new API
AddFunction(NativeFunction native_function)forFunctionRegistry, where the given parameternative_functionstores the external function metadata, so that developers can register external functions by calling this API. The external function metadata discovery responsibility is outside the scope of this proposal and developers can come up with her own - add a new class for storing external functions' LLVM IR buffers (likely a singleton class), and merge the external IRs with the built-in function IR into the LLVM module, so that third party pre-compiled functions can be integrated via LLVM bitcode
I've came up with a PR (#37787) with more details, including the JSON schema and a minimum external C++ external function registered via this approach to demonstrate/test how it works.
The latest PR for implementing this proposal is here (#38116)
Benefits
- Flexibility: This feature would grant users the autonomy to effortlessly integrate third-party functions or bespoke logic, eliminating the need to navigate the core codebase.
- Maintainability: Segregating external functions ensures that Gandiva's primary code remains streamlined, facilitating easier updates and maintenance.
- Polyglot programming: The proposed registry could pave the way for function authoring beyond C++, potentially embracing languages like C, Rust, or Zig. These could then be integrated via a standard C API or even at the LLVM IR level.
- Community Engagement: Encouraging external function registration could bolster community participation. This would enable developers to not only contribute but also disseminate their unique functions. Consequently, specialized functions could be curated and shared more widely.
While the intricate design specifics are still under consideration, I'm keen to understand the community's perspective on this proposal. Would an external function registry align with Gandiva's future roadmap? Your insights would be invaluable. Thank you.
Component(s)
C++ - Gandiva
UPDATE:
- 2023-10-07: submit a new PR (GH-37753: [C++][Gandiva] Add external function registry support #38116) and close the original PR (GH-37753: [C++][Gandiva] Add external function registry support #37787) after discussion in the dev mailing list (Related mailing list discussion: https://lists.apache.org/thread/lm4sbw61w9cl7fsmo7tz3gvkq0ox6rod)
- 2023-09-28: according to mailing list's feedback, revise the proposal by removing the JSON/directory metadata population logic out of this proposal
- 2023-09-25: I added some more description into the proposal section