Skip to content

[C++][Gandiva] Support external function registry #37753

@niyue

Description

@niyue

Describe the enhancement requested

Description

Our team has been leveraging Gandiva in our projects, and its performance and capabilities have been commendable. However, we've identified a constraint concerning the registration of functions. At present, Gandiva necessitates that functions be registered directly within its codebase. This method, while functional, is not the most user-friendly and presents hurdles for those aiming to incorporate third-party functions. Direct modifications to Gandiva's source code for such integrations can inadvertently introduce maintenance challenges and potential versioning conflicts down the line.

Proposal

To address this limitation, I propose the introduction of an external function registry mechanism in Gandiva. This would allow users and developers to register and integrate custom functions without directly modifying Gandiva's core source code.
Two major changes are proposed for supporting this capability:

  • add a new API AddFunction(NativeFunction native_function) for FunctionRegistry, where the given parameter native_function stores the external function metadata, so that developers can register external functions by calling this API. The external function metadata discovery responsibility is outside the scope of this proposal and developers can come up with her own
  • add a new class for storing external functions' LLVM IR buffers (likely a singleton class), and merge the external IRs with the built-in function IR into the LLVM module, so that third party pre-compiled functions can be integrated via LLVM bitcode

I've came up with a PR (#37787) with more details, including the JSON schema and a minimum external C++ external function registered via this approach to demonstrate/test how it works.
The latest PR for implementing this proposal is here (#38116)

Benefits

  • Flexibility: This feature would grant users the autonomy to effortlessly integrate third-party functions or bespoke logic, eliminating the need to navigate the core codebase.
  • Maintainability: Segregating external functions ensures that Gandiva's primary code remains streamlined, facilitating easier updates and maintenance.
  • Polyglot programming: The proposed registry could pave the way for function authoring beyond C++, potentially embracing languages like C, Rust, or Zig. These could then be integrated via a standard C API or even at the LLVM IR level.
  • Community Engagement: Encouraging external function registration could bolster community participation. This would enable developers to not only contribute but also disseminate their unique functions. Consequently, specialized functions could be curated and shared more widely.

While the intricate design specifics are still under consideration, I'm keen to understand the community's perspective on this proposal. Would an external function registry align with Gandiva's future roadmap? Your insights would be invaluable. Thank you.

Component(s)

C++ - Gandiva

UPDATE:

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions