Make WebGPU EP compatible with EP API#26907
Conversation
1ca0d32 to
6873ade
Compare
There was a problem hiding this comment.
Pull request overview
This PR extends WebGPU EP to support building as both a bundled EP (static library) and an EP API-based plugin EP (shared library). The changes introduce a new adapter infrastructure in include/onnxruntime/ep/ that bridges C-API objects to simulate ORT internal class behaviors, enabling compile-time switching between build modes with minimal code changes.
Key changes:
- New EP adapter infrastructure with header-only wrapper classes for kernel info, context, registry, and EP interface
- WebGPU EP API implementation (
api.cc,factory.cc/h,ep.cc/h) inonnxruntime/core/providers/webgpu/ep/ - Templated CPU tensor operator base classes to work with both native and adapter OpKernelInfo types
- Enhanced C API with
KernelInfo_GetOperatorType,KernelInfo_GetSinceVersion, andKernelInfo_GetEp - Test infrastructure updates including example kernel registry changes (Mul kernel renamed to BinaryOp supporting both Add and Mul)
Reviewed changes
Copilot reviewed 75 out of 75 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| include/onnxruntime/ep/*.h | New EP adapter infrastructure headers providing C-API wrappers for kernel registration and execution |
| onnxruntime/core/providers/webgpu/ep/* | WebGPU EP API implementation for factory, EP instance, and plugin entry points |
| onnxruntime/core/providers/cpu/tensor/*.h | Templated base classes for tensor ops to support both native and adapter kernel info types |
| onnxruntime/test/autoep/* | Test updates including BinaryOp generalization and additional test coverage |
| onnxruntime/core/session/*.cc | Core API additions for kernel info operator type, version, and EP retrieval |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
4d451dc to
e52d706
Compare
9e3419b to
0e58206
Compare
4a5e066 to
6333b00
Compare
### Description This PR adds a few headers for supporting building WebGPU EP and CUDA EP as plugin EPs. See summary of #26907
e5caa04 to
a12a6a9
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 32 out of 32 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
AI SummaryThe overall direction is good: the PR adds a real EP API surface for WebGPU instead of relying on the legacy internal factory path, and most of the capability logic is intentionally mirrored from the native WebGPU EP. The blocking issue is that the new plugin path is still hard-wired to the default WebGPU context/device, so non-default device selection and the shared environment plumbing do not actually work once you leave the device-0 happy path. Review1. Device Routing And Allocator Metadata (
|
| # | Severity | Component | Issue |
|---|---|---|---|
| 1 | High | EP factory | Device selection is ignored and all allocators/memory infos are exported as device 0. |
| 2 | High | Data transfer | Shared WebGPU data transfer only works for context/device 0. |
| 3 | High | Capability selection | Preassigned WebGPU nodes are incorrectly included in CPU-preference fallback. |
Verdict
REQUEST CHANGES — the EP API path is still functionally tied to the default WebGPU device, and it also regresses capability selection behavior versus the native WebGPU EP.
|
Explanation of (1) and (2) from @tianleiwu's review: ORT device discovery never worked with WebGPU, majorly because of 2 reasons:
Based on this fact, we create at most one WebGPU device inside WebGPU EP. The real "missing" features are the webgpu options. They are planned to be supported in future changes. |
| } | ||
|
|
||
| uint32_t ORT_API_CALL Factory::GetVendorIdImpl(const OrtEpFactory* /*this_ptr*/) noexcept { | ||
| return 0; |
There was a problem hiding this comment.
0 is None.
Shall this return actual vendor Id here?
There was a problem hiding this comment.
For WebGPU, Vendor ID is explicitly defined to 0. See
onnxruntime/include/onnxruntime/core/framework/ortdevice.h
Lines 56 to 58 in b280801
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 32 out of 32 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
| // currently only option "gpu_graph_id" is used | ||
| auto graph_annotation_str = Api().ort.GetRunConfigEntry(run_options, kOrtRunOptionsConfigCudaGraphAnnotation); |
There was a problem hiding this comment.
Is this only handling this specific run option because we're missing a public C API to get all config entries from a OrtRunOptions? If so, I suppose that would be a good thing to add in a separate PR. It would allow this function to pass along all configs.
There was a problem hiding this comment.
I think it is OK for WebGPU. The current design seems requires a deep copy to convert OrtRunOptions into a onnxruntime::RunOptions, so even if we can get the all entries we may still want to just look at the ones that we are interested in.
|
|
||
| OrtStatus* ORT_API_CALL Factory::CreateEpImpl( | ||
| OrtEpFactory* this_ptr, | ||
| const OrtHardwareDevice* const* /*devices*/, |
There was a problem hiding this comment.
Should this function return a status error if there is more than one hardware device specified here? For example, an application may try to create a session with two OrtEpDevice instances: a discrete gpu and an integrated gpu.
I'm assuming this is not supported by webgpu EP, so perhaps this should return an error to the user.
There was a problem hiding this comment.
This is totally a valid use scenario, however WebGPU does not work in this way.
Please check my comment reply at #26907 (comment). In short, the ORT device discovery does not work with WebGPU. Given a specific OrtHardwareDevice, I cannot pass any info from the OrtHardwareDevice to WebGPU device creation, and I cannot tell whether the WebGPU backend will use it or not.
This is the reason why I just ignore the param
Description
This PR makes it possible to build WebGPU EP as an EP API based plugin EP.
Requirements
The goal of this PR is to support both building WebGPU EP as a bundled EP and an EP API based plugin EP. This approach allows:
Design & Implementation
Instead of changing WebGPU EP from a bundled EP to an EP API based plugin EP in one shot, this PR extend WebGPU EP to support building as plugin EP.
add a new folder
include/onnxruntime/epwith a bunches of header files. Those files are not WebGPU specific. They are used for:onnxruntime::ep::Epto inherit fromThese header files allow a compile time "switch" to the different set of types to minimize changes to existing code. Specifically,
pch.his required to be included as PCH to make sure the "override" to take place correctly.add a new folder
onnxruntime/core/providers/webgpu/epfor EP API implementation, specifically:api.cc: implementsCreateEpFactoriesandReleaseEpFactoryep.ccep.h: implement classonnxruntime::webgpu::ep::Epfactory.ccfactory.h: implement classonnxruntime::webgpu::ep::FactoryDependencies and Prerequisites
(unmerged changes are included as a part of current PR)
Missing Parts