n-to-1 kernel fusion via KernelConfig#45363
n-to-1 kernel fusion via KernelConfig#45363michaelbenayoun wants to merge 10 commits intohuggingface:mainfrom
KernelConfig#45363Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
To be removed before merging. Temporary file.
|
[For maintainers] Suggested jobs to run (before merge) run-slow: qwen3 |
KernelConfig
| _FUSION_PATTERNS_REGISTRY[model_class_or_instance] = patterns | ||
|
|
||
|
|
||
| class FusedModuleBase(nn.Module): |
There was a problem hiding this comment.
@ArthurZucker the API is way simpler than #44979 because the whole purpose of fusion here is to replace the forward with a kernel. So we do not need all the complex machinery. What we need is just a module and being able to replace its forward with the kernel forward.
ArthurZucker
left a comment
There was a problem hiding this comment.
much simpler, much better! I like it 🤗 (not an in depth review as we have to discuss some stuff internally!)
| module.add_module(child_names[0], fused_instance) | ||
| for child_name in child_names[1:]: | ||
| module.add_module(child_name, nn.Identity()) |
There was a problem hiding this comment.
we're probably gonna have concerns with hooks, especially with TP potentially but also accelerate!
What does this PR do?
This PR adds support for fusing multiple modules into a single kernel — the motivating case being fused RMSNorm+MLP kernels, but the API is generic.
What changed
FusedModuleBase,fuse_modules,unfuse_modules,register_fusion_patternsadded tohub_kernels.pyKernelConfignow accepts tuple keys that trigger fusion before kernelizationTwo ways to use it
Option A — inline (no model changes needed)
Embed the glob patterns directly in the
KernelConfigkey as(name, path)pairs:Option B — via registry (model declares its patterns)
The model class declares where its fusable modules live:
Or externally without touching the class:
Then the
KernelConfigkey is just the kernel name:Example
Question
How is related to #45041 and is it serving the same purpose / needs?