Skip compiling fusion segments when using expression evaluator#1930
Skip compiling fusion segments when using expression evaluator#1930
Conversation
|
!build |
1 similar comment
|
!build |
bb2b202 to
b853b91
Compare
|
!build |
1 similar comment
|
!build |
94e037c to
015f0e5
Compare
|
!build |
1 similar comment
|
!build |
| @@ -890,9 +890,7 @@ class PredicateChcker : public IterVisitor { | |||
| } // namespace | |||
|
|
|||
| PredicateElimination::PredicateElimination(Fusion* fusion) { | |||
There was a problem hiding this comment.
This was a WAR. Removing now, since we are skipping compilation.
| @@ -1030,12 +1030,7 @@ void validateSizeMemoryOp(LoadStoreOp* ldst) { | |||
| //! Validate data format and GPU arch compatibility of scheduled | |||
| //! mma operators on the fusion. | |||
| void validateMma(Fusion* fusion) { | |||
There was a problem hiding this comment.
This was a WAR. Removing now, since we are skipping compilation.
csrc/serde/fusion_cache.fbs
Outdated
|
|
||
| // Each Fusion Executor maps to a lowered and compiled kernel. | ||
| table FusionExecutor { | ||
| is_compilation_skipped : bool; |
There was a problem hiding this comment.
Why is this needed? isCompilationSkipped is a method that describes the state of the FusionExecutor. Making it a field in table FusionExecutor seems to suggest it's part of the state itself. cc @rdspring1
There was a problem hiding this comment.
It is a way to delineate when we're using the Aten function. Otherwise, we will recompile the fusion to create lowered_ during deserialization.
isCompilationSkipped indicates that we will use an Aten function and the specific fusion is passed to the FusionExecutor in compileFusion. You could separate the first part into a bool field.
There was a problem hiding this comment.
Sorry for the delay -- I missed the notification.
I understood how it's used here to avoid unnecessary compilation. However, can we replace that check with a check on the fusion itself instead of introducing yet another flag? If works, this can avoid an additional state that has to be consistent with some existing state of the fusion.
csrc/kernel_cache.h
Outdated
| return std::all_of( | ||
| executors_.begin(), executors_.end(), [](const auto& executor) { | ||
| return executor.isCompiled(); | ||
| return executor.isCompiled() || executor.isCompilationSkipped(); |
There was a problem hiding this comment.
nitpick: This feels a little bit confusing to me.
i.e. FusionKernelRuntime::isCompiled means something different from FusionExecutor::isCompiled. (similarly we have FusionExecutorCache::isCompiled.
We don't have to do it in this PR, but maybe another renaming PR to clean up the isCompiled vs isCompilationSkipped in FusionExecutor.
There was a problem hiding this comment.
I should rename it to FusionKernelRuntime::isCompiledOrCompilationSkipped.
This function is used to check if compilation was already attempted which is true if either the executors are compiled or compilation was skipped for them.
There was a problem hiding this comment.
Any other naming suggestions for this function?
There was a problem hiding this comment.
- isCompilationSkipped() --- Returns True when we use the Aten function and
FusionExecutoris never compiled. - isCompiled() --- Returns False when
FusionExecutorhas yet to be compiled.
Maybe we should rename isCompilationSkipped() to isAtenExecuted() or isEagerModeExecutor()
There was a problem hiding this comment.
I agree with @rdspring1's suggestion. We can keep isCompiled to denote if compilation was attempted -- either a kernel was compiled or we are using Aten evaluation. Additionally, isCompilationSkipped is renamed to isEagerModeExecutor.
In this naming scheme, isCompiled always means whether or not we attempted to compile a fusion. FusionExecutor::isCompiled which tests for a compiled kernel will need to be renamed though -- maybe isNvfuserExecutor?
Wdyt?
There was a problem hiding this comment.
When isEagerModeExecutor is false, we're using the nvfuser executor.
isCompiled implies that we have a cubin for the given fusion. There is an implicit assumption that the nvfuser executor is being used.
I'd consider renaming isCompiled to isCubinCompiled, but leaving it alone is also fine.
There was a problem hiding this comment.
Okay, so the renaming is
FusionExecutor::isCompiled->FusionExecutor::isCubinCompiledFusionExecutor::isCompilationSkipped->FusionExecutor::isEagerModeExecutorFusionKernelRuntime/FusionExecutorCache::isCompiled-> no change. Returns true if either of the above are true indicating that compilation was attempted.
I'll go with this naming if it clears the confusion. @rdspring1 @jjsjann123
There was a problem hiding this comment.
sounds very reasonable to me.
…`. (#2061) Serialization assumes that `FusionExecutor` always contains a compiled kernel. This PR updates serialization to skip compilation correctly.
Co-authored-by: Jingyue Wu <wujingyue@gmail.com>
Co-authored-by: jjsjann123 <jiej@nvidia.com>
2ac6cb3 to
409d6c8
Compare
|
!build |
csrc/executor.h
Outdated
| } | ||
|
|
||
| Fusion* fusion() const { | ||
| NVF_ERROR((lowered_ && !fusion_) || (!lowered_ && fusion_)); |
There was a problem hiding this comment.
wasn't this just lowered_ ^ fusion_? An error message would be nice.
There was a problem hiding this comment.
Yes, but XOR does not work on these objects.
There was a problem hiding this comment.
TIL. explicit operator bool()
jjsjann123
left a comment
There was a problem hiding this comment.
Thanks for addressing my concerns.
|
!build |
|
|
||
| Fusion* fusion() const { | ||
| NVF_ERROR( | ||
| (lowered_ && !fusion_) || (!lowered_ && fusion_), |
There was a problem hiding this comment.
I think it's clearer to say
(lowered_ == nullptr) != (fusion_ == nullptr)
which essentially use != to do a logical xor.
| int64_t group_id); | ||
|
|
||
| //! Check if compilation was skipped (fusion segment marked for EE). | ||
| bool isExprEval() const { |
There was a problem hiding this comment.
How about isExpressionEvaluated for less ambiguity?
| preseg_passes::OptimizationPassGuard<preseg_passes::MarkAliasesPreparePass> | ||
| optimization_guard(false); | ||
|
|
||
| auto fusion_ptr = std::make_unique<Fusion>(); |
There was a problem hiding this comment.
I know you didn't add this, but since you are here, do you mind doing a minor, side refactor?
- Change this line to
auto fusion = std::make_unique<Fusion>(); - Remove the next line says
auto& fusion=*fusion_ptr. - Change existing uses of
fusion.tofusion->.
The gist is that unique_ptr comes with syntax sugar to make it as if it's a raw pointer. It's not necessary to create another variable just to hold its reference.
This PR resolves Issue #1812.
predicate_elimination.cppandvalidation.cpp(see comments in the PR)FusionExecutor::fusion_is initialized only when compilation is skipped. So only one oflowered_orfusion_is non-null.FusionKernelRuntime/FusionExecutorCache/FusionExecutor::isCompiled()indicates whether compilation was attempted (either marked for EE or via nvFuser).