Conversation
|
!test |
|
Review updated until commit 41b077c Description
|
| Relevant files | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| Enhancement |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| ⚡ Recommended focus areas for review |
Function signature compatibility
TensorView* tv to const TensorView* tv and a new ignore_empty_alloc parameter was added. While this is a breaking change, the existing caller in allocation.cpp has been updated. Verify that all other potential callers throughout the codebase are compatible with this change. |
Test failures
-
(Medium, 1)
Tensor numerical mismatches in nvFuser matmul testsTest Name H100 Source HopperMatmulTest.HSH_NT_UseScheduler_MultipleInstructionsPerWarpTile ❌ Link
Greptile OverviewGreptile SummaryRefactored Key changes:
Confidence Score: 5/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant AllocationPass as AllocationDomainSetup
participant ValidationPass as VectorizeValidator
participant Utils as ir_utils::canUsePresetAllocationDomain
participant TV as TensorView
Note over AllocationPass,ValidationPass: Before: Function existed only in AllocationDomainSetup
AllocationPass->>Utils: canUsePresetAllocationDomain(tv)
Utils->>TV: hasAllocation() [if ignore_empty_alloc=true]
TV-->>Utils: false
Utils-->>AllocationPass: false (skip preset domain)
AllocationPass->>Utils: canUsePresetAllocationDomain(tv)
Utils->>TV: hasAllocation()
TV-->>Utils: true
Utils->>TV: getMemoryType()
TV-->>Utils: Global
Utils-->>AllocationPass: true (use preset domain)
AllocationPass->>Utils: canUsePresetAllocationDomain(tv)
Utils->>TV: hasAllocation()
TV-->>Utils: true
Utils->>TV: getMemoryType()
TV-->>Utils: Shared
Utils->>TV: definition()
TV-->>Utils: nullptr [NULL CHECK ADDED]
Utils-->>AllocationPass: false (skip preset domain)
Note over ValidationPass,Utils: New: Validation now uses same function
ValidationPass->>Utils: canUsePresetAllocationDomain(tv)
Utils->>TV: hasAllocation()
TV-->>Utils: false
Utils-->>ValidationPass: false
ValidationPass->>ValidationPass: Early return (skip validation)
|
| // If a shared memory output produced by scatter has an | ||
| // allocation domain explicitly set, it's likely to be the | ||
| // valid allocation domain. | ||
| if (auto def = tv->definition(); def != nullptr && def->isA<ScatterOp>()) { |
There was a problem hiding this comment.
logic: same null pointer issue - tv->definition() may be null
| if (auto def = tv->definition(); def != nullptr && def->isA<ScatterOp>()) { | |
| if (auto def = tv->definition(); def != nullptr && def->isA<ScatterOp>()) { |
Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
!test |
| // AliasTest.NotAllOutputAlias_Reduction has a tensor, tv6, that | ||
| // is a Local tensor with CA position of 4 but has an allocation |
There was a problem hiding this comment.
This sounds like a bug/limitation of alias analysis. The fix could be as simple as letting
Line 58 in 294ba81
How do I verify my assumption? This canUsePresetAllocationDomain workaround got merged in months ago -- it's unclear to me how to revert it.
There was a problem hiding this comment.
#5594 seems to pass existing tests. But I don't know how to trigger the issue that leads to canUsePresetAllocationDomain.
There was a problem hiding this comment.
I don't know how to trigger the issue that leads to canUsePresetAllocationDomain.
I wonder if we can adopt an existing test using the bulk load and fork the loaded cache to some other output that is safe alias.
always treat non-global tensors as undetermined.
TL;DR: I think that's probably a safe bet for the codegen status as-is.
If you look at this function, there's also cases where we want to respect allocation domain on shared memory buffers as well, if there's a swizzle on the transform path. I'm uncertain how that could affect the alias forwarding, i.e. Whether alias forwarding would correctly compute the output alias tensor properly handling or ignoring the allocation domain set on intermediate TVs. If alias analysis only forwards global tensors, I think it's safe to ignore anything else along the way?
It always mess with my head since that pass runs both on unsegmented and segmented groups.
There was a problem hiding this comment.
If alias analysis only forwards global tensors, I think it's safe to ignore anything else along the way?
Yes -- that's what #5594 tries to do. Is it sufficient to avoid this change? If so, is it even sufficient to avoid canUsePresetAllocationDomain entirely?
There was a problem hiding this comment.
That one looks good to me.
If the analysis only sticks to global tensors as anchor point, I agree that we don't need to stick with the same analysis that lowering uses for ignoring allocation domain.
|
closing in favor of #5594 |
Pull the function
canUsePresetAllocationDomainfrom allocation pass during lowering into a util function.The reason is to have a consistent way to check if a certain allocation domain is being ignored by codegen, which is useful for things like alias analysis, where they can safely skip the validation.
The refactored util function is used in follow up PR #5184