Enable TensorIndexer for the matmul tests#5574
Conversation
|
!test --diff |
|
Review updated until commit 47ea1e2 Description
|
| Relevant files | |||
|---|---|---|---|
| Tests |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 PR contains tests |
| 🔒 No security concerns identified |
| ⚡ Recommended focus areas for review |
Completeness Check
|
Test failures
-
(Medium, 2)
Tensor numerical mismatches in nvFuser HopperMatmulTest on H100Test Name 20 H100 Source HopperMatmulTest.HSH_NT_UseScheduler_MultipleInstructionsPerWarpTile ❌ ❌ Link
This was found while I was experimenting TensorIndexer with the matmul tests (#5574). Ldmatrix and stmatrix use a special domain as an alternative loop domain for indexing. IIUC, we should not use the alternate domains when initializing tensors. This happens, for example, a tensor is defined by an stmatrix op but is also initialized to zero for predicate elimination. Looks like the initialization should not be done at all, but I think that's a separate issue. Please see https://github.com/NVIDIA/Fuser/pull/5645/files#r2600720284. The other changes are just due to this change. --------- Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
|
!test --diff |
Greptile OverviewGreptile SummaryEnables TensorIndexer for all matmul test classes ( Key Changes:
Confidence Score: 5/5
Important Files ChangedFile Analysis
Sequence DiagramsequenceDiagram
participant Test as Test Class (MmaTest/HopperRS/HopperRSStmatrix/SSTest)
participant SetUp as SetUp() Method
participant Options as EnableOptionsGuard
participant IdModel as IdModel System
participant TensorIndexer as TensorIndexer
participant Tests as Test Execution
Test->>SetUp: Initialize test fixture
SetUp->>Options: getCurOptions().set(EnableOption::IdModel, {"all"})
Options->>IdModel: Enable IdModel for all indexing operations
IdModel->>TensorIndexer: Activate TensorIndexer
Note over TensorIndexer: Uses actual allocation domain<br/>instead of getMaybeAllocationDomain
SetUp->>Test: Continue with test-specific setup
Test->>Tests: Run matmul test cases
Tests->>TensorIndexer: Generate indices for tensors
TensorIndexer->>Tests: Return optimized indices based on loop domain
Note over Tests: Tests verify correctness<br/>with different index generation
|
Enabling TensorIndexer with all matmul tests. The codegen diff shows quite many tests exhibiting some differences. While I haven't examined everything, looks like the changes come from the difference of the initialization of buffers. For example, in
MmaTest/HopperRSStmatrix.SingleTileWithTMALoadStoreStMatrix/4, here's the diff result:We probably should not need to do this initialization (see #5657), but for the sake of this PR, the change here should be benign. Previously, our indexing is always based on the domain returned by
getMaybeAllocationDomain, whereas TensorIndexer uses the actual allocation, which can be the loop domain, so this difference of generated code is expected.