Visit allIDs in IterVisitor by jacobhinkle · Pull Request #5384 · NVIDIA/Fuser

jacobhinkle · 2025-10-14T12:06:28Z

This is split off from #5345, so I don't have a specific repro for any incorrect behavior that this fixes.

Previously, we only traversed all producers of the loop domain in IterVisitor::traverseBetween. That is a problem in cases where we schedule like producer of a reshape, or in exotic cases like #5345 where the domains are disconnected.

This PR ensures that we traverse every ID in the TensorDomain regardless of the relations between the domains contained within.

Previously, we only traversed all producers of the loop domain in IterVisitor::traverseBetween. That is a problem in cases where we schedule like producer of a reshape, or in exotic cases like #5345 where the domains are disconnected. This PR ensures that we traverse every ID in the TensorDomain regardless of the relations between the domains contained within. Note that it calls TensorDomain::allIDs when getting the "next" statements, which will do a redundant topological sort.

github-actions · 2025-10-14T12:07:19Z

Review updated until commit 3b03346

Description

Fix disconnected domain traversal in IterVisitor
Traverse all tensor domains via allDomains()
Ensure complete ID coverage in scheduling
Avoid redundant topological sorts in allIDs

Changes walkthrough 📝

Relevant files

Enhancement

nodes.cpp `Add allDomains() and refactor allIDs()` csrc/ir/nodes.cpp Introduced `allDomains()` returning pointers to all domain vectors Refactored `allIDs()` to use `allDomains()` for consistency Maintained topological uniqueness via `VectorOfUniqueEntries` Removed redundant sorting by reusing domain lists	+6/-1
internal_base_nodes.h `Declare allDomains() in TensorDomain` csrc/ir/internal_base_nodes.h Added declaration of `allDomains()` method Kept `allIDs()` declaration for backward compatibility	+2/-0

Bug fix

iter_visitor.cpp `Traverse all domains in IterVisitor` csrc/iter_visitor.cpp Modified `handle(TensorDomain*)` to iterate over all domains Used `allDomains()` to collect all IterDomain pointers Ensures traversal of reshape and disconnected domains	+3/-2

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

🧪 No relevant tests

⚡ Recommended focus areas for review

Duplicate Function Definition

The function allIDs is defined twice in the same file, which will lead to a compilation error. Only one definition should be retained.

std::vector<const std::vector<IterDomain*>*> TensorDomain::allDomains() const {
  std::vector<const std::vector<IterDomain*>*> all_domains = {
      &loop_domain_,
      &logical_domain_,
      &root_domain_,
      &initial_loop_domain_,
      &allocation_domain_,
      &additional_ids_};
  if (alternate_loop_domain_.has_value()) {
    all_domains.push_back(&alternate_loop_domain_.value());
  }
  return all_domains;
}

std::vector<IterDomain*> TensorDomain::allIDs() const {
  const std::vector<const std::vector<IterDomain*>*> all_domains = allDomains();
  VectorOfUniqueEntries<IterDomain*> discovered_ids;
  for (auto domain : all_domains) {
    discovered_ids.pushBack(*domain);
  }

Potential Performance Overhead

The use of insert with iterators in a loop may lead to performance overhead. Consider using reserve and push_back for better performance.

for (const std::vector<IterDomain*>* dom : stmt->allDomains()) {
  next_stmts_.insert(next_stmts_.end(), dom->begin(), dom->end());
}

jacobhinkle · 2025-10-14T12:55:41Z

!test --diff

jacobhinkle · 2025-10-14T15:20:58Z

!test --diff

jacobhinkle · 2025-10-14T15:22:35Z

csrc/ir/nodes.cpp

+}
+
+std::vector<IterDomain*> TensorDomain::allIDs() const {
+  const std::vector<const std::vector<IterDomain*>*> all_domains = allDomains();


I factored this out into another method so we can avoid running getExprsBetween which can trigger an infinite recursion if we call allIDs from IterVisitor also.

jjsjann123 · 2025-10-14T15:48:52Z

csrc/iter_visitor.cpp

-    next_stmts_.insert(
-        next_stmts_.end(), stmt->loop().begin(), stmt->loop().end());
+    for (const std::vector<IterDomain*>* dom : stmt->allDomains()) {
+      next_stmts_.insert(next_stmts_.end(), dom->begin(), dom->end());


naoyam · 2025-10-14T17:43:45Z

The change looks good, but why so many test failures?

jjsjann123 · 2025-10-16T16:02:25Z

!test

jjsjann123

jacobhinkle · 2025-10-16T18:22:03Z

The change looks good, but why so many test failures?

Looks like all failures were due to CI infra issues past couple days.

## Stacked PRs #5230 moe layer with nvfp4 grouped_mm #5345 exposing layout op at direct python binding <-- this PR #5198 refactor number of groups in layout op #5174 allow layout op in automatic scheduler ## This PR Expose layout op at python direct binding. Added nvfp4 grouped gemm in python test. Minor fixes: 1. ~Added support of allocation domain for output of layout op in concretization pass to maintain the dependency on padded allocation domain to its logical domain.~ No longer needed, handled in #5384 2. Skipped validation for `setAllocationDomain` 3. updated reference implementation to match the math order in nvfuser decomposed nvfp4 quantization. TODO: python tests requires IdModel Indexer in order to work. See issue #5200, as well as suggested WAR in #5200 (comment)

This is split off from #5345, so I don't have a specific repro for any incorrect behavior that this fixes. Previously, we only traversed all producers of the loop domain in IterVisitor::traverseBetween. That is a problem in cases where we schedule like producer of a reshape, or in exotic cases like #5345 where the domains are disconnected. This PR ensures that we traverse every ID in the TensorDomain regardless of the relations between the domains contained within. --------- Co-authored-by: jjsjann123 <jiej@nvidia.com>

## Stacked PRs #5230 moe layer with nvfp4 grouped_mm #5345 exposing layout op at direct python binding <-- this PR #5198 refactor number of groups in layout op #5174 allow layout op in automatic scheduler ## This PR Expose layout op at python direct binding. Added nvfp4 grouped gemm in python test. Minor fixes: 1. ~Added support of allocation domain for output of layout op in concretization pass to maintain the dependency on padded allocation domain to its logical domain.~ No longer needed, handled in #5384 2. Skipped validation for `setAllocationDomain` 3. updated reference implementation to match the math order in nvfuser decomposed nvfp4 quantization. TODO: python tests requires IdModel Indexer in order to work. See issue #5200, as well as suggested WAR in #5200 (comment)

jacobhinkle mentioned this pull request Oct 14, 2025

exposing layout op at direct python binding #5345

Merged

clang-format

65b79e7

jacobhinkle requested a review from naoyam October 14, 2025 12:56

jacobhinkle marked this pull request as ready for review October 14, 2025 12:56

Don't use allIDs. Factor out allDomains() instead

5e2c027

jacobhinkle commented Oct 14, 2025

View reviewed changes

jjsjann123 reviewed Oct 14, 2025

View reviewed changes

Merge branch 'main' into jh/iter_visitor_allids

3b03346

jjsjann123 approved these changes Oct 16, 2025

View reviewed changes

jacobhinkle merged commit a5e4aa2 into main Oct 16, 2025
58 of 60 checks passed

jacobhinkle deleted the jh/iter_visitor_allids branch October 16, 2025 18:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Visit allIDs in IterVisitor#5384

Visit allIDs in IterVisitor#5384
jacobhinkle merged 4 commits intomainfrom
jh/iter_visitor_allids

jacobhinkle commented Oct 14, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 14, 2025 •

edited

Loading

Uh oh!

jacobhinkle commented Oct 14, 2025

Uh oh!

jacobhinkle commented Oct 14, 2025

Uh oh!

jacobhinkle Oct 14, 2025

Uh oh!

jjsjann123 Oct 14, 2025

Uh oh!

naoyam commented Oct 14, 2025

Uh oh!

jjsjann123 commented Oct 16, 2025

Uh oh!

jjsjann123 left a comment

Uh oh!

jacobhinkle commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jacobhinkle commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Changes walkthrough 📝

PR Reviewer Guide 🔍

Uh oh!

jacobhinkle commented Oct 14, 2025

Uh oh!

jacobhinkle commented Oct 14, 2025

Uh oh!

jacobhinkle Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

jjsjann123 Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

naoyam commented Oct 14, 2025

Uh oh!

jjsjann123 commented Oct 16, 2025

Uh oh!

jjsjann123 left a comment

Choose a reason for hiding this comment

Uh oh!

jacobhinkle commented Oct 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

jacobhinkle commented Oct 14, 2025 •

edited

Loading

github-actions bot commented Oct 14, 2025 •

edited

Loading