Skip to content

Shortest path between ValGroups in ValGraph#2270

Merged
naoyam merged 5 commits intomainfrom
val_graph_bfs
May 20, 2024
Merged

Shortest path between ValGroups in ValGraph#2270
naoyam merged 5 commits intomainfrom
val_graph_bfs

Conversation

@naoyam
Copy link
Collaborator

@naoyam naoyam commented May 18, 2024

(Extracted from #2238)

Traversal for finding the shortest path from ValGroups to another ValGroups. The algorithm is based on the standard BFS traversal, however, since ValGraph is not an undirected graph, the dependencies of ValGroups and ExprGroups need to be
satisfied. Specifically, when visiting an ExprGroup, either its inputs or outputs must be visited before. Similarly, when visiting
a ValGroup, there must be at least one defining ExprGroup or one use ExprGroup that is already visited.

The main use case is tensor indexing (#2238), where a typical traversal would be from loop domains to allocation domains. Some indexing-specific specialization would be needed, for example, dependencies with broadcast domains can be ignored as their index is always just zero. The indexing shortest-path traversal would be implemented by subclassing this class.

Please see the new unit tests for concrete examples.

@naoyam
Copy link
Collaborator Author

naoyam commented May 18, 2024

!build

@naoyam naoyam requested a review from jacobhinkle May 18, 2024 22:01
Copy link
Collaborator

@jacobhinkle jacobhinkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am still thinking about cases with loops but I don't see any problems with the algorithm as it is.

to_visit_.insert(to_visit_.end(), not_ready_.begin(), not_ready_.end());
};

if (!allToGroupsVisited()) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to check my understanding. If this line fires, it means we had an iteration where something was not processed. In that case, not_ready_ could be empty, such as when the graph is disconnected and we just cannot reach the target with any path. Or it might not be empty, indicating that we cannot progress to processing all the dependencies of some nodes. I am trying to understand this case. Since we might have cycles in the ValGraph as mentioned in comments, I guess we might have a dependency for some ExprGroup that is involved in a cycle. If that gets skipped due to the cycle then downstream nodes would never become ready so we would hit this I suppose.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not_ready_ is a temporary container to hold groups that are adjacent to the frontier groups but we are not yet able to visit due to missing dependencies. For example, if a Merge group only has one of the inputs, it would be in not_ready_. Those groups are connected with missing dependencies.

If not_ready_ is empty, that should mean there's no more connected group that isn't visited yet. So, some of the to_groups_ must be disconnected.

I'm not sure if this could be a problem with cycles. At least, it should be detected and result in a failure rather than just resulting in an infinite loop.

@naoyam naoyam added the idmodel label May 20, 2024
@naoyam naoyam merged commit b1ee309 into main May 20, 2024
@naoyam naoyam deleted the val_graph_bfs branch May 20, 2024 19:22
naoyam added a commit that referenced this pull request Jun 6, 2024
This is the first of a series of IdModel-based indexing PRs. The overall
algorithm is exactly what I presented at the nvfuser meeting and is
based on the shortest path traversal introduced at #2270. This PR only
includes minimum logic for basic indexing. However, pointwise and
reduction ops with TID/BID parallelization as well as reshape should
work.

Notable missing support:

- Broadcast
- Vectorize, Unswitch, Unroll
- Contiguous indexing

Note that the new indexing interface is not plugged into the lowering
yet. Currently, it's only used by the new unit tests. In subsequent PRs,
there'll be a switch to use the new indexer instead of the existing one.

Overall tracking PR: #2238
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants