Skip to content

Predicate indexing for circular buffering#2677

Merged
naoyam merged 8 commits intomainfrom
idmodel_indexing_circular_buffer_predicate
Jul 27, 2024
Merged

Predicate indexing for circular buffering#2677
naoyam merged 8 commits intomainfrom
idmodel_indexing_circular_buffer_predicate

Conversation

@naoyam
Copy link
Collaborator

@naoyam naoyam commented Jul 24, 2024

Adding support of predicate indexing with circular buffering.

Circular buffering itself doesn't need many changes, but circular buffering and unswitch/unroll is a bit more complicated. There's an existing bug as well, which is fixed here.

#2663 could simplify this PR but we probably don't want to enforce epilogue generation. This PR doesn't rely on it.

Fixes #2159

@naoyam
Copy link
Collaborator Author

naoyam commented Jul 24, 2024

!build

@naoyam naoyam added the idmodel label Jul 24, 2024
@naoyam naoyam marked this pull request as ready for review July 24, 2024 22:24
@naoyam naoyam requested a review from jacobhinkle July 24, 2024 22:25
@naoyam
Copy link
Collaborator Author

naoyam commented Jul 24, 2024

CC: @zasdfgbnm, @rdspring1

@naoyam
Copy link
Collaborator Author

naoyam commented Jul 26, 2024

!build

@naoyam
Copy link
Collaborator Author

naoyam commented Jul 26, 2024

!build

@naoyam
Copy link
Collaborator Author

naoyam commented Jul 26, 2024

Pinging @jacobhinkle

Copy link
Collaborator

@jacobhinkle jacobhinkle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Tests make sense.

Comment on lines +27 to +60
inline std::unordered_set<IdModelEnableOption> getIdModelEnabledOptions() {
std::unordered_set<IdModelEnableOption> opts;

if (hasEnableOptionArgument(EnableOption::IdModel, "consumer_index") ||
hasEnableOptionArgument(EnableOption::IdModel, "index") ||
hasEnableOptionArgument(EnableOption::IdModel, "all")) {
opts.insert(IdModelEnableOption::ConsumerIndex);
}

if (hasEnableOptionArgument(EnableOption::IdModel, "producer_index") ||
hasEnableOptionArgument(EnableOption::IdModel, "index") ||
hasEnableOptionArgument(EnableOption::IdModel, "all")) {
opts.insert(IdModelEnableOption::ProducerIndex);
}

if (hasEnableOptionArgument(EnableOption::IdModel, "inline_predicate") ||
hasEnableOptionArgument(EnableOption::IdModel, "predicate") ||
hasEnableOptionArgument(EnableOption::IdModel, "all")) {
opts.insert(IdModelEnableOption::InlinePredicate);
}

if (hasEnableOptionArgument(EnableOption::IdModel, "unswitch_predicate") ||
hasEnableOptionArgument(EnableOption::IdModel, "predicate") ||
hasEnableOptionArgument(EnableOption::IdModel, "all")) {
opts.insert(IdModelEnableOption::UnswitchPredicate);
}

return opts;
}

inline bool isIdModelOptionEnabled(IdModelEnableOption option) {
const auto opts = getIdModelEnabledOptions();
return opts.find(option) != opts.end();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. You may have other intended uses of this utility that would make use of getIdModelEnabledOptions() directly. If not, then you could instead replace this with a switch (option) in isIdModelOptionEnabled.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any particular plan for this except that it should be retired once everything about IdModel is enabled by default, so I'll leave it as is.

return nullptr;
}

// Epilog should not hit this part
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this specifically because there is now a different IterDomain mapped to each ForLoop that are not Loop mapped with one another and getCircularBufferAxis(tv) returns the IterDomain of the main loop only?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll expand the comment here, but that's just because predication of an expr is based on the consumer. If the control reaches here, it means the tensor is circular buffered. In the epilogue loop, the circular buffer tensor should never appear as a consumer but only as a producer. So, what this part means is that this tensor is a consumer tensor and circular buffered, so the loop stage should never be epilogue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok. That makes sense. Thanks for the explanation.

// (number_of_stages - 1) elements ahead.
auto replace_for_circular_buffering = [&](ForLoop* fl,
Val* original_index) -> Val* {
auto circular_buffer_axis =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: probably a good idea to check this here like in getCircularBufferLoopStage:

Suggested change
auto circular_buffer_axis =
NVF_ERROR(
GpuLower::hasCurrent(),
"Circular buffering info of GpuLower is required but GpuLower is missing");
auto circular_buffer_axis =

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started this new indexing without assuming the existence of GpuLower, but I realized that's not reasonable, so hasCurrent() is assumed everywhere. Also, GpuLower::current() does the check too, so there should be an assertion error if the assumption is broken.

@naoyam naoyam merged commit 8f47e55 into main Jul 27, 2024
@naoyam naoyam deleted the idmodel_indexing_circular_buffer_predicate branch July 27, 2024 01:21
naoyam added a commit that referenced this pull request Jul 27, 2024
…through merge-inner or split-outer domains (#2689)

(Stacked on top of #2677)

The original issue is #681. It was addressed in #687.

This PR is NOT as comprehensive as #687, but my gut feeling is that this
should be good enough, in particular since contig indexing would avoid
backward traversals through merge in many cases.

I'll do final more comprehensive comparison with the legacy indexing
once contig indexing is done.

Since the original PR and issue were reviewed by @zasdfgbnm, could you
please review this too?
naoyam added a commit that referenced this pull request Jul 27, 2024
(Stacked on #2677)

Basically just porting what's already done with the current predication
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid unswitch predicates with double buffering

2 participants