Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
162 changes: 157 additions & 5 deletions csrc/id_model/id_model.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -627,10 +627,29 @@ std::unordered_map<ValGroup, IterDomain*> IdModel::buildLoopPromotionMap(
idGraph(IdMappingMode::LOOP),
inlining_info);

// At this point, most of loop groups should have correct promoted
// IDs. However, non-inlined loop groups may miss promotion that
// should be propagated from parent ID groups, e.g., iS50 of T2 in
// Indexing19. Its parent ID loop group is promoted, but the loop
// group of iS50 is not found yet.

// Step 4: In order to fully propagate the loop graph promotions, first
// propagate them to the IEL groups, which are then used to
// propagate back to the loop groups in Step 5. Unlike Step 2, the
// initial IEL promotion map is empty and is populated with the loop
// promotion map as we traverse down the IEL graph.
std::unordered_map<ValGroup, IterDomain*> final_iel_promotion_map;
propagatePromotionsInIELGraph(
iel_graph,
final_iel_promotion_map,
idGraph(IdMappingMode::LOOP),
loop_promotion_map,
true);
Comment on lines +641 to +647
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to do the following instead?:

for (auto entry : iel_promotion_map) {
  ValGroup iel_group = entry->first;
  ValGroup loop_group = idGraph(IdMappingMode::LOOP).toGroup(iel_group->front());
  auto it = loop_promotion_map.find(loop_group);
  if (it != loop_promotion_map.end()) {
    entry->second = it->second;
  }
}
propagatePromotionsInIELGraph(iel_graph, iel_promotion_map, require_loop_mapped_promotion=true);

I think my biggest problem with IdModel is this is so complicated that I can not fit it into my mind. IIUC, changing it to the above code is equivalent to the current approach, but the mental model will be easier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually related to #2003 (comment).

Using the Step 3 results needs to consider the condition checked by hasUniqueInputLoopGraphs, so the suggested code would result in the double propagation. We could avoid that by selectively updating iel_promotion_map, but that would mean we would need to look at inputs and outputs and propagate loop_promotion_map only to outputs in some cases. I'd say that would be almost equally complicated as the current version.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the explanation. Could you open an issue for #2003 (comment)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you open an issue for #2003 (comment)?

What issue are you referring to? The broadcast forwarding?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added: #2030


// This is not a right map to return but just a placeholder since
// the loop promotion map is not yet completely merged. It will be
// replaced by a proper map.
return loop_promotion_map;
return final_iel_promotion_map;
}

std::unordered_map<ValGroup, IterDomain*> IdModel::buildInlineRootResolutionMap(
Expand Down Expand Up @@ -867,7 +886,9 @@ namespace {
Expr* findMatchingExpr(
const ExprGroup& iel_expr,
const ValGraph& iel_graph,
const std::vector<IterDomain*>& maybe_promoted_inputs) {
const std::vector<IterDomain*>& maybe_promoted_inputs,
bool require_loop_mapped_promotion,
const ValGraph& loop_graph) {
// If any of domains in maybe_promoted_inputs is not found in
// iel_graph, it means the domain is just replayed and by definition
// has no mapping with any existing domain, which means there's no
Expand Down Expand Up @@ -925,17 +946,96 @@ Expr* findMatchingExpr(
continue;
}

// For the final loop promotion map, we want to find
// promotions within the same loop groups. Note that that's
// guaranteed when a new domain is replayed instead of reusing an
// existing domain.
if (require_loop_mapped_promotion) {
if (!loop_graph.disjointExprSets().permissiveAreMapped(
iel_expr->front(), maybe_promoted_input_use_group->front())) {
continue;
}
// This is just an extra sanity check. Make sure all exprs in
// the use group are mapped
NVF_ERROR(
std::all_of(
maybe_promoted_input_use_group->vector().begin(),
maybe_promoted_input_use_group->vector().end(),
[&](Expr* iel_use) {
return loop_graph.disjointExprSets().permissiveAreMapped(
iel_expr->front(), iel_use);
}),
"Not all mapped: ",
nvfuser::toString(iel_expr),
"\n",
nvfuser::toString(maybe_promoted_input_use_group));
}

return maybe_promoted_input_use;
}

return nullptr;
}

// When propagating loop promotions from inputs to outputs of an IEL
// expr, we can't blindly apply loop promotion when all of the input
// domains are loop mapped with the outputs.
//
// i.e. if we have the inlined domains from:
// Inputs:
// T0[i0]
// T1[i0, i1]
//
// T2[i0, b2] = broadcast(T0)
// T3[i0, i1] = T2 + T1
//
// {T1, T2, T3}->merge(0, 1)
// inlineMost
//
// The inlined loop group would consist of:
//
// {i0, i1, b2, i0*b2, i0*i1}
//
// Note that all these domains would have promotion to i0*i1 at the
// end of Step 3. When the IEL expression of merge(i0, i1) is visited by
// propagatePromotionsInIELGraph again, the promotion to i0*i1 of both
// inputs would be propagated to its output, resulting in promotion of
// i0*i1 to (i0*i1)*(i0*i1), which is not the correct propagation.
//
// Therefore only promote i0*b1 to i0*i1, or i0*i1 to i0*i1 (i.e. don't
// promote an input to any transformation within the loop group).
//
// So if we have an iel_expr make sure its inputs and outputs are not in
// the same loop group.
bool hasUniqueInputLoopGroups(
const ExprGroup& iel_expr,
const ValGraph& iel_graph,
const ValGraph& loop_graph) {
const std::vector<ValGroup> iel_inp_groups = iel_graph.inputGroups(iel_expr);

const std::vector<ValGroup> iel_out_groups = iel_graph.outputGroups(iel_expr);

ValGroups inp_loop_groups;
for (const ValGroup& iel_inp_group : iel_inp_groups) {
inp_loop_groups.pushBack(loop_graph.toGroup(iel_inp_group->front()));
}
ValGroups out_loop_groups;
for (const ValGroup& iel_out_group : iel_out_groups) {
out_loop_groups.pushBack(loop_graph.toGroup(iel_out_group->front()));
}

// Check if input groups that are not included in the output group set
return !inp_loop_groups.computeSubtract(out_loop_groups).empty();
}

} // namespace

void IdModel::propagatePromotionsInIELGraph(
const ValGraph& iel_graph,
std::unordered_map<ValGroup, IterDomain*>& iel_promotion_map) {
std::unordered_map<ValGroup, IterDomain*>& iel_promotion_map,
const ValGraph& loop_graph,
const std::unordered_map<ValGroup, IterDomain*>& loop_graph_promotion_map,
bool require_loop_mapped_promotion) {
// In order to make this traversal work, the traversal order must be
// topologically sorted.
ValGraphStmtSort iel_stmt_sort(iel_graph);
Expand All @@ -951,6 +1051,11 @@ void IdModel::propagatePromotionsInIELGraph(
std::vector<IterDomain*> maybe_promoted_inputs;
maybe_promoted_inputs.reserve(iel_inp_groups.size());

// Propagate loop graph promotion only when the inputs and outputs are
// not in the same loop group.
const bool loop_promote_inputs = !loop_graph_promotion_map.empty() &&
hasUniqueInputLoopGroups(iel_expr, iel_graph, loop_graph);

for (const ValGroup& iel_inp_group : iel_inp_groups) {
// Assumed all inputs are IterDomains
NVF_ERROR(iel_inp_group->front()->isA<IterDomain>());
Expand All @@ -963,6 +1068,19 @@ void IdModel::propagatePromotionsInIELGraph(
continue;
}

// Promote loops based on the loop promotion map. If the loop promotion
// map should be used and has an entry we should use that promotion.
if (loop_promote_inputs) {
const ValGroup& loop_copy_group =
loop_graph.toGroup(iel_inp_group->front());
auto inp_loop_promo_it = loop_graph_promotion_map.find(loop_copy_group);
if (inp_loop_promo_it != loop_graph_promotion_map.end()) {
maybe_promoted_inputs.push_back(inp_loop_promo_it->second);
an_input_was_promoted = true;
continue;
}
}

// No promotion found. Just use the non-promoted domain
maybe_promoted_inputs.push_back(iel_inp_group->front()->as<IterDomain>());
}
Expand All @@ -972,8 +1090,12 @@ void IdModel::propagatePromotionsInIELGraph(
continue;
}

Expr* promoted_expr =
findMatchingExpr(iel_expr, iel_graph, maybe_promoted_inputs);
Expr* promoted_expr = findMatchingExpr(
iel_expr,
iel_graph,
maybe_promoted_inputs,
require_loop_mapped_promotion,
idGraph(IdMappingMode::LOOP));

bool replayed = false;

Expand Down Expand Up @@ -1011,6 +1133,13 @@ void IdModel::propagatePromotionsInIELGraph(
}
}

void IdModel::propagatePromotionsInIELGraph(
const ValGraph& iel_graph,
std::unordered_map<ValGroup, IterDomain*>& iel_promotion_map) {
propagatePromotionsInIELGraph(
iel_graph, iel_promotion_map, idGraph(IdMappingMode::LOOP), {}, false);
}

// Replay Expr but with the inputs provided.
Expr* IdModel::addReplayAs(std::vector<IterDomain*> new_inputs, Expr* expr) {
// Figure out which graphs are already initialized to make sure we add the new
Expand Down Expand Up @@ -1332,4 +1461,27 @@ VectorOfUniqueEntries<IterDomain*> IdModel::computeTerminalLoopIds(
return terminal_loop_ids;
}

std::unordered_map<ValGroup, IterDomain*> updateValGroupIdMap(
const std::unordered_map<ValGroup, IterDomain*>& stale_map,
ValGraph& new_graph) {
std::unordered_map<ValGroup, IterDomain*> new_map;

for (const auto& [stale_group, mapped_id] : stale_map) {
const ValGroups& new_groups = new_graph.toGroups(*stale_group);
NVF_ERROR(
new_groups.size() == 1,
"\nUpdate map assumes that new graph is equivalent to old graph plus extra mappings.\n",
"i.e. all mappings in new_graph should exist in the graph stale_map was produced on.\n",
"old:",
nvfuser::toString(stale_group),
"new: ",
nvfuser::toString(new_groups));
NVF_ERROR(
new_map.emplace(new_groups.front(), mapped_id).second,
"Expected only a single mapping but multiple entries detected for ",
nvfuser::toString(new_groups.front()));
}
return new_map;
}

} // namespace nvfuser
39 changes: 37 additions & 2 deletions csrc/id_model/id_model.h
Original file line number Diff line number Diff line change
Expand Up @@ -169,7 +169,7 @@ class IdModel : public PolymorphicBase {
// fusion.
void buildIterDomainDefinitionsAndUses();

/// Start loop map by grouping inlined iter domains
// Start loop map by grouping inlined iter domains
void initializeLoopGraph(const StatefulInliningInfo& info);

// Build a map of loop groups to IterDomains that represent actual
Expand All @@ -192,7 +192,35 @@ class IdModel : public PolymorphicBase {
// input is promoted, the output needs to be promoted too. If
// there's already an equivalent expr that uses the promoted inputs,
// create a mapping from the outputs of the IEL expr to the outputs
// of the equivalent expr.
// of the equivalent expr. When require_loop_mapped_promotion is
// true, the equivalent expr needs to be already loop mapped. If no
// such expr is found, the IEL expr is replayed with the promoted
// inputs. require_loop_mapped_promotion is true when this function
// is used for step 3.
//
// This is used twice when building the promotion map. The first time
// it is used there's no loop graph promotion yet, so only the IEL
// promotions are propagated. In that case, loop_graph_promotion_map
// should be just empty.
//
// Propagation uses iel_promotion_map and
// loop_graph_promotion_map. If both are available for an IEL group,
// the former has the precedence. This is because when this function
// is used for step 4, the given iel_promotion_map starts as an
// empty map and gets populated during this propagation, so any
// mapping in the map is guaranteed to be the correct final mapping,
// whereas the loop graph may have invalid mappings for partially
// inlined domains.
void propagatePromotionsInIELGraph(
const ValGraph& iel_graph,
std::unordered_map<ValGroup, IterDomain*>& iel_promotion_map,
const ValGraph& loop_graph,
const std::unordered_map<ValGroup, IterDomain*>& loop_promotion_map,
bool require_loop_mapped_promotion);

// Same as the other propagatePromotionsInIELGraph but without loop
// graph map. This is used for step 2, where there's no loop
// graph map yet.
void propagatePromotionsInIELGraph(
const ValGraph& iel_graph,
std::unordered_map<ValGroup, IterDomain*>& iel_promotion_map);
Expand Down Expand Up @@ -281,4 +309,11 @@ class IdModel : public PolymorphicBase {
std::unordered_map<ValGroup, IterDomain*> loop_promotion_map_;
};

// A utility function to update a map of ValGroups to ID from an old
// Valgraph to a new ValGraph. The new graph must be a superset of the
// old graph.
std::unordered_map<ValGroup, IterDomain*> updateValGroupIdMap(
const std::unordered_map<ValGroup, IterDomain*>& stale_map,
ValGraph& new_graph);

} // namespace nvfuser
Loading