Skip to content

Implement substitutions#1011

Merged
lockshaw merged 47 commits intoflexflow:repo-refactorfrom
wmdi:substitutions
Sep 19, 2023
Merged

Implement substitutions#1011
lockshaw merged 47 commits intoflexflow:repo-refactorfrom
wmdi:substitutions

Conversation

@wmdi
Copy link
Copy Markdown
Collaborator

@wmdi wmdi commented Aug 21, 2023

Description of changes:

  • Refactor the old namespaces
  • add OutputGraph and the related DSL to represent substitutions
  • add functions to apply substitutions
  • Refactor old codes to get tensor attributes from outputs instead of edges

Related Issues:

Linked Issues:

  • Issue #

Issues closed by this PR:


This change is Reviewable

@wmdi wmdi requested a review from lockshaw August 21, 2023 20:21
Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r12, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @lockshaw)

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 11 files at r13, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @lockshaw)

@wmdi wmdi linked an issue Sep 7, 2023 that may be closed by this pull request
Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 10 files at r14, 7 of 7 files at r15, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @lockshaw)

Copy link
Copy Markdown
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @wmdi)


lib/compiler/test/test_generator.h line 55 at r13 (raw file):

struct Arbitrary<ComputationGraph> {
  static Gen<ComputationGraph> arbitrary() {
    return gen::map(gen::cast<MultiDiGraphView>(serialParallelMultiDiGraph()),

Why this change? What does test_computataion (FYI typo) do?


lib/substitutions/CMakeLists.txt line 13 at r14 (raw file):

    utils
    op-attrs
    pcg

Why?


lib/substitutions/README.md line 5 at r13 (raw file):

## `Substitution`

A substitution is to replace a subgraph of the PCG by a new one. We refer to the subgraph to be replaced as the input graph, and the new subgraph to replace the input graph as the output graph.

It would help to explain not just what the various field of a Substitution are (which can usually be adequately captured by inline doxygen comments), but to give some insight/intuition into why the particular design was chosen (see the visitable README for an example)

The below suggestion adds a sample fix for the beginning of this README.

It's probably best to do this as a separate PR. I've created an issue here: #1121

Suggestion:

A _substitution_ takes a PCG as input and rewrites a subgraph of the given PCG in a way that maintains the correctness of the PCG. 
Equivalently, we can view a substitution as taking a subgraph of a PCG and replacing it with a different graph.
We refer to the subgraph that is replaced as the _input graph_, and the graph that replaces input graph we refer to as the _output graph_.

This intuitively defines what a substitution is, but does not tell us how to represent a substitution within FlexFlow. 
If we naively represent a substitution with a pair of graphs, we run into a few problems:

1. What happens to the edges between nodes in the input graph and nodes in the rest of the PCG? Since every edge in a graph ends in a node, the output graph would be completely isolated from the rest of the PCG.
2. If we represent each input and output graph concretely, we may end up with an enormous or even potentially infinite number of substitutions. To provide some intuition, consider a substitution that takes a node with integer value `a` and a node with integer value `b` and returns a node with integer value `c = a + b`. To represent this substitution for all input graphs would require enumerating every possible value of `a` and `b`, which is essentially infinite.

To solve (1) we instead represent both the input and output graphs as _open graphs_ (see [graph](lib/utils/include/graph/README.md) for more details).
To solve (2) we replace the input open graph with a _graph pattern_ which is similar to a regular expression in that it is capable of matching a potentially infinite number of input graphs, and we replace the output graph with a _graph expression_, which is essentially a function from the nodes matched by the input graph pattern to the resulting output graph.

lib/substitutions/README.md line 6 at r13 (raw file):

A substitution is to replace a subgraph of the PCG by a new one. We refer to the subgraph to be replaced as the input graph, and the new subgraph to replace the input graph as the output graph.

A basic example should be added here (not an example involving operators, but just some arbitrary graph with attributes). That would help clarify things as examples can then be given for each of the different pieces of Substitution (also made part of #1121)


lib/substitutions/include/substitutions/graph_pattern_match.h line 11 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

Yes, subgraphs of PCGs are open. However, edges in PCGs are not. We should match the edges in the pattern graph with the actual edges in the PCG.

We would also need the information of the original edges in the PCG when applying the substitution. For example, we have to know the source node of an InputMultiDiEdge when replacing edges.

I'm not sure I understand the argument. If I split a PCG into multiple pieces (as Unity's splitting heuristic does), why shouldn't I be allowed to match a substitution against the source node in in each piece? The substitution will almost certainly involve an input edge, but if we treat the subgraph as closed then that input edge will never match


lib/substitutions/include/substitutions/output_graph.h line 14 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

I think it makes more sense to check it as part of Substitution. For each generated substitution, we examine whether the preconditions of successful node/edge access are included in node/edge constraints. We may need to add a constraint type to enable checking existence (or we actually don't need it since we always have constraints on operator types? not sure).

Sounds good. I do think we'll need an existence check as there's no obvious reason I can see to mandate having an operator type constraint for every node


lib/substitutions/include/substitutions/output_graph.h line 41 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

It seems that tensor attribute calculation is not necessary in the output graph?

Another problem is do we need tensor information for matching, i.e., are there any constraints we may have on the tensor attributes in any substitution?

I agree with that overall loop, though I think it makes more sense to view the loop as follows:

  • Apply a substitution to the PCG (which consists of)
    • Match input pattern
    • Replace with output expr
    • Perform shape inference
  • Evaluate the PCG cost
  • Update the search states and explore more

I think we want to maintain the invariant that PCGs always satisfy shape inference, so a substitution should take in a valid PCG and return a valid PCG (thus requiring shape inference to be done logically as part of the substitution application)


lib/substitutions/include/substitutions/parallel_tensor_pattern.h line 27 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

I am planning to remove ParallelTensorPattern as it is sufficient to check operators.

Correct me if I'm wrong but I believe we decided that having ParallelTensorPattern was necessary to avoid applying invalid substitutions

Copy link
Copy Markdown
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @wmdi)

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_generator.h line 55 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why this change? What does test_computataion (FYI typo) do?

Done.


lib/substitutions/include/substitutions/output_graph.h line 14 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Sounds good. I do think we'll need an existence check as there's no obvious reason I can see to mandate having an operator type constraint for every node

It is done by is_valid_substitution.


lib/substitutions/include/substitutions/output_graph.h line 41 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I agree with that overall loop, though I think it makes more sense to view the loop as follows:

  • Apply a substitution to the PCG (which consists of)
    • Match input pattern
    • Replace with output expr
    • Perform shape inference
  • Evaluate the PCG cost
  • Update the search states and explore more

I think we want to maintain the invariant that PCGs always satisfy shape inference, so a substitution should take in a valid PCG and return a valid PCG (thus requiring shape inference to be done logically as part of the substitution application)

Done.


lib/substitutions/include/substitutions/parallel_tensor_pattern.h line 27 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Correct me if I'm wrong but I believe we decided that having ParallelTensorPattern was necessary to avoid applying invalid substitutions

Done.

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @lockshaw)


lib/substitutions/CMakeLists.txt line 13 at r14 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why?

What change I made for this line?

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @lockshaw)


lib/substitutions/README.md line 5 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

It would help to explain not just what the various field of a Substitution are (which can usually be adequately captured by inline doxygen comments), but to give some insight/intuition into why the particular design was chosen (see the visitable README for an example)

The below suggestion adds a sample fix for the beginning of this README.

It's probably best to do this as a separate PR. I've created an issue here: #1121

Let do this in a separate PR.


lib/substitutions/README.md line 6 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

A basic example should be added here (not an example involving operators, but just some arbitrary graph with attributes). That would help clarify things as examples can then be given for each of the different pieces of Substitution (also made part of #1121)

Done.


lib/substitutions/include/substitutions/graph_pattern_match.h line 11 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I'm not sure I understand the argument. If I split a PCG into multiple pieces (as Unity's splitting heuristic does), why shouldn't I be allowed to match a substitution against the source node in in each piece? The substitution will almost certainly involve an input edge, but if we treat the subgraph as closed then that input edge will never match

I think the key argument there is we apply matching to a PCG (not a subgraph of a PCG) and a graph pattern, which is defined by finding a subgraph of the PCG homomorphic to the graph pattern. The reason of this design decision is a match does not only provide information for matching but also replacement. In replacement, the surroundings of the subgraph matter.

We can split a standard edge into input/output edges when we try to match it with an input/output edge.

Copy link
Copy Markdown
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @wmdi)


lib/compiler/test/test_generator.h line 55 at r13 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

Done.

Why was the cast to MultiDiGraphView added? Shouldn't a MultiDiGraph already be coercible to a MultiDiGraphView?


lib/substitutions/CMakeLists.txt line 13 at r14 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

What change I made for this line?

The additional PCG dependency, but it makes sense that you'd need that dependency


lib/substitutions/include/substitutions/graph_pattern_match.h line 11 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

I think the key argument there is we apply matching to a PCG (not a subgraph of a PCG) and a graph pattern, which is defined by finding a subgraph of the PCG homomorphic to the graph pattern. The reason of this design decision is a match does not only provide information for matching but also replacement. In replacement, the surroundings of the subgraph matter.

We can split a standard edge into input/output edges when we try to match it with an input/output edge.

But we need to be able to apply matching to a subgraph of a PCG. Let's consider the following example PCG:

digraph G {
    a0 [label="Input"]
    a1 [label="Op B"];
    a2 [label="Op C"];
    a3 [label="Op B"];
    a4 [label="Op A"];
    a5 [label="Op A"];
    a6 [label="Op A"];
    a7 [label="Op D"];
    a8 [label="Op E"];
    a9 [label="Op F"];
    a10 [label="Op C"];
    
    a0 -> a1;
    a1 -> a2;
    a2 -> a3;
    a3 -> a4;
    a4 -> a5;
    a5 -> a6;
    a6 -> a7;
    a7 -> a8;
    a7 -> a9;
    a8 -> a10;
    a9 -> a10;
}

for the sake of argument, we will assume that Unity's heuristic graph splitting performs cuts at the edge from "Op B" -> "Op A" and from "Op A" -> "Op D", resulting in the following PCG subgraph:

digraph G {
   in [label="", color="white"];
   out [label="", color="white"];
   
   a4 [label="Op A"];
   a5 [label="Op A"];
   a6 [label="Op A"];
   
   in -> a4;
   a4 -> a5;
   a5 -> a6;
   a6 -> out;
}

Now, Unity will optimize the PCG subgraph by applying substitutions. Let's assume that the following substitution is in our substitution set:

digraph G {
    subgraph cluster_2 {
        in [label="", color="white"];
    }
    
    subgraph cluster_3 {
        out [label="", color="white"];
    }
    
    subgraph cluster_0 { 
        label="input pattern";
        a1 [label="Op ?= A"];
        a2 [label="Op ?= A"];
   
        in -> a1;
        a1 -> a2;
        a2 -> out;
    }
   
    subgraph cluster_1 {
        label="output expr"
        b [label="Op B"];
        
        in -> b;
        b -> out;
    }
}

If we allow matching on open graphs, it is possible to apply this substitution at two possible locations in the PCG subgraph that we are optimizing, matching either the first two "Op A" nodes or the second two "Op A" nodes. However, if we represent the PCG subgraph as a closed graph, we get

   a4 [label="Op A"];
   a5 [label="Op A"];
   a6 [label="Op A"];
   
   a4 -> a5;
   a5 -> a6;
}

Now if we try to apply the substitution to {a4, a5}, then the substitution will fail to apply as it cannot match the pattern's input edge. Similarly, if we try to apply it to {a5, a6}, the substitution will fail as it cannot match the pattern's output edge.

Thoughts?

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 3 unresolved discussions (waiting on @lockshaw)


lib/substitutions/CMakeLists.txt line 13 at r14 (raw file):

Previously, lockshaw (Colin Unger) wrote…

The additional PCG dependency, but it makes sense that you'd need that dependency

Done.

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 40 of 45 files reviewed, 2 unresolved discussions (waiting on @lockshaw)


lib/substitutions/include/substitutions/graph_pattern_match.h line 11 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

But we need to be able to apply matching to a subgraph of a PCG. Let's consider the following example PCG:

digraph G {
    a0 [label="Input"]
    a1 [label="Op B"];
    a2 [label="Op C"];
    a3 [label="Op B"];
    a4 [label="Op A"];
    a5 [label="Op A"];
    a6 [label="Op A"];
    a7 [label="Op D"];
    a8 [label="Op E"];
    a9 [label="Op F"];
    a10 [label="Op C"];
    
    a0 -> a1;
    a1 -> a2;
    a2 -> a3;
    a3 -> a4;
    a4 -> a5;
    a5 -> a6;
    a6 -> a7;
    a7 -> a8;
    a7 -> a9;
    a8 -> a10;
    a9 -> a10;
}

for the sake of argument, we will assume that Unity's heuristic graph splitting performs cuts at the edge from "Op B" -> "Op A" and from "Op A" -> "Op D", resulting in the following PCG subgraph:

digraph G {
   in [label="", color="white"];
   out [label="", color="white"];
   
   a4 [label="Op A"];
   a5 [label="Op A"];
   a6 [label="Op A"];
   
   in -> a4;
   a4 -> a5;
   a5 -> a6;
   a6 -> out;
}

Now, Unity will optimize the PCG subgraph by applying substitutions. Let's assume that the following substitution is in our substitution set:

digraph G {
    subgraph cluster_2 {
        in [label="", color="white"];
    }
    
    subgraph cluster_3 {
        out [label="", color="white"];
    }
    
    subgraph cluster_0 { 
        label="input pattern";
        a1 [label="Op ?= A"];
        a2 [label="Op ?= A"];
   
        in -> a1;
        a1 -> a2;
        a2 -> out;
    }
   
    subgraph cluster_1 {
        label="output expr"
        b [label="Op B"];
        
        in -> b;
        b -> out;
    }
}

If we allow matching on open graphs, it is possible to apply this substitution at two possible locations in the PCG subgraph that we are optimizing, matching either the first two "Op A" nodes or the second two "Op A" nodes. However, if we represent the PCG subgraph as a closed graph, we get

   a4 [label="Op A"];
   a5 [label="Op A"];
   a6 [label="Op A"];
   
   a4 -> a5;
   a5 -> a6;
}

Now if we try to apply the substitution to {a4, a5}, then the substitution will fail to apply as it cannot match the pattern's input edge. Similarly, if we try to apply it to {a5, a6}, the substitution will fail as it cannot match the pattern's output edge.

Thoughts?

Done.

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 40 of 45 files reviewed, 2 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_generator.h line 55 at r13 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why was the cast to MultiDiGraphView added? Shouldn't a MultiDiGraph already be coercible to a MultiDiGraphView?

Done.

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 5 files at r16, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @lockshaw)

Copy link
Copy Markdown
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 16 of 16 files at r17, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @lockshaw)

Copy link
Copy Markdown
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @wmdi)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Finish testing substitutions Finish implementing substitutions

3 participants