Re-enable substitutions#1471
Conversation
wmdi
left a comment
There was a problem hiding this comment.
Reviewed 285 of 285 files at r1, all commit messages.
Reviewable status: all files reviewed, 5 unresolved discussions (waiting on @lockshaw and @Marsella8)
lib/op-attrs/include/op-attrs/ops/input.h line 14 at r1 (raw file):
TensorShape get_output_shape(InputAttrs const &); ParallelTensorShape get_output_parallel_tensor_shape(InputAttrs const &);
Why get_output_parallel_tensor_shape for some operators while get_output_shape for other operators? What's the difference between them?
lib/utils/include/utils/graph/open_dataflow_graph/open_dataflow_edge.h line 12 at r1 (raw file):
int get_open_dataflow_edge_dst_idx(OpenDataflowEdge const &); DataflowInput get_open_dataflow_edge_dst(OpenDataflowEdge const &); OpenDataflowValue get_open_dataflow_edge_source(OpenDataflowEdge const &);
Suggestion:
OpenDataflowValue get_open_dataflow_edge_src(OpenDataflowEdge const &);lib/utils/include/utils/graph/open_dataflow_graph/algorithms/are_isomorphic.h line 8 at r1 (raw file):
namespace FlexFlow { bool are_isomorphic(OpenDataflowGraphView const &,
How about having are_isomorphic for labelled dataflow graphs as well?
lib/utils/include/utils/graph/open_dataflow_graph/algorithms/get_inputs.h line 10 at r1 (raw file):
std::unordered_set<DataflowGraphInput> get_inputs(OpenDataflowGraphView const &);
Suggestion:
get_open_dataflow_graph_inputs(OpenDataflowGraphView const &);lib/utils/include/utils/graph/open_dataflow_graph/algorithms/open_dataflow_graph_data.struct.toml line 2 at r1 (raw file):
namespace = "FlexFlow" name = "OpenDataflowGraphData"
What is the definition of OpenDataflowGraphData?
Marsella8
left a comment
There was a problem hiding this comment.
Reviewed 285 of 285 files at r1, all commit messages.
Reviewable status: all files reviewed, 22 unresolved discussions (waiting on @lockshaw)
lib/op-attrs/include/op-attrs/dim_ordered/enumerate.h line 14 at r1 (raw file):
for (int raw_ff_dim : count(ff_ordered.size())) { ff_dim_t ff_dim = ff_dim_t{raw_ff_dim}; result.equate({ff_dim, ff_ordered.at(ff_dim)});
possible issues if FFOrdered has duplicate elements? Maybe better ro return std::pairs, so structured binding can also be used.
lib/op-attrs/include/op-attrs/ops/concat.h line 1 at r1 (raw file):
#ifndef _FLEXFLOW_CONCAT_ATTRS_H
_FLEXFLOW_OP_ATTRS_INCLUDE_OP_ATTRS_OPS_CONCAT_H ? And same for the other ops
lib/pcg/src/pcg/parallel_computation_graph/parallel_computation_graph_builder.cc line 430 at r1 (raw file):
ElementUnaryAttrs attrs = ElementUnaryAttrs{ OperatorType::TANH,
should be ELU (or more likely it was accidentally pasted?)
lib/substitutions/include/substitutions/pcg_pattern.h line 32 at r1 (raw file):
bool assignment_satisfies(SubParallelComputationGraph const &, PCGPattern const &, UnlabelledDataflowGraphPatternMatch const &);
PCGPatternMatch (?)
Code quote:
UnlabelledDataflowGraphPatternMatchlib/substitutions/include/substitutions/pcg_pattern_match.struct.toml line 15 at r1 (raw file):
"pcg/parallel_computation_graph/parallel_layer_guid_t.dtg.h", "substitutions/open_parallel_tensor_guid_t.dtg.h", "<unordered_map>",
"<unordered_map>` possibly superfluous because of bidict?
lib/substitutions/include/substitutions/pcg_pattern_match.struct.toml line 21 at r1 (raw file):
"utils/fmt/unordered_map.h", "utils/hash/unordered_map.h", ]
the 2 src_includes should probably be put directly into bidict.h (had the same issue in one of my branches).
lib/substitutions/src/substitutions/substitution.cc line 41 at r1 (raw file):
std::unordered_set<parallel_layer_guid_t> pre_nodes = keys(pre_data.node_data);
as an aside, what is the rationale between having keys returns a set and values return a vector? Feels like it should stay consistent between the 2.
lib/substitutions/src/substitutions/substitution.cc line 137 at r1 (raw file):
std::unordered_map<open_parallel_tensor_guid_t, ParallelTensorAttrs> post_value_data = [&] {
this part is a bit hard to parse (it's a lambda which is immediatelly called right)? Maybe move out into separate function / make it more sequential (just a suggestion)
lib/substitutions/src/substitutions/output_graph/materialize_operator_from_attrs_map.cc line 65 at r1 (raw file):
case OperatorType::LINEAR: return PCGOperatorAttrs{LinearAttrs{ acc.get<int>(OperatorAttributeKey::OUT_CHANNELS),
add /*...*/ for consistency.
lib/utils/include/utils/containers/filtrans.h line 26 at r1 (raw file):
typename In, typename Out = unwrap_optional_t<std::invoke_result_t<F, In>>> std::vector<Out> filtrans(std::vector<In> const &v, F f) {
better name? filtrans is a bit obscure. Also note that we have without_nullopts, so we could sub this with without_nullops(transform(...)) (?)
lib/utils/include/utils/graph/labelled_open_dataflow_graph/algorithms/find_isomorphism.h line 13 at r1 (raw file):
namespace FlexFlow {
small docstring to specify a couple of subtleties (e.g. if there are multiple isomorphism, any isomorphism is returned).
lib/utils/src/utils/graph/dataflow_graph/algorithms.cc line 14 at r1 (raw file):
} std::vector<DataflowOutput> get_inputs(DataflowGraphView const &g,
get_dataflow_outputs ?
Code quote:
get_inputslib/utils/src/utils/graph/dataflow_graph/algorithms.cc line 16 at r1 (raw file):
std::vector<DataflowOutput> get_inputs(DataflowGraphView const &g, Node const &n) { return transform(get_incoming_edges(g, n),
Also maybe get_outgoing_edges instead of get_incoming_edges? (not sure about the logic, just a guess). Also for get_dataflow_inputs which returns e.dst, double check the logic
lib/utils/src/utils/graph/open_dataflow_graph/open_dataflow_value.cc line 7 at r1 (raw file):
std::optional<DataflowOutput> try_get_dataflow_output(OpenDataflowValue const &v) {
try_get_dataflow_graph_output (or try_get_dataflow_graph_input -> try_get_dataflow_input)
Suggestion:
try_get_dataflow_outputlib/utils/src/utils/graph/open_dataflow_graph/algorithms/get_source_nodes.cc line 10 at r1 (raw file):
auto is_source_node = [&](Node const &n) { std::vector<OpenDataflowEdge> incoming_edges = get_incoming_edges(g, n); return incoming_edges.empty();
return get_incoming_edges(g, n).empty(); //or possibly remove the lambda entirely
Code quote:
std::vector<OpenDataflowEdge> incoming_edges = get_incoming_edges(g, n);
return incoming_edges.empty();lib/utils/test/src/utils/containers/filtrans.cc line 39 at r1 (raw file):
} TEST_CASE("filtrans(std::unorded_set<In>, F)") {
set
Code quote:
unorded_setlib/utils/test/src/utils/containers/get_all_permutations.cc line 33 at r1 (raw file):
CHECK(result.size() == 6); } }
I would add an extra subcase where we check for an vector with repeated elements (the STL does not produce duplicates, so for say {1,2,2} you would get {{1,2,2}, {2,1,2}, {2,2,1}} instead of {{1,2,2}, {1,2,2}, {2,1,2},{2,1,2}, {2,2,1},{2,2,1}} (so you can get less than n! permutations). I would also add a small docstring to clarify this. You can copy them from https://github.com/Marsella8/FlexFlow/blob/4b8600da25d64ae2c0bee61e5ab682f6b4c32fc4/lib/utils/test/src/utils/containers/permutations.cc.
lockshaw
left a comment
There was a problem hiding this comment.
Reviewable status: 249 of 337 files reviewed, 22 unresolved discussions (waiting on @Marsella8 and @wmdi)
lib/op-attrs/include/op-attrs/dim_ordered/enumerate.h line 14 at r1 (raw file):
Previously, Marsella8 wrote…
possible issues if
FFOrderedhas duplicate elements? Maybe better ro returnstd::pairs, so structured binding can also be used.
Done.
lib/op-attrs/include/op-attrs/ops/concat.h line 1 at r1 (raw file):
Previously, Marsella8 wrote…
_FLEXFLOW_OP_ATTRS_INCLUDE_OP_ATTRS_OPS_CONCAT_H? And same for the otherops
Done.
lib/op-attrs/include/op-attrs/ops/input.h line 14 at r1 (raw file):
Previously, wmdi (Mengdi Wu) wrote…
Why
get_output_parallel_tensor_shapefor some operators whileget_output_shapefor other operators? What's the difference between them?
Most other op-attrs distinguish whether to do shape inference or parallel shape inference by the type of the input shape arguments (i.e., TensorShape vs ParallelTensorShape), but for operators that don't take inputs this isn't possible as the signatures would be the same (i.e., TensorShape get_output_shape(InputAttrs const &) and ParallelTensorShape get_output_shape(InputAttrs const &)), so we disambiguate them by changing the function name
lib/pcg/src/pcg/parallel_computation_graph/parallel_computation_graph_builder.cc line 430 at r1 (raw file):
Previously, Marsella8 wrote…
should be
ELU(or more likely it was accidentally pasted?)
Done.
lib/substitutions/include/substitutions/pcg_pattern.h line 32 at r1 (raw file):
Previously, Marsella8 wrote…
PCGPatternMatch (?)
Done.
lib/substitutions/include/substitutions/pcg_pattern_match.struct.toml line 15 at r1 (raw file):
Previously, Marsella8 wrote…
"<unordered_map>` possibly superfluous because of bidict?
If you assume bidict.h includes <unordered_map> then sure, but was not relying on that assumption here and input_assignment is an unordered_map, not a bidict
lib/substitutions/include/substitutions/pcg_pattern_match.struct.toml line 21 at r1 (raw file):
Previously, Marsella8 wrote…
the 2
src_includesshould probably be put directly intobidict.h(had the same issue in one of my branches).
They're here for hashing and fmting the input_assignment which is an unordered_map
lib/substitutions/src/substitutions/substitution.cc line 41 at r1 (raw file):
Previously, Marsella8 wrote…
as an aside, what is the rationale between having
keysreturns a set andvaluesreturn a vector? Feels like it should stay consistent between the 2.
Because an unordered_map can have duplicate values but not duplicate keys. We could change values to return a std::unordered_multiset, it's just not currently that way because I didn't realize std::unordered_multiset existed until recently. I'm totally open to a PR to do this change, but I'd rather not do it as part of this PR to avoid making it any larger 🙂
lib/substitutions/src/substitutions/substitution.cc line 137 at r1 (raw file):
Previously, Marsella8 wrote…
this part is a bit hard to parse (it's a lambda which is immediatelly called right)? Maybe move out into separate function / make it more sequential (just a suggestion)
Is it just the immediately-called lambda that's difficult to parse? If so, I'll probably not change that, as that's a pretty common pattern throughout the codebase (we generally use it to logically group code that all exists to just initialize a single value, and it also serves to limit the scope of any temporary variables we use in the process of initializing the value).
If it's something else that you'd finding difficult to read, let me know and I'm totally open to working on making it more readable!
lib/substitutions/src/substitutions/output_graph/materialize_operator_from_attrs_map.cc line 65 at r1 (raw file):
Previously, Marsella8 wrote…
add
/*...*/for consistency.
Done.
lib/utils/include/utils/containers/filtrans.h line 26 at r1 (raw file):
Previously, Marsella8 wrote…
better name?
filtransis a bit obscure. Also note that we havewithout_nullopts, so we could sub this withwithout_nullops(transform(...))(?)
I added an additional function because it's a pretty common operation, and I feel like nesting the without_nullopts(transform(...)) harms readability a bit. I'm open to suggestions for a better name, I spent some time thinking and couldn't come up with a name that concisely explained what the function does, so I went with choosing something that's at least unambiguous
lib/utils/include/utils/graph/labelled_open_dataflow_graph/algorithms/find_isomorphism.h line 13 at r1 (raw file):
Previously, Marsella8 wrote…
small docstring to specify a couple of subtleties (e.g. if there are multiple isomorphism, any isomorphism is returned).
Done.
lib/utils/include/utils/graph/open_dataflow_graph/open_dataflow_edge.h line 12 at r1 (raw file):
int get_open_dataflow_edge_dst_idx(OpenDataflowEdge const &); DataflowInput get_open_dataflow_edge_dst(OpenDataflowEdge const &); OpenDataflowValue get_open_dataflow_edge_source(OpenDataflowEdge const &);
Done.
lib/utils/include/utils/graph/open_dataflow_graph/algorithms/get_inputs.h line 10 at r1 (raw file):
std::unordered_set<DataflowGraphInput> get_inputs(OpenDataflowGraphView const &);
Done.
lib/utils/include/utils/graph/open_dataflow_graph/algorithms/open_dataflow_graph_data.struct.toml line 2 at r1 (raw file):
Previously, wmdi (Mengdi Wu) wrote…
What is the definition of
OpenDataflowGraphData?
Right exists for two main purposes:
-
We don't have
operator==in tests and there's no way to print a graph, so it is useful for tests (i.e., if I have two graphsOpenDataflowGraphView aandOpenDataflowGraphView b, I can doOpenDataflowGraphData a_data = get_graph_data(a); OpenDataflowGraphData b_data = get_graph_data(b); CHECK(a_data == b_data); -
Some operations that we do on views are a bit difficult to express by hooking into the
queryAPI and control over exactNodeids is a bit limited through the standard graph class interfaces right now, soOpenDataflowGraphDatain conjunction with a function to take anOpenDataflowGraphDataand construct aOpenDataflowGraphViewout of it provides a low-level mutable interface for manipulating graphs which can be helpful for some operations like modifying node ids, etc.
lib/utils/src/utils/graph/dataflow_graph/algorithms.cc line 14 at r1 (raw file):
Previously, Marsella8 wrote…
get_dataflow_outputs ?
Renamed to get_input_values for clarity
lib/utils/src/utils/graph/dataflow_graph/algorithms.cc line 16 at r1 (raw file):
Previously, Marsella8 wrote…
Also maybe
get_outgoing_edgesinstead ofget_incoming_edges? (not sure about the logic, just a guess). Also forget_dataflow_inputswhich returnse.dst, double check the logic
The logic of these are correct, hopefully the new name helps clarify?
lib/utils/src/utils/graph/open_dataflow_graph/open_dataflow_value.cc line 7 at r1 (raw file):
Previously, Marsella8 wrote…
try_get_dataflow_graph_output (or
try_get_dataflow_graph_input->try_get_dataflow_input)
These are actually correct, but I'll agree the naming is a bit weird. There are three related types: DataflowInput, DataflowOutput, and DataflowGraphInput. DataflowInput is essentially an input slot/port for a node, DataflowOutput is essentially an output slot/port for a node, and DataflowGraphInput is an incoming value for an open graph. Thus, if we think about DataflowGraphs as representing computations over values, an OpenDataflowGraph has values that either come from nodes within itself (i.e., DataflowOutput) or values that come from outside (i.e., DataflowGraphInput)
lib/utils/src/utils/graph/open_dataflow_graph/algorithms/get_source_nodes.cc line 10 at r1 (raw file):
Previously, Marsella8 wrote…
return get_incoming_edges(g, n).empty(); //or possibly remove the lambda entirely
This is actually intentional, to confirm that get_incoming_edges(OpenDataflowGraphView, Node) is getting called, and not get_incoming_edges(DataflowGraphView, Node). What should probably happen is we should remove the inheritance structure from utils/graph to avoid the need for checks like this, but that's a large change that I'd rather think about more before making
lib/utils/test/src/utils/containers/filtrans.cc line 39 at r1 (raw file):
Previously, Marsella8 wrote…
set
Done.
lib/utils/test/src/utils/containers/get_all_permutations.cc line 33 at r1 (raw file):
Previously, Marsella8 wrote…
I would add an extra subcase where we check for an vector with repeated elements (the STL does not produce duplicates, so for say
{1,2,2}you would get{{1,2,2}, {2,1,2}, {2,2,1}}instead of{{1,2,2}, {1,2,2}, {2,1,2},{2,1,2}, {2,2,1},{2,2,1}}(so you can get less than n! permutations). I would also add a small docstring to clarify this. You can copy them from https://github.com/Marsella8/FlexFlow/blob/4b8600da25d64ae2c0bee61e5ab682f6b4c32fc4/lib/utils/test/src/utils/containers/permutations.cc.
Done.
lib/utils/include/utils/graph/open_dataflow_graph/algorithms/are_isomorphic.h line 8 at r1 (raw file):
Previously, wmdi (Mengdi Wu) wrote…
How about having
are_isomorphicfor labelled dataflow graphs as well?
Done.
wmdi
left a comment
There was a problem hiding this comment.
Reviewed 3 of 88 files at r2, 1 of 46 files at r4, all commit messages.
Reviewable status: 252 of 337 files reviewed, 17 unresolved discussions (waiting on @Marsella8)
wmdi
left a comment
There was a problem hiding this comment.
Reviewed 40 of 88 files at r2, 45 of 46 files at r4.
Reviewable status: all files reviewed, 17 unresolved discussions (waiting on @Marsella8)
Marsella8
left a comment
There was a problem hiding this comment.
Reviewed 1 of 88 files at r2.
Reviewable status: 317 of 338 files reviewed, 5 unresolved discussions (waiting on @lockshaw and @wmdi)
lib/substitutions/include/substitutions/pcg_pattern_match.struct.toml line 21 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
They're here for hashing and fmting the
input_assignmentwhich is anunordered_map
sorry, missed input_assignment
lib/substitutions/src/substitutions/substitution.cc line 41 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Because an
unordered_mapcan have duplicate values but not duplicate keys. We could changevaluesto return astd::unordered_multiset, it's just not currently that way because I didn't realizestd::unordered_multisetexisted until recently. I'm totally open to a PR to do this change, but I'd rather not do it as part of this PR to avoid making it any larger 🙂
got it
lib/substitutions/src/substitutions/substitution.cc line 137 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Is it just the immediately-called lambda that's difficult to parse? If so, I'll probably not change that, as that's a pretty common pattern throughout the codebase (we generally use it to logically group code that all exists to just initialize a single value, and it also serves to limit the scope of any temporary variables we use in the process of initializing the value).
If it's something else that you'd finding difficult to read, let me know and I'm totally open to working on making it more readable!
Makes sense with your explanation, is clear and doesn't need to be changed imo.
lib/utils/include/utils/containers/filtrans.h line 26 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I added an additional function because it's a pretty common operation, and I feel like nesting the
without_nullopts(transform(...))harms readability a bit. I'm open to suggestions for a better name, I spent some time thinking and couldn't come up with a name that concisely explained what the function does, so I went with choosing something that's at least unambiguous
Yeah, thought about it for a bit, and haven't been able to come up with a better name
lib/utils/src/utils/graph/open_dataflow_graph/open_dataflow_value.cc line 7 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
These are actually correct, but I'll agree the naming is a bit weird. There are three related types:
DataflowInput,DataflowOutput, andDataflowGraphInput.DataflowInputis essentially an input slot/port for a node,DataflowOutputis essentially an output slot/port for a node, andDataflowGraphInputis an incoming value for an open graph. Thus, if we think aboutDataflowGraphs as representing computations over values, anOpenDataflowGraphhas values that either come from nodes within itself (i.e.,DataflowOutput) or values that come from outside (i.e.,DataflowGraphInput)
Ah okay that makes sense, I'll incorporate this explanation into the Graph docs.
lockshaw
left a comment
There was a problem hiding this comment.
Reviewable status: 316 of 339 files reviewed, 5 unresolved discussions (waiting on @Marsella8 and @wmdi)
lib/substitutions/include/substitutions/pcg_pattern_match.struct.toml line 21 at r1 (raw file):
Previously, Marsella8 wrote…
sorry, missed input_assignment
Done.
lib/substitutions/src/substitutions/substitution.cc line 41 at r1 (raw file):
Previously, Marsella8 wrote…
got it
Done.
lib/substitutions/src/substitutions/substitution.cc line 137 at r1 (raw file):
Previously, Marsella8 wrote…
Makes sense with your explanation, is clear and doesn't need to be changed imo.
Done.
lib/utils/include/utils/containers/filtrans.h line 26 at r1 (raw file):
Previously, Marsella8 wrote…
Yeah, thought about it for a bit, and haven't been able to come up with a better name
Done.
lib/utils/src/utils/graph/open_dataflow_graph/open_dataflow_value.cc line 7 at r1 (raw file):
Previously, Marsella8 wrote…
Ah okay that makes sense, I'll incorporate this explanation into the Graph docs.
Done.
Marsella8
left a comment
There was a problem hiding this comment.
Reviewed 1 of 285 files at r1, 1 of 88 files at r2, 1 of 46 files at r4, 12 of 14 files at r5, 9 of 9 files at r6, 2 of 2 files at r7, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @lockshaw and @wmdi)
lib/utils/include/utils/containers/enumerate_vector.h line 1 at r7 (raw file):
#ifndef _FLEXFLOW_LIB_UTILS_INCLUDE_UTILS_CONTAINERS_ENUMERATE_VECTOR_H
maybe get rid of enumerate_vector.h and move it to enumerate.h?
lib/utils/test/src/utils/containers/enumerate.cc line 49 at r7 (raw file):
{3, "three"}, }; }
No CHECK, I guess because its unordered? Does it make sense to have enumeration for an unordered collection?
lib/utils/test/src/utils/containers/get_all_permutations.cc line 10 at r7 (raw file):
using namespace ::FlexFlow;
use of unordered_multiset here is superfluous (?) since get_all_permutations returns an unordered_set anyways
lib/op-attrs/src/op-attrs/ops/dropout.cc line 16 at r5 (raw file):
if (get_sum_degree(input_shape) != 1) { return tl::unexpected( fmt::format("Expected sum degree 1, but receieved sum degree {}",
received
Code quote:
receieved
wmdi
left a comment
There was a problem hiding this comment.
Reviewed all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @lockshaw)
Description of changes:
Main features added in this PR:
apply_substitution. More testing is needed to be super-confident, but the tests added in this PR are already stronger than those that existed before.O(2^num graph outputs)) isomorphism checking ofOpenDataflowGraphs (and their labelled counterparts)get_graph_datainfrastructure for working with graphs in a more mutable/value-centric way, not intended for broad use but useful for some tests and for some more complex graph operations like substitution applicationdotrendering of PCGs (currently very rough, will need some improvement)rewrite_node_labels,rewrite_value_labels) and for dealing with node ids (permute_node_ids,permute_input_ids)ParallelComputationGraphBuilder(i.e.,generate_weight_transform)Outstanding issues to be addressed following this PR:
substitutionsREADME #1121SubstitutionBuilderto make creatingSubstitutions less verbose and error-prone #1473ParallelComputationGraphBuilderandComputationGraphBuilder#1474as_dotfor graphs inutils/graphin a less hacky way #1476is_valid_substitution#1477Related Issues:
Linked Issues:
lib/substitutions#1294 (done for external functions, waiting on Add ability to document dtgen structs using doxygen #1475 for data structures)Issues closed by this PR:
This change is