Skip to content

add unit tests for compiler#870

Merged
lockshaw merged 21 commits intoflexflow:repo-refactorfrom
wmdi:repo-refactor
Aug 18, 2023
Merged

add unit tests for compiler#870
lockshaw merged 21 commits intoflexflow:repo-refactorfrom
wmdi:repo-refactor

Conversation

@wmdi
Copy link
Collaborator

@wmdi wmdi commented Jul 19, 2023

Description of changes:

Related Issues:

Linked Issues:

  • Issue #

Issues closed by this PR:

  • Closes #

Before merging:

  • Did you update the flexflow-third-party repo, if modifying any of the Cmake files, the build configs, or the submodules?

This change is Reviewable

@wmdi wmdi requested a review from lockshaw July 19, 2023 17:00
Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we merge the base of this and then merge this? Because repo-refactor doesn't have some of the changes from my fork this diff shows all of them at once. It would be nice to do this as two separate PRs (one for the changes from my fork with a base of repo-refactor and one for the changes from your fork with a base of my fork so the diffs work out)

Reviewable status: 0 of 40 files reviewed, all discussions resolved

@lockshaw lockshaw changed the base branch from repo-refactor to re/search July 26, 2023 03:55
@lockshaw lockshaw changed the base branch from re/search to repo-refactor July 26, 2023 03:55
Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 12 files at r2, 33 of 33 files at r3, 1 of 1 files at r4, all commit messages.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @wmdi)


lib/compiler/src/machine_mapping.cc line 204 at r4 (raw file):

                                     cached_subgraph_costs),
                         post_decompn)),
               MachineMappingRuntimeCmp{});

Suggestion:

      minimize_runtime(optimal_result,
               MachineMapping::sequential_combine(
                   visit(OptimalCost(pre_graph,
                                     cost_estimator,
                                     resource,
                                     source_machine_view,
                                     pre_sink_mv,
                                     allowed_machine_views,
                                     cached_subgraph_costs),
                         pre_decompn),
                   visit(OptimalCost(post_graph,
                                     cost_estimator,
                                     resource,
                                     post_source_mv,
                                     sink_machine_view,
                                     allowed_machine_views,
                                     cached_subgraph_costs),
                         post_decompn)));

lib/compiler/test/test_generator.h line 21 at r4 (raw file):

  static Gen<MachineView> arbitrary() {
    return gen::apply(make_1d_machine_view,
                      gen::construct<gpu_id_t>(gen::nonZero<int>()),

Why not add an arbitrary instance for gpu_id_t? (or even strong_typedef in general)


lib/compiler/test/test_generator.h line 41 at r4 (raw file):

struct Arbitrary<ComputationGraph> {
  static Gen<ComputationGraph> arbitrary() {
    return gen::apply(

If would be really nice to clean up this code a bit -- it's currently really hard to read/understand (and some of this may be complexity we can't avoid, but I think there's still some simplifying to be done). At the very least it should be possible to pull some of it out into functions


lib/compiler/test/test_generator.h line 47 at r4 (raw file):

           std::vector<int> const &compn_type) {
          auto g =
              ::create<UnorderedOutputLabelledMultiDiGraph<Layer, Tensor>>();

Why is there nothing to the left of::create?


lib/compiler/test/test_generator.h line 78 at r4 (raw file):

            RC_ASSERT(source0 != source1 && sink0 != sink1);

            if (source0 == sink0 || t == 0) {

Would probably be a good idea to add parallel and serial composition as actual functions in the codebase as they're pretty core functionality


lib/compiler/test/test_generator.h line 117 at r4 (raw file):

template <>
struct Arbitrary<ParallelComputationGraph> {

Why do ParallelComputationGraph and ComputationGraph have separate Arbitrary instances?


lib/compiler/test/test_generator.h line 170 at r4 (raw file):

            } else {
              // parallel composition
              g.add_edge(MultiDiOutput{source0, NodePort(0)},

Should be constructed by add_node_port

Code quote:

NodePort(0)

lib/compiler/test/test_machine_mapping.cc line 10 at r4 (raw file):

    MachineMapping comb = MachineMapping::sequential_combine(mp0, mp1);

    RC_ASSERT(comb.runtime == mp0.runtime + mp1.runtime);

We should pull runtime out of MachineMapping for the sake of immutability


lib/compiler/test/test_machine_mapping.cc line 21 at r4 (raw file):

TEST_CASE("MachineMapping::parallel_combine") {
  rc::check([](MachineMapping const &mp0, MachineMapping const &mp1) {
    RC_PRE(are_disjoint(keys(mp0), keys(mp1)));

Should this be keys(mp0.machine_views), or is there a keys(...) instance for MachineMapping somewhere I missed. Also might be nice to just add a are_disjoint(MachineMapping const &, MachineMapping const &) or nodes_are_disjoint(MachineMapping const &, MachineMapping const &)


lib/compiler/test/test_machine_mapping.cc line 30 at r4 (raw file):

    for (auto p : mp0.machine_views) {
      RC_ASSERT(p.second == comb.machine_views.at(p.first));
    }

Also, might be nice to just define an is_submap in containers.h

Suggestion:

    RC_ASSERT(restrict_keys(comb.machine_views, keys(mp0)) == mp0);

lib/compiler/test/test_machine_mapping.cc line 34 at r4 (raw file):

}

TEST_CASE("MachieMapping::infinity") {

Suggestion:

MachineMapping

lib/compiler/test/test_optimal_cost.cc line 5 at r4 (raw file):

TEST_CASE("optimal_cost") {
  rc::check([](ParallelComputationGraph const &g) {

Can you add a quick explanation of what these tests are checking? It's a little nontrivial to decode them currently


lib/compiler/test/test_optimal_cost.cc line 13 at r4 (raw file):

        },
        TestCostEstimator{},
        MachineSpecification{1, 1, 4, 0.1, 0.2},

Add strong_typedefs for the args of MachineSpecification--it's quite unclear what these numbers mean currently


lib/compiler/test/test_optimal_cost.cc line 18 at r4 (raw file):

    for (auto node : get_nodes(g)) {
      RC_ASSERT(contains_key(machine_mapping.machine_views, node));
    }

Suggestion:

    RC_ASSERT(keys(machine_mapping.machine_views) == get_nodes(g));

lib/compiler/test/test_unity_algorithm.cc line 23 at r4 (raw file):

    for (auto node : get_nodes(s.pcg)) {
      RC_ASSERT(contains_key(s.machine_mapping.machine_views, node));
    }

Suggestion:

    RC_ASSERT(keys(s.machine_mapping.machine_Views) == get_nodes(s.pcg));

lib/utils/include/utils/graph/cow_ptr_t.h line 14 at r4 (raw file):

struct cow_ptr_t {
  // static_assert(is_clonable<T>::value,
  //               "cow_ptr_t requires the type to have a clone() method");

And create an issue, assign it to me, and link it in the comment after the TODO

Suggestion:

"cow_ptr_t requires the type to have a clone() method"); TODO FIXME @lockshaw

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we merge #842 before merging this pr, because there are some dependencies in graphs.

Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @lockshaw)


lib/compiler/src/machine_mapping.cc line 204 at r4 (raw file):

                                     cached_subgraph_costs),
                         post_decompn)),
               MachineMappingRuntimeCmp{});

Done.


lib/compiler/test/test_generator.h line 21 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why not add an arbitrary instance for gpu_id_t? (or even strong_typedef in general)

Done.


lib/compiler/test/test_generator.h line 41 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

If would be really nice to clean up this code a bit -- it's currently really hard to read/understand (and some of this may be complexity we can't avoid, but I think there's still some simplifying to be done). At the very least it should be possible to pull some of it out into functions

Done.


lib/compiler/test/test_generator.h line 47 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why is there nothing to the left of::create?

Done.


lib/compiler/test/test_generator.h line 78 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Would probably be a good idea to add parallel and serial composition as actual functions in the codebase as they're pretty core functionality

I agree. But we should have a consistent definition of parallel and serial composition first, i.e., 1)different appearance of parallel and serial composition should have the same meaning, and 2) serial-parallel composition and serial-parallel decomposition should be the inverse of each other.

There might be two (or even more) different ways to define it: (I only describe serial composition here, as parallel composition should be similar.)

  1. For two graphs 'g_1' and 'g_2', a serial composition means we merge the sink node of 'g_1' and the source node of 'g_2' as a single node. This definition assumes all the graphs are perfect two-terminal series–parallel graphs.
  2. For two graphs 'g_1' and 'g_2', a serial composition means we add edges between the sink nodes of 'g_1' and the source nodes of 'g_2'. This definition allows we have multiple source/sink nodes (perhaps with additional conventions of how to add edges).

The first one is the definition in graph theory, while the second one is the inverse of the sp decomposition we have now in the codebase.


lib/compiler/test/test_generator.h line 117 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why do ParallelComputationGraph and ComputationGraph have separate Arbitrary instances?

Done.


lib/compiler/test/test_generator.h line 170 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Should be constructed by add_node_port

Done.


lib/compiler/test/test_machine_mapping.cc line 10 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

We should pull runtime out of MachineMapping for the sake of immutability

I don't quite get it. Could you explain more?


lib/compiler/test/test_machine_mapping.cc line 21 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Should this be keys(mp0.machine_views), or is there a keys(...) instance for MachineMapping somewhere I missed. Also might be nice to just add a are_disjoint(MachineMapping const &, MachineMapping const &) or nodes_are_disjoint(MachineMapping const &, MachineMapping const &)

Done.


lib/compiler/test/test_machine_mapping.cc line 30 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Also, might be nice to just define an is_submap in containers.h

Done.


lib/compiler/test/test_machine_mapping.cc line 34 at r4 (raw file):

}

TEST_CASE("MachieMapping::infinity") {

Done.


lib/compiler/test/test_optimal_cost.cc line 13 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Add strong_typedefs for the args of MachineSpecification--it's quite unclear what these numbers mean currently

Done.


lib/compiler/test/test_unity_algorithm.cc line 23 at r4 (raw file):

    for (auto node : get_nodes(s.pcg)) {
      RC_ASSERT(contains_key(s.machine_mapping.machine_views, node));
    }

Done.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 29 of 45 files reviewed, 16 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_optimal_cost.cc line 18 at r4 (raw file):

    for (auto node : get_nodes(g)) {
      RC_ASSERT(contains_key(machine_mapping.machine_views, node));
    }

Done.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 29 of 45 files reviewed, 16 unresolved discussions (waiting on @lockshaw)


lib/utils/include/utils/graph/cow_ptr_t.h line 14 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

And create an issue, assign it to me, and link it in the comment after the TODO

Done.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 29 of 45 files reviewed, 16 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_generator.h line 78 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

I agree. But we should have a consistent definition of parallel and serial composition first, i.e., 1)different appearance of parallel and serial composition should have the same meaning, and 2) serial-parallel composition and serial-parallel decomposition should be the inverse of each other.

There might be two (or even more) different ways to define it: (I only describe serial composition here, as parallel composition should be similar.)

  1. For two graphs 'g_1' and 'g_2', a serial composition means we merge the sink node of 'g_1' and the source node of 'g_2' as a single node. This definition assumes all the graphs are perfect two-terminal series–parallel graphs.
  2. For two graphs 'g_1' and 'g_2', a serial composition means we add edges between the sink nodes of 'g_1' and the source nodes of 'g_2'. This definition allows we have multiple source/sink nodes (perhaps with additional conventions of how to add edges).

The first one is the definition in graph theory, while the second one is the inverse of the sp decomposition we have now in the codebase.

It seems that we cannot use the definition in graph theory, because it duplicates some nodes in decomposition, which is not good for our algorithm.

If we use the second definition, we have to solve the 'additional conventions': According to the current sp decomposition we have, a parallel composition means we put g_1 and g_2 in parallel. As a result, we will have a graph with more than 1 source/sink nodes. Then how do we serially compose the following two graphs?

g_1: n0 -> n1
      n2 -> n3
g_2: n4 -> n5
      n6 -> n7

In decomposition, we never encounter this case, because we will have a parallel decomposition before. However, in composition, we have to deal with it because we are building the graph in a bottom-up way.

Another way is to use definition 1 for serial and parallel composition to build a perfect two-terminal series–parallel graph, and use definition 2 for decomposition for the convenience of our algorithm.

Any thoughts?

Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 15 of 16 files at r5, 1 of 1 files at r6, all commit messages.
Reviewable status: all files reviewed, 13 unresolved discussions (waiting on @wmdi)


lib/compiler/test/test_generator.h line 41 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

Done.

Well done, the new code is already much easier to understand 🙂


lib/compiler/test/test_generator.h line 78 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

It seems that we cannot use the definition in graph theory, because it duplicates some nodes in decomposition, which is not good for our algorithm.

If we use the second definition, we have to solve the 'additional conventions': According to the current sp decomposition we have, a parallel composition means we put g_1 and g_2 in parallel. As a result, we will have a graph with more than 1 source/sink nodes. Then how do we serially compose the following two graphs?

g_1: n0 -> n1
      n2 -> n3
g_2: n4 -> n5
      n6 -> n7

In decomposition, we never encounter this case, because we will have a parallel decomposition before. However, in composition, we have to deal with it because we are building the graph in a bottom-up way.

Another way is to use definition 1 for serial and parallel composition to build a perfect two-terminal series–parallel graph, and use definition 2 for decomposition for the convenience of our algorithm.

Any thoughts?

I would use definition 2, with the serial composition of the graphs you gave being

digraph G {
  n0 -> n1;
  n2 -> n3;
  n1 -> n4;
  n1 -> n6;
  n3 -> n4;
  n3 -> n6;
  n4 -> n5;
  n6 -> n7;
}

(i.e., add all-to-all edges between all sink nodes of g_1 and all source nodes of g_2--I think this is the only way you can get the guarantees that everything in g_1 will execute before anything in g_2 can execute)


lib/compiler/test/test_generator.h line 10 at r6 (raw file):

using namespace FlexFlow;

enum class CompnType { SERIAL, PARALLEL };

What does Compn stand for? Composition? If so I'd just write that out as Compn is a little ambiguous


lib/compiler/test/test_generator.h line 31 at r6 (raw file):

  Generates a series-parallel graph according to the composition sequence
  described by `composition`. A series-parallel graph can be generated as
  follows: 1) Initially, we have E (E is the length of `composition`+1)

Suggestion:

E is `composition.size()+1`

lib/compiler/test/test_generator.h line 32 at r6 (raw file):

  described by `composition`. A series-parallel graph can be generated as
  follows: 1) Initially, we have E (E is the length of `composition`+1)
  components, each containing a single edge; 2) In iteration `i`, we compose two

It feels like it would be easier to have each of the starting graphs just be a single node and have composition be adding edges, not merging nodes. What do you think? I also wonder if this would make it easy to do SP graph generation via recursion rather than the current method (i.e., generate_sp_graph(int num_compositions) = compose(randomComposition(), generate_sp_graph(randomNumLessThan(num_compositions)), generate_sp_graph(<remaining num_compositions>))

Code quote:

each containing a single edge

lib/compiler/test/test_generator.h line 47 at r6 (raw file):

      dstIdx; // src and dst node ports for each edge (I assume it is sufficient
              // to make different edges have different NodePort. Correct me if
              // I am wrong. @lockshaw)

Yes, that works


lib/compiler/test/test_generator.h line 48 at r6 (raw file):

              // to make different edges have different NodePort. Correct me if
              // I am wrong. @lockshaw)
  AdjacencyMultiDiGraph g(0, 0, {});

MultiDiGraph g = MultiDiGraph::create<AdjacencyMultiDiGraph>();


lib/compiler/test/test_generator.h line 93 at r6 (raw file):

    generate_test_labelled_sp_graph() {
  NOT_IMPLEMENTED();
  // Is there a way to construct a labelled graph from a MultiDiGraph and the

I don't think so right now, but it should be relatively easy to add a view that takes a MultiDiGraph and a std::function<NodeLabel(Node const &)> and a std::function<OutputLabel(Output const &)> and implements IOutputLabelledMultiDiGraph


lib/compiler/test/test_generator.h line 104 at r6 (raw file):

template <typename Tag, typename T>
struct Arbtrary<Tag> {

Suggestion:

Arbitrary

lib/compiler/test/test_generator.h line 113 at r6 (raw file):

template <>
struct Arbitrary<Node> {

Why does this exist in addition to the generic Arbitrary instance for strong_typedef?


lib/compiler/test/test_machine_mapping.cc line 10 at r4 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

I don't quite get it. Could you explain more?

Conceptually I view the cost of a MachineMapping as separate from the views it holds. For example, if I just want to construct a MachineMapping in some other part of the code, or as part of a test, etc., what value would I assign runtime? I was thinking that it might be nice to make MachineMapping just be a std::unordered_map<Node, MachineView>, and have a separate component that calculates and caches runtime_cost(MachineView const &). Would that work with the current design?


lib/compiler/test/test_machine_mapping.cc line 4 at r6 (raw file):

#include "test_generator.h"

bool nodes_are_disjoint(MachineMapping const &m1, MachineMapping const &m2) {

This seems general enough to justify moving out of tests and into the standard codebase


lib/utils/include/utils/test_types.h line 84 at r6 (raw file):

template <
    ::FlexFlow::test_types::
        capability... CAPABILITIES> //, typename = typename

Commented out code here should be removed

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_generator.h line 78 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I would use definition 2, with the serial composition of the graphs you gave being

digraph G {
  n0 -> n1;
  n2 -> n3;
  n1 -> n4;
  n1 -> n6;
  n3 -> n4;
  n3 -> n6;
  n4 -> n5;
  n6 -> n7;
}

(i.e., add all-to-all edges between all sink nodes of g_1 and all source nodes of g_2--I think this is the only way you can get the guarantees that everything in g_1 will execute before anything in g_2 can execute)

Good idea. In this way our definitions of composition and decomposition would be consistent.


lib/compiler/test/test_generator.h line 10 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

What does Compn stand for? Composition? If so I'd just write that out as Compn is a little ambiguous

It stands for composition. I think I overused 'compn'.


lib/compiler/test/test_generator.h line 31 at r6 (raw file):

  Generates a series-parallel graph according to the composition sequence
  described by `composition`. A series-parallel graph can be generated as
  follows: 1) Initially, we have E (E is the length of `composition`+1)

Done.


lib/compiler/test/test_generator.h line 32 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

It feels like it would be easier to have each of the starting graphs just be a single node and have composition be adding edges, not merging nodes. What do you think? I also wonder if this would make it easy to do SP graph generation via recursion rather than the current method (i.e., generate_sp_graph(int num_compositions) = compose(randomComposition(), generate_sp_graph(randomNumLessThan(num_compositions)), generate_sp_graph(<remaining num_compositions>))

Starting with edges and merging nodes align with the graph theory definition (definition 1 in our discussion). But since we decide to use definition 2, it will be good to start with a single node.


lib/compiler/test/test_generator.h line 48 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

MultiDiGraph g = MultiDiGraph::create<AdjacencyMultiDiGraph>();

Done.


lib/compiler/test/test_generator.h line 104 at r6 (raw file):

template <typename Tag, typename T>
struct Arbtrary<Tag> {

Done.


lib/compiler/test/test_generator.h line 113 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why does this exist in addition to the generic Arbitrary instance for strong_typedef?

Done.


lib/compiler/test/test_optimal_cost.cc line 5 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Can you add a quick explanation of what these tests are checking? It's a little nontrivial to decode them currently

Done.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 43 of 45 files reviewed, 12 unresolved discussions (waiting on @lockshaw)


lib/utils/include/utils/test_types.h line 84 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Commented out code here should be removed

Done.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 42 of 45 files reviewed, 12 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_generator.h line 93 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think so right now, but it should be relatively easy to add a view that takes a MultiDiGraph and a std::function<NodeLabel(Node const &)> and a std::function<OutputLabel(Output const &)> and implements IOutputLabelledMultiDiGraph

Done.


lib/compiler/test/test_machine_mapping.cc line 4 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

This seems general enough to justify moving out of tests and into the standard codebase

For now I keep the new codes as local as possible for convenience. After you review the serial parallel composition codes I will move all the general codes into proper position together.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 41 of 45 files reviewed, 12 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_generator.h line 93 at r6 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

Done.

And where are the serialization methods, which are to serialize a graph view into a graph?


lib/compiler/test/test_machine_mapping.cc line 10 at r4 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Conceptually I view the cost of a MachineMapping as separate from the views it holds. For example, if I just want to construct a MachineMapping in some other part of the code, or as part of a test, etc., what value would I assign runtime? I was thinking that it might be nice to make MachineMapping just be a std::unordered_map<Node, MachineView>, and have a separate component that calculates and caches runtime_cost(MachineView const &). Would that work with the current design?

Actually, the current MachineMapping should be viewed as the type for dp result, which is the cost with the corresponding strategy(machine mapping). I have separated MachineMapping from the dp result.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 12 files at r2, 24 of 33 files at r3, 11 of 16 files at r5, 1 of 1 files at r8, 3 of 6 files at r9, 5 of 5 files at r10, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @lockshaw)

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 40 of 46 files reviewed, 12 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_machine_mapping.cc line 4 at r6 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

For now I keep the new codes as local as possible for convenience. After you review the serial parallel composition codes I will move all the general codes into proper position together.

Done.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r11, all commit messages.
Reviewable status: all files reviewed, 12 unresolved discussions (waiting on @lockshaw)

Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 11 files at r12, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @lockshaw and @wmdi)


lib/compiler/test/test_generator.h line 93 at r6 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

And where are the serialization methods, which are to serialize a graph view into a graph?

materialize_* (https://github.com/flexflow/FlexFlow/blob/405ddf44f1214e0bba6ac3f41914749e49977bbe/lib/utils/include/utils/graph/views.h#L322-L347)


lib/compiler/test/test_generator.h line 120 at r7 (raw file):

template <>
struct Arbitrary<variant<Serial, Node>> {

Does there not exist automatic Arbitrary instances for variant? If not this should be added


lib/compiler/test/test_generator.h line 138 at r7 (raw file):

template <>
struct Arbitrary<Serial> {

Aren't Serial and Parallel visitable? Doesn't visitable automatically define an arbitrary instance?


lib/utils/src/graph/serialparallel.cc line 221 at r12 (raw file):

}

std::unordered_map<Node, Node> parallel_extend(MultiDiGraph &g,

Can you add an issue to refactor these once global-nodes is merged? (and link it in the code). Feel free to assign the issue to me and add it to the project board


lib/utils/src/graph/serialparallel.cc line 267 at r12 (raw file):

}

struct MultiDiGraphFromSPDecomposition {

Suggestion:

struct MultiDiGraphFromSPDecompositionFunctor {

lib/utils/src/graph/serialparallel.cc line 291 at r12 (raw file):

MultiDiGraph multidigraph_from_sp_decomposition(Serial const &serial) {
  MultiDiGraph g = MultiDiGraph::create<AdjacencyMultiDiGraph>();
  for (auto child : serial.children) {

Ideally write out the type here rather than use auto unless the type name is really long (also, const &)


lib/utils/src/graph/serialparallel.cc line 299 at r12 (raw file):

MultiDiGraph multidigraph_from_sp_decomposition(Parallel const &parallel) {
  MultiDiGraph g = MultiDiGraph::create<AdjacencyMultiDiGraph>();
  for (auto child : parallel.children) {

Ideally write out the type here rather than use auto unless the type name is really long (also, const &)

Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @lockshaw and @wmdi)

Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @wmdi)


lib/compiler/test/test_generator.h line 32 at r6 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

Starting with edges and merging nodes align with the graph theory definition (definition 1 in our discussion). But since we decide to use definition 2, it will be good to start with a single node.

Sounds good. Do you want to implement this in this PR or do a follow-up PR?

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 11 of 11 files at r12, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @lockshaw)


lib/compiler/test/test_generator.h line 32 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Sounds good. Do you want to implement this in this PR or do a follow-up PR?

I think it is done in this PR, by constructing graphs from sp decomposition (without using global nodes).


lib/compiler/test/test_generator.h line 93 at r6 (raw file):

Previously, lockshaw (Colin Unger) wrote…

materialize_* (https://github.com/flexflow/FlexFlow/blob/405ddf44f1214e0bba6ac3f41914749e49977bbe/lib/utils/include/utils/graph/views.h#L322-L347)

Done.


lib/compiler/test/test_generator.h line 120 at r7 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Does there not exist automatic Arbitrary instances for variant? If not this should be added

No. And we can easily imagine that there isn't a clear way to define an Arbitrary instance for variant.


lib/compiler/test/test_generator.h line 138 at r7 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Aren't Serial and Parallel visitable? Doesn't visitable automatically define an arbitrary instance?

Not for now. I tried to make it but had trouble with the macro stuffs. I will open an issue for this.


lib/utils/src/graph/serialparallel.cc line 221 at r12 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Can you add an issue to refactor these once global-nodes is merged? (and link it in the code). Feel free to assign the issue to me and add it to the project board

Done.


lib/utils/src/graph/serialparallel.cc line 267 at r12 (raw file):

}

struct MultiDiGraphFromSPDecomposition {

Done.


lib/utils/src/graph/serialparallel.cc line 291 at r12 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally write out the type here rather than use auto unless the type name is really long (also, const &)

Done.


lib/utils/src/graph/serialparallel.cc line 299 at r12 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally write out the type here rather than use auto unless the type name is really long (also, const &)

Done.

Copy link
Collaborator Author

@wmdi wmdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r13, all commit messages.
Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @lockshaw)

Copy link
Collaborator

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @wmdi)


lib/compiler/test/test_generator.h line 120 at r7 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

No. And we can easily imagine that there isn't a clear way to define an Arbitrary instance for variant.

I'll work on adding one. #981


lib/compiler/test/test_generator.h line 138 at r7 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

Not for now. I tried to make it but had trouble with the macro stuffs. I will open an issue for this.

#976


lib/utils/src/graph/serialparallel.cc line 221 at r12 (raw file):

Previously, wmdi (Mengdi Wu) wrote…

Done.

#976

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants