Skip to content

Add ParallelComputationGraphBuilder#1411

Merged
lockshaw merged 13 commits intoflexflow:repo-refactorfrom
lockshaw:pcg-builder
Jun 19, 2024
Merged

Add ParallelComputationGraphBuilder#1411
lockshaw merged 13 commits intoflexflow:repo-refactorfrom
lockshaw:pcg-builder

Conversation

@lockshaw
Copy link
Collaborator

@lockshaw lockshaw commented Jun 10, 2024

Description of changes:

  • Add interface for building PCGs using parallel shape inference, analogous to ParallelComputationGraphBuilder.
  • Modify dtgen to set struct constructors to be explicit, both for clarity and to remove a bug in string conversion.

Related Issues:

Linked Issues:

  • Issue #

Issues closed by this PR:

  • Closes #

This change is Reviewable

@lockshaw lockshaw changed the base branch from inference to repo-refactor June 10, 2024 08:01
@lockshaw lockshaw marked this pull request as ready for review June 10, 2024 08:02
@lockshaw lockshaw requested a review from reyna-abhyankar June 10, 2024 08:02
@codecov
Copy link

codecov bot commented Jun 10, 2024

Codecov Report

Attention: Patch coverage is 93.69880% with 100 lines in your changes missing coverage. Please review.

Project coverage is 39.79%. Comparing base (8633945) to head (0b2c9a4).

Additional details and impacted files
@@                Coverage Diff                 @@
##           repo-refactor    #1411       +/-   ##
==================================================
+ Coverage          25.60%   39.79%   +14.19%     
==================================================
  Files                228      248       +20     
  Lines               7484     8744     +1260     
  Branches             294      320       +26     
==================================================
+ Hits                1916     3480     +1564     
+ Misses              5568     5264      -304     
Flag Coverage Δ
unittests 39.79% <93.69%> (+14.19%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
lib/op-attrs/include/op-attrs/get_output_shapes.h 0.00% <ø> (ø)
...s/attention/multihead_attention_parallel_inputs.cc 48.64% <100.00%> (+0.70%) ⬆️
lib/op-attrs/src/op-attrs/ops/batch_matmul.cc 64.04% <100.00%> (+0.82%) ⬆️
lib/op-attrs/src/op-attrs/ops/conv_2d.cc 100.00% <100.00%> (+65.16%) ⬆️
...-attrs/ops/conv_2d/conv_2d_parallel_input_shape.cc 100.00% <100.00%> (+100.00%) ⬆️
lib/op-attrs/src/op-attrs/ops/embedding.cc 81.81% <100.00%> (+0.27%) ⬆️
lib/op-attrs/src/op-attrs/ops/linear.cc 91.04% <100.00%> (ø)
lib/op-attrs/test/src/datatype.cc 100.00% <100.00%> (ø)
lib/op-attrs/test/src/ops/batch_matmul.cc 97.84% <100.00%> (ø)
lib/op-attrs/test/src/ops/cast.cc 100.00% <100.00%> (ø)
... and 35 more

... and 38 files with indirect coverage changes

@lockshaw lockshaw enabled auto-merge (squash) June 10, 2024 08:21
Copy link
Collaborator

@reyna-abhyankar reyna-abhyankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 283 of 285 files at r1.
Reviewable status: 283 of 285 files reviewed, 8 unresolved discussions (waiting on @lockshaw)


lib/op-attrs/include/op-attrs/get_output_shapes.h line 119 at r1 (raw file):

ParallelTensorShape get_output_shape(ConcatAttrs const &,
                                     std::vector<ParallelTensorShape> const &);
ParallelTensorShape get_output_shape(Conv2DAttrs const &,

Why delete?


lib/op-attrs/include/op-attrs/ops/cast.h line 14 at r1 (raw file):

CHECK_VALID_OP_ATTR(CastAttrs);

tl::expected<TensorShape, std::string> get_output_shape(CastAttrs const &,

Why is this moved over from the other file?


lib/pcg/include/pcg/computation_graph_builder.h line 285 at r1 (raw file):

      std::optional<std::string> const &name = std::nullopt);
  tensor_guid_t
      element_unary(ElementUnaryAttrs const &,

Why delete?


lib/pcg/include/pcg/file_format/v1/graphs/v1_operator_graph.struct.toml line 15 at r1 (raw file):

  "<vector>",
  "<unordered_set>",
  "pcg/file_format/v1/graphs/v1_graph_edge.dtg.h",

Why is there a v1?


lib/pcg/include/pcg/parallel_computation_graph/parallel_computation_graph.h line 9 at r1 (raw file):

namespace FlexFlow {

I think an interface here for getting the topological / reverse topological ordering would be nice since thats what the eventual legion backing will use


lib/pcg/include/pcg/parallel_computation_graph/parallel_computation_graph_builder.h line 10 at r1 (raw file):

namespace FlexFlow {

struct ParallelComputationGraphBuilder {

What about the rest of the operators? Is that a future todo


lib/utils/include/utils/fmt.h line 15 at r1 (raw file):

namespace fmt {

template <typename T, typename Char>

Why are these chunks deleted / moved over?


lib/op-attrs/src/op-attrs/datatype.cc line 24 at r1 (raw file):

}

bool can_strictly_promote_datatype_from_to(DataType src, DataType dst) {

Can we do this more explicitly? If someone changes the ordering of datatypes in the original enum, this can be wrong.

Copy link
Collaborator Author

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 269 of 307 files reviewed, 8 unresolved discussions (waiting on @reyna-abhyankar)


lib/op-attrs/include/op-attrs/get_output_shapes.h line 119 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Why delete?

They're now declared in the corresponding header in op-attrs/ops/ (e.g., get_output_shape(CastAttrs const &, ParallelTensorShape const & is now declared in op-attrs/ops/cast.h)


lib/op-attrs/include/op-attrs/ops/cast.h line 14 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Why is this moved over from the other file?

To me it seemed to make more sense to define these per-operator (so that all of the functions for a single operator type are grouped together) rather than define these per-capability (so that all of parallel shape inference is together) since each operator has a lot of shared stuff between the functions and may have extra functions (get_kernel_shape, get_bias_shape, etc.). If we need a common interface to them all that can always be added later by wrapping all of the per-operator functions (it's actually not unlikely this will get added later, but I'd prefer to wait until the use-case is clear)


lib/pcg/include/pcg/computation_graph_builder.h line 285 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Why delete?

With the ElementScalarUnary and ElementUnaryAttrs combined the additional functions felt necessary. I actually should have probably deleted ComputationGraphBuilder::element_scalar_unary too, which I will do unless you object


lib/pcg/include/pcg/file_format/v1/graphs/v1_operator_graph.struct.toml line 15 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Why is there a v1?

V1 refers to the actual json file format--there's separate V1 types for most of PCG to try to make the file format more explicit


lib/pcg/include/pcg/parallel_computation_graph/parallel_computation_graph.h line 9 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

I think an interface here for getting the topological / reverse topological ordering would be nice since thats what the eventual legion backing will use

Done.


lib/pcg/include/pcg/parallel_computation_graph/parallel_computation_graph_builder.h line 10 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

What about the rest of the operators? Is that a future todo

Yeah, this is just the list of operators we have shape inference for at the moment


lib/utils/include/utils/fmt.h line 15 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Why are these chunks deleted / moved over?

The problem with grouping everything into a single fmt.h file is it means that anything that uses a single of these implementations gets rebuilt anytime any of them change. Having them per-file is a bit less convenient (as users have to choose which types they want to format in the includes) but leads to a lot less "rebuild the world" behavior. If this becomes too impractical I can move it back or add a wrapper header that just pulls the rest in (which may be the best solution), but for now I think utils could use more divided up headers rather than less


lib/op-attrs/src/op-attrs/datatype.cc line 24 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Can we do this more explicitly? If someone changes the ordering of datatypes in the original enum, this can be wrong.

The best I can come up with is to added wrapper types for the arguments (i.e., SrcDataType and DstDataType), but I'm not sure the extra code is worth it as the whole "from, to" order is explicitly in the function name. The real answer here is to use a language with named arguments. If you think it's worth the extra code I'm happy to add it, but taking that approach might get pretty crazy pretty fast

Copy link
Collaborator

@reyna-abhyankar reyna-abhyankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 285 files at r1, 36 of 36 files at r2, all commit messages.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @lockshaw)


lib/op-attrs/include/op-attrs/get_output_shapes.h line 119 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

They're now declared in the corresponding header in op-attrs/ops/ (e.g., get_output_shape(CastAttrs const &, ParallelTensorShape const & is now declared in op-attrs/ops/cast.h)

Should that also happen for other operators?


lib/op-attrs/include/op-attrs/ops/cast.h line 14 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

To me it seemed to make more sense to define these per-operator (so that all of the functions for a single operator type are grouped together) rather than define these per-capability (so that all of parallel shape inference is together) since each operator has a lot of shared stuff between the functions and may have extra functions (get_kernel_shape, get_bias_shape, etc.). If we need a common interface to them all that can always be added later by wrapping all of the per-operator functions (it's actually not unlikely this will get added later, but I'd prefer to wait until the use-case is clear)

Then we should do this for the rest of the operators (that can be a future todo, along with some of the other operator-specific changes made in this PR)


lib/pcg/include/pcg/computation_graph_builder.h line 285 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

With the ElementScalarUnary and ElementUnaryAttrs combined the additional functions felt necessary. I actually should have probably deleted ComputationGraphBuilder::element_scalar_unary too, which I will do unless you object

Works for me.


lib/pcg/test/src/pcg/parallel_computation_graph/parallel_computation_graph.cc line 5 at r2 (raw file):

TEST_SUITE(FF_TEST_SUITE) {
  TEST_CASE("topological_ordering") {

What is the difference between this test and the test above?


lib/op-attrs/src/op-attrs/datatype.cc line 24 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

The best I can come up with is to added wrapper types for the arguments (i.e., SrcDataType and DstDataType), but I'm not sure the extra code is worth it as the whole "from, to" order is explicitly in the function name. The real answer here is to use a language with named arguments. If you think it's worth the extra code I'm happy to add it, but taking that approach might get pretty crazy pretty fast

Instead of doing that, you can just manually compare each possibility. There's only 6 of them so it shouldn't be that much code

...
} else if (src == DataType::INT32) {
    return dst == DataType::INT32 || dst == DataType::INT64 ...
} ...

And you can assert if the datatype is not one of the listed options (so this breaks if a new datatype is added a new datatype). Where is this function used btw?


lib/pcg/test/src/pcg/dataflow_graph/algorithms.cc line 49 at r2 (raw file):

  }

  TEST_CASE("topological_ordering") {

Should this PR test more complex cases or leave that for a future PR?

@lockshaw lockshaw requested a review from reyna-abhyankar June 18, 2024 15:44
Copy link
Collaborator Author

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 304 of 307 files reviewed, 4 unresolved discussions (waiting on @reyna-abhyankar)


lib/op-attrs/include/op-attrs/get_output_shapes.h line 119 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Should that also happen for other operators?

Eventually--this is being done incrementally as shape inference gets added for various operators


lib/op-attrs/include/op-attrs/ops/cast.h line 14 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Then we should do this for the rest of the operators (that can be a future todo, along with some of the other operator-specific changes made in this PR)

Yeah it's being incrementally done--each PR that adds shape inference moves the corresponding functions


lib/pcg/include/pcg/computation_graph_builder.h line 285 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Works for me.

Done


lib/pcg/test/src/pcg/parallel_computation_graph/parallel_computation_graph.cc line 5 at r2 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

What is the difference between this test and the test above?

This is for ParallelComputationGraph, not DataflowGraph


lib/op-attrs/src/op-attrs/datatype.cc line 24 at r1 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Instead of doing that, you can just manually compare each possibility. There's only 6 of them so it shouldn't be that much code

...
} else if (src == DataType::INT32) {
    return dst == DataType::INT32 || dst == DataType::INT64 ...
} ...

And you can assert if the datatype is not one of the listed options (so this breaks if a new datatype is added a new datatype). Where is this function used btw?

In ParallelComputationGraphBuilder::as_type and in shape inference for Cast: https://github.com/lockshaw/FlexFlow/blob/a9cee72c2d317c1dd36c63b0bb66880530f5691a/lib/pcg/src/pcg/parallel_computation_graph/parallel_computation_graph_builder.cc#L425-L427


lib/pcg/test/src/pcg/dataflow_graph/algorithms.cc line 49 at r2 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Should this PR test more complex cases or leave that for a future PR?

Leave it for now, DataflowGraph is getting moved into utils/graph and there is currently work on improving unit testing there, so this can get dealt with then

@lockshaw
Copy link
Collaborator Author

lib/op-attrs/src/op-attrs/datatype.cc line 24 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

In ParallelComputationGraphBuilder::as_type and in shape inference for Cast: https://github.com/lockshaw/FlexFlow/blob/a9cee72c2d317c1dd36c63b0bb66880530f5691a/lib/pcg/src/pcg/parallel_computation_graph/parallel_computation_graph_builder.cc#L425-L427

I misread your initial comment, this is now done

Copy link
Collaborator

@reyna-abhyankar reyna-abhyankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @lockshaw)


lib/op-attrs/test/src/datatype.cc line 30 at r3 (raw file):

        RC_PRE(can_strictly_promote_datatype_from_to(d1, d2));
        RC_PRE(can_strictly_promote_datatype_from_to(d2, d3));
        RC_ASSERT(!can_strictly_promote_datatype_from_to(d1, d3));

Maybe I don't know the exact syntax here, but shouldn't this be
RC_ASSERT(can_strictly_promote_datatype_from_to(d1, d3));?


lib/op-attrs/src/op-attrs/datatype.cc line 31 at r3 (raw file):

      break;
    case DataType::INT32:
      allowed = {DataType::INT64};

why not float or double for ints? or are we only allowing exact precision

Copy link
Collaborator Author

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 301 of 319 files reviewed, 2 unresolved discussions (waiting on @reyna-abhyankar)


lib/op-attrs/test/src/datatype.cc line 30 at r3 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Maybe I don't know the exact syntax here, but shouldn't this be
RC_ASSERT(can_strictly_promote_datatype_from_to(d1, d3));?

Yup! Fixed


lib/op-attrs/src/op-attrs/datatype.cc line 31 at r3 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

why not float or double for ints? or are we only allowing exact precision

That's what I'm currently doing. Tbh I don't have a super clear idea of exactly what the correct behavior here is, so I'm starting with this and I'll figure it out later 🙂

Copy link
Collaborator

@reyna-abhyankar reyna-abhyankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 18 of 18 files at r4, all commit messages.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @lockshaw)


lib/op-attrs/test/src/datatype.cc line 30 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Yup! Fixed

And did you figure out why it was passing before?


lib/op-attrs/src/op-attrs/datatype.cc line 31 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

That's what I'm currently doing. Tbh I don't have a super clear idea of exactly what the correct behavior here is, so I'm starting with this and I'll figure it out later 🙂

Fair enough. Though if bool can be promoted to anything, then shouldn't int32 also be promotable to everything except bool?

@lockshaw
Copy link
Collaborator Author

lib/op-attrs/src/op-attrs/datatype.cc line 31 at r3 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

Fair enough. Though if bool can be promoted to anything, then shouldn't int32 also be promotable to everything except bool?

Why should it?

Copy link
Collaborator

@reyna-abhyankar reyna-abhyankar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @lockshaw)


lib/op-attrs/src/op-attrs/datatype.cc line 31 at r3 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Why should it?

I guess if you're going for exact precision then it's fine as is.

Copy link
Collaborator Author

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @reyna-abhyankar)


lib/op-attrs/test/src/datatype.cc line 30 at r3 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

And did you figure out why it was passing before?

Yes, that's why stuff is now using RC_SUBCASE--the error wasn't getting properly handed over to doctest


lib/op-attrs/src/op-attrs/datatype.cc line 31 at r3 (raw file):

Previously, reyna-abhyankar (Reyna Abhyankar) wrote…

I guess if you're going for exact precision then it's fine as is.

I'm not entirely sure what the right answer here is, so I figured I'd err on the side of of being too conservative about the behavior and we can tweak if it ends up being unnecessarily restrictive

@reyna-abhyankar reyna-abhyankar self-requested a review June 19, 2024 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants