Skip to content

Pool2D parallel shape inference#9

Closed
Marsella8 wants to merge 12 commits intolockshaw:pcg-serializationfrom
Marsella8:pcg-serialization
Closed

Pool2D parallel shape inference#9
Marsella8 wants to merge 12 commits intolockshaw:pcg-serializationfrom
Marsella8:pcg-serialization

Conversation

@Marsella8
Copy link
Copy Markdown

@Marsella8 Marsella8 commented May 27, 2024

Description of changes:
Added Pool2D parallel shape inference

Related Issues:

Linked Issues:

  • Issue #

Issues closed by this PR:

  • Closes #

Before merging:

  • Did you update the flexflow-third-party repo, if modifying any of the Cmake files, the build configs, or the submodules?

This change is Reviewable

@lockshaw lockshaw self-requested a review May 27, 2024 16:22
Copy link
Copy Markdown
Owner

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add tests? You can find an example in https://github.com/lockshaw/FlexFlow/blob/88709f056892bdd90ce3a6a6c5060d0b91966df6/lib/op-attrs/test/src/test_batch_matmul.cc

Reviewed 8 of 8 files at r1, all commit messages.
Reviewable status: :shipit: complete! all files reviewed, all discussions resolved (waiting on @Marsella8)

Copy link
Copy Markdown
Author

@Marsella8 Marsella8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test has been added

Reviewable status: 8 of 9 files reviewed, all discussions resolved (waiting on @lockshaw)

Copy link
Copy Markdown
Owner

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Marsella8)


lib/op-attrs/test/src/test_pool_2d.cc line 63 at r2 (raw file):

    CHECK(result == correct_output_shape);
  }
}

Tests for parallel shape inference (i.e., get_output_shape(Pool2DAttrs, ParallelTensorShape)) are also needed--see https://github.com/lockshaw/FlexFlow/blob/1a6d1f8d9de10f1d9159686c61b13e07b71671b6/lib/op-attrs/test/src/test_attention.cc for example

Copy link
Copy Markdown
Owner

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Marsella8)

Copy link
Copy Markdown
Author

@Marsella8 Marsella8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 10 files reviewed, 1 unresolved discussion (waiting on @lockshaw)


lib/op-attrs/test/src/test_pool_2d.cc line 63 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Tests for parallel shape inference (i.e., get_output_shape(Pool2DAttrs, ParallelTensorShape)) are also needed--see https://github.com/lockshaw/FlexFlow/blob/1a6d1f8d9de10f1d9159686c61b13e07b71671b6/lib/op-attrs/test/src/test_attention.cc for example

added tests for get_output_shape(Pool2DAttrs, ParallelTensorShape). I'm still not entirely confident in how the shapes change when there is parallelization involved, so let me know if you see any glaring issues. I can briefly work on it later tonight if some additional changes are needed.

Copy link
Copy Markdown
Owner

@lockshaw lockshaw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r4, all commit messages.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @Marsella8)


lib/op-attrs/test/src/test_pool_2d.cc line 63 at r2 (raw file):

Previously, Marsella8 wrote…

added tests for get_output_shape(Pool2DAttrs, ParallelTensorShape). I'm still not entirely confident in how the shapes change when there is parallelization involved, so let me know if you see any glaring issues. I can briefly work on it later tonight if some additional changes are needed.

Ideally have tests checking each of the individual parallelization schemes, as in

SUBCASE("data parallelism") {
int degree = 4;
ParallelTensorShape par_input = make_input(SumDegree{1}, DiscardCopyDegree{1}, degree, 1);
{
tl::expected<ParallelTensorShape, std::string> result = get_output_shape(attrs, par_input);
tl::expected<ParallelTensorShape, std::string> correct = make_output(SumDegree{1}, DiscardCopyDegree{1}, degree, 1);
CHECK(result == correct);
}
{
tl::expected<ParallelTensorShape, std::string> result = get_weights_shape(attrs, par_input);
tl::expected<ParallelTensorShape, std::string> correct = make_weights(SumDegree{1}, DiscardCopyDegree{degree}, 1, 1);
CHECK(result == correct);
}
}
SUBCASE("input features parallelism") {
int degree = 4;
ParallelTensorShape input = make_input(SumDegree{1}, DiscardCopyDegree{1}, 1, degree);
{
tl::expected<ParallelTensorShape, std::string> result = get_output_shape(attrs, input);
tl::expected<ParallelTensorShape, std::string> correct = make_output(SumDegree{degree}, DiscardCopyDegree{1}, 1, 1);
CHECK(result == correct);
}
{
tl::expected<ParallelTensorShape, std::string> result = get_weights_shape(attrs, input);
tl::expected<ParallelTensorShape, std::string> correct = make_weights(SumDegree{1}, DiscardCopyDegree{degree}, 1, 1);
CHECK(result == correct);
}
}
SUBCASE("output channel shard parallelism") {
// NOTE (@lockshaw): in the current (parallel shape inference from just input tensor) representation we have to choose between
// either parallelism in the weight channel dimension or in the weight entry dimension. For now we choose to represent
// parallelism in the channel dimension, but partitioning in the entry dimension is also potentially useful as it produces
// sum parallelism in the output
int degree = 4;
ParallelTensorShape input = make_input(SumDegree{1}, DiscardCopyDegree{degree}, 1, 1);
{
tl::expected<ParallelTensorShape, std::string> result = get_output_shape(attrs, input);
tl::expected<ParallelTensorShape, std::string> correct = make_output(SumDegree{1}, DiscardCopyDegree{1}, 1, degree);
CHECK(result == correct);
}
{
tl::expected<ParallelTensorShape, std::string> result = get_weights_shape(attrs, input);
tl::expected<ParallelTensorShape, std::string> correct = make_weights(SumDegree{1}, DiscardCopyDegree{1}, 1, degree);
CHECK(result == correct);
}
}
, as it makes it clearer what is actually being tested

Copy link
Copy Markdown
Author

@Marsella8 Marsella8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 8 of 10 files reviewed, 1 unresolved discussion (waiting on @lockshaw)


lib/op-attrs/test/src/test_pool_2d.cc line 63 at r2 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Ideally have tests checking each of the individual parallelization schemes, as in

SUBCASE("data parallelism") {
int degree = 4;
ParallelTensorShape par_input = make_input(SumDegree{1}, DiscardCopyDegree{1}, degree, 1);
{
tl::expected<ParallelTensorShape, std::string> result = get_output_shape(attrs, par_input);
tl::expected<ParallelTensorShape, std::string> correct = make_output(SumDegree{1}, DiscardCopyDegree{1}, degree, 1);
CHECK(result == correct);
}
{
tl::expected<ParallelTensorShape, std::string> result = get_weights_shape(attrs, par_input);
tl::expected<ParallelTensorShape, std::string> correct = make_weights(SumDegree{1}, DiscardCopyDegree{degree}, 1, 1);
CHECK(result == correct);
}
}
SUBCASE("input features parallelism") {
int degree = 4;
ParallelTensorShape input = make_input(SumDegree{1}, DiscardCopyDegree{1}, 1, degree);
{
tl::expected<ParallelTensorShape, std::string> result = get_output_shape(attrs, input);
tl::expected<ParallelTensorShape, std::string> correct = make_output(SumDegree{degree}, DiscardCopyDegree{1}, 1, 1);
CHECK(result == correct);
}
{
tl::expected<ParallelTensorShape, std::string> result = get_weights_shape(attrs, input);
tl::expected<ParallelTensorShape, std::string> correct = make_weights(SumDegree{1}, DiscardCopyDegree{degree}, 1, 1);
CHECK(result == correct);
}
}
SUBCASE("output channel shard parallelism") {
// NOTE (@lockshaw): in the current (parallel shape inference from just input tensor) representation we have to choose between
// either parallelism in the weight channel dimension or in the weight entry dimension. For now we choose to represent
// parallelism in the channel dimension, but partitioning in the entry dimension is also potentially useful as it produces
// sum parallelism in the output
int degree = 4;
ParallelTensorShape input = make_input(SumDegree{1}, DiscardCopyDegree{degree}, 1, 1);
{
tl::expected<ParallelTensorShape, std::string> result = get_output_shape(attrs, input);
tl::expected<ParallelTensorShape, std::string> correct = make_output(SumDegree{1}, DiscardCopyDegree{1}, 1, degree);
CHECK(result == correct);
}
{
tl::expected<ParallelTensorShape, std::string> result = get_weights_shape(attrs, input);
tl::expected<ParallelTensorShape, std::string> correct = make_weights(SumDegree{1}, DiscardCopyDegree{1}, 1, degree);
CHECK(result == correct);
}
}
, as it makes it clearer what is actually being tested

I'm still unsure on how to model the kernel tensor (and how to infer the shape). Is it only (width, height) or do I also need to include the input channel size as a dimension?

@lockshaw
Copy link
Copy Markdown
Owner

lockshaw commented Jun 7, 2024

lib/op-attrs/test/src/test_pool_2d.cc line 63 at r2 (raw file):

Previously, Marsella8 wrote…

I'm still unsure on how to model the kernel tensor (and how to infer the shape). Is it only (width, height) or do I also need to include the input channel size as a dimension?

I don't think you have a kernel tensor here, as Pool2D doesn't have any weights, so all you'd need to check is get_output_shape. Other operators, e.g., Conv2D, have one or more weights, so they have additional functions to check (for Conv2D, get_kernel_shape, get_bias_shape, get_output_shape)

@lockshaw
Copy link
Copy Markdown
Owner

Superseded by flexflow#1495

@lockshaw lockshaw closed this Sep 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants