[Docs] update the PT 2.0 optimization doc with latest findings #3370

sayakpaul · 2023-05-09T05:17:40Z

No description provided.

HuggingFaceDocBuilderDev · 2023-05-09T05:22:49Z

The documentation is not available anymore as the PR was closed or merged.

docs/source/en/optimization/torch2.0.mdx

patrickvonplaten · 2023-05-09T10:55:37Z

docs/source/en/optimization/torch2.0.mdx

+We conducted a comprehensive benchmark with PyTorch 2.0's efficient attention implementation and `torch.compile` across different GPUs and different batch sizes for five of our most used pipelines. 
+In the following tables, we report our findings in terms of the number of iterations processed per second. 
+
+### A100 (batch size: 1)


Let's also add RTX4090 and T4 no?

Think people will be quite interested in the home GPUs here

There is a reason why the PR is in the draft mode :-)

docs/source/en/optimization/torch2.0.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

sayakpaul · 2023-05-09T15:23:42Z

@patrickvonplaten @pcuenca this is now ready for review!

pcuenca

Awesome! Do you want me to regenerate some of the plots we used for the blog post? Let me know if you'd like to use any of them here and I'll prepare them.

docs/source/en/optimization/torch2.0.mdx

pcuenca · 2023-05-10T08:00:20Z

docs/source/en/optimization/torch2.0.mdx

 ```

-## Using accelerated transformers and torch.compile.
+## Using accelerated transformers and `torch.compile`.


I think the backquotes were not properly shown in headings when docs were generated? Let's just keep an eye on it and remove them if we see anything weird after merging :)

docs/source/en/optimization/torch2.0.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

sayakpaul · 2023-05-10T10:44:10Z

Do you want me to regenerate some of the plots we used for the blog post?

Sure, that'd be very helpful!

patrickvonplaten

Love it 😍 Super clean and nicely written.
I'm sure the community will love this!

Some graphs could definitely help (maybe we could use those as well to update the official PyTorch blog post cc @pcuenca )

Co-authored-by: Pedro <pedro@huggingface.co>

sayakpaul · 2023-05-12T08:14:10Z

For the failing test: #3397 (comment)

pcuenca

Amazing, thanks a lot @sayakpaul!

pcuenca · 2023-05-12T08:15:43Z

docs/source/en/optimization/torch2.0.mdx

    ```

-    Depending on the type of GPU, `compile()` can yield between 2-9% of _additional speed-up_ over the accelerated transformer optimizations. Note, however, that compilation is able to squeeze more performance improvements in more recent GPU architectures such as Ampere (A100, 3090), Ada (4090) and Hopper (H100).
+    Depending on the type of GPU, `compile()` can yield between **3% - 56%** of _additional speed-up_ over the accelerated transformer optimizations. Note, however, that compilation is able to squeeze more performance improvements in more recent GPU architectures such as Ampere (A100, 3090), Ada (4090) and Hopper (H100).


It's actually much more than that, especially in the case of IF. I think the percent computation was wrong in the excel, so I changed it. For example, in A100 (BS 1), looking at txt2img in the stable version it goes from 21.66 it/s to 44.03 it/s, which is a bit more than double. Hence, the percent improvement is 103. Can you please double check @sayakpaul?

We could for example say that we get up to twice as many iterations per second, or almost 5x in the case of IF Stage I.

I'm also curious and surprised that the improvement is so big (especially in IF)! Do you have any insight on that?

I did 100 * (b - a) / b following https://docs.google.com/spreadsheets/d/1LrltKSgZyOZiLQ7n7_GvIl_BoED-AHeVOJYBc_QqxXA/edit#gid=0

I'm also curious and surprised that the improvement is so big (especially in IF)! Do you have any insight on that?

It's probably just better suited for tiling and fusion and collectively they might contribute in improving the overall arithmetic density while minimizing the memory transfers. But need to profile to say for sure.

You're actually right. Let me edit and merge.

docs/source/en/optimization/torch2.0.mdx

pcuenca · 2023-05-12T10:13:18Z

We may want to update this section in a follow up PR: https://huggingface.co/docs/diffusers/stable_diffusion#next-steps. Writing it down here so I don't forget.

patrickvonplaten · 2023-05-12T11:53:59Z

Cool should we merge this one?

sayakpaul · 2023-05-12T12:03:33Z

Waiting for @pcuenca to clarify:

#3370 (comment)

add: benchmarking stats for A100 and V100.

befdb36

sayakpaul requested review from patrickvonplaten and pcuenca May 9, 2023 05:17

patrickvonplaten reviewed May 9, 2023

View reviewed changes

docs/source/en/optimization/torch2.0.mdx Outdated Show resolved Hide resolved

patrickvonplaten reviewed May 9, 2023

View reviewed changes

docs/source/en/optimization/torch2.0.mdx Outdated Show resolved Hide resolved

patrickvonplaten reviewed May 9, 2023

View reviewed changes

docs/source/en/optimization/torch2.0.mdx Outdated Show resolved Hide resolved

patrickvonplaten reviewed May 9, 2023

View reviewed changes

docs/source/en/optimization/torch2.0.mdx Show resolved Hide resolved

sayakpaul and others added 4 commits May 9, 2023 20:14

Apply suggestions from code review

4a4696f

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

address patrick's comments.

89a941d

add: rtx 4090 stats

268c292

⚔ benchmark reports done

3a01835

sayakpaul marked this pull request as ready for review May 9, 2023 15:23

pcuenca reviewed May 10, 2023

View reviewed changes

Apply suggestions from code review

60bbbbb

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

3313 pr link.

9889051

patrickvonplaten approved these changes May 11, 2023

View reviewed changes

sayakpaul and others added 2 commits May 12, 2023 09:50

Merge branch 'main' into update-pt2-docs

9992ac8

add: plots.

0f14d1a

Co-authored-by: Pedro <pedro@huggingface.co>

sayakpaul requested a review from pcuenca May 12, 2023 04:41

fix formattimg

07dc8fe

pcuenca approved these changes May 12, 2023

View reviewed changes

sayakpaul mentioned this pull request May 12, 2023

[Docs] Update https://huggingface.co/docs/diffusers/stable_diffusion#next-steps. Writing it down here so I don't forget. #3412

Closed

sayakpaul added 2 commits May 13, 2023 15:01

Merge branch 'main' into update-pt2-docs

9fc23c4

update number percent.

b6de6a6

sayakpaul merged commit bdefabd into main May 13, 2023

sayakpaul deleted the update-pt2-docs branch May 13, 2023 09:42

[Docs] update the PT 2.0 optimization doc with latest findings #3370

[Docs] update the PT 2.0 optimization doc with latest findings #3370

Uh oh!

Conversation

sayakpaul commented May 9, 2023

Uh oh!

HuggingFaceDocBuilderDev commented May 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

patrickvonplaten May 9, 2023

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 9, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented May 9, 2023

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pcuenca May 10, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

sayakpaul commented May 10, 2023

Uh oh!

patrickvonplaten left a comment

Choose a reason for hiding this comment

Uh oh!

sayakpaul commented May 12, 2023

Uh oh!

pcuenca left a comment

Choose a reason for hiding this comment

Uh oh!

pcuenca May 12, 2023

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 12, 2023

Choose a reason for hiding this comment

Uh oh!

sayakpaul May 13, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

pcuenca commented May 12, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

patrickvonplaten commented May 12, 2023

Uh oh!

sayakpaul commented May 12, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

HuggingFaceDocBuilderDev commented May 9, 2023 •

edited

Loading

pcuenca commented May 12, 2023 •

edited

Loading