2975 Fix the perf issue of RandCropByPosNegLabel #3050

Nic-Ma · 2021-09-30T03:06:56Z

Fixes #2975 .

Description

This PR is followup of ticket #3038 , fixed the training slow down issue.
Now the training speed is same as the numpy version benchmark of 0.7 release (56s-58s with 21.08 docker, 52s-54s with 21.06 docker).
The main change is to avoid saving indices into GPU because we actually need to get the item() value in CPU and index the image to crop.

Status

Ready

Types of changes

Non-breaking change (fix or new feature that would not break existing functionality).
Breaking change (fix or new feature that would cause existing functionality to change).
New tests added to cover the changes.
Integration tests passed locally by running ./runtests.sh -f -u --net --coverage.
Quick tests passed locally by running ./runtests.sh --quick --unittests.
In-line docstrings updated.
Documentation updated, tested make html command in the docs/ folder.

merge master

Signed-off-by: Nic Ma <nma@nvidia.com>

Nic-Ma · 2021-09-30T03:07:08Z

/black

Signed-off-by: Nic Ma <nma@nvidia.com>

Nic-Ma · 2021-09-30T05:54:56Z

/black

Nic-Ma · 2021-09-30T06:00:20Z

BTW, as the numpy version unravel_index is slightly faster than PyTorch version, if running the fast training tutorial with numpy data, the total time can be 1s faster.

Thanks.

Signed-off-by: Nic Ma <nma@nvidia.com>

Nic-Ma · 2021-09-30T06:23:22Z

/black

rijobro · 2021-09-30T09:04:46Z

Now the training speed is same as the numpy version benchmark of 0.7 release

So with the data on the GPU, we're only as fast as the numpy implementation with all on the CPU?

BTW, as the numpy version unravel_index is slightly faster than PyTorch version

We could change the logic to only use torch if the data is already on the GPU. If on the CPU, use numpy regardless of whether input was torch or numpy:

if isinstance(x, torch.Tensor) and x.device is not torch.device("cpu"):
    torch.unravel
else:
    np.unravel

Nic-Ma · 2021-09-30T14:18:30Z

Hi @rijobro ,

Thanks for your review.
There are 2 data in this case: the image data and the bg / fg indices data.
If we move the image to GPU, it can help accelerate the training.
But if we put the bg / fg indices on GPU, actually it doesn't help, because when we use the indices to randomly crop image, we need to put the indices value back to CPU by torch.item(). That's why here I think we should always save the indices on CPU.
And about the unravel_index question, I would suggest to keep the logic simple and straightforward before we totally completed backends for all the transforms, just use torch APIs for tensor and numpy APIs for numpy array. Now it's just a slight difference, maybe next PyTorch version will improve it.

What do you think?

Thanks.

rijobro · 2021-09-30T15:57:09Z

@Nic-Ma sounds good, thanks for the explanations!

Signed-off-by: Nic Ma <nma@nvidia.com>

Nic-Ma and others added 7 commits February 1, 2021 19:15

Merge pull request #19 from Project-MONAI/master

42a45e0

merge master

Merge pull request #32 from Project-MONAI/master

cd16a13

merge master

Merge pull request #180 from Project-MONAI/dev

6f87afd

merge master

Merge pull request #214 from Project-MONAI/dev

f398298

merge master

Merge pull request #230 from Project-MONAI/dev

ac59586

merge master

Merge pull request #231 from Project-MONAI/dev

2a9d018

merge master

[DLMED] fix crop performance issue

1159655

Signed-off-by: Nic Ma <nma@nvidia.com>

Nic-Ma added 2 commits September 30, 2021 11:52

[DLMED] fix CI test

3f45312

Signed-off-by: Nic Ma <nma@nvidia.com>

[DLMED] use default to avoid overflow

5117aaf

Signed-off-by: Nic Ma <nma@nvidia.com>

[DLMED] fix CI error

ca674c4

Signed-off-by: Nic Ma <nma@nvidia.com>

rijobro approved these changes Sep 30, 2021

View reviewed changes

Nic-Ma enabled auto-merge (squash) September 30, 2021 14:18

Merge branch 'dev' into 2975-fix-spatial-crop

d20ffb9

Nic-Ma added 2 commits October 1, 2021 08:22

Merge branch 'dev' into 2975-fix-spatial-crop

01cf1ad

[DLMED] fix CI test issue

66c8ffa

Signed-off-by: Nic Ma <nma@nvidia.com>

Nic-Ma merged commit 7c46f8e into Project-MONAI:dev Oct 1, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2975 Fix the perf issue of RandCropByPosNegLabel #3050

2975 Fix the perf issue of RandCropByPosNegLabel #3050

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

rijobro commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

rijobro commented Sep 30, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

2975 Fix the perf issue of RandCropByPosNegLabel #3050

2975 Fix the perf issue of RandCropByPosNegLabel #3050

Uh oh!

Conversation

Nic-Ma commented Sep 30, 2021

Description

Status

Types of changes

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

rijobro commented Sep 30, 2021

Uh oh!

Nic-Ma commented Sep 30, 2021

Uh oh!

rijobro commented Sep 30, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants