Skip to content

Conversation

@ericspod
Copy link
Member

Signed-off-by: Eric Kerfoot eric.kerfoot@kcl.ac.uk

Fixes #1160.

Description

This introduces a number of changes to networks and their components to be compatible with Torchscript. Every network is being tested, except DynUNet, by converting it to a Torchscript script then back to a network before running a forward pass of synthetic data through it. This should ensure that a saved network represents a valid Pytorch object which can be reloaded and used without needing the original codebase. Going forward new networks need to check for Torchscript compatibility by including the sort of tests that have been added here, avoid using super in their definition, avoiding inheritance definitions that require super, and iterating over indices (eg. for i in range(t.shape[0])) in favour of using enumerate and zip when traversing lists.

The DynUNet is not currently compatible because it relies on iteration through lists of modules or the ListModule class. These are not currently compatible with Torchscript so a different architecture for this class is needed. It's possible to inherit from the existing UNet which is structured differently for this reason, however Torchscript does also have issues with the super keyword so a little more design thought is needed to account for the variation in the DynUNet.forward method which would need to use super in such a case.

Changes were made to the squeeze and excitation classes to also account for Torchscript's lack of support for forward. Other things not supported and removed include the reverse function and Tensor.ndimension method.

The return types for some methods of UNet were also made more general to accommodate future subtypes which might override these methods to return different subtypes of nn.Module. Other classes might need to be modified in a similar way in the future for the same reason.

Status

Ready

Types of changes

  • Non-breaking change (fix or new feature that would not break existing functionality).
  • Breaking change (fix or new feature that would cause existing functionality to change).
  • New tests added to cover the changes.
  • Integration tests passed locally by running ./runtests.sh --codeformat --coverage.
  • Quick tests passed locally by running ./runtests.sh --quick.
  • In-line docstrings updated.
  • Documentation updated, tested make html command in the docs/ folder.

Signed-off-by: Eric Kerfoot <eric.kerfoot@kcl.ac.uk>
@ericspod ericspod requested review from Nic-Ma and wyli November 18, 2020 13:56
@wyli
Copy link
Contributor

wyli commented Nov 18, 2020

/integration-test

@wyli
Copy link
Contributor

wyli commented Nov 18, 2020

@ericspod the timeout limit is set to 60min, so I don't think the windows test fails because of that. I suspect it's multiple test cases open the same pretrained model files but not closing them properly which might cause issue on Windows

@ericspod ericspod merged commit 71b6794 into master Nov 18, 2020
@ericspod ericspod deleted the torchscript_fix branch November 18, 2020 14:43
@rijobro
Copy link
Contributor

rijobro commented Nov 19, 2020

@wyli where is the timeout set? I've just run locally and the failing test takes over 60 seconds, I wondered if there was a mix up between minutes and seconds in the limit? Not likely, I know, but just thought I'd check!

For info:

test_script_4 (test_senet.TestSENET) (25.7s)
test_script_1 (test_senet.TestSENET) (26.6s)
test_script_3 (test_senet.TestSENET) (46.1s)
test_script_2 (test_senet.TestSENET) (49.5s)
test_script_5 (test_senet.TestSENET) (50.4s)
test_script_0 (test_senet.TestSENET) (84.1s)

@wyli
Copy link
Contributor

wyli commented Nov 19, 2020

shouldn't be the issue, the setting is here:

timeout-minutes: 60
@rijobro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add TorchScript support for networks

4 participants