Load modules lazily#575
Conversation
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
|
@hakonanes This is very nice, thanks for helping with this! Is orix.quaternion then compiled the first time that it is used? I'm trying to remember exactly how that works. So something like this?? Edit I guess the numba code isn't really the slow part. It's just matplotlib. from orix.quaternion import Rotation # this is now really fast
r = Rotation([1,2,3,4]) # this is now slow becuase the numba code is compiling
r2 = Rotation([0,0,0,1]) # this is now fast
r1 *r2 # This is also fast |
|
I guess my one gripe with the lazy loading is that it just feels like it's moving the blame around for who takes a long time to import. People used to get frusterated with how long it took hyperspy to load, when most of the time was matplotlib scipy etc. But all of those packages were using lazy loading so they weren't being blamed because importing them didn't take any time at all! I guess the last package to not implement lazy loading gets stuck "holding the buck" so to speak. Most of the time is spent loading I guess my question is are we actually lazily loading anything or does almost every class import The other question is does it make sense to make contributing to the package more difficult to just to make users feel better. Just some thoughts. I like the idea of lazy loading, but sometimes I think it's almost like a shell game of hiding your imports. |
I started to look at lazy loading in the last couple of days, because pyxem is slow to load, which is particularly inconvenient for testing code changes or running test suite, etc.!!! In my case, there are several reasons for it to be slow:
Generally, lazy import very useful, because it defers to import library for when they are necessary and quite often these libraries are not needed. Also, they are usually needed in functions which are not particularly fast. |
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
|
Valid points, @CSSFrancis. Introducing lazy loading may not be revolutionary for the end user who uses functionality that uses most "heavy" modules. But as @ericpre points out, it would be a benefit to downstream packages, like pyxem.
Yes. Keeping the |
CSSFrancis
left a comment
There was a problem hiding this comment.
Some small comments otherwise this is good! I must say a lot of my hesistation is gone with the .pyi files. I hadn't realized that was another option for defining lazy imports. I don't know if it is less complicated than it was but it feels less complicated.
I think I'm convinced :) I'll try to make the changes to pyxem/didffsims as well which should help.
| author = "orix developers" | ||
| copyright = f"2018-{str(datetime.now().year)}, {author}" | ||
| release = orix.__version__ | ||
| release = orix_version |
There was a problem hiding this comment.
Full project version: https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-release
| New imports go in the ``__init__.pyi`` "stub files", *not* in the ``__init__.py`` files. | ||
| Basically, nothing should go in the ``__init__.py`` files except the lazy loading | ||
| functionality. | ||
|
|
There was a problem hiding this comment.
Maybe add something like:
| In the __init__.py file: | |
| .. code-block:: python | |
| import lazy_loader | |
| __getattr__, __dir__, __all__ = lazy_loader.attach_stub(__name__, __file__) | |
| del lazy_loader |
It might also be good to explain how the init.pyi files should be set up. The getattr and dir are for attribute acess right and dir is for tab completion right?
There was a problem hiding this comment.
I've added the following explanation:
The returns from lazy loader are:
- ``__getattr__``: function to access names defined by the module
- ``__dir__``: list of names a module defines
- ``__all__``: list of module, class, or function names that should be imported when
``from package import *`` is encounteredSigned-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
|
Thank you for the review, @CSSFrancis! I'll merge once tests pass. |
|
Actually, before that, @argerlt, do you want to have a look? |
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
…m#572) Signed-off-by: Håkon Wiik Ånes <hwaanes@gmail.com>
Description of the change
This PR addresses #572 by loading modules lazily using lazy-loader. This becomes a new required dependency.
A new contributor guide briefly explains lazy loading and why we use it.
Running the following line, which requires the tuna package,
python -X importtime -c "import orix.quaternion" 2> oqu.txt && tuna oqu.txt, produces the following browser output without lazy loading:And with lazy loading (this PR):
We see a speed-up of roughly 33x for the quaternion module.
Other minor changes:
Progress of the PR
For reviewers
__init__.py.section in
CHANGELOG.rst.__credits__inorix/__init__.pyand in.zenodo.json.