Skip to content

Conversation

@larsoner
Copy link
Member

@larsoner larsoner commented Dec 20, 2019

I think we should move to using tqdm instead of our own internal implementation. I think it's more stable in the long run and less code for us to maintain. It also gets us nice things like time estimates for completion for free.

This PR puts tqdm in externals and uses it for progress bars. Some sample code to test:

Details
import os
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
import mne
from mne.datasets import sample
from mne.decoding import GeneralizingEstimator

# All the functions that currently use ProgressBar
try:
    os.remove('/tmp/null.part')
except Exception:
    pass
mne.utils._fetch_file('https://github.com/mne-tools/mne-python/archive/v0.19.1.zip', '/tmp/null', verbose=True)

data_path = sample.data_path()
raw_fname = data_path + '/MEG/sample/sample_audvis_filt-0-40_raw.fif'
events_fname = data_path + '/MEG/sample/sample_audvis_filt-0-40_raw-eve.fif'
raw = mne.io.read_raw_fif(raw_fname, preload=True)
picks = mne.pick_types(raw.info, meg=True, exclude='bads')  # Pick MEG channels
events = mne.read_events(events_fname)
event_id = {'Auditory/Left': 1, 'Auditory/Right': 2,
            'Visual/Left': 3, 'Visual/Right': 4}
epochs = mne.Epochs(raw, events, event_id, picks=picks, preload=True)
clf = make_pipeline(StandardScaler(), LogisticRegression(solver='lbfgs'))
time_gen = GeneralizingEstimator(clf, scoring='roc_auc', n_jobs=2,
                                 verbose=True)
time_gen.fit(X=epochs['Left'].get_data(),
             y=epochs['Left'].events[:, 2] > 2)
scores = time_gen.score(X=epochs['Right'].get_data(),
                        y=epochs['Right'].events[:, 2] > 2)

Produces on master:

[............................................................] 100.00% ( 60.7 MB,  20.9 MB/s) -
File saved as /tmp/null.

[............................................................] 100.00% Fitting GeneralizingEstimator |
[............................................................] 100.00% Scoring GeneralizingEstimator |

And on this PR:

Downloading https://codeload.github.com/mne-tools/mne-python/zip/v0.19.1 (60.7 MB)
100%|██████████████████████████████████████████| 60.7M/60.7M [00:03<00:00, 16.2MB/s]
File saved as /tmp/null.

100%|██████████████| Fitting GeneralizingEstimator :106/106 [00:01<00:00, 70.92it/s]
100%|████████| Scoring GeneralizingEstimator : 11236/11236 [00:10<00:00, 1034.17it/s]

tqdm is very flexible in terms of how it displays, so we can tweak this aesthetically if we want.

Closes #7418

@agramfort
Copy link
Member

agramfort commented Dec 20, 2019 via email

@larsoner
Copy link
Member Author

From a quick look, it appears to be "try to import tqdm, and if we can't, use our own basic implementation". The advantages I see over us vendoring tqdm is that:

  1. We would benefit from tqdm updates without us having to update our own copies
  2. We don't add ~3k lines of code to externals

The downsides would be:

  1. We end up maintaining our own version of some progress-bar-like thing
  2. No guaranteed end-user experience (in fact, guaranteed sub-optimal experience if they don't have tqdm)

So on balance I think it's better to add to externals.

There is a third approach -- if we're going to have some fallback experience, I'd rather mv mne/externals/tqdm mne/externals/_tqdm in this PR do this in mne/externals:

try:
    import tqdm
except ImportError:
    import _tqdm as tqdm

i.e., make our fallback version some known vendored version of tqdm. I don't see any advantage of the pytorch approach over this third one.

@codecov
Copy link

codecov bot commented Dec 20, 2019

Codecov Report

Merging #7155 into master will decrease coverage by 0.00%.
The diff coverage is 98.13%.

@@            Coverage Diff             @@
##           master    #7155      +/-   ##
==========================================
- Coverage   90.08%   90.08%   -0.01%     
==========================================
  Files         454      453       -1     
  Lines       82610    82566      -44     
  Branches    13066    13060       -6     
==========================================
- Hits        74422    74379      -43     
+ Misses       5361     5359       -2     
- Partials     2827     2828       +1     

@cbrnr
Copy link
Contributor

cbrnr commented Dec 20, 2019

👍 for using tqdm. What is the idea of mne/externals? To avoid dependencies that need to be user-installed? Are packages in mne/externals ever synced with upstream?

@larsoner
Copy link
Member Author

What is the idea of mne/externals? To avoid dependencies that need to be user-installed?

Yes

Are packages in mne/externals ever synced with upstream?

Usually we only bother when we find bugs

@jasmainak
Copy link
Member

tqdm is a very light dependency and I have almost never had any issues installing it.

@larsoner
Copy link
Member Author

I guess a fourth option would be to add tqdm to our requirements.txt and environment.yml, and then in ProgressBar try importing it and if it's not present, just don't show a progressbar. That's a simple way to go.

@jasmainak
Copy link
Member

@larsoner this will likely break autoreject because it uses mne.ProgressBar.

@larsoner
Copy link
Member Author

@jasmainak the API is only changed a little bit Looks like the autoreject use is:

pbar = ProgressBar(iterable, mesg=desc, spinner=True)

spinner isn't relevant anymore but it can easily be discarded from kwargs, so I'll just do that. Then you should be okay, right?

@jasmainak
Copy link
Member

okay I see and tqdm now becomes a dependency?

@larsoner
Copy link
Member Author

That's one of the four(ish?) options discussed above. I think the cleanest is probably to make tqdm a soft dependency: if you want progressbars in MNE you need it, without it you don't get any. And we can add it to requirements.txt and environment.yml.

@larsoner
Copy link
Member Author

I think the cleanest is probably to make tqdm a soft dependency: if you want progressbars in MNE you need it, without it you don't get any. And we can add it to requirements.txt and environment.yml.

I started implementing this and I actually don't like it -- it makes us put in logic for whether or not tqdm actually exists because we need things like ProgressBar.__iter__ to work. Currently we can just wrap to tqdm's iter, but if tqdm isn' there, we have to iterate ourselves, etc.

So I'd prefer to try to use the system one, and fallback to the copy in mne/externals if the system one is not present.

@larsoner larsoner changed the title MAINT, ENH: Move to tqdm MRG, ENH: Move to tqdm Dec 21, 2019
@jasmainak
Copy link
Member

I see, since you are putting a fallback option in externals I guess we don't need to do anything on autoreject end!

@agramfort
Copy link
Member

@larsoner are you sure that our ProgressBar was the cause of the travis hanging?

@larsoner
Copy link
Member Author

I don't think that it was.

I still think we get both a maintainability and a usability benefit from this PR.

spinner : bool
Show a spinner. Useful for long-running processes that may not
increment the progress bar very often. This provides the user with
feedback that the progress has not stalled.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we loose this benefit with tqdm ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't do a spinner. It does continuous time estimates, which are better and still update continuously.

@jasmainak
Copy link
Member

@larsoner does the progressbar still work in notebooks?

@larsoner
Copy link
Member Author

@larsoner does the progressbar still work in notebooks?

Haven't tested it. I'm willing to trust tqdm to do whatever is best here. But feel free to try it. I know they have some fancy stylized HTML stuff. Maybe it's automagical?

@agramfort
Copy link
Member

agramfort commented Dec 26, 2019 via email

@larsoner
Copy link
Member Author

What fixing something that is not broken?

I actually started our updating ProgressBar to do some nicer things (such as have better aesthetics), found one or two bugs (spinner symbols showing up when they shouldn't; couldn't change the characters or spacing easily), and realized that tqdm did not have these bugs.

I see little benefit and a 3000 lines PR will a new dependency.

There are more red than green lines to code we maintain. The +3000 is externals. So this really should lower our maintenance burden.

So from my perspective:

  1. Less code for us to maintain. Yes we have to do externals but this is easier than fixing bugs and adding features ourselves.
  2. Additional features that we don't have. These include time estimates (which are awesome IMO) and notebook support (which you have to activate, but still).

Having stuff in externals in not free either as it generates extra packaging cost for debian maintainers

Sure, that's understandable. I still think taking everything into account it's a lower maintenance cost for us overall.

@cbrnr
Copy link
Contributor

cbrnr commented Jan 6, 2020

Re packaging costs, would it be a good idea to enable using (importing) the real external package if installed? Only if the package is not installed we could resort the version from externals.

@larsoner
Copy link
Member Author

larsoner commented Jan 6, 2020

Re packaging costs, would it be a good idea to enable using (importing) the real external package if installed? Only if the package is not installed we could resort the version from externals.

That is what's currently done in this version of this PR.

I think overall this approach reduces our maintenance costs compared to what we currently have in master, as we can rely on tqdm to maintain progress bar code (and just copy-paste updates to our vendored version if we need them, which should be rare and easy when needed) and brings us usability improvements along with it.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other than a couple comments LGTM. In general I would prefer a soft dependency where missing system tqdm would just not show progress bars at all, but @larsoner said that added a bunch of ugly conditionals into MNE code so I'm fine with having a frozen version of tqdm in externals instead, as it makes the code that we actively maintain cleaner.

@cbrnr
Copy link
Contributor

cbrnr commented Jan 8, 2020

Re packaging costs, would it be a good idea to enable using (importing) the real external package if installed? Only if the package is not installed we could resort the version from externals.

That is what's currently done in this version of this PR.

@larsoner alright, so this is perfect then! I didn't check this PR for whatever reason but some other externals dependency (I think it was pymatreader) where this is not the case. I think it would be nice if all dependencies in externals were really optional and the real packages would be preferred if installed.

@larsoner
Copy link
Member Author

Looks like this was fixed in 4.36 so we just need to set some minimum version:

tqdm/tqdm@5d373de

@drammock can you verify that you're on 4.35 or lower?

@drammock
Copy link
Member

oops. Yes, I'm on an old version of tqdm (version 4.32.1). Updating tqdm fixed it.

@larsoner
Copy link
Member Author

Pushed a commit to do a version check (use our externals if the system version is too old) and also style "fixes" to unify how the tqdm and native versions look, and update description of #7418 to reflect the changes

@cbrnr
Copy link
Contributor

cbrnr commented Mar 11, 2020

I'm 👍 for merging this PR (instead of #7418). This seems to be ready so let's do it (I'm already waiting really hard for 0.20 to be released)!

@larsoner
Copy link
Member Author

Okay used a simpler mechanism for our parallel progress bar support, this is how it looks on Jupyter notebooks now (doesn't even work in master currently, let alone a nice HTML representation):

Screenshot from 2020-03-18 14-11-45

@agramfort
Copy link
Member

ok no more bugs to report to @larsoner on my side.

anyone else what's to test before we merge?

@larsoner
Copy link
Member Author

@drammock do you want to try the snippet above in terminal and notebook and merge if you're happy?

@drammock drammock merged commit 361fa5f into mne-tools:master Mar 19, 2020
@drammock
Copy link
Member

thanks @larsoner!

@larsoner larsoner deleted the tqdm branch March 19, 2020 17:42
@agramfort
Copy link
Member

agramfort commented Mar 19, 2020 via email

@massich
Copy link
Contributor

massich commented Mar 19, 2020

I liked reading this one !! kudos to everyone !! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants