Skip to content

Conversation

@larsoner
Copy link
Member

@larsoner larsoner commented Oct 9, 2019

At the sprint @olafhauk mentioned that not having -autobad support was what he was missing in his Python pipeline. This adds support for it.

@larsoner
Copy link
Member Author

larsoner commented Oct 9, 2019

@drammock
Copy link
Member

drammock commented Oct 9, 2019

tutorial LGTM. I didn't really have time for more than a skim of the code changes.

@codecov
Copy link

codecov bot commented Oct 9, 2019

Codecov Report

Merging #6940 into master will increase coverage by 0.04%.
The diff coverage is 95.53%.

@@            Coverage Diff             @@
##           master    #6940      +/-   ##
==========================================
+ Coverage   89.92%   89.96%   +0.04%     
==========================================
  Files         453      451       -2     
  Lines       82111    81731     -380     
  Branches    12999    12968      -31     
==========================================
- Hits        73840    73531     -309     
+ Misses       5445     5374      -71     
  Partials     2826     2826

@agramfort
Copy link
Member

@larsoner did you test on a number of datasets to which point this implementation matches with maxfilter output?

@agramfort
Copy link
Member

@olafhauk maybe you can give it a try?

@larsoner
Copy link
Member Author

I tried it on sample and got the same result. On other files I tested, in general one or two channels can differ but I think it mostly has to do with:

  1. MaxFilter sometimes processing multiple buffers even though it says it processes them one at a time (or somehow treating time differently)
  2. They probably band-pass differently (maybe just a frequency-domain zeroing?)

If I tweak duration and limit a bit I can usually make things match. One or two channel differences for channels near the limit boundary are to be expected I think.

@agramfort
Copy link
Member

I can test on cam-can data but I would need the bads found by original maxfilter program. @olafhauk @dengemann @SherazKhan do you know where I can find this?

Copy link
Contributor

@wmvanvliet wmvanvliet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool PR!

@agramfort
Copy link
Member

@wmvanvliet if you have access to original maxfilter code can you test this PR on some of your dataset? thanks

@wmvanvliet
Copy link
Contributor

What procedure do you want me to use to check maxwell vs mne output? Here is my first attempt, giving me different bad channels: https://gist.github.com/wmvanvliet/a3944ad425f28e33fba836315f78169d

@agramfort
Copy link
Member

agramfort commented Oct 16, 2019 via email

@wmvanvliet
Copy link
Contributor

Here are some results: https://pastebin.com/KafKgLH3
Output is quite different.

@larsoner
Copy link
Member Author

For flats you need to use mne.preprocessing.mark_flat as a first step. Without it the reconstructions will be different so differences are expected. I can add this to the instructions. BTW it's much faster if you do raw.resample(100).filter(0.1, None) rather than just raw.filter(0.1, 50).

@larsoner
Copy link
Member Author

@wmvanvliet you might also find that changing the duration and/or limit arguments in MNE will get you closer to what MaxFilter produces. When I ran things it seemed like sometimes it was using a duration=5 type of processing, and the deviation values it found were always larger than the ones I got in MNE, so maybe something like duration=5, limit=5 would get you closer.

@wmvanvliet
Copy link
Contributor

With limit=5, MNE was detecting too many bads compared to maxfilter. So I tried duration=5, limit=7. Results are here: https://pastebin.com/KV9Yzgjn. Code for the script is here: https://gist.github.com/wmvanvliet/a3944ad425f28e33fba836315f78169d

@agramfort
Copy link
Member

agramfort commented Oct 17, 2019 via email

@wmvanvliet
Copy link
Contributor

How's this? https://pastebin.com/vcKwRbSf

@larsoner
Copy link
Member Author

A 12% absolute match is not so great :(

I'll ping Jukka and see if he has any insight into why things might be different. It's possible that it's the filtering. Are these European / 50 Hz data? The line noise might be messing things up, and if they use a brick wall FFT that stops at 49 Hz, that will be very different from what raw.filter or even raw.resample(100) would do.

If it's easy enough to re-run, you could try raw.resample(98) to see if it helps, as that should kill the 50 Hz line noise and eliminate that possibility.

@wmvanvliet
Copy link
Contributor

Powerline is at 50Hz, yes

@larsoner
Copy link
Member Author

Also did you make sure the params were the same like origin, cross_talk, fine_calibration, etc.? I doubt it would make a big difference but it might make some difference

@wmvanvliet
Copy link
Contributor

@larsoner
Copy link
Member Author

  • You should probably use origin=(0., 0., 0.04) in MNE (assuming these are subject files and not empty room) -- we use a dig fit by default and they use a fixed pos
  • Either pass fine_calibration and cross_talk in MNE or (easier) pass -ctc off -cal off in the MF call -- by default MF for a given site will use built-in cross-talk and fine cal
  • If you really want to reduce differences in the expansions, use regularize=None, bad_condition='ignore' in MNE and -regularize off in MF because our regularization behaves slightly differently
  • If you do decide to re-run, I'd also do resample(98) just in case some line noise is sticking around right at Nyquist, since I'm not sure how MF actually does its lowpass.

@olafhauk
Copy link
Contributor

@olafhauk maybe you can give it a try?

I've just come back from another trip. You want me to test the autobad option in maxfilter?

@olafhauk
Copy link
Contributor

I can test on cam-can data but I would need the bads found by original maxfilter program. @olafhauk @dengemann @SherazKhan do you know where I can find this?
Do you already have an answer to this? They have just appointed a new CamCan administrator who may be able to provide this info.

@wmvanvliet
Copy link
Contributor

That improved things a little: https://pastebin.com/M1tdsz6f

@larsoner
Copy link
Member Author

That improved things a little:

80 hits, 14 misses, 46 false alarms (and 9958 correct rejections) -- getting better at least! I emailed Jukka to see if he has some insight into where other potential differences may arise, but it might take a bit for him to get back to me. I'll mark this WIP in the meantime

Do you already have an answer to this? They have just appointed a new CamCan administrator who may be able to provide this info.

@olafhauk no we do not have this info yet

@larsoner larsoner changed the title MRG, ENH: Add autobad detection like MF WIP, ENH: Add autobad detection like MF Oct 18, 2019
@olafhauk
Copy link
Contributor

That improved things a little:

80 hits, 14 misses, 46 false alarms (and 9958 correct rejections) -- getting better at least! I emailed Jukka to see if he has some insight into where other potential differences may arise, but it might take a bit for him to get back to me. I'll mark this WIP in the meantime

Do you already have an answer to this? They have just appointed a new CamCan administrator who may be able to provide this info.

@olafhauk no we do not have this info yet

If you have already applied for access to CamCan MEG data via the CamCan web-site, you should be able to find the log-files with bad channel information on your temporary CBU account. Otherwise you need to apply for access to these data. I don't work with CamCan data myself at the moment I'm afraid. For queries it's probably best for the person who applied for CamCan access to write to rik.henson@mrc-cbu.cam.ac.uk directly.

@larsoner
Copy link
Member Author

@wmvanvliet I don't have any lines that look like Detected 2 flat channels what MaxFilter version are you using? I have Revision: 2.2.15

@wmvanvliet
Copy link
Contributor

wmvanvliet commented Oct 22, 2019 via email

@larsoner
Copy link
Member Author

So it turns out that MaxFilter does some stuff that we can and probably should avoid. The reading, filtering, and downsampling is tied to the number of tags and number of samples per tag, which we don't need to do because of our reading abstractions. Moreover, we can just load and filter all data at once to avoid edge artifacts.

It seems like the hits/misses/false alarms are now most strongly tied to how the filtering is done (windowing, steepness, corner frequency) and I think we should just make a sensible choice (probably low-pass below the line freq) and go with that, noting that MF behavior will be different. For now I've pushed something that gets me 53/6/14/7271 for hit/miss/FA/CR on 32 files, with the numbers bouncing around depending mostly on how I change our filtering choices.

@wmvanvliet can you try re-running on your data? You can remove any flat or bad marking steps at the Python end, e.g. you no longer need to mark_flat as we now internally mimic what MF does.

@larsoner
Copy link
Member Author

larsoner commented Feb 5, 2020

ping @wmvanvliet :

can you try re-running on your data? You can remove any flat or bad marking steps at the Python end, e.g. you no longer need to mark_flat as we now internally mimic what MF does.

See my explanation above, but I think at this point even with some mismatches we are doing about as well as we should try to do in matching what MaxFilter does. See if you agree with my reasoning above.

With this and #7290 I'll finally be able to have everything in Python (no need for pushing files to a MaxFilter workstation)...

@larsoner larsoner added this to the 0.20 milestone Feb 5, 2020
@larsoner larsoner changed the title WIP, ENH: Add autobad detection like MF MRG, ENH: Add autobad detection like MF Feb 5, 2020
@larsoner
Copy link
Member Author

larsoner commented Mar 3, 2020

ping @wmvanvliet, it would be nice to move forward on this if possible

@wmvanvliet
Copy link
Contributor

@larsoner
Copy link
Member Author

larsoner commented Mar 4, 2020

@wmvanvliet I think that's pretty good! It's not perfect, but based on what I wrote above I think this is expected. Are you (sufficiently) convinced?

@wmvanvliet
Copy link
Contributor

At the very least, it's a useful addition :) It's ok for me if things are not 100% MF compatible, as other portions such as the head position tracking also vary a little.

Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also have a

def find_outliers(X, threshold=3.0, max_iter=2):

in a file called preprocessing/bads.py

maybe we could APIs between functions that aim to find bad channels?

find_outliers -> find_bads_zscore
here then find_bads_maxwell ?

thinking out loud...

@larsoner
Copy link
Member Author

larsoner commented Mar 5, 2020

We also have a def find_outliers(X, threshold=3.0, max_iter=2): in a file called preprocessing/bads.py

This looks like it's supposed to be a private function -- it's undocumented in python_reference.rst and only used in ica code. I think we should do a deprecation cycle to make it private (very easy by renaming to _find_outliers and adding a find_outliers with deprecation decocorator).

That frees us up from worrying about that function at all. I'm fine with find_bads_maxwell as it does make it so that future find_bads_* methods could be added.

@larsoner
Copy link
Member Author

larsoner commented Mar 5, 2020

Pushed a commit to rename to find_bad_channels_maxwell

@agramfort
Copy link
Member

agramfort commented Mar 5, 2020 via email

@larsoner
Copy link
Member Author

larsoner commented Mar 5, 2020

CI failures are unrelated

@agramfort agramfort merged commit 76d8495 into mne-tools:master Mar 5, 2020
@agramfort
Copy link
Member

awesome @larsoner !

@larsoner larsoner deleted the autobad branch March 6, 2020 04:13
AdoNunes pushed a commit to AdoNunes/mne-python that referenced this pull request Apr 6, 2020
* ENH: Add autobad detection like MF

* API: Rename to find_bad_channels_maxwell

* DOC: Missed a few omissions [ci skip]
AdoNunes pushed a commit to AdoNunes/mne-python that referenced this pull request Apr 6, 2020
* ENH: Add autobad detection like MF

* API: Rename to find_bad_channels_maxwell

* DOC: Missed a few omissions [ci skip]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants