[MRG+1] CTF support for ds bad channels and bad segments #5255

bloyl · 2018-05-31T19:13:46Z

Current CTF IO doesn't read in bad channels or segments from *.ds/BadChannels and *.ds/bad.segments files.

This PR should address that.

I stuck the new methods in info.py without a strong reason so if anyone thinks they make more sense somewhere else I'm happy to move them,

How should I test this?

I could make a new version of MNE-testing-data/CTF/testdata_ctf_short.ds with a bad segment and a bad channel marked.
I could alter an exists ds folder in MNE-testing-data/CTF/ to have a bad segment and channel.

Downside to option 1 is that it'll add ~10M to MNE-testing-data. Downside to option 2 is that it would potentially break/alter existing tests.

agramfort · 2018-05-31T19:33:43Z

I would go with 2 and see if it actually breaks any test. Make the bad segments really short and make only one channel bad?

bloyl · 2018-06-04T22:37:09Z

It seems as if mne-python also doesn't read in time of the first data sample from ctf data.
For instance both dataEditor (CTF program), Brainstorm and fieldtrip think the first data sample of 'MNE-testing-data/CTF/testdata_ctf_short.ds' is 2 seconds. mne-python has it as 0.

From

mne-python/mne/io/base.py

Lines 629 to 633 in ab4571a

    
           def _update_times(self): 
        
               """Update times.""" 
        
               self._times = np.arange(self.n_times) / float(self.info['sfreq']) 
        
               # make it immutable 
        
               self._times.flags.writeable = False

it seems like adjusting this time is not possible is that correct?

codecov · 2018-06-05T15:17:29Z

Codecov Report

Merging #5255 into master will increase coverage by 0.03%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #5255      +/-   ##
==========================================
+ Coverage   88.22%   88.25%   +0.03%     
==========================================
  Files         358      358              
  Lines       66228    66284      +56     
  Branches    11254    11268      +14     
==========================================
+ Hits        58429    58502      +73     
+ Misses       4981     4959      -22     
- Partials     2818     2823       +5

bloyl · 2018-06-05T19:41:29Z

mne/io/meas_info.py

+    # Add measurement id
+    if info['meas_id'] is not None:
+        write_id(fid, FIFF.FIFF_PARENT_BLOCK_ID, info['meas_id'])
+


Is this check needed or wanted?

mne-python/mne/io/write.py

Lines 224 to 226 in ab4571a

def write_id(fid, kind, id_=None):

"""Write fiff id."""

id_ = _generate_meas_id() if id_ is None else id_

will put in the default (DATE_NONE)

I guess the question is do you always want a meas_id in a the fiff structure?

The FIF file itself will need one, but I don't know if meas_info needs one. Can you try writing a file with an info without meas_id in it and see if the C tools can read it?

I don't have the C tools running or any experience with them.

For what its worth without this line (ie the info['meas_id'] == None case) it would revert to the backend of write_info method so if those played nice with the C code I imagine this would be fine as well

In those cases (ie the info['meas_id'] == None) read_info in mne-python returned an info['meas_id'] with the DATE_NONE usecs in it.

Also the file gets the default (None/DATE_NONE) id.

bloyl · 2018-06-05T19:44:11Z

mne/io/tests/test_raw.py

+        assert_array_equal(raw.info['meas_id']['machid'],
+                           raw3.info['meas_id']['machid'])
+    else:
+        logger.warn('Reader does not set meas_id')


Do we want this to raise an error? it would catch IO readers that don't set meas_id.

Currently it is only raising errors when the reader doesn't set meas_id and when the tests that expect the warning messages to be a specific thing.

We probably don't need to warn. I don't think we are guaranteed that all formats will have it.

Ok I can pull it. What is info['meas_id'] used for... anything?

bloyl · 2018-06-05T19:47:03Z

mne/io/ctf/ctf.py


+        # Add bad segments as Annotations (correct for start time)
+        s_time = -res4['pre_trig_pts'] / float(info['sfreq'])
+        self.annotations = _read_bad_segments(directory, s_time)


Ideally this s_time would be the start of the 'raw.times' array. Is that possible with the current fif structures?

raw.times always starts at 0. but there is raw.first_samp which is an offset, is that what you mean?

raw.first_samp at least from what I saw doesn't change raw.times[0] but does skip the first raw.first_samp samples of the data array. Is that the correct understanding?

I'm not sure what it does in the context of CTF. But for Neuromag data it is the offset between when the acquisition started and the recording began. In MNE we use it when cropping files, too.

The problem with setting raw.first_samp is that it skips those samples as stored on disk, so while it makes sense of the timing issues needs to say set to 0 to read in all of the data.

first_samp should not impact the reading of the first samples on disk. It's basically an increment in events and allows to know when samples where recorded wrt the original absolute time

That doesn't match what I am seeing.
for instance running this test code

import os.path as op from mne.io import read_raw_ctf from mne.datasets import testing ctf_dir = op.join(testing.data_path(download=True), 'CTF') ctf_fname = op.join(ctf_dir, 'testdata_ctf_short.ds') raw = read_raw_ctf(ctf_fname, preload=True) print ('first_samp - %d' % raw.first_samp) print ('_data.shape', raw._data.shape) print ('_data[3,0]', raw._data[3, 0]) print ('_data[3,5]', raw._data[3, 5]) print ('len(raw.times) - %d' % len(raw.times)) print ('raw.times.min() - %f' % raw.times.min()) print ('raw.times.max() - %f' % raw.times.max())

on the current master(not this branch) returns

first_samp - 0 ('_data.shape', (356, 4801)) ('_data[3,0]', 16442.588333333333) ('_data[3,5]', 16442.592499999999) len(raw.times) - 4801 raw.times.min() - 0.000000 raw.times.max() - 4.000000

if I change

mne-python/mne/io/ctf/ctf.py

Line 146 in da1d329

first_samps = [0] * len(last_samps)

to
first_samps = [5] * len(last_samps)

returns

first_samp - 5 ('_data.shape', (356, 4796)) ('_data[3,0]', 16442.592499999999) ('_data[3,5]', 16442.596666666665) len(raw.times) - 4796 raw.times.min() - 0.000000 raw.times.max() - 3.995833

so it certainly seems like setting first_samp skips the opening of the data array.

agramfort · 2018-06-09T13:48:11Z

@larsoner ok for you?

can someone replicate the travis failure?

larsoner · 2018-06-09T18:07:05Z

test_cov.py has been timing out on all PRs for the last week-ish. I have not had time to try to replicate yet

larsoner

LGTM aside from the seemingly unresolved first_samp issue.

larsoner · 2018-06-09T18:08:54Z

mne/io/fiff/tests/test_raw_fiff.py

    assert_equal(raw.info['meas_date'], raw_read.info['meas_date'])

+    for key in ['secs', 'usecs', 'version']:
+        assert_equal(raw.info['meas_id'][key], raw_read.info['meas_id'][key])


FYI you can now do the simpler assert raw.info['meas_id'][key] == raw_read.info['meas_id'][key]. The codebase is slowly migrating toward this pytest way of doing things (except for comparisons that require numpy.testing). Feel free to change it here or not, I'm happy either way

agramfort · 2018-06-10T18:04:32Z

first_samp should tell you when raw[:, 0] was recorded. It's modified by the crop method for example. so I think it's consistent to say that raw[:, 0] which corresponds to raw.times[0] (which is always 0) was actually recorded at: orig_time + first_samp / sfreq does it make sense. Sorry if I miss something but I looked quickly.

bloyl · 2018-06-10T18:20:52Z

All the raw.times stuff makes sense.

My concern is the differences in raw._data
when first_samp=5 the data array is 5 samples shorter (4796) then when first_samp=0 (4801).

Similarly first data sample ('_data[3,0]', 16442.592499999999) when first_samp=5 is equal to the 5th data sample ('_data[3,5]', 16442.592499999999) when first_samp=0

bloyl · 2018-06-11T18:21:09Z

This is where first_samp is used to skip the opening of the data on disk.

mne-python/mne/io/base.py

Lines 506 to 520 in da1d329

    
           offset = 0 
        
           for fi in np.nonzero(files_used)[0]: 
        
               start_file = self._first_samps[fi] 
        
               # first iteration (only) could start in the middle somewhere 
        
               if offset == 0: 
        
                   start_file += start - cumul_lens[fi] 
        
               stop_file = np.min([stop - cumul_lens[fi] + self._first_samps[fi], 
        
                                   self._last_samps[fi] + 1]) 
        
               if start_file < self._first_samps[fi] or stop_file < start_file: 
        
                   raise ValueError('Bad array indexing, could be a bug') 
        
               n_read = stop_file - start_file 
        
               this_sl = slice(offset, offset + n_read) 
        
               self._read_segment_file(data[:, this_sl], idx, fi, 
        
                                       int(start_file), int(stop_file), 
        
                                       cals, mult)

readers then implement _read_segment_file

I assume that fif file/IO sorts this all out for you when you use crop?

agramfort · 2018-06-11T19:49:32Z

> I assume that fif file/IO sorts this all out for you when you use crop?

yes.

bloyl · 2018-06-15T16:01:22Z

Do we want to add add keywords to raw_raw_ctf that activate/deactivate this behavior? if so what should the defaults be?

I would vote for adding keywords that default to the old behavior (ignoring bad channels and segments) and maybe plan to change the default at some later release.

larsoner · 2018-06-15T16:16:59Z

I see it as more of a bugfix / something we should have been doing all along. Do you think it's likely that people had bad channels and/or segments marked during acq that they wouldn't want marked during offline analysis?

bloyl · 2018-06-15T17:22:47Z

I agree that this is something that should have been done the whole time so should become the default behavior at some point. I am perhaps being overly conservative after the comp channels stuff :). I am fine either way.

I due think its possible that there are people who didn't know there were bad channels/segments marked in the data they were using and now their pipelines will behave slightly differently.

agramfort · 2018-06-15T17:42:23Z

I also see as BUG FIX that we should clearly document in the next release email

agramfort · 2018-07-09T08:16:56Z

@bloyl can you resurrect this?

bloyl · 2018-07-09T23:34:37Z

+1 for merging on my end

I'll be away the next two weeks with limited email so feel free to make any changes you feel are needed.

larsoner · 2018-07-10T01:46:20Z

@bloyl have you tested this with some of your own files and found it to work okay?

I'll probably test it locally just by using the testing dataset and visually inspecting the Annotations.

bloyl · 2018-07-10T02:29:40Z

I have. it seemed to be working fine. the only issue has to do with the start time, but I don't see away around that.

…

On Mon, Jul 9, 2018 at 9:46 PM Eric Larson ***@***.***> wrote: @bloyl <https://github.com/bloyl> have you tested this with some of your own files and found it to work okay? I'll probably test it locally just by using the testing dataset and visually inspecting the Annotations. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5255 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AFU7najTQ36KdxPqvheZ0ROTmBtTIZW-ks5uFAdugaJpZM4UVlgo> .

larsoner

LGTM +1 for merge

agramfort · 2018-07-12T11:46:03Z

mne/io/base.py

    start_block(fid, FIFF.FIFFB_MEAS)
    write_id(fid, FIFF.FIFF_BLOCK_ID)
-    if info['meas_id'] is not None:
-        write_id(fid, FIFF.FIFF_PARENT_BLOCK_ID, info['meas_id'])


@agramfort I know @bloyl is out for a bit so I'm working on this now.

Maybe we should just revert this bit for now?

agramfort · 2018-07-19T12:39:36Z

@bloyl can you rebase to fix CIs?

larsoner · 2018-07-19T18:34:18Z

Green

agramfort · 2018-07-20T08:20:49Z

thx @bloyl and @larsoner

bloyl mentioned this pull request Jun 4, 2018

CTF meas_id #5258

Closed

bloyl mentioned this pull request Jun 5, 2018

Test CTF bad channels and bad segments mne-tools/mne-testing-data#34

Merged

bloyl commented Jun 5, 2018

View reviewed changes

larsoner reviewed Jun 9, 2018

View reviewed changes

bloyl added 9 commits July 9, 2018 09:59

CTF support for ds bad channels and bad segments

1f92bf3

Test Annotations

4af51c7

remove meas_id checks

c2ea432

test readers for meas_id consistency

d46b83c

Remove warning about readers

345dc5c

Update mne-test-data version and checksum

da40fff

Appease travis

1224701

subtract start time instead of using orig_time.

5526458

Update whats_new

42c50a9

bloyl force-pushed the ctf_io branch from 076b1c3 to 42c50a9 Compare July 9, 2018 14:06

Fix testing imports

ea184a3

Merge branch 'master' into ctf_io

880ad4c

larsoner approved these changes Jul 11, 2018

View reviewed changes

larsoner changed the title ~~CTF support for ds bad channels and bad segments~~ [MRG+1] CTF support for ds bad channels and bad segments Jul 11, 2018

agramfort reviewed Jul 12, 2018

View reviewed changes

FIX: Revert removal

2868d33

agramfort merged commit 1aea53a into mne-tools:master Jul 20, 2018

jasmainak mentioned this pull request Nov 16, 2019

meas_date is guessed during I/O when it is None #7067

Closed

bloyl mentioned this pull request Nov 17, 2019

FIX:7067 - None meas_date roundtrip #7071

Closed

	def write_id(fid, kind, id_=None):
	"""Write fiff id."""
	id_ = _generate_meas_id() if id_ is None else id_

Uh oh!

[MRG+1] CTF support for ds bad channels and bad segments #5255

[MRG+1] CTF support for ds bad channels and bad segments #5255

Uh oh!

Conversation

bloyl commented May 31, 2018

Uh oh!

agramfort commented May 31, 2018

Uh oh!

bloyl commented Jun 4, 2018

Uh oh!

codecov bot commented Jun 5, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agramfort commented Jun 9, 2018

Uh oh!

larsoner commented Jun 9, 2018

Uh oh!

larsoner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

agramfort commented Jun 10, 2018 via email

Uh oh!

bloyl commented Jun 10, 2018

Uh oh!

bloyl commented Jun 11, 2018

Uh oh!

agramfort commented Jun 11, 2018 via email

Uh oh!

bloyl commented Jun 15, 2018

Uh oh!

larsoner commented Jun 15, 2018

Uh oh!

bloyl commented Jun 15, 2018

Uh oh!

agramfort commented Jun 15, 2018 via email

Uh oh!

agramfort commented Jul 9, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bloyl commented Jul 9, 2018

Uh oh!

larsoner commented Jul 10, 2018

Uh oh!

bloyl commented Jul 10, 2018 via email

Uh oh!

larsoner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jun 5, 2018 •

edited

Loading

agramfort commented Jul 9, 2018 •

edited

Loading