Skip to content

Conversation

@massich
Copy link
Contributor

@massich massich commented Mar 12, 2019

In this PR:

  • Fix CNT date parsing
  • Add CNT support in mne.read_annotations
  • Add stim_channel parameter to mne.io.read_raw_cnt (to later remove stim synthesis)
  • Fix tests to use stim_channel=False

This needs to go before #6025

@codecov
Copy link

codecov bot commented Mar 12, 2019

Codecov Report

Merging #6047 into master will increase coverage by 0.04%.
The diff coverage is 83.13%.

@@            Coverage Diff             @@
##           master    #6047      +/-   ##
==========================================
+ Coverage   88.91%   88.95%   +0.04%     
==========================================
  Files         401      402       +1     
  Lines       73013    73123     +110     
  Branches    12135    12151      +16     
==========================================
+ Hits        64922    65050     +128     
+ Misses       5201     5188      -13     
+ Partials     2890     2885       -5

@massich massich force-pushed the deprecate_cnt_stim_channel branch from cd14d74 to 40fa0e3 Compare March 12, 2019 15:54
@massich massich marked this pull request as ready for review March 12, 2019 16:04
@massich
Copy link
Contributor Author

massich commented Mar 12, 2019

@wmvanvliet you added the type 3 events in #3911, can you test that events_from_annotations brings you the right onsets?

@massich
Copy link
Contributor Author

massich commented Mar 12, 2019

for the record, I've checked with files that have cnt and eeglab versions:

  • 32 bits is not working (we can open another PR)
  • events are ok.
  • channel locations seem to be ok.

It would be good if we can get cnt files with events of type1 and type3 to be added to the test suit (I only have type2).

massich pushed a commit to massich/mne-python that referenced this pull request Mar 12, 2019
@massich
Copy link
Contributor Author

massich commented Mar 12, 2019

TL;DR

Just for the record, testing with some confidential files + some files on the web running the following

import mne
import pytest
import os.path as op
from datetime import datetime
from mne import __file__ as _mne_file
from mne.tests.test_annotations import _assert_annotations_equal

regular_path = op.join(op.dirname(_mne_file), '..', 'sandbox', 'data')
confidential_path = op.join(regular_path, 'confidential', 'cnt')

flankers_path = op.join(regular_path, '914flankers.cnt')

cnt_files_with_eeglab_pair = [
    op.join(confidential_path, 'BoyoAEpic1_16bit.cnt'),
    op.join(confidential_path, 'cont_67chan_resp_32bit.cnt'),
    op.join(confidential_path, 'SampleCNTFile_16bit.cnt')
]


def test_ensure_meas_date(recwarn):
    raw = mne.io.read_raw_cnt(flankers_path, montage=None,
                              date_format='dd/mm/yy')
    meas_date = (datetime
                 .fromtimestamp(raw.info['meas_date'][0])
                 .strftime('%d/%m/%y %H:%M:%S'))

    assert meas_date == '23/09/07 12:22:15'


@pytest.mark.parametrize('fname', cnt_files_with_eeglab_pair, ids=op.basename)
def test_check_cnt_eeglab_pairs(fname, recwarn):
    raw_eeglab = mne.io.read_raw_eeglab(fname.replace('.cnt', '.set'),
                                        montage=None)
    raw_cnt = mne.io.read_raw_cnt(fname, montage=None)

    assert raw_cnt.info.keys() == raw_eeglab.info.keys()
    xx = object_diff(raw_cnt.info, raw_eeglab.info)
    print(xx)

I get:

  • The date work properly,
  • the info difference between cnt and its transformed eeglab version are
    • the subject_info anonymization (eeglab is none cnt has empty fields)
    • digitization mismatch (but this is beeing addressed somwhere else)
    • for one file cnt has a trigger channel marked as bad that does not appear in the eeglab version of the file.
['chs'][63]['cal'] type mismatch (<class 'numpy.ndarray'>, <class 'float'>)
['chs'][63]['coord_frame'] value mismatch (4, 0)
['chs'][63]['loc'] array mismatch
['chs'][63]['unit_mul'] type mismatch (<class 'float'>, <class 'int'>)
['subject_info'] type mismatch (<class 'dict'>, <class 'NoneType'>)

@agramfort
Copy link
Member

agramfort commented Mar 12, 2019 via email

@agramfort
Copy link
Member

event_id = list(cfg_local['eventvalue'].astype('int'))
else:
event_id = [int(cfg_local['eventvalue'])]
if 'event_id' not in locals():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

calling locals() here seems dangerous makes the code harder to read. Can you see how to avoid it? thx

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forget it it's in tests. Not strongly necessary then but just nice to do.

def test_read_annotations():
"""Test reading for annotations from a .CNT file."""
annot = read_annotations(fname)
assert len(annot) == 6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would merge this test with test_compare_events_and_annotations by replacing above the call to _read_annotations_cnt by read_annotations(fname). It will cover the same amount of code and will save time.

@wmvanvliet
Copy link
Contributor

@wmvanvliet you added the type 3 events in #3911, can you test that events_from_annotations brings you the right onsets?

@jona-sassenhagen did the actual tests back then. I don't have any CNT files with type 3 events.

massich pushed a commit to massich/mne-python that referenced this pull request Mar 13, 2019
@massich massich requested review from jaeilepp and larsoner March 13, 2019 13:40
@larsoner
Copy link
Member

@jona-sassenhagen can you test with your event-type-3 CNT file(s)?

teeg_parser = Struct('<Bll')

f.seek(teeg_offset)
return Teeg._make(teeg_parser.unpack(f.read(teeg_parser.size)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using private _make does not look clean / safe (using private method). Why not just return Teeg(*teeg_parser.unpack(f.read(teeg_parser.size)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'Cos I did not know how to write it! Thx.

def parser(buffer):
struct = Struct(struct_pattern)
for chunk in struct.iter_unpack(buffer):
yield event_maker._make(chunk)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same


with open(fname, 'rb') as fid:
fid.seek(SETUP_NCHANNELS_OFFSET)
(n_channels,) = unpack('<H', fid.read(calcsize('<H')))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically (and more compactly) we use and prefer np.frombuffer instead of fid.read + unpack

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if np.frombuffer took a file descriptor (or the file was lodaded in a buffer) this could be written like this which is really readable:

   with open(fname, 'rb') as fid:
       n_channels = np.frombuffer(
           fid, dtype=np.uint16, offset=SETUP_NCHANNELS_OFFSET)
       sfreq = np.frombuffer(
           fid, dtype=np.uint16, offset=SETUP_RATE_OFFSET)
       event_table_pos = np.frombuffer(
           fid, dtype=np.int32, offset=SETUP_EVENTTABLEPOS_OFFSET)

Otherwise np.fromfile but you can not get read of the seek.

    with open(fname, 'rb') as fid:
        fid.seek(SETUP_NCHANNELS_OFFSET)
        n_channels = np.fromfile(fid, dtype='<u2', count=1)[0]

        fid.seek(SETUP_RATE_OFFSET)
        sfreq =  np.fromfile(fid, dtype='<u2', count=1)[0]

        fid.seek(SETUP_EVENTTABLEPOS_OFFSET)
        event_table_pos = np.fromfile(fid, dtype='<i4', count=1)[0]

But I've no idea which of the 3 is preferable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frombuffer is what we use in mne/io/tag.py and it works with a file descriptor, we changed it to that from np.fromfile a year or so ago so I think it's the preferred way

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(or rather I guess we do frombuffer(fid.read(...), ...) in tag.py)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say do whichever works. Having a fid.seek in there is expected/fine

def read_raw_cnt(input_fname, montage, eog=(), misc=(), ecg=(), emg=(),
data_format='auto', date_format='mm/dd/yy', preload=False,
verbose=None):
stim_channel=True, verbose=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it should be None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't default to True to not break people's code? At the beginning I just removed the whole stim. But talking to @agramfort it seems that we cannot get read of the stim in cnt until 0.20, unless we backport the addition of this stim_channel=True to 0.17.3 and then 0.18 to false/none and remove the stim channel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make None an alias for True, where None emits a warning but True does not. But maybe it's not necessary.

stim_channel[event_time - 1] = event_id

else: # when stim_channel_toggle is False
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see much value in else: pass because it's implied by just not having it there; and in this case the comment does not add anything because it's just the conditional checked above

stim_channel=stim_channel, n_bytes=n_bytes)
else:
cnt_info.update(baselines=np.array(baselines), n_samples=n_samples,
n_bytes=n_bytes)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is redundancy in several conditionals, including here, which could instead be (no need for else at all):

if stim_channel_toggle:
    ...
    cnt_info.update(stim_channel=stim_channel)
cnt_info.update(baselines=np.array(baselines), n_samples=n_samples,
                n_bytes=n_bytes)

Seems simpler (makes it clear no matter what, cnt_info is updated with these entries) and is more DRY, no?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

totally.

_data_stop = start + sample_stop
data_[-1] = stim_ch[_data_start:_data_stop]
else:
pass
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need for else: pass

description : array of str, shape (n_annotations,)
Array of strings containing description for each annotation. If a
string, all the annotations are given the same description. To reject
epochs, use description starting with keyword 'bad'. See example above.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring is wrong. You return an annotations

large amount of memory). If preload is a string, preload is the
file name of a memory-mapped file which is used to store the data
on the hard drive (slower, requires less memory).
stim_channel : bool (default True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

None not true

@massich
Copy link
Contributor Author

massich commented Mar 13, 2019

@larsoner merge if you are happy. The code synthesizing the stim channel from the events type 3 is still the same, and there's an upcoming PR fixing the overflow + the 32 bits annotations.

@massich massich merged commit 9d71dc0 into mne-tools:master Mar 14, 2019
@massich massich deleted the deprecate_cnt_stim_channel branch March 18, 2019 07:38
jeythekey pushed a commit to jeythekey/mne-python that referenced this pull request Apr 27, 2019
In this PR:
- [x] Fix CNT date parsing
- [x] Add CNT support in `mne.read_annotations`
- [x] Add `stim_channel` parameter to `mne.io.read_raw_cnt` (to later remove stim synthesis)
- [x] Fix tests to use `stim_channel=False`
raise IOError('Unexpected event size.')

# XXX long NumEvents is available, why are not we using it?
n_events = event_size // event_bytes
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess that #6535 is why.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants