Skip to content

Conversation

@massich
Copy link
Contributor

@massich massich commented Nov 22, 2018

Things that need to happen:

  • load edf files with annotations, and find events
  • new deprecation messages
  • remove 'EDF Annotations' channel

Nice to have:

  • preload=false, loads the annotations but not the data.

@codecov
Copy link

codecov bot commented Nov 22, 2018

Codecov Report

Merging #5741 into master will decrease coverage by 0.02%.
The diff coverage is 94.3%.

@@            Coverage Diff             @@
##           master    #5741      +/-   ##
==========================================
- Coverage   88.55%   88.53%   -0.03%     
==========================================
  Files         369      369              
  Lines       69139    68955     -184     
  Branches    11638    11614      -24     
==========================================
- Hits        61228    61048     -180     
- Misses       5046     5048       +2     
+ Partials     2865     2859       -6

@massich massich changed the title Deprecate EDF [WIP] Deprecate EDF Nov 22, 2018
@massich massich force-pushed the edf branch 2 times, most recently from 7a3e80e to a545623 Compare November 26, 2018 16:31
cals = np.append(cals, 1)
if stim_channel is not None:
stim_channel = _check_stim_channel(stim_channel, ch_names, sel)
stim_channel = _check_stim_channel(stim_channel, ch_names, sel)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this could be change by just chekcing if any of the valid stim keywords is present in ch_names

# # Test for stim channel
# events = find_events(raw, shortest_event=1)
# # The events are overlapping.
# assert_array_equal(events[:, 0], raw._raw_extras[0]['events'][1][::2])
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows, that actually there are events in gdf1_path and these should be their onsets:

ipdb> raw._raw_extras[0]['events'][1][::2]                                    
array([ 5127,  5697,  5728,  5850,  5936,  5957,  5992,  6059,  6084,
        6132,  6165,  6256,  6389,  6423,  6459,  6572,  6657,  6706,
        6740,  6805,  6833,  6857,  6951,  6982,  7008,  7043,  7139,
        7205,  7314,  7434,  7474,  7540,  7607,  7636,  7660,  7686,
        7788,  7844,  7879,  7911,  8050,  8089,  8128,  8196,  8257,
        8281,  8308,  8349,  8373,  8415,  8458,  8496,  8582,  8620,
        8644,  8711,  8737,  8780,  8904,  8935,  8979,  9014,  9107,
        9152,  9247,  9275,  9392,  9457,  9512,  9556,  9667,  9729,
        9755,  9853,  9939,  9968, 10045, 10153, 10290, 10357, 10413,
       10504, 10567, 10642, 10666, 10708, 10738, 10845, 10896, 10945,
       11042, 11109, 11220, 11246, 11291, 11335, 11385, 11931, 11974,
       12026, 12093, 12168, 12199, 12264, 12346, 12398, 12515, 12564,
       12590, 12625, 12738, 12762, 12803, 12848, 12873, 12939, 12980,
       13010, 13075, 13147, 13186, 13219, 13247, 13298, 13328, 13355,
       13400, 13466, 13497, 13569, 13594, 13656, 13727, 13764, 13798,
       13821, 13844, 13896, 13946, 14012, 14087, 14198, 14335, 14384,
       14458, 14509, 14531, 14556, 14614, 14673, 14752, 14819, 14854,
       14877, 14930, 14956, 14989, 15009, 15052, 15142, 15186, 15207,
       15287, 15354, 15386, 15416, 15453, 15488, 15589, 15675, 15760,
       15785, 15860, 15890, 15918, 15984, 16014, 16086, 16203, 16280,
       16366, 16458, 16486, 16574, 16602, 16721, 16746, 16812, 16888,
       16920, 17009, 17058, 17126, 17673, 17752, 17776, 17810, 17944,
       18034, 18106, 18138, 18212, 18236, 18358, 18388, 18418, 18448,
       18474, 18528, 18552, 18582, 18655, 18698, 18732, 18813, 18898,
       18930, 18955, 19062, 19115, 19200, 19334, 19363, 19430, 19484,
       19516, 19551, 19589, 19616, 19649, 19726, 19752, 19796, 19922,
       19950, 20024, 20045, 20091, 20212, 20240, 20295, 20319, 20350,
       20393, 20487, 20594, 20647, 20715, 20748, 20776, 20845, 20878,
       20912, 20978, 21010, 21048, 21152, 21187, 21212, 21279, 21304,
       21336, 21366, 21428, 21468, 21552, 21658, 21733, 21772, 21869,
       21895, 21960, 21985, 22018, 22087, 22156, 22178, 22208, 22259,
       22307, 22447, 22472, 22512, 22578, 22612, 22658, 22741, 22817,
       22838, 23388, 23416, 23491, 23523, 23546, 23657, 23724, 23850,
       23968, 24060, 24178, 24314, 24397, 24468, 24493, 24529, 24557,
       24585, 24654, 24677, 24739, 24847, 24876, 24927, 24957, 24984,
       25027, 25072, 25131, 25172, 25239, 25272, 25386, 25470, 25581,
       25618, 25645, 25667, 25703, 25752, 25782, 25810, 25876, 25908,
       25954, 26032, 26076, 26098, 26160, 26185, 26304, 26334, 26386,
       26411, 26478, 26530, 26570, 26618, 26653, 26716, 26748, 26877,
       26906, 26985, 27032, 27057, 27125, 27263, 27293, 27420, 27476,
       27505, 27544, 27565, 27596, 27628, 27650, 27692, 27757, 27854,
       27962, 28058, 28080, 28141, 28165, 28311, 28333, 28374, 28477,
       28511, 28576, 28608, 28720, 28785, 28868, 28913, 29462, 29489,
       29579, 29698, 29767, 29790, 29862, 29946, 30009, 30033, 30062,
       30108, 30176, 30213, 30236, 30261, 30304, 30325, 30392, 30413,
       30479, 30535, 30563, 30585, 30629, 30690, 30718, 30786, 30844,
       30974, 31042, 31079, 31161, 31215, 31236, 31270, 31292, 31367,
       31392, 31420, 31454, 31478, 31546, 31621, 31752, 31824, 31867,
       31907, 31932, 31975, 32024, 32054, 32128, 32157, 32223, 32250,
       32286, 32312, 32348, 32442, 32519, 32587, 32610, 32634, 32705,
       32798, 32864, 32907, 32993, 33037, 33146, 33180, 33248, 33308,
       33333, 33363, 33476, 33500, 33564, 33591, 33637, 33741, 33813,
       33841, 33990, 34101, 34176, 34207, 34298, 34323, 34398, 34455,
       34483, 34534, 34565, 34609, 34880], dtype=uint32)

however, I cannot see to which channel these events should be encoded:

ipdb> raw._raw_extras[0]['ch_names']                                          
['FP1', 'FP2', 'F5', 'AFz', 'F6', 'T7', 'Cz', 'T8', 'P7', 'P3', 'Pz', 'P4', 'P8', 'O1', 'Oz', 'O2']

Any idea ??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do we do with this?

Copy link
Contributor

@nbara nbara Nov 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@massich The file in question was provided by @alexandrebarachant in #2136, I am not sure what events it should contain beyond those that were found in the unit test previously.

however, I cannot see to which channel these events should be encoded

IIRC the GDF format supports passing events as annotations without a physical stim/trigger/status channel, so that could explain why you find events in the file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, events in GDF are usually stored in a separate event table and not in a channel.

# actually one of the requested channels
idx = np.arange(self.info['nchan'])[idx] # slice -> ints
read_size = len(r_lims) * buf_len
_idx = np.arange(self.info['nchan'])[idx] # slice -> ints
Copy link
Contributor Author

@massich massich Nov 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was overwriting the idx parameter that was actually needed by the scaling.

data *= cal.T[idx]
data += offsets[idx]
data *= gains.T[idx]


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you need this change. It should have worked fine as it is, this just converts it from a slice to an array of int, which should allow for things like np.where(idx == stim_channel) but otherwise work the same way in the operations you had above.

# - add it as self.annotations
# - Repeat for all TAL channels (if Any file format allows for
# multiple TAL annotations, as it
# seems to indicate the old code.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@agramfort
Copy link
Member

agramfort commented Nov 28, 2018 via email

@massich
Copy link
Contributor Author

massich commented Nov 28, 2018

yes, I broke it. I'm fixing it. But if more people can test (and share) that would be awesome!

@massich
Copy link
Contributor Author

massich commented Nov 28, 2018

There would be a subsequent PR improving the test suit etc.. tests would be able to be added.

@massich massich changed the title [WIP] Deprecate EDF [MRG] Deprecate EDF Nov 28, 2018
@massich
Copy link
Contributor Author

massich commented Nov 28, 2018

@larsoner can you take a look at this one?

@cbrnr
Copy link
Contributor

cbrnr commented Nov 29, 2018

I tested a couple of EDF/BDF (and one GDF) files and everything seems to work. Am I right that currently, we still keep the stim channel in EDF+ files with TAL channels as an analog channel? I think we could probably get rid of the analog channel in these cases.

Also please check out this EDF+ file from this website. The file contained in the ZIP file is called ma0844az_1-1+.edf. When I open it, I get a RuntimeWarning: Omitted 21 annotation(s) that were outside data range. raw = mne.io.read_raw_edf("ma0844az_1-1+.edf"). Only 20 out of the 40 annotations end up in raw.annotations, although other viewers correctly import all annotations.

Furthermore, the same website has a BDF+ file (which contains a TAL channel). Here, the TAL channel is not recognized, presumably because it is called "BDF Annotations" (and we only search for "EDF Annotations"). Maybe you can fix this as well.

@massich massich changed the title [MRG] Deprecate EDF [WIP] Deprecate EDF Nov 29, 2018
@massich
Copy link
Contributor Author

massich commented Nov 29, 2018

Ok. First of all, thanks a lot for the examples they bring light to some shady things. (+ I really loved the BDF. I think we should have the same file for all the formats we have)

Here follows, some tests and thougts.

EDF

Here we compare if master and the PR read the same things or not

def test_edf_compare_sotred_annotations_when_reading(recwarn):
    from mne.io.pick import channel_indices_by_type
    from mne import find_events

    raw = read_raw_edf(edf, preload=True)
    annot = read_annotations(edf)
    found_types = [k for k, v in
                   channel_indices_by_type(raw.info, picks=None).items()
                   if v]
    events = find_events(raw, shortest_event=1)

    print(f'\n Annotations load in read_raw_edf: {raw.annotations}')
    print(f'The read types are {found_types}')
    print(f'Annotations in the file {annot}')
    print(f'Having {events.shape[0]} evnts for {raw.get_data().shape[1]}'
          f' samples is a ratio of {events.shape[0]/raw.get_data().shape[1]}')

For master:

Annotations load in read_raw_edf: <Annotations  |  41 segments : a1+a2 (1), aanspreken (1), ademh (1)..., orig_time : 2005-09-15 10:18:42>
The read types are ['eeg', 'stim']
Annotations in the file <Annotations  |  41 segments : a1+a2 (1), aanspreken (1), ademh (1)..., orig_time : None>
Having 39 evnts for 363620 samples is a ratio of 0.00010725482646719103

For the PR case:

Annotations load in read_raw_edf: <Annotations  |  20 segments : a1+a2 (1), aanspreken (1), ademh (1)..., orig_time : 2005-09-15 10:18:42>
The read types are ['eeg', 'stim']
Annotations in the file <Annotations  |  41 segments : a1+a2 (1), aanspreken (1), ademh (1)..., orig_time : None>
Having 270100 evnts for 363620 samples is a ratio of 0.7428084263791871

I broke something. My guess is that the stim channel and the annotations represent the same data (39~=41) but something is really broken in the PR.
Plus, from the 41 annotations in the EDF file, 21 get kicked out 'cos they are out of the span of the raw object.

def test_edf_times(recwarn):
    raw = read_raw_edf(edf, preload=True)
    annot = read_annotations(edf)

    # Print annotation boundaries
    for name, a in [('raw.annotations', raw.annotations),
                    ('annot', annot)]:
        print(f'\nin {name} onsets go from {min(a.onset)}s to {max(a.onset)}s')
        print(f'which {max(a.onset)} < {raw.times[-1]}'
              f' is evaluated to {max(a.onset)<raw.times[-1]}')

For master:

in raw.annotations onsets go from 0.0s to 1800.0s
which 1800.0 < 1818.095 is evaluated to True

in annot onsets go from 0.0s to 1800.0s
which 1800.0 < 1818.095 is evaluated to True

For the PR case:

in raw.annotations onsets go from 0.0s to 1450.0s
which 1450.0 < 1454.476 is evaluated to True

in annot onsets go from 0.0s to 1800.0s
which 1800.0 < 1454.476 is evaluated to False

This suggest that there's something fishy, not with the annotations but in the reading of the sfreq.

BDF

Well, here is clear that 'BDF Annotations' is not picked as stim.

def test_plot_bdf_master(recwarn):
    from mne.io.pick import channel_indices_by_type
    from mne import find_events
    raw = read_raw_edf(bdf)
    # XXX: There's no read_annotations for bdf (only edf)
    # annot = read_annotations(bdf)
    found_types = [k for k, v in
                   channel_indices_by_type(raw.info, picks=None).items()
                   if v]
    # XXX: find_events with no stim channel would break in this PR 
    # events = find_events(raw, shortest_event=1)

    print(f'\n Annotations load in read_raw_edf: {raw.annotations}')
    print(f'The read types are {found_types}')
    print(f'The shape of the data is {raw.get_data().shape}')
    print(f"Channel names in _raw_extras: {raw._raw_extras[0]['ch_names']}")

    my_scaling = {k: 'auto' for k in found_types}
    raw.plot(scalings=my_scaling)

in master:

Annotations load in read_raw_edf: <Annotations  |  0 segments , orig_time : None>
The read types are ['eeg', 'stim']
The shape of the data is (12, 120000)
Channel names in _raw_extras: ['squarewave', 'ramp', 'pulse', 'ECG', 'noise', 'sine 1 Hz', 'sine 8 Hz', 'sine 8.5 Hz', 'sine 15 Hz', 'sine 17 Hz', 'sine 50 Hz', 'STI 014']

image

in PR:

Annotations load in read_raw_edf: <Annotations  |  0 segments , orig_time : None>
The read types are ['eeg']
The shape of the data is (12, 120000)
Channel names in _raw_extras: ['squarewave', 'ramp', 'pulse', 'ECG', 'noise', 'sine 1 Hz', 'sine 8 Hz', 'sine 8.5 Hz', 'sine 15 Hz', 'sine 17 Hz', 'sine 50 Hz', 'BDF Annotations']

image
image
image

So it does not realize that 'BDF Annotations' is a tal_ch despite the fact
that it does for our BDF file in the testing. Which it should 'cos the rule is that
it picks the last channel in edf_info (then
stored in raw._raw_extras) as the annotation channel.
It does not get picked up as a stim, and added to annotations (or stim) is just treated as eeg.


To sum up, I'm taking a look at all this.

@cbrnr
Copy link
Contributor

cbrnr commented Nov 30, 2018

Thanks @massich, let me know if I can help. The BDF issue should be easy to fix, you only need to add "BDF Annotations" to the list of valid TAL channel names. Maybe allow "EDF Annotations" only for EDF files and "BDF Annotations" only for BDF files.

@massich
Copy link
Contributor Author

massich commented Nov 30, 2018

I think this is driving me crazy. 'BDF Annotations' in the example does not seem to follow the same annotations pattern.

Is there any dumping tool?

@massich massich force-pushed the edf branch 2 times, most recently from 54d6901 to 4a36ded Compare November 30, 2018 13:29
Copy link
Member

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for MRG when CIs are happy

@massich
Copy link
Contributor Author

massich commented Dec 17, 2018

@larsoner, @cbrnr can any of you give it a last look and merge if you are happy?

@larsoner
Copy link
Member

I already commented a few minutes ago

@larsoner larsoner merged commit f85de59 into mne-tools:master Dec 17, 2018
@larsoner
Copy link
Member

Thanks @massich !

@massich massich deleted the edf branch January 14, 2019 16:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants