Skip to content

Conversation

@bloyl
Copy link
Contributor

@bloyl bloyl commented Nov 17, 2019

Fixes #7067

always writing meas_date seems like the most straight forward fix. Hopefully it doesn't break anything.

As a fix it does have the downside that if a fiff file exists on disk without a meas_date tag and a meas_id that isn't equal to DATE_NONE then on reading in the file the meas_date will be guessed.

@codecov
Copy link

codecov bot commented Nov 17, 2019

Codecov Report

Merging #7071 into master will decrease coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #7071      +/-   ##
==========================================
- Coverage   89.74%   89.74%   -0.01%     
==========================================
  Files         442      442              
  Lines       77786    77790       +4     
  Branches    12621    12620       -1     
==========================================
- Hits        69812    69810       -2     
- Misses       5163     5170       +7     
+ Partials     2811     2810       -1

m2 = m2.hexdigest()
assert m1 == m2

# check for bug 7067
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# check for bug 7067
# check for bug #7067

Copy link
Member

@jasmainak jasmainak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fix looks straightforward and the test seems to check that. LGTM

if info.get('meas_date') is not None:
write_int(fid, FIFF.FIFF_MEAS_DATE, info['meas_date'])
else:
write_int(fid, FIFF.FIFF_MEAS_DATE, DATE_NONE)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bloyl can you help me understand why the writer should be fixed? It seems normal not to write anything if meas_date is None. Is it the reader that does not set the date to None when it's not present in the file on disk?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

over my morning coffee, I agree the writer shouldn't be fixed. It seemed like the most straightforward at the time.

The current fix will have non mne-python readers showing strange dates. or faulting trying to convert DATA_NONE to a timestamp.

@agramfort
Copy link
Member

agramfort commented Nov 17, 2019 via email

@bloyl
Copy link
Contributor Author

bloyl commented Nov 17, 2019

Posted this to the issue by mistack...

Potential Fixes

  1. During Reading don't guess meas_date from meas_id. This is here.

    mne-python/mne/io/meas_info.py

    Lines 1229 to 1230 in ef7d7c1

    if meas_date is None:
    meas_date = (info['meas_id']['secs'], info['meas_id']['usecs'])

    I don't really know why this behavior was desirable but I'm sure it was/is for something.

  2. Stop writing the meas_id directly in the meas_info tree of the fiff. This would revert to pre [MRG+1] CTF support for ds bad channels and bad segments #5255 behavior. we'd also have to remove some of the testing that was added to [MRG+1] CTF support for ds bad channels and bad segments #5255. This would work because somewhere in here:

    mne-python/mne/io/meas_info.py

    Lines 1211 to 1224 in ef7d7c1

    # Make the most appropriate selection for the measurement id
    if meas_info['parent_id'] is None:
    if meas_info['id'] is None:
    if meas['id'] is None:
    if meas['parent_id'] is None:
    info['meas_id'] = info['file_id']
    else:
    info['meas_id'] = meas['parent_id']
    else:
    info['meas_id'] = meas['id']
    else:
    info['meas_id'] = meas_info['id']
    else:
    info['meas_id'] = meas_info['parent_id']

    the meas_id gets found and set to to DATE_NONE

  3. When writing the meas_id to the meas_info fiff tree, I could set the meas_id = DATA_NONE if meas_date is None.

  4. Do nothing. This issue indicates that meas_date and meas_id are out of sync (ie anonymized not through anonymise_info) and maybe its not surprising to get strange results?

Finally, I think that only case 1 would preserve round trip IO of both meas_date and meas_id in a situation where meas_date = NONE and meas_id != DATE_NONE. The other options will all guess one from the other in some way.

;----
I should note that removing

mne-python/mne/io/meas_info.py

Lines 1229 to 1230 in ef7d7c1

if meas_date is None:
meas_date = (info['meas_id']['secs'], info['meas_id']['usecs'])

pass IO tests.
So whatever it's original need was it isn't explicitly tested for :)

@agramfort
Copy link
Member

agramfort commented Nov 17, 2019 via email

@jasmainak
Copy link
Member

I think the issue is that you can erase meas_date but there is no way to erase meas_id from the fif tree. So if meas_date is None it can be removed from the fif tree but when reading it is guessed again from the meas_id. So it's an inconsistency we have to live with that is imposed on us by virtue of the fif file format. What @larsoner did was to come up with a magic number so that we can get around this issue for MNE-Python

@agramfort
Copy link
Member

agramfort commented Nov 18, 2019 via email

@larsoner
Copy link
Member

larsoner commented Nov 18, 2019

My preference would be to (in rough PR order):

  1. Restore writing our special "meas date". It's backward compatible with what we used to do (i.e., merge this). We can have a separate issue about phasing it out if need be.
  2. Use datetime | None objects for info['meas_date'].
  3. Add code for set_meas_date that takes care of annotations and such. The anonymization code can use this / be refactored to use it. It only takes datetime | None as an argument.
  4. Extend set_meas_date and/or anonymize functions to kill some of these other dates (such as meas_id) that are probably causing our anonymization to be incomplete.

@agramfort
Copy link
Member

agramfort commented Nov 18, 2019 via email

@larsoner
Copy link
Member

@bloyl okay for you?

@bloyl
Copy link
Contributor Author

bloyl commented Nov 19, 2019

I'm +1 on steps 2, 3 and 4.

I'm -1 on step 1. I don't see the purpose in reengineering #5255. to support the ability of setting meas_date to None without adjusting the meas_id. Particularly since steps 2, 3, and 4 will make doing that impossible anyway. the BIDs PR should call anonymize if the purpose is to remove the date/time info.

I'd suggest just doing steps 2, 3 and 4 as somewhat of a priority and closing this PR without merging anything.

@larsoner
Copy link
Member

Your opposition to 1 sounds like it's related to BIDS anonymization. I don't see the relationship between BIDS anonymization and None/special date writing. BIDS anon will never use it, so what problems will it cause there?

Basically people in my proposal people can continue to use the old None style of anonymization (and work with files they've already processed this way!), or use the new date shift method.

Steps 2/3/4 to me would allow datetime or None as options, sorry for not listing that.

@agramfort
Copy link
Member

agramfort commented Nov 19, 2019 via email

@larsoner
Copy link
Member

Agreed @agramfort. The question for @bloyl is why no (1) in the proposal above? I think we want 1/this PR if and only if we allow info['meas_date'] = None. We allowed this previously so for backward compat at least it seems like the round-trip should pass.

@bloyl
Copy link
Contributor Author

bloyl commented Nov 19, 2019

There seems to be some confusion.

I am fine with supporting info['meas_date'] = None and with info['meas_id'] using DATE_NONE.
I think Steps 2/3/4 should work to ensure that those two don't get out of sync or maybe even get rid of info['meas_id'] all together.

That is all great.

The original issue (#7067)
calls
raw.info['meas_date'] = None
This sets up a situation where raw.info['meas_date'] and raw.info['meas_id'] are out of sync. In the current codebase raw.info['meas_id'] gets written to one place in the fiff tree (DATA_NONE gets written to other nodes) and raw.info['meas_date'] doesn't get written to disk. Then on reading raw.info['meas_date] gets set from what had been in raw.info['meas_id'].

We all agree (I think) that following steps 2/3/4 this situation shouldn't occur, because raw.info['meas_date'] and raw.info['meas_id'] will be kept in sync. So hopefully the whole issue is moot by the next release.

My issue with step 1 is what is meant by

backward compatible with what we used to do

What is the backward compatible thing to do in the out of sync case?

Assuming its pre #5255 behaviour, then that would be that after reading back from disk you'd get raw.info['meas_date'] == None with info['meas_id'] using DATE_NONE

To do that I think you'd have to do either option 2 or 3 from my comment above.

  1. When writing don't write the actual meas_id anywhere. This is fine in principal but I feel like they will be more work then we think. For instance the current state of the PR is the beginning of this approach and has failing tests in the CTF reader from the other parts of PR ( [MRG+1] CTF support for ds bad channels and bad segments #5255 ). I think the effort to make this work would be better used doing Steps 2/3/4,

  2. Option 3 from my comment seems like it might work but is also hacky and I think will be confusing to future support. Although maybe we could tag it for removal re writing in the steps 2/3/4.

I guess I'm willing to give option 3 (the Hacky one) a shot and if it works then great. But I don't have time to redo PR #5255, particularly to support an edge case that will be eliminated before the next release.

@jasmainak
Copy link
Member

Does following 2/3/4 actually guarantee info['meas_date'] = None gets set correctly in IO? What would be the logic if there is no special date? You cannot remove meas_id from FIF tree as far as I remember.

@agramfort
Copy link
Member

agramfort commented Nov 20, 2019 via email

@jasmainak
Copy link
Member

But DATE_NONE is our special date so what you propose amounts to 1. in @larsoner 's proposal, no?

@agramfort
Copy link
Member

agramfort commented Nov 20, 2019 via email

@larsoner
Copy link
Member

Okay I think I have a bit better handle on the problem. From 9 years ago Alex did in read_meas_info:

    if meas_date is None:
        meas_date = (info['meas_id']['secs'], info['meas_id']['usecs'])

This is reasonable if, e.g., some old files don't have a meas_date in the info struct.

But then on write we do:

    if info.get('meas_date') is not None:
        write_int(fid, FIFF.FIFF_MEAS_DATE, info['meas_date'])

So we only write if not None.

I propose a variant of (1) we change it so that this is always written, like:

    if 'meas_date' in info:
        meas_date = info['meas_date']
        meas_date = DATE_NONE if meas_date is None else meas_date
        write_int(fid, FIFF.FIFF_MEAS_DATE, meas_date)

This will not be totally backward compatible, as those old files were not written this way. But it is at least forward compatible with allowing info['meas_date'] = None to I/O round-trip properly. Then we can continue with steps 2/3/4. This passes the test added here about I/O round-trip as well as all existing mne/io tests locally. In case people want to look in diff form (and to get CIs going) it's up at #7090.

@bloyl
Copy link
Contributor Author

bloyl commented Nov 20, 2019

The reason not to write DATE_NONE to the fiff file is that other readers (mne-c, mne-cpp, fieldtrip, brainstorm etc) don't know that DATE_NONE is special and will interpret it as an actual date.

@agramfort
Copy link
Member

agramfort commented Nov 20, 2019 via email

@larsoner
Copy link
Member

Looking back at 0.17, what we used to do is not write meas_date when meas_date is None, but then we also set meas_id and file_id fields to DATE_NONE. And then on read we did:

    if meas_date is None:
        meas_date = (info['meas_id']['secs'], info['meas_id']['usecs'])
    if np.array_equal(meas_date, DATE_NONE):
        meas_date = None

So we could continue not writing meas_date (since we never did), but make sure that whatever gets used as a surrogate meas_date in that case gets set to the magical DATE_NONE values (point 4). It seems like these probably need to be done as part of anonymization anyway.

So maybe move point (4) up, add the np.array_equal(meas_date, DATE_NONE) conditional back, and then we don't need (1) anymore.

The reason not to write DATE_NONE to the fiff file is that other readers (mne-c, mne-cpp, fieldtrip, brainstorm etc) don't know that DATE_NONE is special and will interpret it as an actual date.

Regarding this point, I'm not sure what these other readers do when they encountered missing meas_date before, but if they looked elsewhere in the tree, they already have this problem (and have had it for a while) so we probably need to fix that separately in MNE-MATLAB anyway.

@bloyl
Copy link
Contributor Author

bloyl commented Nov 20, 2019

Is there any information on the FIFF file format? Specifically as to what meas_date'and meas_ids are supposed to encode? Are they supposed to be the same?

@bloyl
Copy link
Contributor Author

bloyl commented Nov 20, 2019

if we aren't concerned about other packages/platforms supporting DATE_NONE as a date, then I'm fine with the proposed fix currently sitting on #7090

@larsoner
Copy link
Member

  • meas_date
  • file_id. In our code this comes in start_file(fname) -> write_id(fid, FIFF.FIFF_FILE_ID, id_) and if id_ is None it generates one that uses DATE_NONE.

As for the motivations, I can only infer them from these files. But it does look like DATE_NONE is something that we've been writing for a couple of years, as it's in _generate_meas_id.

@larsoner
Copy link
Member

Okay now all #7090 does is ensure that doing this works:

info['meas_date'] = None
anonymize_info(info)

It round-trips properly. Basically what we did before with info['meas_date'] = None was always incomplete anonymization, so people should do anonymize_info(info) afterward to ensure things work.

So I think we should merge #7090, then proceed with steps 2-4 above. @bloyl this sounds like it's compatible with what you're thinking (right?) and I think it stays true to what @agramfort and I talked about regarding info['meas_date'] = None round-tripping and so forth.

@bloyl
Copy link
Contributor Author

bloyl commented Nov 21, 2019

This seems fine to me although I'm pretty sure that

info['meas_date'] = None
anonymize_info(info)

Already works. its even tested:

# test with meas_date = None
base_info['meas_date'] = None
exp_info_3['meas_date'] = None
exp_info_3['file_id']['secs'] = DATE_NONE[0]
exp_info_3['file_id']['usecs'] = DATE_NONE[1]
exp_info_3['meas_id']['secs'] = DATE_NONE[0]
exp_info_3['meas_id']['usecs'] = DATE_NONE[1]
exp_info_3['subject_info'].pop('birthday', None)
new_info = anonymize_info(base_info.copy(), daysback=delta_t_2.days)
assert_object_equal(new_info, exp_info_3)
new_info = anonymize_info(base_info.copy())
assert_object_equal(new_info, exp_info_3)

That being said the additional test in this PR seems fine.

My only suggestion would be to change the api of anonymize_info to take something like a remove_dates flag (default=False). Essentially this would set info['meas_date'] to None and then proceed through the rest of anonymize_info

That way user code would look like this.

raw.anonymize(remove_dates=True)

which I think is cleaner and less prone to error (ie only doing the info['meas_date'] = None part)

@larsoner
Copy link
Member

My only suggestion would be to change the api of anonymize_info to take something like a remove_dates flag (default=False). Essentially this would set info['meas_date'] to None and then proceed through the rest of anonymize_info

Let's get #7090 in, and then do steps 2-4, then see if it's necessary. I don't think it will be since people will be able to do the following if they really want info['meas_date'] = None:

raw.set_meas_date(None)
raw.anonymize()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

meas_date is guessed during I/O when it is None

4 participants