Skip to content

Conversation

@jackz314
Copy link
Contributor

Reference issue

N/A

What does this implement/fix?

This adds the ability to export Raw or Epochs instances as EEGLAB set files. As well as the ability to import channel location data from separate CSV files or another file (using read_raw).

Additional information

The exported to set function (save_set) was based on mnelab's write_set. I added channel location export and events/epochs data export so that Epochs exports can be read by EEGLAB.

The channel location expansion function (xyz to all, used for save_set) referenced MATLAB's cart2sph, as well as EEGLAB's relevant functions.

It has been roughly tested on a few data sets on both EEGLAB and MNE.

@jackz314
Copy link
Contributor Author

Seems like the tests are failing due to the lack of docstrings. Should I add them?

@mmagnuski
Copy link
Member

Welcome on your first mne PR, @jackz314!
Thanks for contributing, beinging able to save files in .set format would be a nice addition (at least from my perspective).

Seems like the tests are failing due to the lack of docstrings. Should I add them?

Yes, we follow the numpy docstring format. Take a look around in the codebase - you will see the pattern. The docstrings are required for all public functions.
Adding a functionality like that also requires writing tests that cover > 90% of the added code.
Also - make sure you follow the PEP8 code style.
You will find more guidelines here.

@agramfort
Copy link
Member

thx @jackz314 for the work

we should discuss how to approach this. So far we have been reluctant to support writing for something different from .fif as it's very likely that the save to something else will not persist to disk all the informations in the mne instance.

Also you add a lot of public functions to mne and we are support careful about this. Should i had started this PR I would have tried to support writing without any new public function: just use .save method and detect the file extension to know what to do.

@hoechenberger
Copy link
Member

Nice job, @jackz314!
Like @agramfort said, ideally all functions except for save_set() should be private.

Also I'm wondering if instead of adding save_set(), we should consider adding a parameter to save(), allowing to save in EEGLAB format. Or it could even auto-pick "EEGLAB .set" if the output file extension is .set.
But this really needs a bit of discussion (and, most certainly, much more testing…)

@jackz314
Copy link
Contributor Author

Thanks guys. I'm currently writing tests on this, but my expertise in both EEGLAB and MNE is limited, so I would probably need help from other people to test this more thoroughly - As I'm writing these tests, I'm already discovering multiple bugs in my code. I agree maybe merging save_set with save would be better.

The only two user-facing functions would probably be save_set and import_eeg_chan_location_from_csv. get_eeglab_full_cords and cart_to_eeglab_full_coords doesn't seem to be useful outside of conversion between eeglab and mne, so I can change them to private.

As for information loss, unfortunately, it seems like there will be some unavoidable info loss, so far from my limited testing, things like re-reference, baseline, and high/low pass freq info are lost when converting between the two. For re-referencing and hi/lo pass, the modified data is preserved, but in EEGLAB there's no way of knowing what kind of re-referencing is done, and when converting back to MNE, there's no way of getting the filtering parameters.

Info lost is definitely not ideal, but in a lot of cases it's tolerable (as long as data is intact and as expected), so I think maybe some warning could be given to users who try to export as set files, that way they are aware of the info lost, but can still benefit from the conversion. I'm currently working with a lab where EEGLAB is heavily used, but they also want to use Python as well, since some things like ML is easier on Python, this is why I wrote these functions, and I think other people in similar situations would benefit from this as well.

@hoechenberger
Copy link
Member

import_eeg_chan_location_from_csv

This one should probably also be integrated into read_custom_montage()

@jackz314
Copy link
Contributor Author

import_eeg_chan_location_from_csv

This one should probably also be integrated into read_custom_montage()

I just realized that channel location data in MNE is called Montage, I will integrate it into read_custom_montage(), but it should probably be a separate PR.

@hoechenberger
Copy link
Member

I will integrate it into read_custom_montage(), but it should probably be a separate PR.

This actually sounds like a good idea!!

@jackz314
Copy link
Contributor Author

I also would like to know where I should put the tests for these functions. Currently, I put the tests for Epochs' save set to tests/test_epochs.py, and I plan on putting the tests for Raw's save set to io/tests/test_raw.py, would this be appropriate? Or should I put both of them into io/eeglab/tests/test_eeglab.py since they are related to EEGLAB?

@jackz314
Copy link
Contributor Author

I noticed that channel location data is actually stored in the format of -y, x, z instead of x, y, z. Is there a particular reason for this? Additionally, is this a standard that applies to all location data or specific to some formats like EEGLAB set files?

@hoechenberger
Copy link
Member

hoechenberger commented Mar 27, 2021

Hello @jackz314,

I briefly talked with @agramfort about this effort and we had the feeling that adding support for writing to new formats could end up as opening Pandora's box: it would be very challenging to ensure that no data is lost (or even to quantify as to which data is lost, specifically) during an I/O roundtrip.

We felt that maybe this functionality would be better suited in an external package, which could very well be maintained under the MNE umbrella (mne-tools GitHub organization), but could be released separately from MNE, which would make a lot of sense as this would allow to push out bugfixes on an independent schedule.

Another idea I had and which is kind of contradictory to what I said above is adding a new method for exports that possibly lead to data loss. This is essentially what GIMP does: "save/save as" saves to their native format; "export" converts the data, possibly losing information. So I would vote for calling a method that exports to non-FIFF files export().

I'd love to hear your and the other developers' thoughts!

@jackz314
Copy link
Contributor Author

Personally, I think this feature would be helpful even if there is some data loss, like I mentioned above, I think it would be better to provide warnings to users who use this function (and maybe on relevant docs as well) about potential data loss than to not provide the feature at all. As you said, I think using a separate and distinct name like export instead of save or save_set could further make sure that the user knows what they are doing.

Some data will be lost unfortunately due to some features being implemented very differently in EEGLAB and MNE (e.g., reference channel/method, filtering configuration, etc.). However, since the core data and some other fields like events/trials/epochs/annotations (the events "family") and channel info can mostly be exchanged without data loss, exporting will still be helpful even if other fields might not survive the i/o cycle.

I'm leaning more towards your second idea, because I feel like since other functions that read from EEGLAB are already maintained in the main MNE package, any changes to EEGLAB that requires any updates to MNE's I/O should occur naturally together. I also think that maintaining a separate package just for exporting to EEGLAB seems a bit overkill and shouldn't be necessary, especially since reading from EEGLAB is already in MNE. Still, I'm also fine with starting a new package, if the consensus later agrees that this feature should be in a separate package, I'd be happy to port my code over.

@hoechenberger
Copy link
Member

hoechenberger commented Mar 27, 2021

Pinging @larsoner, @jasmainak, @adam2392, @sappelhoff, and @cbrnr

@agramfort
Copy link
Member

agramfort commented Mar 27, 2021 via email

@adam2392
Copy link
Member

adam2392 commented Mar 27, 2021

Don't we have a "pybv" package for brainvision? Perhaps we have:

  • Pyedf (I think there's pyedflib but writing was non trivial)
  • Pyeeglab?

Maybe "raw.export" just calls these underlying packages? Unit testing should primarily be in those sub packages to address @agramfort objection?

the only reason I even use eeglab is cuz their ICA is more advanced then mne. Don't use them for anything else :p. Imo these kinds of write/export functionality will be very attractive for cross-platform usage. Could even have things like:

Pypersyst
Pynihonkohden

? This can be useful for mne-bids too for writing other supported BIDS formats of EEG/IEEG.

@agramfort
Copy link
Member

agramfort commented Mar 27, 2021 via email

@hoechenberger
Copy link
Member

@adam2392

Maybe "raw.export" just calls these underlying packages? Unit testing should primarily be in those sub packages to address @agramfort objection?

Yeah that sounds like a possibility. Raw.export() would just be a few lines then, passing self to one of those packages and emitting a warning – that's about it.

the only reason I even use eeglab is cuz their ICA is more advanced then mne.

It's off-topic alright, but could you elaborate? Do you mean the interactive exploration of components..?

@adam2392
Copy link
Member

They have this nice feature which plugs into a database of features I think that can automatically estimate ICA labels in case you have 100s of subjects and can't feasibly run manual ICA, they actually have a cool feature that will match it against known labels to try to estimate if it's eye blink, heart beat, etc.

Sorry off topic yeah :p.

@hoechenberger
Copy link
Member

hoechenberger commented Mar 27, 2021

They have this nice feature which plugs into a database of features I think that can automatically estimate ICA labels in case you have 100s of subjects and can't feasibly run manual ICA, they actually have a cool feature that will match it against known labels to try to estimate if it's eye blink, heart beat, etc.

Sorry off topic yeah :p.

Ah, that one. Yes we should add this to MNE too. Actually shouldn't be too difficult, I believe! We already do have a pattern matching infrastructure.

Copy link
Member

@hoechenberger hoechenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more nitpicks and then I'm happy 😃


if fmt not in supported_formats:
raise ValueError(f"Format '{fmt}' is not supported. "
f"Supported formats are {supported_formats}.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we only want to print the keys here, separated by commas, i.e. something like

', '.join(supported_formats.keys()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well someone could attempt to use file extensions to infer automatically, and if the extension is not supported I think printing out the dictionary might be more helpful compared to just the formats.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can also print the dictionary in a better way (e.g. eeglab (.set), brainvision (.eeg, .vmrk, ...))

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great idea!

Co-authored-by: Richard Höchenberger <richard.hoechenberger@gmail.com>
Copy link
Member

@hoechenberger hoechenberger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌👌👌👌👌👌👌❤️

@hoechenberger
Copy link
Member

@cbrnr Feel free to merge if you're happy!

Fantastic work, and thank you so much for your endless patience, @jackz314!

Supported formats: EEGLAB (set, uses :mod:`eeglabio`)
%(export_warning)s
%(export_params_base)s
fmt : 'auto' | 'eeglab'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt : 'auto' | 'eeglab'

Supported formats: EEGLAB (set, uses :mod:`eeglabio`)
%(export_warning)s
%(export_params_base)s
fmt : 'auto' | 'eeglab'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
fmt : 'auto' | 'eeglab'

@cbrnr
Copy link
Contributor

cbrnr commented Apr 19, 2021

@hoechenberger can you check what needs to be done for @agramfort's latest comments? I think we're not yet ready.

@jackz314
Copy link
Contributor Author

@hoechenberger can you check what needs to be done for @agramfort's latest comments? I think we're not yet ready.

Yeah I removed the redundant code but I'm not sure about the docstring part.

@cbrnr cbrnr added this to the 0.23 milestone Apr 19, 2021
@cbrnr cbrnr changed the title Add functionality to export to set files. Import channel location. Add functionality to export to EEGLAB .set Apr 19, 2021
@jackz314 jackz314 requested a review from larsoner April 22, 2021 00:00
Copy link
Member

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, just minor suggestions really!

However, I don't think we should merge until this is solved, i.e., eeglabio makes a PyPi release:

$ pip install --user eeglabio
ERROR: Could not find a version that satisfies the requirement eeglabio
ERROR: No matching distribution found for eeglabio

and then we add eeglabio to github_actions_dependencies.sh in the same place as nitime requirements_testing_extra.txt. I suspect that none of these lines are really being tested by CIs, and codecov agrees:

codecov/patch — 22.32% of diff hit (target 95.00%)

I know that there is already a pip install -i https://test.pypi.org/simple/ eeglabio in the eeglabio instructions, but I think users will rightly assume they should be able to pip install eeglabio, and we should wait until that's available. Also you might consider setting up a conda-forge recipe for it, maybe @hoechenberger can help with that end?

@larsoner
Copy link
Member

@jackz314 now that #9337 is merged, if you rebase or merge with upstream/main you should be able to add eeglabio to requirements_testing_extra.txt once you make a PyPi release, and it should make CIs actually test the new functionality

@jackz314
Copy link
Contributor Author

@jackz314 now that #9337 is merged, if you rebase or merge with upstream/main you should be able to add eeglabio to requirements_testing_extra.txt once you make a PyPi release, and it should make CIs actually test the new functionality

I moved eeglabio from Test PyPI to PyPI and added it as an requirement. The tests should now run on CIs as well.

@agramfort
Copy link
Member

agramfort commented Apr 23, 2021 via email

@jackz314
Copy link
Contributor Author

you have some remaining comments from @larsoner to address. thx !

For some reason, I didn't see the comments before. I just addressed them.

@larsoner larsoner merged commit c6b22e8 into mne-tools:main Apr 23, 2021
@larsoner
Copy link
Member

Thanks for this @jackz314 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants