-
-
Notifications
You must be signed in to change notification settings - Fork 1.5k
WIP: Inverse-transform ReceptiveField #4546
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## master #4546 +/- ##
=========================================
- Coverage 87.05% 85.85% -1.2%
=========================================
Files 349 349
Lines 65780 65830 +50
Branches 10203 10208 +5
=========================================
- Hits 57263 56518 -745
- Misses 5634 6456 +822
+ Partials 2883 2856 -27 |
|
I would change
+1 for computing them directly. How much longer does it take?
Should be identical if no regularization is applied
yes
I would just center X and Y a priori, same as in
|
mne/decoding/receptive_field.py
Outdated
| np.linalg.inv(np.dot(y.T, y))]) | ||
| else: | ||
| inv_coef = np.linalg.multi_dot([self.estimator_.x_xt_, coef, | ||
| np.linalg.inv(np.dot(y.T, y))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You need to ensure that y is centered to 0.
Prefer using np.cov when possible to simplify equation
The autocorrelation calculation can be very slow, especially for long recordings. |
|
Thanks for the comments @kingjr.
It's exactly the same computation, the only difference is that I wouldn't expose |
mne/decoding/receptive_field.py
Outdated
| coef = np.reshape(self.coef_, (n_feats * n_delays, n_outputs)) | ||
| if not isinstance(self.estimator_, TimeDelayingRidge): | ||
| inv_coef = np.linalg.multi_dot([np.dot(X.T, X), coef, | ||
| np.linalg.inv(np.dot(y.T, y))]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use linalg.inv from scipy.linalg not numpy
0f5f9f5 to
72df78f
Compare
|
OK, I rebased after #4543. I fixed your first few comments, but I need more tests still. I'm still unclear as to what the consensus is regarding API : are you OK with the way things are now, or should If it's the latter, should it be an extra parameter to The current code doesn't add more than a couple seconds tops, but I guess this can add up when doing CVs and such. |
8ba21f8 to
e6ced48
Compare
|
I'd rather compute them because I worry about the speed of recalculating. But @kingjr understands the |
|
Sorry, I mean I'd rather store the XtX and Xy than recompute them. |
|
Can some example be updated to show the utility of this? |
Yep, sure. Can update the existing example to show Backward modelling+inverse patterns on the same data. |
|
I'm in favour of |
|
Default False to save time or True for convenience? |
|
Default False for time |
a776fcd to
a402d19
Compare
|
@nbara the positive-lags PR was merged so feel free to rebase |
|
(and update the examples/tutorials as necessary) |
a402d19 to
9405aea
Compare
|
@larsoner I've rebased again and updated the example. Let me know what you think. FWIW I can't reproduce FIG5 in Crosse et al. I've tried using sklearn's Ridge regression, as well as different hyperparameter values to match the paper, but it doesn't look similar. Maybe I've missed something. |
| Y, _ = raw[:] # Outputs for the model | ||
| Y = Y.T | ||
| EEG, _ = raw[:] # Outputs for the model | ||
| EEG = EEG.T |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't rename variables as PR can become controversial for little gain.
we usually use data (not EEG) and here it is written Y as it is the standard name for predicted target variable in ML
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. The EEG is not the predicted variable in the backward case so I thought it'd make more sense.
|
If you have an idea of what a good test would be to check that the inverse coefficients have the right values besides checking that PS: squeezed in PR #4205 in whats_new as I'd forgotten back then. let me know if that's not kosher. |
87eb7a7 to
cac4223
Compare
larsoner
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like good progress, some comments for you.
In case others want to look (@choldgraf?):
| linear encoding model using the continuously-varying speech envelope to | ||
| predict activity of a 128 channel EEG system. | ||
| can perform a similar function along with scikit-learn. We will first fit a | ||
| linear encoding (or forward) model using the continuously-varying speech |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are going to get an angry GIF from @choldgraf, I had a similar "encoding (or forward)" in my PR and it did not go over well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually I am fine with it if they're both specified, as long as encoding/decoding come first ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also now I feel like a curmudgeonly old academic
|
|
||
| # Initialize the model. Here the features are the EEG data. We also specify | ||
| # ``patterns=True`` to compute forward-transformed coefficients during model | ||
| # fitting (cf. next section). We'll use a Ridge regression estimator with a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ridge regression (no caps)
| # recreate `figure 5`_ from [1]_. The backward model weights reflect the | ||
| # channels that contribute most toward reconstructing the stimulus signal, but | ||
| # are not directly interpretable in a neurophysiological sense. Here we also | ||
| # look at the inversed model weights, which are easier to understand [2]_. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This last sentence should probably start another section (so the plot is separate) and also give some explanation about what you mean about being easier to understand -- one or two sentences is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree - we want one figure per section
| y = y.reshape(-1, y.shape[-1], order='F') | ||
| else: | ||
| X = X - X.mean(0, keepdims=True) | ||
| cov_ = np.cov(X.T) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the TimeDelayingRidge class to have a patterns option? Then this can be stored instead of being recomputed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could create a _compute_patterns(model, X, y) that would be shared between them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@larsoner Not sure I understand. What is being recomputed here ? I'm only computing the pattern once either way...
Do you want the pattern/inverse coef to be computed inside TDR if pattern=True ? That means there's going to be some code redundancy between receptive_field.py and TDR.py, is that OK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@larsoner Sorry if I'm being slow but can you elaborate on what it is you want to see?
Or are you happy with the current state of the code ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you've already changed the code to be fast for TDR so I think we're good
mne/decoding/receptive_field.py
Outdated
| del X | ||
|
|
||
| # Inverse output covariance | ||
| inv_Y = 1. / float(n_times * n_epochs - 1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should go in an else clause because it's not used in the y.ndim == 2 and y.shape[1] != 1 case
| # assert_allclose(inv_rf.coef_, np.transpose(rf.patterns_, (1, 0, 2)), | ||
| # rtol=0.1) | ||
| # assert_allclose(rf.coef_, np.transpose(inv_rf.patterns_, (1, 0, 2)), | ||
| # atol=0.1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might only be true if regularization is 0., can you check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah my sense is that the regularization will do some strange things. Plus we're including extra information in there when we do the inverse transform. If we just do encoding model ridge, we're getting weights by normalizing w/ the stimulus covariance matrix, if we do an inverted decoding model, we're getting weights normalized w/ the channels covariance matrix, then normalized again by the stimulus covariance matrix, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@larsoner Yeah I checked and this also fails without regularization. Like @choldgraf I tend to think that they're something going on with the regularisation, although I can't put my finger on what exactly.
| mne.viz.plot_topomap(np.mean(mean_patterns[:, ix_plot], axis=1), | ||
| pos=info, axes=ax[1], | ||
| show=False, vmin=-max_patterns, vmax=max_patterns) | ||
| ax[1].set(title="Forward-transformed coefficients") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the "for delay ..." so there is a better correspondence with the earlier section?
786983e to
6d1b6e4
Compare
|
@kingjr left another block on this PR...I'm hesitant to merge until he unblocks it... |
|
Here's a link to the rendered page so that it's a top-level link: |
|
LGTM - my only question is the delays in the final plot (-.14 and -.125). I think they're negative because stimulus activity lags in front of the brain activity, which in the model will show up as positive delays (aka, from the perspective of the stimulus, brain activity in the future is predicting stimulus activity now). I think we should mention this so that it doesn't confuse people. |
|
@choldgraf How about this as a comment just prior to specifying the lags :
|
? IMO |
I can't (because it's been rebranched?), but my comment has been addressed |
|
@nbara I like the proposed change, let's see if @choldgraf agrees then merge |
only works with estimator=Ridge() at the moment
- demean prior to computing coefficients - rename inverse_coef_ to patterns_ - scipy.linalg - some tests (incomplete)
Can't reproduce Lalor's figures.
+ squeezed in GDF support in whatsnew
- replace `backward` by `decoding` - rewrite example - fix section headers in example TODO: add patterns option to TimeDelayingRidge
3a2310d to
9d39873
Compare
9d39873 to
4bb7955
Compare
|
@larsoner OK it's pushed, but CircleCI fails. I'm not sure why. Let me know if there's anything to do on my end. |
|
Thanks @nbara ! |
|
woooo way to go! nice addition! |
This is an attempt to implement inverse coefficient computation according to [1], which has been discussed in #3884 . This is mostly useful to visualise weights for backward models (aka stimulus reconstruction).
ATM the computation is done in
ReceptiveField.fit(), so I had to expose the data covariancex_xt_inTimeDelayingRidge(). I can compute the inverse coefficients directly inside TDR if you prefer.These are specific points where your input would be appreciated:
inverse_coefs_topatterns_?Thanks in advance.
[1] | Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.-D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96–110. doi:10.1016/j.neuroimage.2013.10.067