Skip to content

Comments

MNT readd modopt#45

Merged
agramfort merged 21 commits intomainfrom
readd_modopt
Apr 15, 2022
Merged

MNT readd modopt#45
agramfort merged 21 commits intomainfrom
readd_modopt

Conversation

@mathurinm
Copy link
Collaborator

once #44 is merged, this will allow to readd the modopt solvers as they were before removal.

@agramfort
Copy link
Contributor

CIs don't complain here. I am surprised.

@mathurinm can you report if this branch https://github.com/CEA-COSMIC/ModOpt/tree/develop fixes the issue?

@agramfort
Copy link
Contributor

cc @sfarrens

@tomMoral
Copy link
Member

This issue should have been fixed in CEA-COSMIC/ModOpt#189.
Let's run the CI to see if everything works fine.

@mathurinm
Copy link
Collaborator Author

The solvers themselves work now, but they break benchopt since they make arrays non writeable, and sklearn and celer's cython code does not like it:
CEA-COSMIC/ModOpt#204

@mathurinm
Copy link
Collaborator Author

Maybe it's worthy to get someone from modopt to help here, because their solvers seem not bad at all:
image

Copy link
Contributor

@agramfort agramfort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise at the end of the run of modopt do:

X.flags.writeable = true
y.flags.writeable = true

to revert what modopt does internally

mathurinm and others added 3 commits December 20, 2021 16:08
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
Co-authored-by: Alexandre Gramfort <alexandre.gramfort@m4x.org>
@mathurinm
Copy link
Collaborator Author

OK for 1 rep:
image
KO for 3 reps:
image

probably a warm restart issue

@tomMoral
Copy link
Member

Indeed, there is no init of the variable in iterate which calls _update where the state is considered as _x_old.
We need to reset manually the internal state (maybe in get_result to avoid extra cost?).

@mathurinm
Copy link
Collaborator Author

thanks for the pointer @tomMoral, resetting _x_old seems OK for POGM.

For FISTA, I don't know, even for 1 run I get a divergence for lambda_max /2:

(base) ➜  lasso git:(readd_modopt) ✗ benchopt run -e . -d simulated -f modopt -s cd -s sklearn -r1
WARNING: astropy not found, will default to scipy for convolution
Benchopt is running
Simulated[n_samples=100,n_features=500,rho=0.6]                                                                      
|--Lasso Regression[fit_intercept=False,reg=0.5]                                                                     
|----cd: done                                                                                                        
WARNING: Making input data immutable.
|----ModOpt-FISTA[restart_strategy=greedy]: diverged    

@mathurinm
Copy link
Collaborator Author

@tbng can you take over this PR ? It seems the divergence is caused by min_beta being set to 1 instead of 1 / L

@tbng
Copy link
Contributor

tbng commented Mar 16, 2022

@tbng can you take over this PR ? It seems the divergence is caused by min_beta being set to 1 instead of 1 / L

Sure, no problem -- just to be sure: does this mean fix this param and rerun the whole benchmark?

@mathurinm
Copy link
Collaborator Author

Yes, just to fix the param and check that the modopt solvers behave nicely, for example running them along with celer and cd on dense and sparse data

@tbng
Copy link
Contributor

tbng commented Mar 17, 2022

The benchmark when setting modopt_fista greedy with min_beta=1/L went fine but there is a diverge case for leukemia dataset with adaptive restarting scheme :

Screenshot_2022-03-17_16-47-49

ping @zaccharieramzi I guess?

Edit: actually in this case the red curve converge (although slow) until the very last time stepwhich is also confusing

@tbng
Copy link
Contributor

tbng commented Mar 29, 2022

I managed to avoid the divergence in Fista restart-adaptive by increasing the value of initial beta_param (or \gamma in Algorithm 4 of the citation). However, it is quite slow to converge running benchmark with 5 repetitions. @zaccharieramzi do you think this is consistent with what you've observed for ModOpt-Fista-Adaptive-1 (red line?)
Screenshot_2022-03-29_09-34-56
Screenshot_2022-03-29_09-34-56
Screenshot_2022-03-29_09-36-21

@zaccharieramzi
Copy link
Contributor

Hi @tbng ,

Indeed according to this: https://hal.inria.fr/hal-02298569/document, FISTA with adaptive restart is rather slow compared to the best performers.

@mathurinm
Copy link
Collaborator Author

mathurinm commented Mar 31, 2022

Hi @tbng did you push all of your changes or did I miss something? I still get divergence:

reg= 0.5 reg = 0.1
image image

Even when it converges, it looks like something is wrong given the time it takes:
image

Greedy also diverges on other data... :
image

@tbng
Copy link
Contributor

tbng commented Mar 31, 2022

Hi @tbng did you push all of your changes or did I miss something? I still get divergence:
reg= 0.5 reg = 0.1
image image

Even when it converges, it looks like something is wrong given the time it takes: image

Greedy also diverges on other data... : image

Sorry I could not push to this PR ! But what I did is just change the parameter p_lazy = 1/20 and q_lazy = 1/5 -- but I think this is why it makes Fista with adaptive restart converges much slower than before (one could see that from Algo 5 of Liang 2018 (https://arxiv.org/abs/1811.01430) although with no divergence.

But from some exchange I think @mathurinm has arrived at much better results + filing issues : CEA-COSMIC/ModOpt#221

@mathurinm
Copy link
Collaborator Author

@agramfort a few curves:

image image image

@mathurinm mathurinm changed the title NOMRG readd modopt MNT readd modopt Apr 11, 2022
@mathurinm mathurinm requested a review from tomMoral April 11, 2022 08:33
else:
def op(w):
return self.X @ w
# TODO implement correct gradient if fit_intercept
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mathurinm you want to merge with this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't the correcct gradient has been defined for the fit_intercept case from line 66-68?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tbng that's the "prediction"/forward step (X @ w). I have implemented the gradient (trans_op) too now.

@mathurinm
Copy link
Collaborator Author

I have added support for fit_intercept:
image

@agramfort agramfort merged commit 6b655c9 into main Apr 15, 2022
@agramfort agramfort deleted the readd_modopt branch April 15, 2022 08:40
@agramfort
Copy link
Contributor

thx @mathurinm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants