Skip to content

Conversation

@PonteIneptique
Copy link
Contributor

@PonteIneptique PonteIneptique commented Jun 26, 2020

It's a work in progress for now, I'll need to implement multi-GPUs but I have not access to it right now.

@PonteIneptique
Copy link
Contributor Author

Rather than issues, I'd be looking for feedback on my implementation :)
There is some stuff I still have to do but this looks promising.

@emanjavacas
Copy link
Owner

Hey. Nice you got it wrapped up. I have two comments on this.

  • I'd really try to avoid duplicating classes/scripts just because of the optimization (in this case it's the trainer class and the train script). The train script is perhaps less problematic but the Trainer class is meant to be very general and I'd prefer not to have to maintain two classes that do almost the same. If the Trainer class has to adapt slightly for it I'd be fine.

  • Many of the parameters we'd like to optimize are nested in the config file. You can have a look at how I deal with this in the optimize.py file and perhaps try something along those lines. There is already code to parse the opt.json files, so I think that could be reused.

@PonteIneptique
Copy link
Contributor Author

Regarding 1., I agree but I am unsure about the way forward. Technically, I reused as much as I could from the Trainer class. Most of the duplicate code directly comes from the train script. Maybe the Trainer class could have a setup(settings) method though, which would even more reduce the need for duplication around.

As for 2., I technically deal with nested using path ("lr/patience"). I did not include example though. I'll look at your implementation for this.

@PonteIneptique
Copy link
Contributor Author

I think something along the following line would be neat:

trainer, trainset, devset, encoder, models = Trainer.setup(settings)

@PonteIneptique
Copy link
Contributor Author

I reworked the API to go towards a more unified training API. If you agree with it, I'll apply it to other scripts.

// {
// "path": "cemb_dim",
// "type": "suggest_int",
// "args": [200, 600]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't optuna allow to define sampling distributions for the parameters?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I designed it, you can basically use any optuna distribution function. It just makes it simpler as you don't have to "code" it.
I have not yet implemented the GridSampler and stuff like this. Should probably be the next target... https://optuna.readthedocs.io/en/stable/reference/samplers.html?highlight=suggest#optuna.samplers.GridSampler

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done ;)

@emanjavacas
Copy link
Owner

emanjavacas commented Jun 27, 2020 via email

@PonteIneptique
Copy link
Contributor Author

Thanks for the review ;)

@PonteIneptique
Copy link
Contributor Author

Ok, this should be again seen by you. I addressed all your concerns (I think). This should be neat :)

PonteIneptique and others added 4 commits July 3, 2020 09:47
… reset settings

Trial would not reset Model settings used for model training, still thinking it did, hence training infinitely
on the same first params
@PonteIneptique
Copy link
Contributor Author

All in all, I think this is one is ready

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants