Refactor Solver to allow interactive stepping#1228
Conversation
|
@longjon I am currently playing on top of this branch. Right now I am simply planning to expose these to python, so as to have a python re-implementation of Solver::Solve, and then add there callbacks to python plotting magic (see #481). |
|
@rodrigob Cool, let me know (or PR this branch) if you find any errors. In my use, I don't worry about things happening at the end of But yes, I essentially use a Python implementation of |
|
LGTM |
|
I'm using this PR for implementing online reinforcement learning. I think it's very useful but needs a small fix to work safely:
This is how I fixed it: muupan@d098036 |
|
@muupan you're right, of course. I've updated this with a rebase of my most recent working version. |
5f43178 to
033bafe
Compare
Refactor Solver to allow interactive stepping
Refactor Solver to allow interactive stepping Conflicts: src/caffe/solver.cpp
|
Updated: the snapshot logic has been changed to properly snapshot after every _k_th iteration, instead of snapshotting before the iteration following every _k_th. (In other words, snapshotting has been moved to the end of the solver loop.) This simplifies the logic (except the "snapshot after train" logic), and ensures that |
src/caffe/solver.cpp
Outdated
There was a problem hiding this comment.
IMO since this method is common setup specific to the Solver class itself, it should just be given a new name like Initialize() (and then called separately before PreSolve() @ line 236), so that anyone subclassing Solver doesn't need to remember to call it.
There was a problem hiding this comment.
I agree that forcing subclassers to call PreSolve is awkward. Is there any reason we don't just call PreSolve from Init? That would simplify the logic and get rid of initialized_, as there would be no such thing as an uninitialized solver, and then it makes sense to do the base class initialization right in Init. I think I'll try it and update the PR.
There was a problem hiding this comment.
Actually, we can't do that, because virtual functions can't be called from constructors. Here's a slightly more radical idea, which I've implemented now: get rid of Solver::PreSolve and just use the constructors, since constructors now provide the same functionality. SGDSolver::PreSolve, which was the only actual PreSolve, remains as it was and gets called from the SGDSolver constructor.
147555a to
9478a48
Compare
There was a problem hiding this comment.
can be removed since you changed the call to ForwardPrefilled?
There was a problem hiding this comment.
The changed call is in Solve, this is in Step, where the call is to ForwardBackward. An alternative is to remove bottom_vec but call ForwardPrefilled and Backward separately; that's what I've now implemented.
There was a problem hiding this comment.
Actually, I think I'll revert that, because it makes #1663 awkward. We should probably just clean up the Net interface to avoid these dummy vectors at some later point.
|
@jeffdonahue I think I'm satisfied with this now; merge when ready. |
Refactor Solver to allow interactive stepping
|
Cool, thanks Jon |
In this PR, the main loop of
Solver::Solveis factored out intoSolver::Step, which is exposed to Python. This allows interactive stepping of the solver.Warning: I've only actually tested a parallel version of this on another branch.