Conversation
0bd5360 to
3922737
Compare
|
thanks for handling this :) |
There was a problem hiding this comment.
These should be declared as constants (e.g. const Dtype kRMSDecay = 0) like the other args to make the meaning clear.
|
Thanks @erogol for the original work and thanks @ronghanghu for the rebase. This looks good except as noted above. |
|
@jeffdonahue OK, I'll handle them. Thanks for the comments! |
3e8ab30 to
fbd0533
Compare
|
Cool, LGTM. @ronghanghu feel free to merge whenever it's easiest for you, before or after the other two PRs. |
7f61b86 to
abe99e8
Compare
Implement RMSProp solver and cleaned up to adjust to new solver interface that uses accumulated gradients and refactored regularization.
There was a problem hiding this comment.
Could you set the RMS decay here, instead of introducing the decay argument to least squares and snapshotting tests? Since it is unique to this solver I think it is best handled here.
|
@ronghanghu Sorry I didn't catch this earlier, but I have a suggestion for the RMS decay parameter in the tests. Instead of introducing another argument and setting it for every test, this param could be set by the RMSProp test class for encapsulation. Could you send a follow-up PR to make this change? |
|
@shelhamer Yes, I can send another PR to do that. Adam solver is also going to introduce a Addressed in #2888. |
Rebased and adapted RMSprop implementation #1890 to the new solver interface #2518 and #1977. The original author is @erogol. Pulled against master instead of dev.
The RMSprop solver is based on G. Hinton's lecture (http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf). Param gradients are divided by average root mean square of gradients in recent batches. It can be seen as a mini-batch version of using only the sign of gradients.
Update rule:
Momentum is not supported for RMSprop solver, as in #1890.