Sparse Linear now does sparse updates from the last input by ebetica · Pull Request #725 · torch/nn

ebetica · 2016-03-20T05:08:09Z

See dicussion @ #698

Speeds up updateGradParameters and zeroGradParameters when one forward/backward update has been run by keeping track of the immediately previous input.

The follow snippet

local nn = require('nn')
local sys = require('sys')

local model = nn.SparseLinear(65536, 256)

local input = torch.rand(5, 2)
input:select(2, 1):mul(65536):ceil()

sys.tic()
local output = model:forward(input)
sys.toc(true)

local gradOutput = torch.rand(output:size())

sys.tic()
local gradInput = model:backward(input, gradOutput)
sys.toc(true)

sys.tic()
model:updateParameters(1e-3)
sys.toc(true)

sys.tic()
model:zeroGradParameters()
sys.toc(true)

gives us

0.00010514259338379 
6.2942504882812e-05 
5.6028366088867e-05 
4.6968460083008e-05

ebetica · 2016-03-20T05:08:26Z

@zhangxiangxiao

zhangxiangxiao · 2016-03-20T09:04:52Z

This is super awesome! Thanks @ebetica !

soumith · 2016-03-20T18:01:34Z


 function SparseLinear:updateOutput(input)
+   if self.sparseUpdate == ONE_LAST_INPUT then
+      self.sparseUpdate = ACC_MULTIPLE_TIMES


this is wrong. afaik this has to be self.sparseUpdate = NO_LAST_INPUT
The rewritten logic in this file actually never goes back to self.sparseUpdate being reset to self.sparseUpdate = NO_LAST_INPUT as lines 195 and 208 have been removed.
Practically, this PR will enable sparse optimizations only for the first mini-batch, and take the dense path afterwards because it is always be in ACC_MULTIPLE_TIMES mode.
However, I suspect that I misunderstand this PR, can you explain the state transitions and their expected behavior?

Yes that is the intended behavior. After enough non zero elements have been
passed through it becomes more efficient to simply update every parameter
instead of finding the unique nonzeroes first. A smarter method would
probably be to do this transition after accumulating too many non zero
elements, but to simplify it, my design decision was to only do sparse
updates after one minibatch.

On Sun, Mar 20, 2016, 14:01 Soumith Chintala notifications@github.com
wrote:

In SparseLinear.lua
#725 (comment):

@@ -51,6 +51,9 @@ function SparseLinear:reshapeInput(input)
end

function SparseLinear:updateOutput(input)

if self.sparseUpdate == ONE_LAST_INPUT then

self.sparseUpdate = ACC_MULTIPLE_TIMES

this is wrong. afaik this has to be self.sparseUpdate = NO_LAST_INPUT
The rewritten logic in this file actually never goes back to
self.sparseUpdate being reset to self.sparseUpdate = NO_LAST_INPUT as lines
195 and 208 have been removed.
Practically, this PR will enable sparse optimizations only for the first
mini-batch, and take the dense path afterwards because it is always be in
ACC_MULTIPLE_TIMES mode.
However, I suspect that I misunderstand this PR, can you explain the state
transitions and their expected behavior?

—
You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
https://github.com/torch/nn/pull/725/files/1156255899c621e322a1f7991a817d4139af754d#r56768682

yes, if I understand, the intention is to only do sparse updates if you have a pattern of: forward + backward + update, forward + backward + update, ...

However, after processing two mini-batches of FW + BW + UP, FW + BW + UP, the third mini-batch no longer sees sparse updates, as you never reset the state to NO_LAST_INPUT (because lines 195, 208)

ebetica · 2016-03-21T00:06:16Z

Patched with the bugfix. State is reset on zeroGradParameters.

Sparse Linear now does sparse updates from the last input

ebetica force-pushed the sparselinear_fix branch from 63fcaa6 to 1156255 Compare March 20, 2016 08:35

soumith reviewed Mar 20, 2016
View reviewed changes

Sparse Linear now does sparse updates from the last input

617aa0a

ebetica force-pushed the sparselinear_fix branch from b2c927b to 617aa0a Compare March 21, 2016 00:05

soumith added a commit that referenced this pull request Mar 22, 2016

Merge pull request #725 from ebetica/sparselinear_fix

32318fb

Sparse Linear now does sparse updates from the last input

soumith merged commit 32318fb into torch:master Mar 22, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sparse Linear now does sparse updates from the last input#725

Sparse Linear now does sparse updates from the last input#725
soumith merged 1 commit intotorch:masterfrom
ebetica:sparselinear_fix

ebetica commented Mar 20, 2016

Uh oh!

ebetica commented Mar 20, 2016

Uh oh!

zhangxiangxiao commented Mar 20, 2016

Uh oh!

soumith Mar 20, 2016

Uh oh!

ebetica Mar 20, 2016

Uh oh!

soumith Mar 20, 2016

Uh oh!

ebetica commented Mar 21, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

ebetica commented Mar 20, 2016

Uh oh!

ebetica commented Mar 20, 2016

Uh oh!

zhangxiangxiao commented Mar 20, 2016

Uh oh!

soumith Mar 20, 2016

Choose a reason for hiding this comment

Uh oh!

ebetica Mar 20, 2016

Choose a reason for hiding this comment

Uh oh!

soumith Mar 20, 2016

Choose a reason for hiding this comment

Uh oh!

ebetica commented Mar 21, 2016

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants