Skip to content

Comments

Should not modify index in im2col kernel loop#427

Merged
shelhamer merged 2 commits intoBVLC:devfrom
jamt9000:fix-kernel-index
Jun 25, 2014
Merged

Should not modify index in im2col kernel loop#427
shelhamer merged 2 commits intoBVLC:devfrom
jamt9000:fix-kernel-index

Conversation

@jamt9000
Copy link
Contributor

I assume this is a mistake, although I don't think it actually breaks anything in practice with the launch configuration used.

@jamt9000
Copy link
Contributor Author

I think there'll be a problem with the data pointers too

@jamt9000
Copy link
Contributor Author

Ok, I have checked if there is any chance of anything dangerous happening (ie, the loop body running more than once) with the current launch configuration. Fortunately, even though the index is decreased in the loop, the value of blockDim.x * gridDim.x is always going to be greater than or equal to n (since blockDim.x * gridDim.x = 1024 * ceil(n/1024)), so even if the index is 0 the loop will only run once.

However, if gridDim.x is changed (eg. halved to compute two elements in one kernel, which is what the grid-stride loop #225 is meant to make possible) then the loop will run an incorrect number of times (and write garbage data) or may never terminate.

Here is a python script showing the problem

nthreads = 1024

blockDim_x = nthreads
n = 6000

# ceil of n/nthreads
gridDim_x = ((n + nthreads - 1) / nthreads)

width_col = 2

def kernel(blockIdx_x = 1, threadIdx_x = 10):
    index = blockIdx_x * blockDim_x + threadIdx_x;
    while(index < n):
        print 'index0', index
        index /= width_col
        print 'index1', index

        # do stuff...

        index += blockDim_x * gridDim_x
        print 'index2', index

# With current launch config will run once
kernel()

# With different launch config will loop wrongly
gridDim_x /= 2
kernel()

# Or infinitely
width_col = 6
kernel()

@jamt9000
Copy link
Contributor Author

@shelhamer et al may want to check

@shelhamer
Copy link
Member

@jamt9000 thanks for investigating this and making an illustrative example! We'll be sure to review this and make changes post-NIPS deadline (or sooner, time permitting...).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment: maybe suppressing the above 4 lines to 2:

Dtype* data_col_ptr = data_col + (channel_out * height_col + h_out) * width_col + w_out;
const Dtype* data_im_ptr = data_im + (channel_in * height + h_in) * width + w_in;

@Yangqing
Copy link
Member

Yangqing commented Jun 6, 2014

I took a look and it looks great. Adding a unit test could be even better if you have time to write one :)

@shelhamer
Copy link
Member

Thanks for the kernel fix. Please follow-up on Yangqing's comments then we'll merge.

@jamt9000
Copy link
Contributor Author

I'm not sure how to go about writing a test for it. I probably want to test
the kernel in isolation with different launch configurations, but for that
I'd need to add a .cu file to the tests directory which would require
changes to the Makefile and maybe isn't what you want.

On Sun, Jun 8, 2014 at 7:38 AM, Evan Shelhamer notifications@github.com
wrote:

Thanks for the kernel fix. Please follow-up on Yangqing's comments then
we'll merge.


Reply to this email directly or view it on GitHub
#427 (comment).

@shelhamer
Copy link
Member

test the kernel in isolation with different launch configurations, but for that
I'd need to add a .cu file to the tests directory which would require
changes to the Makefile

I agree that is right kind of test. I don't see a problem with adding this as a further GPU test with the required Makefile change. @jeffdonahue thoughts?

@shelhamer
Copy link
Member

Merging, as this has been illustrated to be correct. Tests to verify launch configurations can follow in a future PR.

Thanks @jamt9000!

shelhamer added a commit that referenced this pull request Jun 25, 2014
Should not modify index in im2col kernel loop
@shelhamer shelhamer merged commit 9b45809 into BVLC:dev Jun 25, 2014
@jamt9000 jamt9000 mentioned this pull request Jun 26, 2014
mitmul pushed a commit to mitmul/caffe that referenced this pull request Sep 30, 2014
Should not modify index in im2col kernel loop
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants