Skip to content

addd CUDA_KERNEL_LOOP macro#225

Closed
blackball wants to merge 1 commit intoBVLC:devfrom
blackball:add_kernel_loop_macro
Closed

addd CUDA_KERNEL_LOOP macro#225
blackball wants to merge 1 commit intoBVLC:devfrom
blackball:add_kernel_loop_macro

Conversation

@blackball
Copy link
Contributor

No description provided.

@blackball
Copy link
Contributor Author

Please ignore my previous requests. I have been working in different places with different computers those special days, I have to redo the same thing in new places. And I was not very familiar with this git flow, sorry for the operations. About the this macro, you know the idea, if there's still some issues, please do it instead of me, thank you.

@sguada
Copy link
Contributor

sguada commented Mar 18, 2014

@blackball Although having a loop macro could be useful in some cases. In the places you introduce it, will create unnecessary loops where there none.

Please review what do you intend to do with it.

@blackball
Copy link
Contributor Author

no loops.

-----Original Message-----
From: "Sergio Guadarrama" notifications@github.com
Sent: ‎2014/‎3/‎19 0:38
To: "BVLC/caffe" caffe@noreply.github.com
Cc: "blackball" bugway@gmail.com
Subject: Re: [caffe] addd CUDA_KERNEL_LOOP macro (#225)

@blackball Although having a loop macro could be useful in some cases. In the places you introduce it, will create unnecessary loops where there none.
Please review what do you intend to do with it.

Reply to this email directly or view it on GitHub.

@jamt9000
Copy link
Contributor

@sguada With the current launch configurations it doesn't make any difference (the loop body will be executed once). But it actually makes the launch configuration much more flexible for future changes, since it allows a grid that is smaller than the data.

https://devblogs.nvidia.com/parallelforall/cuda-pro-tip-write-flexible-kernels-grid-stride-loops/

@jeffdonahue jeffdonahue self-assigned this Mar 18, 2014
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does not compile -- should be index, n, not index < n

@jeffdonahue
Copy link
Contributor

@blackball, this isn't compiling due to multiple problems - please fix these and push to the same branch rather than creating a new PR. You might have to do git push -f if you rebase dev.

@jeffdonahue
Copy link
Contributor

Never mind @blackball, I fixed these myself, rebased and merged in #239. Thanks.

@sguada
Copy link
Contributor

sguada commented Mar 19, 2014

@blackball thanks for the explanation, now I see the benefit of the grid-stride loop

vfdev-5 pushed a commit to vfdev-5/caffe that referenced this pull request Oct 3, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

Comments