Conversation
|
Please ignore my previous requests. I have been working in different places with different computers those special days, I have to redo the same thing in new places. And I was not very familiar with this git flow, sorry for the operations. About the this macro, you know the idea, if there's still some issues, please do it instead of me, thank you. |
|
@blackball Although having a loop macro could be useful in some cases. In the places you introduce it, will create unnecessary loops where there none. Please review what do you intend to do with it. |
|
no loops. -----Original Message----- @blackball Although having a loop macro could be useful in some cases. In the places you introduce it, will create unnecessary loops where there none. |
|
@sguada With the current launch configurations it doesn't make any difference (the loop body will be executed once). But it actually makes the launch configuration much more flexible for future changes, since it allows a grid that is smaller than the data. https://devblogs.nvidia.com/parallelforall/cuda-pro-tip-write-flexible-kernels-grid-stride-loops/ |
There was a problem hiding this comment.
does not compile -- should be index, n, not index < n
|
@blackball, this isn't compiling due to multiple problems - please fix these and push to the same branch rather than creating a new PR. You might have to do |
|
Never mind @blackball, I fixed these myself, rebased and merged in #239. Thanks. |
|
@blackball thanks for the explanation, now I see the benefit of the grid-stride loop |
Merge v0.15.12 into caffe-0.16
No description provided.