Skip to content

Conversation

@Lunderberg
Copy link
Contributor

Previously, conv2d_cudnn.cuda would use cudnn's benchmarking function to select a forward convolution when cfg.is_fallback, and conv3d_cudnn.cuda would use cudnn's benchmarking at all times. After this commit, both expose the cudnn algorithm choice as an option. If cfg.is_fallback, the local device will be benchmarked if present, otherwise will select a default cudnn implementation.

In the future, to better support RPC use-cases, the fallback config should be based on cudnn-specific parameters saved in the Target object.

@Lunderberg
Copy link
Contributor Author

Related PR #8275 is for the same goal of allowing CuDNN modules to be build on a local non-GPU machine for use on a remote GPU machine. The two implementations are independent, and are separate PRs for reviewing purposes.

Potential reviewer: @mdw-octoml

… and conv3d_cudnn.cuda

Previously, `conv2d_cudnn.cuda` would use cudnn's benchmarking
function to select a forward convolution when `cfg.is_fallback`, and
`conv3d_cudnn.cuda` would use cudnn's benchmarking at all times.
After this commit, both expose the cudnn algorithm choice as an
option.  If `cfg.is_fallback`, the local device will be benchmarked if
present, otherwise will select a default cudnn implementation.

In the future, to better support RPC use-cases, the fallback config
should be based on cudnn-specific parameters saved in the Target
object.
@Lunderberg Lunderberg force-pushed the cudnn_conv_find_algo branch from 0cdeef6 to f7fa507 Compare June 17, 2021 18:13
Copy link
Contributor

@jwfromm jwfromm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice change, thanks Eric!

@masahi masahi merged commit bf3f000 into apache:main Jun 18, 2021
@Lunderberg Lunderberg deleted the cudnn_conv_find_algo branch June 18, 2021 12:28
ylc pushed a commit to ylc/tvm that referenced this pull request Sep 29, 2021
… and conv3d_cudnn.cuda (apache#8276)

Previously, `conv2d_cudnn.cuda` would use cudnn's benchmarking
function to select a forward convolution when `cfg.is_fallback`, and
`conv3d_cudnn.cuda` would use cudnn's benchmarking at all times.
After this commit, both expose the cudnn algorithm choice as an
option.  If `cfg.is_fallback`, the local device will be benchmarked if
present, otherwise will select a default cudnn implementation.

In the future, to better support RPC use-cases, the fallback config
should be based on cudnn-specific parameters saved in the Target
object.

Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
zxy844288792 pushed a commit to zxy844288792/tvm that referenced this pull request Mar 4, 2022
… and conv3d_cudnn.cuda (apache#8276)

Previously, `conv2d_cudnn.cuda` would use cudnn's benchmarking
function to select a forward convolution when `cfg.is_fallback`, and
`conv3d_cudnn.cuda` would use cudnn's benchmarking at all times.
After this commit, both expose the cudnn algorithm choice as an
option.  If `cfg.is_fallback`, the local device will be benchmarked if
present, otherwise will select a default cudnn implementation.

In the future, to better support RPC use-cases, the fallback config
should be based on cudnn-specific parameters saved in the Target
object.

Co-authored-by: Eric Lunderberg <elunderberg@octoml.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants