Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Implement 3D deconvolution (cuDNN)#5615

Merged
sxjscience merged 2 commits intoapache:masterfrom
leezu:3d_deconvolution
Apr 3, 2017
Merged

Implement 3D deconvolution (cuDNN)#5615
sxjscience merged 2 commits intoapache:masterfrom
leezu:3d_deconvolution

Conversation

@leezu
Copy link
Copy Markdown
Contributor

@leezu leezu commented Mar 29, 2017

This pull request adds support for 3D deconvolution when compiled with cudnn and run on GPU. It does not break the current 2D Deconvolution non-cudnn code and simply throws an error if the non-cudnn code is called with a 3D kernel.

As the non-gpu version does not support 3D deconvolution, this pull request does not add test cases to compare that their output is similar. Instead I manually compared the output to the PyTorch 3D Deconvolution implementation.

I have seen that the cudnn convolution code uses DMLC_DECLARE_FIELD(workspace).set_default(1024).set_range(0, 8192). Shall the cudnn deconvolution code be adapted to use a default of 1024 for as well? Currently the default is 512.

The following code can be used to compare with PyTorch:

import mxnet as mx
import numpy as np
import torch

TEST = "3D_TM"

if TEST == "3D_TM":
    DATA_SHAPE = (1, 1, 3, 3, 3)

    weights = np.random.normal(scale=0.3, size=(1, 1, 3, 3, 3))
    weights[0][0][0][0] = 1
    data = np.random.normal(size=DATA_SHAPE)

    ### pytorch
    t_c = torch.nn.ConvTranspose3d(1, 1, 3, bias=None)

    t_weights = torch.from_numpy(weights)
    t_c.weight.data = t_weights

    t_data = torch.from_numpy(data)
    t_v_data = torch.autograd.Variable(t_data)

    t_v_out = t_c.forward(t_v_data)
    t_out = t_v_out.data.numpy()

    ### mxnet
    ctx = mx.gpu(0)
    rand = mx.sym.Variable('rand')
    d = mx.sym.Deconvolution(rand, name='d', kernel=(3, 3, 3), num_filter=1)

    modG = mx.mod.Module(
        symbol=d, data_names=('rand', ), label_names=None, context=ctx)
    modG.bind(data_shapes=[("rand", DATA_SHAPE)])

    m_weights = mx.ndarray.array(weights)
    modG.init_params(arg_params = {'d_weight' : m_weights})

    m_data = mx.ndarray.array(data)

    batch = mx.io.DataBatch([m_data], None)
    modG.forward(batch)

    m_out = modG.get_outputs()[0].asnumpy()


elif TEST == "2D_TMM":
    DATA_SHAPE = (1, 1, 3, 3)

    weights = np.random.normal(scale=0.3, size=(1, 1, 1, 1))
    weights[0][0][0][0] = 1
    data = np.random.normal(size=DATA_SHAPE)

    ### pytorch
    t_c = torch.nn.ConvTranspose2d(1, 1, 1, bias=None)

    t_weights = torch.from_numpy(weights)
    t_c.weight.data = t_weights

    t_data = torch.from_numpy(data)
    t_v_data = torch.autograd.Variable(t_data)

    t_v_out = t_c.forward(t_v_data)
    t_out = t_v_out.data.numpy()

    ### mxnet gpu
    ctx = mx.gpu(0)
    rand = mx.sym.Variable('rand')
    d = mx.sym.Deconvolution(rand, name='d', kernel=(1, 1), num_filter=1)

    modG = mx.mod.Module(
        symbol=d, data_names=('rand', ), label_names=None, context=ctx)
    modG.bind(data_shapes=[("rand", DATA_SHAPE)])

    m_weights = mx.ndarray.array(weights)
    modG.init_params(arg_params={'d_weight' : m_weights})

    m_data = mx.ndarray.array(data)

    batch = mx.io.DataBatch([m_data], None)
    modG.forward(batch)

    m_out = modG.get_outputs()[0].asnumpy()

    ### mxnet cpu
    ctx = mx.cpu()
    modG_cpu = mx.mod.Module(
        symbol=d, data_names=('rand', ), label_names=None, context=ctx)
    modG_cpu.bind(data_shapes=[("rand", DATA_SHAPE)])
    modG_cpu.init_params(mx.init.Normal(0.02))

    modG_cpu.set_params(arg_params={'d_weight' : m_weights}, aux_params={})

    modG_cpu.forward(batch)

    m_c_out = modG_cpu.get_outputs()[0].asnumpy()

@leezu leezu mentioned this pull request Mar 29, 2017
@leezu leezu force-pushed the 3d_deconvolution branch 9 times, most recently from 4addd13 to 6482f8e Compare March 31, 2017 10:21
@piiswrong
Copy link
Copy Markdown
Contributor

please fix gpu test

@leezu leezu force-pushed the 3d_deconvolution branch from b52dc4f to 1ca2823 Compare April 2, 2017 05:12
@leezu
Copy link
Copy Markdown
Contributor Author

leezu commented Apr 2, 2017

Tests are passing now.

The upsampling operator initialized the target_shape to (0,0), which is not meaningful. While this worked for the old Deconvolution code (as shape (0,0) was the default target_shape), this pull request adapted the default target_shape to (), which is more meaningful given that shapes of different dimension can be supplied and aligns with the way the convolution code handles the defaults.

leezu added 2 commits April 3, 2017 19:54
If no target shape is supposed to be set, then indeed no target shape shall be
set (and not (0,0) as before).
@leezu leezu force-pushed the 3d_deconvolution branch from 1ca2823 to 910a64b Compare April 3, 2017 11:55
@sxjscience
Copy link
Copy Markdown
Member

I decide to merge this in.

@leezu
Copy link
Copy Markdown
Contributor Author

leezu commented Apr 3, 2017

I rebased on current master, which restarted the tests. They finished running now, so the status is again green.

wmat_ptr = wmat.dptr_;
gwmat_ptr = gwmat.dptr_;
data_ptr = data.dptr_;
gdata_ptr = gdata.dptr_;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use grad.dptr_ directly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be out of scope.

@sxjscience sxjscience merged commit d2f40d8 into apache:master Apr 3, 2017
@piiswrong
Copy link
Copy Markdown
Contributor

I just noticed 2d deconv isn't tested. Could you add tests?

@leezu
Copy link
Copy Markdown
Contributor Author

leezu commented Apr 4, 2017

Guneet-Dhillon pushed a commit to Guneet-Dhillon/mxnet that referenced this pull request Sep 13, 2017
* Implement 3D deconvolution for cudnn

* Adapt upsampling operator to new deconvolution API

If no target shape is supposed to be set, then indeed no target shape shall be
set (and not (0,0) as before).
@leezu leezu deleted the 3d_deconvolution branch September 28, 2020 18:32
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants