Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
fa25f6b
LBR-GRU integration
xziya Aug 3, 2019
fd1e214
Unit tests for RNN fullfilled
xziya Aug 3, 2019
87e58d9
Collapse for-loop, readable offset, size_t vars
xziya Aug 5, 2019
d1ced43
Fix OpenMP incompatible unsigned int with MSVC on windows
xziya Aug 5, 2019
44f3b79
Explicitly convert size_t to int
xziya Aug 5, 2019
095a294
Trigger CI
xziya Aug 5, 2019
89b459c
Merge branch 'master' and tigger CI
xziya Aug 5, 2019
5e81ace
Re-trigger CI
xziya Aug 6, 2019
2c2a29b
Using Resource to manage temp space, RNNOp public mem vars shift
xziya Aug 7, 2019
a46163e
Shift cudnn mem vars to private
xziya Aug 8, 2019
6fd9e04
Trigger CI
xziya Aug 8, 2019
22cba57
Shift cpu mem vars to private, trigger CI
xziya Aug 8, 2019
b2aaf96
Add notes for OpenMP collapse() directive
xziya Aug 9, 2019
8d126bf
Double check CI
xziya Aug 10, 2019
b04f70e
Merge branch 'master' of https://github.com/apache/incubator-mxnet in…
xziya Aug 10, 2019
1e0e553
Trigger CI
xziya Aug 11, 2019
43d9dd9
Resolve confict in rnn-inl.h & rnn.cc
xziya Aug 14, 2019
103fd47
Fix macro problem
xziya Aug 14, 2019
6e907eb
Fix cudnn macro mistake in rnn.cc
xziya Aug 14, 2019
271240e
Trigger CI
xziya Aug 15, 2019
2a7defe
Doc revision for MKL-DNN RNN operator
xziya Aug 15, 2019
b3cebfd
Merge 'master' & trigger CI
xziya Aug 20, 2019
ca0de1c
Trigger CI
xziya Aug 20, 2019
995ff1f
Weights memory bug fix
xziya Aug 20, 2019
92e5203
Bump cudnn version to 7.6.0.64
DickJC123 Aug 11, 2019
0f23e7d
Trigger CI
xziya Aug 20, 2019
bb0331a
Trigger CI
xziya Aug 21, 2019
169be0e
Use NDArray to manage temp memory
xziya Aug 21, 2019
3ac8cbc
Correct way to use NDArray
xziya Aug 21, 2019
ca5a96e
Merge
xziya Aug 21, 2019
e9f4423
Trigger CI
xziya Aug 22, 2019
4a3a2b3
Trigger CI
xziya Aug 22, 2019
3613866
Merge branch 'master' into MKLDNN-LBR-GRU
xziya Aug 22, 2019
dbb7dd9
trigger
xziya Aug 22, 2019
47a65c9
Merge clang fix
xziya Aug 22, 2019
7d6e938
Trigger CI with a large absolute tolerance 1e-4 -> 2e-4
xziya Aug 23, 2019
6fe260d
Merge branch 'master' into MKLDNN-LBR-GRU
xziya Aug 25, 2019
96e4a33
NDArray
xziya Aug 25, 2019
22d86f7
Indent, remove TempResource for CPU context
xziya Aug 26, 2019
6436643
Trigger CI
xziya Aug 27, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion ci/docker/Dockerfile.build.ubuntu_gpu_cu101
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ RUN /work/ubuntu_docs.sh
COPY install/ubuntu_tutorials.sh /work/
RUN /work/ubuntu_tutorials.sh

ENV CUDNN_VERSION=7.5.1.10
ENV CUDNN_VERSION=7.6.0.64
COPY install/ubuntu_cudnn.sh /work/
RUN /work/ubuntu_cudnn.sh

Expand Down
60 changes: 32 additions & 28 deletions docs/tutorials/mkldnn/operator_list.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,34 +21,38 @@ MXNet MKL-DNN backend provides optimized implementations for various operators c

To help users understanding MKL-DNN backend better, the following table summarizes the list of supported operators, data types and functionalities. A subset of operators support faster training and inference by using a lower precision version. Refer to the following table's `INT8 Inference` column to see which operators are supported.

| Operator | Function | FP32 Training (backward) | FP32 Inference | INT8 Inference |
| --- | --- | --- | --- | --- |
| **Convolution** | 1D Convolution | Y | Y | N |
| | 2D Convolution | Y | Y | Y |
| | 3D Convolution | Y | Y | N |
| **Deconvolution** | 2D Deconvolution | Y | Y | N |
| | 3D Deconvolution | Y | Y | N |
| **FullyConnected** | 1D-4D input, flatten=True | N | Y | Y |
| | 1D-4D input, flatten=False | N | Y | Y |
| **Pooling** | 2D max Pooling | Y | Y | Y |
| | 2D avg pooling | Y | Y | Y |
| **BatchNorm** | 2D BatchNorm | Y | Y | N |
| **LRN** | 2D LRN | Y | Y | N |
| **Activation** | ReLU | Y | Y | Y |
| | Tanh | Y | Y | N |
| | SoftReLU | Y | Y | N |
| | Sigmoid | Y | Y | N |
| **softmax** | 1D-4D input | Y | Y | N |
| **Softmax_output** | 1D-4D input | N | Y | N |
| **Transpose** | 1D-4D input | N | Y | N |
| **elemwise_add** | 1D-4D input | Y | Y | Y |
| **Concat** | 1D-4D input | Y | Y | Y |
| **slice** | 1D-4D input | N | Y | N |
| **Reshape** | 1D-4D input | N | Y | N |
| **Flatten** | 1D-4D input | N | Y | N |
| **Quantization** | 1D-4D input | N | N | Y |
| **Dequantization** | 1D-4D input | N | N | Y |
| **Requantization** | 1D-4D input | N | N | Y |
| Operator | Function | FP32 Training (backward) | FP32 Inference | INT8 Inference |
| --- | --- | --- | --- | --- |
| **Convolution** | 1D Convolution | Y | Y | N |
| | 2D Convolution | Y | Y | Y |
| | 3D Convolution | Y | Y | N |
| **Deconvolution** | 2D Deconvolution | Y | Y | N |
| | 3D Deconvolution | Y | Y | N |
| **FullyConnected** | 1D-4D input, flatten=True | N | Y | Y |
| | 1D-4D input, flatten=False | N | Y | Y |
| **Pooling** | 2D max Pooling | Y | Y | Y |
| | 2D avg pooling | Y | Y | Y |
| **BatchNorm** | 2D BatchNorm | Y | Y | N |
| **LRN** | 2D LRN | Y | Y | N |
| **Activation** | ReLU | Y | Y | Y |
| | Tanh | Y | Y | N |
| | SoftReLU | Y | Y | N |
| | Sigmoid | Y | Y | N |
| **softmax** | 1D-4D input | Y | Y | N |
| **Softmax_output** | 1D-4D input | N | Y | N |
| **Transpose** | 1D-4D input | N | Y | N |
| **elemwise_add** | 1D-4D input | Y | Y | Y |
| **Concat** | 1D-4D input | Y | Y | Y |
| **slice** | 1D-4D input | N | Y | N |
| **Reshape** | 1D-4D input | N | Y | N |
| **Flatten** | 1D-4D input | N | Y | N |
| **Quantization** | 1D-4D input | N | N | Y |
| **Dequantization** | 1D-4D input | N | N | Y |
| **Requantization** | 1D-4D input | N | N | Y |
| **RNN** | Vanilla RNN, activation=Tanh | N | Y | N |
| | Vanilla RNN, activation=ReLU | N | Y | N |
| | LSTM, activation=Tanh | N | Y | N |
| | LBR-GRU | N | Y | N |

Besides direct operator optimizations, we also provide graph fusion passes listed in the table below. Users can choose to enable or disable these fusion patterns through environmental variables.

Expand Down
Loading