make gluon rnn layers hybrid blocks#11482
Conversation
| def hybrid_forward(self, F, inputs, states=None, **kwargs): | ||
| if F is ndarray: | ||
| batch_size = inputs.shape[self._layout.find('N')] | ||
| if self._input_size == 0: |
There was a problem hiding this comment.
implement infershape instead?
There was a problem hiding this comment.
this is not possible due to the inverse shape inference in concat.
There was a problem hiding this comment.
or did you mean overriding block's infer shape? I'm taking a look
| states = self.begin_state(batch_size, ctx=inputs.context) | ||
| else: | ||
| states = self.begin_state(0, func=symbol.zeros) | ||
| if isinstance(states, (ndarray.NDArray, symbol.Symbol)): |
There was a problem hiding this comment.
(ndarray.NDArray, symbol.Symbol) -> tensor_types
| return outputs, new_states | ||
|
|
||
| def _forward_kernel(self, inputs, states): | ||
| def __call__(self, inputs, *states): |
There was a problem hiding this comment.
implement infer_shape instead?
|
@piiswrong turns out I cannot do that only in infer_shape. The reason is that sometimes the block is used as a child block of other blocks, in which case the infer shape is called from parent, thus bypassing the code path in rnn infer shape. |
|
I see. Then could you do it without overriding |
|
I can override forward but it would be pretty much equivalent. The reason I cannot do this in hybrid_forward is that when given partial shape, hybrid_forward would only be invoked with symbols as part of the infer_shape pass. |
de75d33 to
0970b45
Compare
03ea58d to
a7a3112
Compare
| return (*out_attrs)[0].ndim(); | ||
| if ((*out_attrs)[0].ndim()) { | ||
| return ReverseReshapeInferShape(&(*in_attrs)[0], oshape); | ||
| } |
There was a problem hiding this comment.
Should this be
if ((*out_attrs)[0].ndim()) {
return ReverseReshapeInferShape(&(*in_attrs)[0], oshape);
}
return false;instead?
be969f5 to
2320758
Compare
0f7ad25 to
f75fe01
Compare
| from common import setup_module, with_seed, teardown, assert_raises_cudnn_disabled | ||
| from test_gluon import * | ||
| from test_loss import * | ||
| from test_gluon_rnn import * |
There was a problem hiding this comment.
test_gluon_rnn has assert_raises_cudnn_disabled for test_rnn. test_rnn doesnt raise an error though. I think it should be removed from test_gluon_rnn.py
8a62ec4 to
340dc9d
Compare
|
@szha today is the mentioned code freeze date for MXNet 1.3 release. Could you please check the status of this PR? thanks! |
|
@Roshrini We've found the root cause of the bug, the temporary work-around is the one being built and tested now. |
|
@Roshrini this PR uncovers a weird bug which should probably be addressed before releasing. |
6a80987 to
9dde5e7
Compare
| @assert_raises_cudnn_disabled() | ||
| def test_layer_fill_shape(): | ||
| layer = gluon.rnn.LSTM(10) | ||
| layer.hybridize() |
There was a problem hiding this comment.
Removing this hybridize call should be enough. is there any reason we are removing all hybridize calls ?
There was a problem hiding this comment.
The same error on test_norm still occurs if I only remove this one call, so I was trying out all possible cases
|
@szha thanks for the update. |
| .set_attr<FCompute>("FCompute<cpu>", ConcatGradCompute<cpu>); | ||
|
|
||
|
|
||
| NNVM_REGISTER_OP(_rnn_param_concat) |
There was a problem hiding this comment.
Please write comments for this operator. How is it different from the normal concat.
There was a problem hiding this comment.
This is a custom concat op with specialized infer_shape, which handles the case where the first one or two inputs may have unknown shape that can be inferred from output shape.
| from common import assert_raises_cudnn_disabled | ||
|
|
||
|
|
||
| @assert_raises_cudnn_disabled() |
There was a problem hiding this comment.
It’s not calling cudnn kernel
| check_rnn_layer_forward(layer, mx.nd.ones((3, 2, 7))) | ||
| print(layer) | ||
| assert layer.i2h_weight[0].shape[1] == 7, layer.i2h_weight[0].shape[1] | ||
| assert layer.l0_i2h_weight.shape[1] == 7, layer.l0_i2h_weight.shape[1] |
There was a problem hiding this comment.
This has not been exposed as a documented attribute before and shouldn’t be considered part of API
* make Gluon RNN layer hybrid block * separate gluon gpu tests * remove excess assert_raises_cudnn_disabled usage * add comments and refactor * add bidirectional test * temporarily remove hybridize in test_gluon_rnn.test_layer_fill_shape
[MXNET-750] fix nested call on CachedOp. (apache#11951) * fix nested call on cachedop. * fix. extend reshape op to allow reverse shape inference (apache#11956) Improve sparse embedding index out of bound error message; (apache#11940) [MXNET-770] Remove fixed seed in flaky test (apache#11958) * Remove fixed seed in flaky test * Remove fixed seed in flaky test Update ONNX docs with the latest supported ONNX version (apache#11936) Reduced test to 3 epochs and made gpu only (apache#11863) * Reduced test to 3 epochs and made GPU only * Moved logger variable so that it's accessible Fix flaky tests for test_laop_4 (apache#11972) Updating R client docs (apache#11954) * Updating R client docs * Forcing build Fix install instructions for MXNET-R (apache#11976) * fix install instructions for MXNET-R * fix install instructions for MXNET-R * fix default cuda version for MXNet-R [MXNET-751] fix ce_loss flaky (apache#11971) * add xavier initializer * remove comment line [MXNET-769] set MXNET_HOME as base for downloaded models through base.data_dir() (apache#11636) * set MXNET_DATA_DIR as base for downloaded models through base.data_dir() push joblib to save containers so is not required when running * MXNET_DATA_DIR -> MXNET_HOME [MXNET-748] linker fixed on Scala issues (apache#11989) * put force load back as a temporary solution * use project.basedir as relative path for OSX linker [MXNET-772] Re-enable test_module.py:test_module_set_params (apache#11979) [MXNET-771] Fix Flaky Test test_executor.py:test_dot (apache#11978) * use assert_almost_equal, increase rtol, reduce matrix size * remove seed in test_bind * add seed 0 to test_bind, it is still flaky * add comments for tracking remove mod from arity 2 version of load-checkpoint in clojure-package (apache#11808) * remove mod from arity 2 version of load-checkpoint * load-checkpoint arity 2 test Add unit test stage for mxnet cpu in debug mode (apache#11974) Website broken link fixes (apache#12014) * fix broken link * fix broken link * switch to .md links * fix broken link removed seed from flaky test (apache#11975) Disable ccache log print due to threadunsafety (apache#11997) Added default tolerance levels for regression checks for MBCC (apache#12006) * Added tolerance level for assert_almost_equal for MBCC * Nudge to CI Disable flaky mkldnn test_requantize_int32_to_int8 (apache#11748) [MXNET-769] Usability improvements to windows builds (apache#11947) * Windows scripted build Adjust Jenkins builds to use ci/build_windows.py Issues: apache#8714 apache#11100 apache#10166 apache#10049 * Fix bug * Fix non-portable ut * add xunit Fix import statement (apache#12005) array and multiply are undefined. Importing them from ndarray Disable flaky test test_random.test_gamma_generator (apache#12022) [MXNET-770] Fix flaky test: test_factorization_machine_module (apache#12023) * Remove fixed seed in flaky test * Remove fixed seed in flaky test * Update random seed to reproduce the issue * Fix Flaky unit test and add a training test * Remove fixed seed in flaky test * Update random seed to reproduce the issue * Fix Flaky unit test and add a training test * Increase accuracy check disable opencv threading for forked process (apache#12025) Bug fixes in control flow operators (apache#11942) Fix data narrowing warning on graph_executor.cc (apache#11969) Fix flaky tests for test_squared_hinge_loss (apache#12017) Fix flaky tests for test_hinge_loss (apache#12020) remove fixed seed for test_sparse_ndarray/test_operator_gpu.test_sparse_nd_pickle (apache#12012) Removed fixed seed from , test_loss:test_ctc_loss_train (apache#11985) Removed fixed seed from , test_loss:test_sample_weight_loss (apache#11986) Fix reduce_kernel_M1 (apache#12026) * Fix reduce_kernel_M1 * Improve test_norm Update test_loss.py to remove fixed seed (apache#11995) [MXNET-23] Adding support to profile kvstore server during distributed training (apache#11215) * server profiling merge with master cleanup old code added a check and better info message add functions for C compatibility fix doc lint fixes fix compile issues lint fix build error update function signatures to preserve compatibility fix comments lint * add part1 of test * add integration test Re-enabling test_ndarray/test_cached (apache#11950) Test passes on CPU and GPU (10000 runs) make gluon rnn layers hybrid blocks (apache#11482) * make Gluon RNN layer hybrid block * separate gluon gpu tests * remove excess assert_raises_cudnn_disabled usage * add comments and refactor * add bidirectional test * temporarily remove hybridize in test_gluon_rnn.test_layer_fill_shape [MXNET-751] fix bce_loss flaky (apache#11955) * add fix to bce_loss * add comments * remove unecessary comments Doc fix for a few optimizers (apache#12034) * Update optimizer.py * Update optimizer.py
* make Gluon RNN layer hybrid block * separate gluon gpu tests * remove excess assert_raises_cudnn_disabled usage * add comments and refactor * add bidirectional test * temporarily remove hybridize in test_gluon_rnn.test_layer_fill_shape
Description
make gluon rnn layers hybrid blocks
resolves #10873
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes