Skip to content

Conversation

@sgrechanik-h
Copy link
Contributor

This PR moves the helper function checking nnvm operations against numpy reference implementations from test_top_level# to nnvm.testing and adds the ability to compare symbolic gradients to numerically computed gradients. Also this PR fixes a wrong gradient implementation for the division operation (it was found using this approach).

Currently only tests that used the old helper function are rewritten in such a way that they now use the new function. There are still many tests which didn't use the helper function, they should also be rewritten so that they use this function, if possible, but this is left for future work.

@sgrechanik-h
Copy link
Contributor Author

@kazum @kevinthesun @srkreddy1238 could you please review this PR?

Copy link
Contributor

@srkreddy1238 srkreddy1238 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgrechanik-h
That was a great start to standardize the helpers across various test cases 👍

Few comments to allow non numpy forward/backward callers.

Otherwise this PR look great & waiting to get this merged so that all other test cases can use this instead of local helpers.

None (default) means that all input variables will be used. May be a
list of pairs `(var_name, shape)`.

np_forward : Callable[..., List[numpy.ndarray]], optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to give different naming for np_forward : Need not be numpy always, could be an implementation in any of the frontends we have.

Also receive and pass a flexible parameters 'attr' (dictionary object with default value None) for this function which can carry the implementation specific data like session info ...etc.

logging.debug(debug_stage)

numpy_res = np_forward(**np_inputs_without_head_grads)
np.testing.assert_allclose(nnvm_res[0], numpy_res, atol=atol, rtol=rtol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the context of above comment on np_forward,
Pls check if the result from np_forward is a list and is yes, then compare all arrays in list.


dump_graph : bool
Dump the graph even on success.
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure an option to custom comparison function argument make any sense here.
Just raising as it came across my thought.

only_targets : Set[str]
Test only for those targets from `ctx_list()` that are also in this set.

numerical_grads : bool or "if_possible"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about putting "if_possible" case inside numerical_grads=True?

then check gradients numerically only if this graph can be created (i.e. if there are some
operations with unimplemented gradients, it will just issue a warning).

delta : float
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add optional to optional arguments.

shape = shape.copy() if shape else {}

grad_input_vars_real = []
for x in grad_input_vars:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic looks complicated. How about just passing a list of symbol as grad_input_vars? Shape information can be defined in shape(maybe input_shapes of dict from str to tuple?) argument so that we don't have multiple shapes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, not having shapes in grad_input_vars would be cleaner, I'll fix it.

dtypes = graph.json_attr('dtype')

if shapes is None or dtypes is None:
graph = graph.apply('InferShape').apply('InferType')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we assume set_shape_inputs and set_dtype_inputs have been called. It might be better to directly pass input_shapes and input_types into this function to make it more general.


forward_graph = nnvm.graph.create(symbol)

if dtype is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to set_dtype_inputs. just check dtype is not None.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I think I didn't understand this one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If dtype is None, graph_to_function will raise exception? Same thing will happen for shape.
How about setting dtype as optional but not None as default value, such as dtype="float32"? For shape argument we should force user to pass in value. Then the code logic can be simpler.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Setting dtype='float32' as default would simplify a lot of tests a little bit. However, if some variables have their dtypes specified by attributes then either these attributes should be preferred or an error should be raised, so I'm not sure.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Maybe we can put the shape/dtype check right in this function, before infer shape/dtype, and check either input variable shape/dtype attributes are set or shape/dtype arguments are passed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since shapes and dtypes of some inputs may be inferred from shapes and dtypes of other inputs, the proper solution would be to run inference passes, and only then check if some inputs are left without types. Then these inputs can be assigned the default type.

# nnvm_res contains the output and gradients (if they are needed)
nnvm_res = main_function(**np_inputs)

if np_forward is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If np_forward is None, forward part comparison will be skipped? I think we should enforce np_forward to be set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sometimes we might want check gradients only (or even only numerical gradients). But I think it is good idea to require passing numerical_grads=True explicitly if no reference function is specified.

# Since the result may be non-scalar, we have to put another operation on the top,
# so we just multiple by the randomly generated head_grads.
# This way we can reuse the gradient values which has been already computed.
def function(**kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to a more meaningful name.

----------
function
A function that takes inputs as keyword arguments (like `function(**input_values)`) and
returns a scalar result. Should accept and return numpy arrays.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description is not quite clear here. Scalar or ndarray?

Relative tolerance.
"""

if function_value is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can just compute this value instead of setting it as an argument?

inputs = [('x', dshape, x)]
helper(y, inputs, dtype, forward, backward)
inputs = [(x, dshape)]
check_function(y, inputs, forward, backward, dtype=dtype)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are setting grad_input_vars=inputs, but inputs are actually input_shapes. It would be more clear if we can refactor the interface of check_function.

def check_numerical_grads(function, grad_input_vars, input_values, grad_values, function_value=None,
delta=1e-3, max_error=1e+3, max_discarded_frac=0.1,
atol=1e-2, rtol=1e-2):
"""A helper function that checks that numerical gradients of a function are equal to
Copy link
Contributor

@kevinthesun kevinthesun Jul 31, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is new and will affect all backward operators, can you give more explanation on it?

If developers add new backward ops but find the checking failure, how can they quickly debug the issue? Since this is not simply comparing with np computation result, maybe we can have a tutorial for it. This would also be good to the stability of CI while we are introducing more and more backward ops.

if grad.shape != input_values[x_name].shape:
raise AssertionError(
"Gradient wrt '{}' has unexpected shape {}, expected {} "
.format(x.attr('name'), grad.shape, input_values[x_name].shape))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

x is not defined here. x.attr('name') should be x_name.



def check_numerical_grads(function, grad_input_vars, input_values, grad_values, function_value=None,
delta=1e-3, max_error=1e+3, max_discarded_frac=0.1,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value of delta is different from the check_function's one. Should be 1e-2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, it should be 1e-3 everywhere, thank you!

return (function(**modified_values) - function_value)/a_delta

for x_name in grad_input_vars:
grad = grad_values[x_name]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can replace these two lines with for x_name, grad in grad_values.items():. Then grad_input_vars is no longer necessary.

raise ValueError("numerical_grads must be a bool or 'if_possible', not {}"
.format(numerical_grads))

input_vars = [x for x in symbol.list_input_variables()]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we simply replace this with input_vars = symbol.list_input_variables()?

@sgrechanik-h
Copy link
Contributor Author

  • Renamed np_forward to forward, same for backward, and now it is possible to pass additional
    parameters to these functions via additional_params.

  • Added support for multiple inputs.

  • Now additional parameters to the numgrad checking function are passed as a single dictionary.

  • I'm not sure yet about merging the behaviour of numerical_grads=True and
    numerical_grads='if_possible', so I left it as it is for now.

  • Require passing numerical_grads=True explicitly if no reference function is specified.

  • Got rid of explicit passing of input variables in most of the tests. Also shapes cannot be
    passed with grad_input_vars now.

Also added some tests for check_function itself.

The main thing that is left to do is something like a tutorial on what to do if numerical gradient
testing fails. I'll return to this task in a few days.

@sgrechanik-h
Copy link
Contributor Author

Simplified numerical grad checking, improved docs. Wasn't sure where to put the text on what to do when numgrad check fails, so I put it under Python API/NNVM API/nnvm.testing

@kazum @kevinthesun @srkreddy1238 could you please review this again?

@sgrechanik-h sgrechanik-h changed the title [NNVM][TEST] Test against numerical grad; Fix div grad [NNVM][TEST] Test against numerical grad Aug 9, 2018

shape : Dict[nnvm.Symbol or str, Tuple[int]] or Tuple[int], optional
A dict mapping input variable names to shapes, or just a single shape.
By default shapes will be inferred automatically.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we infer input shapes? I think this argument shouldn't be optional.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are inferred from variables' attributes:

x = sym.Variable("x", shape=(1, 2))

Copy link
Contributor

@kevinthesun kevinthesun Aug 14, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This interface doesn't have any constraint to input symbol. So I can write some codes like:
data = sym.Variable("data")
net = sym.conv2d(data, ...)
check_function(net, shape=None, ...)

It will raise exception later, since no shape information is provided? Currently this kind of input is marked as valid.

Also for kwargs, we can directly pass default values, instead of just None. Then we don't need to write a lot of "if xxx = None:" statements at the beginning stage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, the error message is not nice in this case. I think making it nicer and clarifying the documentation should be enough.


forward_graph = nnvm.graph.create(symbol)

if dtype is not None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If dtype is None, graph_to_function will raise exception? Same thing will happen for shape.
How about setting dtype as optional but not None as default value, such as dtype="float32"? For shape argument we should force user to pass in value. Then the code logic can be simpler.

if shape is not None:
nnvm.compiler.graph_attr.set_shape_inputs(backward_graph, shape)

backward_graph = backward_graph.apply('InferShape').apply('InferType')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/dmlc/tvm/blob/master/nnvm/tests/python/compiler/test_top_level1.py#L32-L37
We just need to update the shape head_grads for input shape dictionary when compiling.

Testing new operations
----------------------

When adding new operations, it it a good idea to test them. Testing
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“it it” => “it is”

all_shapes = graph.json_attr('shape')
all_dtypes = graph.json_attr('dtype')

all_dtypes = [None if t == -1 else TCODE_TO_DTYPE[t] for t in all_dtypes]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TCODE_TO_DTYPE[-1] is defined as None in graph_attr.py. The if-else expression is not necessary.

if None in dtype.values():
raise ValueError("Input variables with no type: {}".format(dtype))

if any([not s for s in shape.values()]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not all(shape.values()) looks simpler.

if (exclude_targets is not None and (target in exclude_targets or
str(target) in exclude_targets)) or\
(only_targets is not None and not (target in only_targets or
str(target) in only_targets)):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The expression is a bit complicated. I’d suggest splitting them:

if exclude_targets is not None:
    if target in exclude_targets or str(target) in exclude_targets:
        logging.info(...)
        continue
if only_targets is not None:
    if target not in only_targets and str(target) not in only_targets:
        logging.info(...)
        continue

@yzhliu
Copy link
Member

yzhliu commented Aug 20, 2018

@srkreddy1238 @kevinthesun @kazum could you help to review again according to the new changes and explicitly approve if it is good to you.
@sgrechanik-h Please rebase.

@yzhliu yzhliu self-assigned this Aug 20, 2018
@yzhliu yzhliu merged commit 0edf87e into apache:master Aug 23, 2018
@yzhliu
Copy link
Member

yzhliu commented Aug 23, 2018

Thanks everyone. This is now merged.

ZhennanQin pushed a commit to ZhennanQin/tvm that referenced this pull request Sep 6, 2018
* [NNVM][TEST] Numerical gradient testing

* [NNVM][TEST] Make some tests a little faster

* Fix the failing test_top_level3

* Target exclusion for the check_function

* Try to ignore singularities

* grad_input_vars now can't contain shapes

* Don't pass unnecessary grad_input_vars to check_function

* Multiple outputs; fixes; testing of check_function

* Use numerical_grads_params to pass parameters to numgrad checker

* Fail when no action is requested excplicitly

* Pass additional params to functions

* Silence the linter issue

* Simplified numgrad checking

* Improved docs for check_function

* Fixed the error message when no dtype is provided

* Several fixes

* Tests with shape/dtype inference for inputs

* Don't check dense's grads on cuda

* Raise an error if output dtypes haven't been inferred

* Moved shape/dtype inference into a separate function; use float32 as fallback

* Remove redundant dtype=float32

* Fix multiple outputs

* Use check_function in the rest of the test_top_level1
FrozenGene pushed a commit to FrozenGene/tvm that referenced this pull request Dec 27, 2018
* [NNVM][TEST] Numerical gradient testing

* [NNVM][TEST] Make some tests a little faster

* Fix the failing test_top_level3

* Target exclusion for the check_function

* Try to ignore singularities

* grad_input_vars now can't contain shapes

* Don't pass unnecessary grad_input_vars to check_function

* Multiple outputs; fixes; testing of check_function

* Use numerical_grads_params to pass parameters to numgrad checker

* Fail when no action is requested excplicitly

* Pass additional params to functions

* Silence the linter issue

* Simplified numgrad checking

* Improved docs for check_function

* Fixed the error message when no dtype is provided

* Several fixes

* Tests with shape/dtype inference for inputs

* Don't check dense's grads on cuda

* Raise an error if output dtypes haven't been inferred

* Moved shape/dtype inference into a separate function; use float32 as fallback

* Remove redundant dtype=float32

* Fix multiple outputs

* Use check_function in the rest of the test_top_level1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants