Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Flaky test_gluon_model_zoo_gpu.test_training @ Python3: MKLDNN-GPU #9820

@marcoabreu

Description

@marcoabreu

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-9809/4/pipeline
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/389/pipeline
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/391/pipeline
http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/master/387/pipeline

======================================================================

FAIL: test_gluon_model_zoo_gpu.test_training

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest

    self.test(*self.arg)

  File "/workspace/tests/python/gpu/../unittest/common.py", line 155, in test_new

    orig_test(*args, **kwargs)

  File "/workspace/tests/python/gpu/test_gluon_model_zoo_gpu.py", line 159, in test_training

    assert_almost_equal(cpu_out.asnumpy(), gpu_out.asnumpy(), rtol=1e-2, atol=1e-2)

  File "/workspace/python/mxnet/test_utils.py", line 493, in assert_almost_equal

    raise AssertionError(msg)

AssertionError: 

Items are not equal:

Error 84.491165 exceeds tolerance rtol=0.010000, atol=0.010000.  Location of maximum error:(9, 877), a=0.816808, b=-0.181214

 a: array([[ 0.46908194,  0.25131682,  0.27432024, ..., -0.3243659 ,

        -0.8637756 ,  0.57461524],

       [ 0.18945426,  0.32339886,  0.18884647, ...,  0.00044107,...

 b: array([[ 0.34904164,  0.3221189 ,  0.05189171, ..., -0.14858764,

        -0.9381798 ,  0.39666444],

       [ 0.3438487 ,  0.4455883 ,  0.32494336, ...,  0.15454161,...

-------------------- >> begin captured logging << --------------------

urllib3.connectionpool: DEBUG: Starting new HTTP connection (1): data.mxnet.io

urllib3.connectionpool: DEBUG: http://data.mxnet.io:80 "GET /data/val-5k-256.rec HTTP/1.1" 200 150874780

root: INFO: downloaded http://data.mxnet.io/data/val-5k-256.rec into data/val-5k-256.rec successfully

common: INFO: Setting test np/mx/python random seeds, use MXNET_TEST_SEED=1760574333 to reproduce.

--------------------- >> end captured logging << ---------------------

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions