Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Address already in use during tutorial test #11120

@marcoabreu

Description

@marcoabreu

http://jenkins.mxnet-ci.amazon-ml.com/blue/organizations/jenkins/incubator-mxnet/detail/PR-10656/7/pipeline/

+ nosetests-3.4 test_tutorials.py --nologcapture

....[19:25:36] src/io/iter_image_recordio_2.cc:170: ImageRecordIOParser2: ./data/caltech.rec, use 4 threads for decoding..

.......[19:27:51] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)

.....[19:28:58] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)

..[19:32:06] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)

[19:37:22] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)

....[19:39:25] src/operator/nn/./cudnn/./cudnn_algoreg-inl.h:107: Running performance tests to find the best convolution algorithm, this can take a while... (setting env variable MXNET_CUDNN_AUTOTUNE_DEFAULT to 0 to disable)

.[19:39:51] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v0.8.0. Attempting to upgrade...

[19:39:51] src/nnvm/legacy_json_util.cc:217: Symbol successfully upgraded!

.....[19:40:24] src/operator/././../common/utils.h:416: Optimizer with lazy_update = True detected. Be aware that lazy update with row_sparse gradient is different from standard update, and may lead to different empirical results. See https://mxnet.incubator.apache.org/api/python/optimization/optimization.html for more details.

..[19:40:43] src/operator/././../common/utils.h:416: Optimizer with lazy_update = True detected. Be aware that lazy update with row_sparse gradient is different from standard update, and may lead to different empirical results. See https://mxnet.incubator.apache.org/api/python/optimization/optimization.html for more details.

..Traceback (most recent call last):

  File "/usr/lib/python3.5/runpy.py", line 184, in _run_module_as_main

    "__main__", mod_spec)

  File "/usr/lib/python3.5/runpy.py", line 85, in _run_code

    exec(code, run_globals)

  File "/usr/local/lib/python3.5/dist-packages/ipykernel_launcher.py", line 16, in <module>

    app.launch_new_instance()

  File "/usr/local/lib/python3.5/dist-packages/traitlets/config/application.py", line 657, in launch_instance

    app.initialize(argv)

  File "<decorator-gen-123>", line 2, in initialize

  File "/usr/local/lib/python3.5/dist-packages/traitlets/config/application.py", line 87, in catch_config_error

    return method(app, *args, **kwargs)

  File "/usr/local/lib/python3.5/dist-packages/ipykernel/kernelapp.py", line 456, in initialize

    self.init_sockets()

  File "/usr/local/lib/python3.5/dist-packages/ipykernel/kernelapp.py", line 248, in init_sockets

    self.control_port = self._bind_socket(self.control_socket, self.control_port)

  File "/usr/local/lib/python3.5/dist-packages/ipykernel/kernelapp.py", line 180, in _bind_socket

    s.bind("tcp://%s:%i" % (self.ip, port))

  File "zmq/backend/cython/socket.pyx", line 547, in zmq.backend.cython.socket.Socket.bind

  File "zmq/backend/cython/checkrc.pxd", line 25, in zmq.backend.cython.checkrc._check_rc

zmq.error.ZMQError: Address already in use

F..

======================================================================

FAIL: test_tutorials.test_unsupervised_learning_gan

----------------------------------------------------------------------

Traceback (most recent call last):

  File "/usr/local/lib/python3.5/dist-packages/nose/case.py", line 198, in runTest

    self.test(*self.arg)

  File "/work/mxnet/tests/tutorials/test_tutorials.py", line 203, in test_unsupervised_learning_gan

    assert _test_tutorial_nb('unsupervised_learning/gan')

AssertionError: 

-------------------- >> begin captured stdout << ---------------------

Kernel died before replying to kernel_info



--------------------- >> end captured stdout << ----------------------



----------------------------------------------------------------------

Ran 35 tests in 1024.216s



FAILED (failures=1)

build.py: 2018-05-31 19:41:38,780 Running of command in container failed (1): nvidia-docker run --rm -t --shm-size=3g -v /home/jenkins_slave/workspace/it-tutorials-py2:/work/mxnet -v /home/jenkins_slave/workspace/it-tutorials-py2/build:/work/build -u 1001:1001 mxnet/build.ubuntu_gpu /work/runtime_functions.sh tutorialtest_ubuntu_python2_gpu

build.py: 2018-05-31 19:41:38,780 You can try to get into the container by using the following command: nvidia-docker run --rm -t --shm-size=3g -v /home/jenkins_slave/workspace/it-tutorials-py2:/work/mxnet -v /home/jenkins_slave/workspace/it-tutorials-py2/build:/work/build -u 1001:1001 -ti --entrypoint /bin/bash mxnet/build.ubuntu_gpu /work/runtime_functions.sh tutorialtest_ubuntu_python2_gpu

into container: False

Traceback (most recent call last):

  File "ci/build.py", line 307, in <module>

    sys.exit(main())

  File "ci/build.py", line 243, in main

    container_run(platform, docker_binary, shared_memory_size, command)

  File "ci/build.py", line 154, in container_run

    raise subprocess.CalledProcessError(ret, cmd)

subprocess.CalledProcessError: Command 'nvidia-docker run --rm -t --shm-size=3g -v /home/jenkins_slave/workspace/it-tutorials-py2:/work/mxnet -v /home/jenkins_slave/workspace/it-tutorials-py2/build:/work/build -u 1001:1001 mxnet/build.ubuntu_gpu /work/runtime_functions.sh tutorialtest_ubuntu_python2_gpu' returned non-zero exit status 1

script returned exit code 1

Possibly related to hardcoding ports.

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions