Skip to content

running issues #4

@z4z5

Description

@z4z5

Hi @danecor ,
I have installed all the things and succeed in running the pytest to test all the unit tests.But when I try to use the command :
CUDA_VISIBLE_DEVICES=0 python run.py doom tmaze --num_steps=500000 --burnin=10000 --epsilon_period=40000
to run the example,the error appeared after a while.The output is below:
Namespace(
AG_SEED=None, ENV_SEED=None, NP_SEED=None, TF_SEED=None, act_func=None, beta_prior=None, burnin=10000, concurrent_batches=None, debug=False, delete_old_episodes=False, discount=None, epsilon_period=40000, exp_eps_decay=False, experiment=['tmaze'], freeze_weights=False, gpu_frac=None, grad_norm_clip=None, hist_len=None, init_capacity=None, learning_rate=None, living_reward=None, map_path=None, max_replay_size=None, min_epsilon=None, minibatch_size=None,
module=['doom'], n_z=None, net_arch=None, num_reset_steps=None, num_steps=500000, path_ext=None, pri_cutoff=None, prior_tau=None, record=False, restore_path=None, restore_weights_path=None, seed=0, show_screen=False, straight_through=False, summary_step=None, tau_max=None, tau_min=None, tau_period=None, test=False, test_epoch_length=None, test_epsilon=None, track_repeats=False, train_epoch_length=None, train_reward=False, train_step=None, trigger=None, trigger_step=None)
2019-09-11 15:51:48,095 - MainThread - INFO: Starting Experiment
2019-09-11 15:51:50.338234: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2019-09-11 15:51:50.338271: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2019-09-11 15:51:50.338277: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2019-09-11 15:51:50.338282: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2019-09-11 15:51:50.338287: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2019-09-11 15:51:52.447393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:940] Found device 0 with properties:
name: Tesla P40
major: 6 minor: 1 memoryClockRate (GHz) 1.531
pciBusID 0000:04:00.0
Total memory: 22.38GiB
Free memory: 22.21GiB
2019-09-11 15:51:52.447439: I tensorflow/core/common_runtime/gpu/gpu_device.cc:961] DMA: 0
2019-09-11 15:51:52.447446: I tensorflow/core/common_runtime/gpu/gpu_device.cc:971] 0: Y
2019-09-11 15:51:52.447456: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P40, pci bus id: 0000:04:00.0)
2019-09-11 15:51:53,336 - MainThread - INFO: Starting sweeper table
2019-09-11 15:51:53.833710: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla P40, pci bus id: 0000:04:00.0)
2019-09-11 15:51:57,364 - MainThread - INFO: Step: 01000 Reward: 8 Epsilon: 1.000
2019-09-11 15:52:00,362 - MainThread - INFO: Step: 02000 Reward: 1 Epsilon: 1.000
2019-09-11 15:52:03,139 - MainThread - INFO: Step: 03000 Reward: -8 Epsilon: 1.000
2019-09-11 15:52:06,004 - MainThread - INFO: Step: 04000 Reward: -1 Epsilon: 1.000
2019-09-11 15:52:08,949 - MainThread - INFO: Step: 05000 Reward: 7 Epsilon: 1.000
2019-09-11 15:52:11,796 - MainThread - INFO: Step: 06000 Reward: 5 Epsilon: 1.000
2019-09-11 15:52:14,769 - MainThread - INFO: Step: 07000 Reward: 3 Epsilon: 1.000
2019-09-11 15:52:17,592 - MainThread - INFO: Step: 08000 Reward: 1 Epsilon: 1.000
2019-09-11 15:52:20,517 - MainThread - INFO: Step: 09000 Reward: -4 Epsilon: 1.000
2019-09-11 15:52:23,533 - MainThread - INFO: Step: 10000 Reward: 4 Epsilon: 1.000
2019-09-11 15:52:25.069607: W tensorflow/core/framework/op_kernel.cc:1158] Unimplemented: CopySliceToElement Unhandled data type: 17
Exception in thread TrainingThread:
Traceback (most recent call last):
File "/home/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/home/anaconda2/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/home/z4z5/VaST/models/vae.py", line 354, in train_thread
zs, _ = super(ConcurrentVAE, self).train(step, summary_writer)
File "/home/z4z5/VaST/models/vae.py", line 211, in train
results = self.sess.run(fetches, **kwargs)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 789, in run
run_metadata_ptr)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 997, in _run
feed_dict_string, options, run_metadata)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1132, in _do_run
target_list, options, run_metadata)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1152, in _do_call
raise type(e)(node_def, op, message)
UnimplementedError: CopySliceToElement Unhandled data type: 17
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[-1,60,80,6], [-1], [-1]], output_types=[DT_UINT8, DT_UINT16, DT_BOOL], _device="/job:localhost/replica:0/task:0/cpu:0"]]
[[Node: strided_slice_8/_21 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_136_strided_slice_8", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/gpu:0"]]

Caused by op u'IteratorGetNext', defined at:
File "run.py", line 93, in
restore_weights_path)
File "/home/z4z5/VaST/io_utils.py", line 90, in init_experiment
restore_step=rstep, restore_path=restore_weights_path)
File "/home/z4z5/VaST/models/base.py", line 27, in init
self._create_graph(params, restore_step)
File "/home/z4z5/VaST/models/base.py", line 35, in _create_graph
self._create_network()
File "/home/z4z5/VaST/models/vae.py", line 290, in _create_network
super(ConcurrentVAE, self)._create_network()
File "/home/z4z5/VaST/models/vae.py", line 37, in _create_network
self._create_input()
File "/home/z4z5/VaST/models/vae.py", line 312, in _create_input
obs, self.acts, self.starts = self.iterator.get_next()
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/contrib/data/python/ops/dataset_ops.py", line 247, in get_next
name=name))
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/ops/gen_dataset_ops.py", line 254, in iterator_get_next
output_shapes=output_shapes, name=name)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
op_def=op_def)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2506, in create_op
original_op=self._default_original_op, op_def=op_def)
File "/home/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1269, in init
self._traceback = _extract_stack()

UnimplementedError (see above for traceback): CopySliceToElement Unhandled data type: 17
[[Node: IteratorGetNext = IteratorGetNextoutput_shapes=[[-1,60,80,6], [-1], [-1]], output_types=[DT_UINT8, DT_UINT16, DT_BOOL], _device="/job:localhost/replica:0/task:0/cpu:0"]]
[[Node: strided_slice_8/_21 = _Recvclient_terminated=false, recv_device="/job:localhost/replica:0/task:0/gpu:0", send_device="/job:localhost/replica:0/task:0/cpu:0", send_device_incarnation=1, tensor_name="edge_136_strided_slice_8", tensor_type=DT_BOOL, _device="/job:localhost/replica:0/task:0/gpu:0"]]

I can't find the solution after google it,can you help me?
Thanks,
z4z5

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions