While pushing data to our azure remote, errors occur in updating the progress bar for some files ( I've so far seen 1-3 file uploads fail out of just 7 being pushed). This causes dvc to think that the file upload failed and so it decides not to upload the .dir file. Actually the file upload did succeed, so on a retry of the dvc push command only the missing .dir file is uploaded.
All files in the directory structure are pushed to the remote cache, as well as the .dir file representing them.
2022-05-05 13:10:00,228 DEBUG: Preparing to transfer data from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache' to 'dvc/models_v3/some_new_folder'
2022-05-05 13:10:00,228 DEBUG: Preparing to collect status from 'dvc/models_v3/some_new_folder'
2022-05-05 13:10:00,228 DEBUG: Collecting status from 'dvc/models_v3/some_new_folder'
2022-05-05 13:10:00,229 DEBUG: Querying 3 hashes via object_exists
2022-05-05 13:10:01,080 DEBUG: Querying 1 hashes via object_exists
2022-05-05 13:10:03,613 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:03,614 DEBUG: Estimated remote size: 4096 files
2022-05-05 13:10:03,614 DEBUG: Querying '9' hashes via traverse
2022-05-05 13:10:06,704 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,707 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,714 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,724 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,726 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,727 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,734 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,736 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,755 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,755 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,757 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:06,773 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 13:10:49,294 DEBUG: Preparing to collect status from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache'
2022-05-05 13:10:49,294 DEBUG: Collecting status from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache'
2022-05-05 13:10:49,648 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/43/0e4a44fc39b482c1344b2d2ca286f1' to 'dvc/models_v3/some_new_folder/43/0e4a44fc39b482c1344b2d2ca286f1'
2022-05-05 13:10:50,236 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/f9/50bdf84b33643f71b7eefd93864d5d' to 'dvc/models_v3/some_new_folder/f9/50bdf84b33643f71b7eefd93864d5d'
2022-05-05 13:10:50,271 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/bb/f019f9da1f79aa4bcd611925756a60' to 'dvc/models_v3/some_new_folder/bb/f019f9da1f79aa4bcd611925756a60'
2022-05-05 13:10:50,279 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/0d/69161c5a8d2f5883675e21454c9f5b' to 'dvc/models_v3/some_new_folder/0d/69161c5a8d2f5883675e21454c9f5b'
2022-05-05 13:10:50,300 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/5e/c95974eacea034d009e940062eba14' to 'dvc/models_v3/some_new_folder/5e/c95974eacea034d009e940062eba14'
2022-05-05 13:10:50,307 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/72/58fd7add28137624d7214a146efac5' to 'dvc/models_v3/some_new_folder/72/58fd7add28137624d7214a146efac5'
2022-05-05 13:11:43,447 ERROR: failed to transfer 'md5: 430e4a44fc39b482c1344b2d2ca286f1' - unsupported operand type(s) for -: 'NoneType' and 'int'
------------------------------------------------------------
Traceback (most recent call last):
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 25, in wrapper
func(fs_path, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 162, in func
return dest.add(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/objects/db.py", line 126, in add
self._add_file(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/objects/db.py", line 98, in _add_file
return fs.utils.transfer(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 96, in transfer
_try_links(links, from_fs, from_path, to_fs, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 66, in _try_links
return _copy(from_fs, from_path, to_fs, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 44, in _copy
return to_fs.upload(from_path, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/base.py", line 386, in upload
return self.put_file(from_info, to_info, callback=cb, size=size)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/base.py", line 329, in put_file
self.fs.put_file(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 85, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 65, in sync
raise return_result
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/adlfs/spec.py", line 1574, in _put_file
await bc.upload_blob(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/tracing/decorator_async.py", line 74, in wrapper_use_tracer
return await func(*args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/aio/_blob_client_async.py", line 388, in upload_blob
return await upload_block_blob(**options)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/aio/_upload_helpers.py", line 129, in upload_block_blob
return await client.commit_block_list(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_generated/aio/operations/_block_blob_operations.py", line 1006, in commit_block_list
pipeline_response = await self._client._pipeline.run(request, stream=False, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 215, in run
return await first_node.send(pipeline_request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
[Previous line repeated 4 more times]
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/policies/_redirect_async.py", line 64, in send
response = await self.next.send(request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_shared/policies_async.py", line 111, in send
response = await self.next.send(request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_shared/policies_async.py", line 90, in send
response_callback(response)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/adlfs/spec.py", line 298, in wrapper
callback.absolute_update(current)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/_callback.py", line 122, in absolute_update
self.progress_bar.update_to(value)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/progress.py", line 121, in update_to
self.update(current - self.n)
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
------------------------------------------------------------
2022-05-05 13:11:45,918 ERROR: failed to transfer 'md5: 7258fd7add28137624d7214a146efac5' - unsupported operand type(s) for -: 'NoneType' and 'int'
------------------------------------------------------------
Traceback (most recent call last):
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 25, in wrapper
func(fs_path, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 162, in func
return dest.add(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/objects/db.py", line 126, in add
self._add_file(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/objects/db.py", line 98, in _add_file
return fs.utils.transfer(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 96, in transfer
_try_links(links, from_fs, from_path, to_fs, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 66, in _try_links
return _copy(from_fs, from_path, to_fs, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/utils.py", line 44, in _copy
return to_fs.upload(from_path, to_path)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/base.py", line 386, in upload
return self.put_file(from_info, to_info, callback=cb, size=size)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/base.py", line 329, in put_file
self.fs.put_file(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 85, in wrapper
return sync(self.loop, func, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 65, in sync
raise return_result
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/fsspec/asyn.py", line 25, in _runner
result[0] = await coro
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/adlfs/spec.py", line 1574, in _put_file
await bc.upload_blob(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/tracing/decorator_async.py", line 74, in wrapper_use_tracer
return await func(*args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/aio/_blob_client_async.py", line 388, in upload_blob
return await upload_block_blob(**options)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/aio/_upload_helpers.py", line 129, in upload_block_blob
return await client.commit_block_list(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_generated/aio/operations/_block_blob_operations.py", line 1006, in commit_block_list
pipeline_response = await self._client._pipeline.run(request, stream=False, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 215, in run
return await first_node.send(pipeline_request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
[Previous line repeated 4 more times]
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/policies/_redirect_async.py", line 64, in send
response = await self.next.send(request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_shared/policies_async.py", line 111, in send
response = await self.next.send(request)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/core/pipeline/_base_async.py", line 83, in send
response = await self.next.send(request) # type: ignore
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/azure/storage/blob/_shared/policies_async.py", line 90, in send
response_callback(response)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/adlfs/spec.py", line 298, in wrapper
callback.absolute_update(current)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/fs/_callback.py", line 122, in absolute_update
self.progress_bar.update_to(value)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/progress.py", line 121, in update_to
self.update(current - self.n)
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
------------------------------------------------------------
2022-05-05 13:11:45,920 DEBUG: failed to upload full contents of 'None', aborting .dir file upload
2022-05-05 13:11:45,921 ERROR: failed to upload '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/76/df0b9081b1d33145e688a4b23b5cf3.dir' to 'dvc/models_v3/some_new_folder/76/df0b9081b1d33145e688a4b23b5cf3.dir'
2022-05-05 13:11:45,922 ERROR: failed to push data to the cloud - 3 files failed to upload
------------------------------------------------------------
Traceback (most recent call last):
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/repo/push.py", line 68, in push
pushed += self.cloud.push(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data_cloud.py", line 85, in push
return transfer(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 176, in transfer
_do_transfer(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/data/transfer.py", line 116, in _do_transfer
raise FileTransferError(total_fails)
dvc.exceptions.FileTransferError: 3 files failed to transfer
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/commands/data_sync.py", line 58, in run
processed_files_count = self.repo.push(
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/repo/__init__.py", line 48, in wrapper
return f(repo, *args, **kwargs)
File "/home/qcify/qcify/myagent/_work/1/s/venv/lib/python3.9/site-packages/dvc/repo/push.py", line 72, in push
raise UploadError(exc.amount)
dvc.exceptions.UploadError: 3 files failed to upload
------------------------------------------------------------
2022-05-05 13:11:45,932 DEBUG: Analytics is enabled.
2022-05-05 13:11:46,047 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmpu8n72eqg']'
2022-05-05 13:11:46,050 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmpu8n72eqg']'
2022-05-05 18:04:51,465 DEBUG: Preparing to transfer data from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache' to 'dvc/models_v3/some_new_folder'
2022-05-05 18:04:51,465 DEBUG: Preparing to collect status from 'dvc/models_v3/some_new_folder'
2022-05-05 18:04:51,466 DEBUG: Collecting status from 'dvc/models_v3/some_new_folder'
2022-05-05 18:04:51,467 DEBUG: Querying 3 hashes via object_exists
2022-05-05 18:04:52,254 DEBUG: Querying 1 hashes via object_exists
2022-05-05 18:04:54,340 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:54,341 DEBUG: Estimated remote size: 4096 files
2022-05-05 18:04:54,342 DEBUG: Querying '9' hashes via traverse
2022-05-05 18:04:57,243 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,257 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,263 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,272 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,281 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,291 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,343 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,360 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,405 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,407 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,418 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:04:57,422 DEBUG: 'dvc/models_v3/some_new_folder/00' doesn't look like a cache file, skipping
2022-05-05 18:05:17,034 DEBUG: Preparing to collect status from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache'
2022-05-05 18:05:17,034 DEBUG: Collecting status from '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache'
2022-05-05 18:05:17,389 DEBUG: Uploading '/home/qcify/qcify/myagent/_work/1/s/models/.dvc/cache/76/df0b9081b1d33145e688a4b23b5cf3.dir' to 'dvc/models_v3/some_new_folder/76/df0b9081b1d33145e688a4b23b5cf3.dir'
2022-05-05 18:05:17,578 DEBUG: Indexing pushed dir 'md5: 76df0b9081b1d33145e688a4b23b5cf3.dir' with '8' nested files
1 file pushed
2022-05-05 18:05:17,599 DEBUG: Analytics is enabled.
2022-05-05 18:05:17,714 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', '/tmp/tmp_8_jinwu']'
2022-05-05 18:05:17,716 DEBUG: Spawned '['daemon', '-q', 'analytics', '/tmp/tmp_8_jinwu']'
I already went digging a bit myself and found a temporary fix. The problem seems to be that a response from azure can have None for the current progress, which causes an error when updating the progress bar. I fixed this for now by changing the update_to function where the error happens as follows:
I don't know for sure if this is intended behaviour on azure's side, but it does seem like it. By following the traceback from the bottom up I found this Azure/azure-sdk-for-python#11419 (comment) link in their code. The linked comment also checks if current is None, so perhaps the solution is as simple as that?
Bug Report
Description
While pushing data to our azure remote, errors occur in updating the progress bar for some files ( I've so far seen 1-3 file uploads fail out of just 7 being pushed). This causes dvc to think that the file upload failed and so it decides not to upload the .dir file. Actually the file upload did succeed, so on a retry of the dvc push command only the missing .dir file is uploaded.
Reproduce
Expected
All files in the directory structure are pushed to the remote cache, as well as the .dir file representing them.
Environment information
Output of
dvc doctor:Additional Information (if any):
First try with failed uploads:
Second try which just uploads the .dir file:
I already went digging a bit myself and found a temporary fix. The problem seems to be that a response from azure can have None for the current progress, which causes an error when updating the progress bar. I fixed this for now by changing the update_to function where the error happens as follows:
I don't know for sure if this is intended behaviour on azure's side, but it does seem like it. By following the traceback from the bottom up I found this Azure/azure-sdk-for-python#11419 (comment) link in their code. The linked comment also checks if current is None, so perhaps the solution is as simple as that?