-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Dataset extractors fail for an empty dataset. This is relevant since we are creating extractors that submit to HPC resources where the extractor triggers the upload of files from the HPC resource. Users would likely create empty datasets to fill up.
Here is the stack trace for what happens when empty dataset committed:
self._RealGetContents() File "/usr/local/lib/python3.9/zipfile.py", line 1324, in _RealGetContents raise BadZipFile("File is not a zip file") zipfile.BadZipFile: File is not a zip file 2021-08-24 19:37:07,512 [Thread-16 ] INFO : pyclowder.connectors - [61254a46e4b0d8ae89cf36d6] : StatusMessage.retry: (#10) File is not a zip file 2021-08-24 19:37:08,244 [Thread-17 ] DEBUG : pyclowder.connectors - ['tcnichol@illinois.edu'] 2021-08-24 19:37:08,371 [Thread-17 ] INFO : pyclowder.connectors - [61254a46e4b0d8ae89cf36d6] : StatusMessage.start: Started processing. 2021-08-24 19:37:08,372 [Thread-17 ] DEBUG : pyclowder.extractors - default check message : {'notifies': ['tcnichol@illinois.edu'], 'source': {'id': {'resourceType': "'dataset", 'id': '61254a46e4b0d8ae89cf36d6'}, 'extra': {}}, 'jobid': '61254a5ae4b0d8ae89cf36da', 'msgid': '61254a5ae4b0d8ae89cf36db', 'flags': '', 'intermediateId': '61254a46e4b0d8ae89cf36d6', 'host': 'https://pdg.clowderframework.org', 'datasetId': '61254a46e4b0d8ae89cf36d6', 'id': '61254a46e4b0d8ae89cf36d6', 'datasetname': 'EMPTY', 'fileSize': '0', 'target': '{}', 'secretKey': 'aea4c447-7a7e-4c7d-b717-bde0fb57eed0', 'activity': 'submitted', 'routing_key': 'extractors.ncsa.maple.bridges2.dataset', 'parameters': '{"directory":"/jet/home/ocean/MAPLE/data"}', 'action': 'manual-submission', 'retry_count': 10} 2021-08-24 19:37:08,457 [Thread-17 ] INFO : pyclowder.connectors - [61254a46e4b0d8ae89cf36d6] : StatusMessage.processing: Downloading dataset. 2021-08-24 19:37:08,509 [Thread-17 ] ERROR : pyclowder.connectors - [61254a46e4b0d8ae89cf36d6] File is not a zip file Traceback (most recent call last): File "/usr/local/lib/python3.9/site-packages/pyclowder/connectors.py", line 443, in _process_message (file_paths, tmp_files, tmp_dirs) = self._prepare_dataset(host, secret_key, resource) File "/usr/local/lib/python3.9/site-packages/pyclowder/connectors.py", line 357, in _prepare_dataset file_paths = pyclowder.utils.extract_zip_contents(inputzip) File "/usr/local/lib/python3.9/site-packages/pyclowder/utils.py", line 125, in extract_zip_contents zipobj = zipfile.ZipFile(zipfilepath) File "/usr/local/lib/python3.9/zipfile.py", line 1257, in __init__ self._RealGetContents() File "/usr/local/lib/python3.9/zipfile.py", line 1324, in _RealGetContents raise BadZipFile("File is not a zip file") zipfile.BadZipFile: File is not a zip file 2021-08-24 19:37:08,509 [Thread-17 ] INFO : pyclowder.connectors - [61254a46e4b0d8ae89cf36d6] : StatusMessage.error: File is not a zip file