-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
What is this
We need the ability to decopress a remote file (a tar file in a bucket) and write the content to another path in the bucket.
Example:
/bucker/compressed/foo.tar
# is decompressed to
/bucket/somewhere-else/bar.txt
/bucker/somewhere-else/baz.json
What to do
dpytools/s3/decompressor.py
I'm envisioning (though by all means chanfge to something that makes more sense) something like:
# We're looking for something along the lines of
def extract_s3_object_to_s3_path(location_of_tar_file, path_like_location, pristine_target_path: = True, format="tar"):
"""
location_of_tar_file would be the full s3 location of the compressed file
path_like_location` would be something like `/dataset-1/something-else/` so _each file decompressed from the tar goes into the new path.
if pristine_target_path: = True then assert that no files exist on that path before you do anything else.
if any format other that "tar" is passed in for now then raise a NotImplementedError please.
where we are saying its a tar file file with that kwarg then assert that it actually is a tar file
"""Acceptance Critiera
- Functionality implemented and unit tested
Please note: someone else is doing initial s3 functions which will includes get_s3_object
Given its a very simple function write your own for this if the other task isn't merged when you pick this up, we'll switchin over to the generic one when it's done.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels