-
Notifications
You must be signed in to change notification settings - Fork 1
Description
What is this
We need the ability to move one or more files internally within an aws bucket.
i.e change a structure of
/files/foo.json
/files/bar.txt
to
/otherplace/foo.json
/otherplace/bar.txt
this is to include partial moves, it's not as simple as copy paste everything, example:
/files/foo.json
/files/bar.txt
to
/files/foo.json
/otherplace/bar.txt
.... why not just copy rather than move?
If you use a copy operation you need to recreate the data which in the case of large files can be very slow. By "moving" the files the intention is we're really (under the hood) just changing a pointer, i.e a bit of metadata on the object that says "this is where the file is".
TLDR - performance reasons / future proofing.
What to do
The code should live in /pytools/s3/bucket.py. You'll be looking to create something like:
Note: initial sketches only, change the implementation if something else makes more sense.
`/dpytools/s3/basic.py`
def move_one_object_internally(s3_object, path, assert_target_path_is_empty = False):
"""
Take one object, move it to another path with the bucket
If assert_target_path_is_empty then raise an exception if other
files exist on that path
You dont need the bucket name/path this is moving files internally
within the bucket.
"""
...
# if s3_object is at /myfiles/stuff/data.csv then
move_one_object_internally(s3_object, "/otherfiles")
# should result in the s3_object residing at:
# /otherfiles/data.csv
def move_many_objects_internally(list_of_s3_objects, path, all_objects_on_same_path = True, assert_target_path_is_empty = True):
"""
Take a list of objects and move them all to the same path.
Do assert that all the objects being copied over start on the same path (in the same sub directory)
where all_objects_on_same_path flagged to true.
"""
You can use a s3 bucket in the bleed environment for devloping this, If unsure how then ask Mike.
Acceptance Criteria
- Single files can be moved
- Lists of files can be moved
- Tests written
- Good docstrings that explain usage