Skip to content

s3 internal file mover #13

@mikeAdamss

Description

@mikeAdamss

What is this

We need the ability to move one or more files internally within an aws bucket.

i.e change a structure of

/files/foo.json
/files/bar.txt

to

/otherplace/foo.json
/otherplace/bar.txt

this is to include partial moves, it's not as simple as copy paste everything, example:

/files/foo.json
/files/bar.txt

to

/files/foo.json
/otherplace/bar.txt

.... why not just copy rather than move?

If you use a copy operation you need to recreate the data which in the case of large files can be very slow. By "moving" the files the intention is we're really (under the hood) just changing a pointer, i.e a bit of metadata on the object that says "this is where the file is".

TLDR - performance reasons / future proofing.

What to do

The code should live in /pytools/s3/bucket.py. You'll be looking to create something like:

Note: initial sketches only, change the implementation if something else makes more sense.

`/dpytools/s3/basic.py`
def move_one_object_internally(s3_object, path, assert_target_path_is_empty = False):
    """
    Take one object, move it to another path with the bucket

    If assert_target_path_is_empty then raise an exception if other
    files exist on that path

    You dont need the bucket name/path this is moving files internally
    within the bucket.
    """
    ...


# if s3_object is at /myfiles/stuff/data.csv then
move_one_object_internally(s3_object, "/otherfiles")
# should result in the s3_object residing at:
# /otherfiles/data.csv

def move_many_objects_internally(list_of_s3_objects, path, all_objects_on_same_path = True, assert_target_path_is_empty = True):
    """
    Take a list of objects and move them all to the same path.

    Do assert that all the objects being copied over start on the same path (in the same sub directory)
    where all_objects_on_same_path flagged to true.
    """
    

You can use a s3 bucket in the bleed environment for devloping this, If unsure how then ask Mike.

Acceptance Criteria

  • Single files can be moved
  • Lists of files can be moved
  • Tests written
  • Good docstrings that explain usage

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions