Skip to content

bagit.py and broken soft links #115

@ajnelson-nist

Description

@ajnelson-nist

Hello,

I encountered an issue today with bagit.py failing to deal with a broken soft link, and halting bagging an otherwise-intact file system tree. I was attempting to use the script on a directory tree that included a soft-linked file apparently meant to be set at a later date (e.g. .../foo.cfg pointing to .../config/populated_by_user/foo.cfg, similar to what the Apache webserver does with config files). The execution environment in this case is in a POSIX-interfaced file system.

The problem in bagit.py appears to stem from the function _can_read, on (today's) Line 1362. The broken soft link in this case pointed at a non-existent directory, raising an error on Line 206.

I suggest that a broken soft link should not prevent a directory tree from being bagged. It may be better for _can_read to only report actual directories and files that are unreadable, possibly with broken links as a new third output.

If helpful, there is a script that converts a file system walk (via os.walk) to DFXML, and it has an if-ladder that goes through all file-system-level file types, not just directories and regular files. See the walk_to_dfxml.py function filepath_to_fileobject, and all assignment statements matching name_type = (starting on Line 36 today). You may want _can_read to skip operating on other file types as well.

--Alex

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions