Skip to content

Library cannot generate manifest entries for or validate bags with hidden (.) payload files #51

@ssciolla

Description

@ssciolla

Hi, we seem to have discovered today when validating a bag generated by another BagIt implementation (bagit-python) that hidden files -- or those starting with ., such as .keep -- are not handled by this library. Essentially, the bag included hidden files in the manifest and payload, but the library didn't find them, resulting in a failed completeness check. I believe the following line is the source of the issue. Because it's also used by the manifest! method, hidden files would also not be included in the manifest when the bag is generated by the library.

bagit/lib/bagit/bag.rb

Lines 39 to 41 in 4a7fb6d

def bag_files
Dir[File.join(data_dir, "**", "*")].select { |f| File.file? f }
end

A fix would likely involve using Dir.glob with a specific pattern that would include these; the Ruby docs suggest using '{*,.*}', so for this library we might use '**/{*,.*}'. Dir[] might also still work with the updated pattern. There is a File::FNM_DOTMATCH flag, but it would also cause . and .. to be included (which we don't want).

In reviewing the spec, I don't see any indication that these files should be ignored or not included. If people seem to agree this is an issue, I'm happy to open a PR with some added/updated tests. If I'm incorrect somewhere in this analysis, I'm also happy to be pointed in the right direction.

Resources

Update: In thinking more about this, this may be considered a breaking change for current users of the library, as the bags they previously generated using the library would now be considered invalid by the "fixed" version. That would have to be handled somehow, assuming this change is a wanted improvement.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions