Skip to content

Special character in filename/path not escaped in id, when using ROCrate(init=True) #224

@LauLauThom

Description

@LauLauThom

I have installed the rocrate package from the github URL so I have the version of the last commit, so including the fixes in this past PR #217.

Still I think the special characters (spaces..) in the filename or path are not escaped/quoted, when a crate is created from an existing directory, containing files with such characters.

Here is an example reproducing the issue.

from rocrate.rocrate import ROCrate
import os

new_crate_root = "./test_crate"

# create a directory with a single text file
if not os.path.exists(new_crate_root):
    os.mkdir(new_crate_root)

# create a file with spaces in the name
with open(os.path.join(new_crate_root, "file with space.txt"), "w") as f:
    f.write("empty")

# Initialize a crate from the directory and save it to create the json
crate = ROCrate(new_crate_root, init=True)
crate.write(new_crate_root) 

# then reload it, this time parsing it
crate = ROCrate(new_crate_root)
file1 = crate.get_by_type("File")[0]

print(f"{file1.id=}") #  missing the % !
assert "%" in file1.id 

I believe the quoting happens in all other cases (see below).
But when init=True in the ROCrate constructor then crate.source evaluate to True and the identifier is not quoted.

if not crate.source:
identifier = quote(identifier)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions