-
Notifications
You must be signed in to change notification settings - Fork 33
Tests for ls reinstated, underlying function fixed, support for ZARR-BIDS files added.
#1164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
37 commits
Select commit
Hold shift + click to select a range
d3ec5f1
Unravelling `ls` issues
TheChymera dc34ab0
strange `Cannot creat weak reference to PosixPath object`
TheChymera 8aef541
Fixed weakref issue
TheChymera b3a4b9a
Added readme listing and removed some diagnostic print calls
TheChymera 3272c86
Removed more debugging print calls
TheChymera ba99146
Removed more debugging print calls
TheChymera 64677c1
Creating and passing digests
TheChymera f8a0da8
Added validation bids schema version to result object metadata
TheChymera 430fb06
Assigning bids schema version for validation to metadata dict
TheChymera 279ecf9
Trying to insert bids_schema_version ... somewhere.
TheChymera 60b2408
Added (get_validation)_bids_version to BIDS assets
TheChymera 4a4530e
Creating ZARR digest for ZARR files
TheChymera 9df52ad
Using correct ZARR digest class (still not working)
TheChymera 0e0ff8e
More debugging print calls trying to track down digest error
TheChymera cc28fbb
Removed unneeded import
TheChymera 5d59ae8
Using zarr extensions variable from constants module
TheChymera 5d62fcc
Reinstated fake digest after introducing ZARR conditional
TheChymera f0de103
Attempting to fix digest type conflict (still not working)
TheChymera 86ae9c0
Use Zarr checksum to set metadata contentSize
jwodder 9866d0c
Removed debugging print calls
TheChymera 785dec0
Not requiring bids_version for ValidatorOrigin
TheChymera 2059dd7
Typing fixes and improved variable name
TheChymera a2119ec
Attemting type check fix for missing attribute
TheChymera 722e804
Removed commented imports
TheChymera 3649b5e
Actually fix typing
jwodder 32d0379
Debugging upload tests
TheChymera ba7d60c
Reinstating test
TheChymera e8a64e1
Associate BIDS asset errors with the actual assets
jwodder ccda9ec
Fixed novel README validation error
TheChymera 44a8ab4
Checking ls output subject identifier
TheChymera efab43e
Added ls command tests for zarrBIDS file
TheChymera 1ed0f5f
ZARR-appropriate fake checksum
TheChymera 9900841
More compact code
TheChymera 3384ef5
Dot in extension
TheChymera d9ccf8b
Safer assert satatement
TheChymera 375947d
Making sure the method which sets the attribute has been run
TheChymera 11c18f5
Updated logic to better accommodate future BIDS NWB examples
TheChymera File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||
|---|---|---|---|---|---|---|
|
|
@@ -28,6 +28,7 @@ | |||||
| import tenacity | ||||||
|
|
||||||
| from . import __version__, get_logger | ||||||
| from .consts import ZARR_EXTENSIONS, metadata_all_fields | ||||||
| from .dandiset import Dandiset | ||||||
| from .misctypes import Digest | ||||||
| from .pynwb_utils import ( | ||||||
|
|
@@ -38,14 +39,22 @@ | |||||
| metadata_cache, | ||||||
| nwb_has_external_links, | ||||||
| ) | ||||||
| from .utils import ensure_datetime, get_mime_type, get_utcnow_datetime | ||||||
| from .utils import ( | ||||||
| ensure_datetime, | ||||||
| find_parent_directory_containing, | ||||||
| get_mime_type, | ||||||
| get_utcnow_datetime, | ||||||
| ) | ||||||
|
|
||||||
| lgr = get_logger() | ||||||
|
|
||||||
|
|
||||||
| # Disable this for clean hacking | ||||||
| @metadata_cache.memoize_path | ||||||
| def get_metadata(path: Union[str, Path]) -> Optional[dict]: | ||||||
| def get_metadata( | ||||||
| path: Union[str, Path], | ||||||
| digest: Optional[Digest] = None, | ||||||
| ) -> Optional[dict]: | ||||||
| """ | ||||||
| Get "flatdata" from a .nwb file or a Dandiset directory | ||||||
|
|
||||||
|
|
@@ -59,64 +68,90 @@ def get_metadata(path: Union[str, Path]) -> Optional[dict]: | |||||
| ------- | ||||||
| dict | ||||||
| """ | ||||||
|
|
||||||
| from .files import bids, dandi_file, find_bids_dataset_description | ||||||
|
|
||||||
| # when we run in parallel, these annoying warnings appear | ||||||
| ignore_benign_pynwb_warnings() | ||||||
| path = str(path) # for Path | ||||||
| path = os.path.abspath(str(path)) # for Path | ||||||
| meta = dict() | ||||||
|
|
||||||
| if op.isdir(path): | ||||||
| if op.isdir(path) and not path.endswith(tuple(ZARR_EXTENSIONS)): | ||||||
| try: | ||||||
| dandiset = Dandiset(path) | ||||||
| return cast(dict, dandiset.metadata) | ||||||
| except ValueError as exc: | ||||||
| lgr.debug("Failed to get metadata for %s: %s", path, exc) | ||||||
| return None | ||||||
|
|
||||||
| if nwb_has_external_links(path): | ||||||
| raise NotImplementedError( | ||||||
| f"NWB files with external links are not supported: {path}" | ||||||
| # Is the data BIDS (as defined by the presence of a BIDS dataset descriptor) | ||||||
| bids_dataset_description = find_bids_dataset_description(path) | ||||||
| if bids_dataset_description: | ||||||
| dandiset_path = find_parent_directory_containing("dandiset.yaml", path) | ||||||
| bids_dataset_description = find_bids_dataset_description(path) | ||||||
| df = dandi_file( | ||||||
| Path(path), | ||||||
| dandiset_path, | ||||||
| bids_dataset_description=bids_dataset_description, | ||||||
| ) | ||||||
| path_metadata = df.get_metadata(digest=digest) | ||||||
| assert isinstance(df, bids.BIDSAsset) | ||||||
| meta["bids_version"] = df.get_validation_bids_version() | ||||||
| # there might be a more elegant way to do this: | ||||||
| for key in metadata_all_fields: | ||||||
| try: | ||||||
| value = getattr(path_metadata.wasAttributedTo[0], key) | ||||||
| except AttributeError: | ||||||
| pass | ||||||
| else: | ||||||
| meta[key] = value | ||||||
| elif path.endswith((".NWB", ".nwb")): | ||||||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
in relation to prior comment #1164 (comment) -- the point is to validate .nwb regardless either they are part of BIDS or not. BIDS validation does not do any validation of .nwb files which could be within BIDS dataset. |
||||||
| if nwb_has_external_links(path): | ||||||
| raise NotImplementedError( | ||||||
| f"NWB files with external links are not supported: {path}" | ||||||
| ) | ||||||
|
|
||||||
| # First read out possibly available versions of specifications for NWB(:N) | ||||||
| meta["nwb_version"] = get_nwb_version(path) | ||||||
|
|
||||||
| # PyNWB might fail to load because of missing extensions. | ||||||
| # There is a new initiative of establishing registry of such extensions. | ||||||
| # Not yet sure if PyNWB is going to provide "native" support for needed | ||||||
| # functionality: https://github.com/NeurodataWithoutBorders/pynwb/issues/1143 | ||||||
| # So meanwhile, hard-coded workaround for data types we care about | ||||||
| ndtypes_registry = { | ||||||
| "AIBS_ecephys": "allensdk.brain_observatory.ecephys.nwb", | ||||||
| "ndx-labmetadata-abf": "ndx_dandi_icephys", | ||||||
| } | ||||||
| tried_imports = set() | ||||||
| while True: | ||||||
| try: | ||||||
| meta.update(_get_pynwb_metadata(path)) | ||||||
| break | ||||||
| except KeyError as exc: # ATM there is | ||||||
| lgr.debug("Failed to read %s: %s", path, exc) | ||||||
| res = re.match(r"^['\"\\]+(\S+). not a namespace", str(exc)) | ||||||
| if not res: | ||||||
| raise | ||||||
| ndtype = res.groups()[0] | ||||||
| if ndtype not in ndtypes_registry: | ||||||
| raise ValueError( | ||||||
| "We do not know which extension provides %s. " | ||||||
| "Original exception was: %s. " % (ndtype, exc) | ||||||
| ) | ||||||
| import_mod = ndtypes_registry[ndtype] | ||||||
| lgr.debug("Importing %r which should provide %r", import_mod, ndtype) | ||||||
| if import_mod in tried_imports: | ||||||
| raise RuntimeError( | ||||||
| "We already tried importing %s to provide %s, but it seems it didn't help" | ||||||
| % (import_mod, ndtype) | ||||||
| ) | ||||||
| tried_imports.add(import_mod) | ||||||
| __import__(import_mod) | ||||||
|
|
||||||
| meta["nd_types"] = get_neurodata_types(path) | ||||||
|
|
||||||
| # First read out possibly available versions of specifications for NWB(:N) | ||||||
| meta["nwb_version"] = get_nwb_version(path) | ||||||
|
|
||||||
| # PyNWB might fail to load because of missing extensions. | ||||||
| # There is a new initiative of establishing registry of such extensions. | ||||||
| # Not yet sure if PyNWB is going to provide "native" support for needed | ||||||
| # functionality: https://github.com/NeurodataWithoutBorders/pynwb/issues/1143 | ||||||
| # So meanwhile, hard-coded workaround for data types we care about | ||||||
| ndtypes_registry = { | ||||||
| "AIBS_ecephys": "allensdk.brain_observatory.ecephys.nwb", | ||||||
| "ndx-labmetadata-abf": "ndx_dandi_icephys", | ||||||
| } | ||||||
| tried_imports = set() | ||||||
| while True: | ||||||
| try: | ||||||
| meta.update(_get_pynwb_metadata(path)) | ||||||
| break | ||||||
| except KeyError as exc: # ATM there is | ||||||
| lgr.debug("Failed to read %s: %s", path, exc) | ||||||
| res = re.match(r"^['\"\\]+(\S+). not a namespace", str(exc)) | ||||||
| if not res: | ||||||
| raise | ||||||
| ndtype = res.groups()[0] | ||||||
| if ndtype not in ndtypes_registry: | ||||||
| raise ValueError( | ||||||
| "We do not know which extension provides %s. " | ||||||
| "Original exception was: %s. " % (ndtype, exc) | ||||||
| ) | ||||||
| import_mod = ndtypes_registry[ndtype] | ||||||
| lgr.debug("Importing %r which should provide %r", import_mod, ndtype) | ||||||
| if import_mod in tried_imports: | ||||||
| raise RuntimeError( | ||||||
| "We already tried importing %s to provide %s, but it seems it didn't help" | ||||||
| % (import_mod, ndtype) | ||||||
| ) | ||||||
| tried_imports.add(import_mod) | ||||||
| __import__(import_mod) | ||||||
|
|
||||||
| meta["nd_types"] = get_neurodata_types(path) | ||||||
| else: | ||||||
| raise RuntimeError("Unable to get metadata from non-BIDS, non-NWB asset.") | ||||||
| return meta | ||||||
|
|
||||||
|
|
||||||
|
|
@@ -949,6 +984,12 @@ def add_common_metadata( | |||||
| "mtime %s of %s is in the future", metadata["blobDateModified"], path | ||||||
| ) | ||||||
| metadata["contentSize"] = os.path.getsize(path) | ||||||
| if digest is not None and digest.algorithm is models.DigestType.dandi_zarr_checksum: | ||||||
| m = re.fullmatch( | ||||||
| r"(?P<hash>[0-9a-f]{32})-(?P<files>[0-9]+)--(?P<size>[0-9]+)", digest.value | ||||||
| ) | ||||||
| if m: | ||||||
| metadata["contentSize"] = int(m["size"]) | ||||||
| metadata.setdefault("wasGeneratedBy", []).append( | ||||||
| get_generator(start_time, end_time) | ||||||
| ) | ||||||
|
|
||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.