Upload test to check for metadata by TheChymera · Pull Request #1180 · dandi/dandi-cli

TheChymera · 2022-12-20T13:08:32Z

The idea is to add a test which ensures that the upload code produces BIDS assets for which the correct data can always be retrieved. Currently we just check a few of the files being uploaded. I think it already works in any case, but would be a good add-on test.

@yarikoptic @jwodder any ideas how I can access the metadata of the remote asset? I tried to look at other tests which check metadata, but thus far I haven't found anything that does that.

codecov · 2022-12-20T13:12:50Z

Codecov Report

Base: 89.15% // Head: 89.07% // Decreases project coverage by -0.08% ⚠️

Coverage data is based on head (288715e) compared to base (8e67482).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1180      +/-   ##
==========================================
- Coverage   89.15%   89.07%   -0.09%     
==========================================
  Files          76       76              
  Lines        9510     9444      -66     
==========================================
- Hits         8479     8412      -67     
- Misses       1031     1032       +1

Flag	Coverage Δ
unittests	`89.07% <100.00%> (-0.09%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
dandi/tests/test_upload.py	`100.00% <100.00%> (ø)`
dandi/support/threaded_walk.py	`93.10% <0.00%> (-1.73%)`	⬇️
dandi/dandiapi.py	`87.11% <0.00%> (-1.17%)`	⬇️
dandi/files/zarr.py	`88.17% <0.00%> (-0.20%)`	⬇️
dandi/download.py	`88.02% <0.00%> (-0.03%)`	⬇️
dandi/organize.py	`83.54% <0.00%> (-0.01%)`	⬇️
dandi/consts.py	`100.00% <0.00%> (ø)`
dandi/cli/cmd_organize.py	`100.00% <0.00%> (ø)`
dandi/tests/test_files.py	`100.00% <0.00%> (ø)`
... and 2 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

jwodder · 2022-12-20T13:46:54Z

@TheChymera dandiset.get_asset_by_path() returns a RemoteAsset object that has the methods get_raw_metadata() (returning a dict directly unserialized from JSON) and get_metadata() (returning a dandischema BaseAsset, if you're sure the metadata is valid)

TheChymera · 2022-12-20T15:47:17Z

@jwodder thanks :)

So I can check for metadata and it's the correct metadata via get_raw_metadata(), but not with get_metadata().

for ddfe305:

PATH                                   SIZE    ERRORS UPLOAD STATUS MESSAGE
CHANGES
dataset_description.json
README
dandiset.yaml
sub-Sub1/perf/sub-Sub1_m0scan.nii.gz
sub-Sub1/perf/sub-Sub1_aslcontext.tsv
sub-Sub1/perf/sub-Sub1_m0scan.json
sub-Sub1/perf/sub-Sub1_asl.json
sub-Sub1/perf/sub-Sub1_asl.nii.gz
sub-Sub1/perf/sub-Sub1_asllabeling.jpg
sub-Sub1/anat/sub-Sub1_T1w.json
sub-Sub1/anat/sub-Sub1_T1w.nii.gz
Summary:                               0 Bytes
{'@context': 'https://raw.githubusercontent.com/dandi/schema/master/releases/0.6.3/context.json',
 'access': [{'schemaKey': 'AccessRequirements', 'status': 'dandi:OpenAccess'}],
 'blobDateModified': '2022-12-20T10:43:59.299141-05:00',
 'contentSize': 1,
 'contentUrl': ['http://localhost:8000/api/assets/5ae1d41e-ad10-4060-95dc-1cd727442f13/download/',
                'http://localhost:9000/dandi-dandisets/blobs/fa6/d5d/fa6d5da9-e65a-49f5-bcee-d80afab32294'],
 'dateModified': '2022-12-20T10:44:09.520165-05:00',
 'digest': {'dandi:dandi-etag': '741e0f4bad5b093044dc54a74d911094-1',
            'dandi:sha2-256': '01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b'},
 'encodingFormat': 'application/gzip',
 'id': 'dandiasset:5ae1d41e-ad10-4060-95dc-1cd727442f13',
 'identifier': '5ae1d41e-ad10-4060-95dc-1cd727442f13',
 'path': 'sub-Sub1/anat/sub-Sub1_T1w.nii.gz',
 'schemaKey': 'Asset',
 'schemaVersion': '0.6.3',
 'wasAttributedTo': [{'identifier': 'Sub1', 'schemaKey': 'Participant'}],
 'wasGeneratedBy': [{'description': 'Metadata generated by DANDI cli',
                     'id': 'urn:uuid:af9cf642-ec7d-44f5-96ca-661a93103669',
                     'name': 'Metadata generation',
                     'schemaKey': 'Activity',
                     'wasAssociatedWith': [{'identifier': 'RRID:SCR_019009',
                                            'name': 'DANDI Command Line '
                                                    'Interface',
                                            'schemaKey': 'Software',
                                            'url': 'https://github.com/dandi/dandi-cli',
                                            'version': '0.46.6+217.gddfe305'}]}]}
AAAAAAAAAAAAAAAAAAAAAA
FAILED

=================================== FAILURES ===================================
__________________________ test_upload_bids_metadata ___________________________
test_upload.py:231: in test_upload_bids_metadata
    ).get_metadata()
../dandiapi.py:1308: in get_metadata
    return models.Asset.parse_obj(self.get_raw_metadata())
pydantic/main.py:526: in pydantic.main.BaseModel.parse_obj
    ???
pydantic/main.py:342: in pydantic.main.BaseModel.__init__
    ???
E   pydantic.error_wrappers.ValidationError: 2 validation errors for Asset
E   contentUrl -> 0
E     URL host invalid, top level domain required (type=value_error.url.host)
E   contentUrl -> 1
E     URL host invalid, top level domain required (type=value_error.url.host)

Does that mean the schema is somehow malformed? The actual error look like it has something to do with an invalid URL, though 🤔

jwodder · 2022-12-20T15:49:29Z

@TheChymera Where were you running the tests to see that output? The environment variable DANDI_ALLOW_LOCALHOST_URLS needs to be set in order for dandischema to accept localhost URLs; it's automatically set when using tox or the GitHub Actions CI.

TheChymera · 2022-12-20T15:52:55Z

oh, yes, that was indeed it, I was running pytest directly to point it at the test I was working on:

chymera@decohost ~/src/dandi-cli/dandi/tests $ DANDI_TESTS_PERSIST_DOCKER_COMPOSE=1 DANDI_TESTS_PULL_DOCKER_COMPOSE=0 pytest -vvs test_upload.py::test_upload_bids_metadata | less

yarikoptic · 2022-12-21T14:40:08Z

dandi/tests/test_upload.py

+    # Automatically check all files, heuristic should remain very BIDS-stable
+    for asset in dandiset.get_assets(order="path"):
+        apath = asset.path
+        if "Sub1" in apath:


Just test on some known paths since if we later rename from Sib1, this test would pass without testing anything

How about 288715e ? I like that we're checking all of them, at trivial additional time cost.

it is not runtime which concerned me but possible absent checking if that dataset changes so that eg there is no subjects -- this test would still be valid. ok let's consider it unlikely and proceed

yarikoptic · 2022-12-21T21:56:40Z

dandi/tests/test_upload.py

+def test_upload_bids_metadata(
+    mocker: MockerFixture, bids_dandiset: SampleDandiset
+) -> None:
+    bids_dandiset.upload(existing="force")


why upload needs to be forced?

oh, you always do that... not sure if that is a good/clean practice, not liking it.. but not unique to this test, so let's proceed.

yarikoptic · 2022-12-21T22:00:06Z

dandi/tests/test_upload.py

+        if "sub-" in apath:
+            metadata = dandiset.get_asset_by_path(apath).get_metadata()
+            # Hard-coded check for the subject identifier set in the fixture:
+            assert metadata.wasAttributedTo[0].identifier == "Sub1"


this is a very rudimentary bids dataset, not even with sessions or other interesting assets,... oh well, ok, at least we check that subject is extracted. We would need to improve later

github-actions · 2023-01-17T16:16:16Z

🚀 PR was released in 0.48.1 🚀

Upload test to check for metadata

1ad8c23

Retrieving metadata

ddfe305

TheChymera marked this pull request as draft December 20, 2022 16:51

Getting all relevant assets for testing

cc01263

yarikoptic reviewed Dec 21, 2022

View reviewed changes

Improving check to not pass if the identifier changes

288715e

TheChymera marked this pull request as ready for review December 21, 2022 19:34

TheChymera requested a review from yarikoptic December 21, 2022 19:34

yarikoptic reviewed Dec 21, 2022

View reviewed changes

yarikoptic merged commit 9aae3bf into master Dec 21, 2022

yarikoptic deleted the metadata branch December 21, 2022 22:01

github-actions bot added the released label Jan 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Upload test to check for metadata#1180

Upload test to check for metadata#1180
yarikoptic merged 4 commits intomasterfrom
metadata

TheChymera commented Dec 20, 2022

Uh oh!

codecov bot commented Dec 20, 2022 •

edited

Loading

Uh oh!

jwodder commented Dec 20, 2022 •

edited

Loading

Uh oh!

TheChymera commented Dec 20, 2022

Uh oh!

jwodder commented Dec 20, 2022

Uh oh!

TheChymera commented Dec 20, 2022 •

edited

Loading

Uh oh!

yarikoptic Dec 21, 2022

Uh oh!

TheChymera Dec 21, 2022

Uh oh!

yarikoptic Dec 21, 2022

Uh oh!

yarikoptic Dec 21, 2022

Uh oh!

yarikoptic Dec 21, 2022

Uh oh!

yarikoptic Dec 21, 2022

Uh oh!

github-actions bot commented Jan 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

TheChymera commented Dec 20, 2022

Uh oh!

codecov bot commented Dec 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jwodder commented Dec 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

TheChymera commented Dec 20, 2022

Uh oh!

jwodder commented Dec 20, 2022

Uh oh!

TheChymera commented Dec 20, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yarikoptic Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

TheChymera Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

yarikoptic Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

yarikoptic Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

yarikoptic Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

yarikoptic Dec 21, 2022

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jan 17, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 20, 2022 •

edited

Loading

jwodder commented Dec 20, 2022 •

edited

Loading

TheChymera commented Dec 20, 2022 •

edited

Loading