Updating BIDS validator and schema to contemporary upstream equivalent#1050
Updating BIDS validator and schema to contemporary upstream equivalent#1050yarikoptic merged 30 commits intomasterfrom
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #1050 +/- ##
==========================================
+ Coverage 88.41% 88.48% +0.07%
==========================================
Files 72 73 +1
Lines 9251 9291 +40
==========================================
+ Hits 8179 8221 +42
+ Misses 1072 1070 -2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
|
Additionally, we might want to start using the reference bids-examples repository for validation (rather than our own fork). In any case, using vanilla upstream would add another ~10s to the BIDS validator tests; as our fork had stripped a lot of the datasets. While we will prospectively no longer run the validator tests themselves (these would be run by the bids-schemacode package), we still need BIDS example data to construct our derivative BIDS datasets for higher-level function testing... Any ideas if we can somehow check out only some of the files to reduce download time? @yarikoptic |
45b386f to
036e41d
Compare
|
This pull request introduces 1 alert when merging 4257ef8 into 50ca7b0 - view on LGTM.com new alerts:
|
yarikoptic
left a comment
There was a problem hiding this comment.
only started, need to switch, please attend to --report option meanwhile
| help="Whether to write a report under a unique path in the current directory. " | ||
| "Only usable if `--report` is not already used.", | ||
| help="Whether to write a report under a unique path in the current directory. ", | ||
| ) |
There was a problem hiding this comment.
so we are breaking CLI ... why not just to keep --report to be an option to provide the path and if not specified, assume that no report writing was requested - print to the screen?
Note: I might not be even able to write to current directory (dataset might be owned by someone else).
There was a problem hiding this comment.
functionality wasn't broken, help text was just incorrect.
There was a problem hiding this comment.
You removed --report-flag and made --report from taking a string (path) into being is_flag -- so, it got broken that user no longer would be able to say --report myvalidation.log. And I think it is worth removing somewhat odd --report-flag so ok to break but we better "make it right" this time and avoid future breakage.
If you really want to be fancy and support also bool like behavior, I guess you would need to make it nargs='?' (if click supports it, @jwodder could help) and then if not report - assign that default path to the log, so user could do both --report and --report mylog.log.
Alternative -- not bother with "bool" like behavior in CLI, and just make it demand path string.
NB: please do not mark such comments Resolved - let original Author decide if they were resolved or not since it makes it harder for a reviewer to locate prior comments while re-reviewing and see if prior concerns were addressed.
yarikoptic
left a comment
There was a problem hiding this comment.
--report handling still needs to be fixed and tested better
| help="Whether to write a report under a unique path in the current directory. " | ||
| "Only usable if `--report` is not already used.", | ||
| help="Whether to write a report under a unique path in the current directory. ", | ||
| ) |
There was a problem hiding this comment.
You removed --report-flag and made --report from taking a string (path) into being is_flag -- so, it got broken that user no longer would be able to say --report myvalidation.log. And I think it is worth removing somewhat odd --report-flag so ok to break but we better "make it right" this time and avoid future breakage.
If you really want to be fancy and support also bool like behavior, I guess you would need to make it nargs='?' (if click supports it, @jwodder could help) and then if not report - assign that default path to the log, so user could do both --report and --report mylog.log.
Alternative -- not bother with "bool" like behavior in CLI, and just make it demand path string.
NB: please do not mark such comments Resolved - let original Author decide if they were resolved or not since it makes it harder for a reviewer to locate prior comments while re-reviewing and see if prior concerns were addressed.
It didn't get broken, the feature was simply removed. I added it initially because I thought it would be helpful, and I decided that in fact it is not. The |
| def write_report( | ||
| validation_result, | ||
| report_path="{logdir}/bids-validator-report_{datetime}-{pid}.log", | ||
| report_path="/var/tmp/bids-validator/report_{datetime}-{pid}.log", |
There was a problem hiding this comment.
I suggest still using a variable for the log base directory, but set it to $TMPDIR (falling back to /tmp) by default.
There was a problem hiding this comment.
This is for upstream usage, we wrap this command and pass the DANDI log directory via appdirs. /tmp/ is tricky as it's ephemeral on some systems, and users might not know how their system handles it. /var/tmp is the safer bet.
Compare:
https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch05s15.html
to
https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s18.html
Particularly:
Programs must not assume that any files or directories in /tmp are preserved between invocations of the program.
is pretty much a non-starter for logging which might need to be consulted over a longer period.
| def validate_bids( | ||
| bids_paths, | ||
| schema_reference_root="{module_path}/support/bids/schemadata/", | ||
| schema_reference_root="/usr/share/bids-schema/", |
There was a problem hiding this comment.
Under what conditions is this default used, and why would we expect anything to be in this directory?
There was a problem hiding this comment.
In the DANDI use case this is never used, but this is bundled code from upstream, which we will soon no longer track but import from a python dependency. Prospectively, the dependency would find the BIDS schema index in that directory, as that is where architecture-independent read-only user data belongs as per the FHS.
Co-authored-by: John T. Wodder II <jwodder@users.noreply.github.com>
Co-authored-by: John T. Wodder II <jwodder@users.noreply.github.com>
|
@jwodder thanks for your input :3 good to go now? |
|
ok, I think that main discussions/questions were addressed/resolved. Let's proceed and see where it takes us. Thank you @TheChymera ! |
Brought our code back in sync with upstream BIDS, as we finally have upstream
ome.zarrsupport and there is no longer any reason to maintain a fork of the specification.This also fixes: #1037
Ideally this will be the last update of bundled code as we are working on importing the built-in Python validator as a library.
Should the need for specification forking arise in the future, this should best be handled via the parameterized schema directory.
Draft for now as there might be inconsistencies I haven't spotted.