Analysis schema#906
Conversation
…b.com/HEPData/hepdata-validator/blob/91b182772eac3a6d01451b98e4e24a9e7a865887/hepdata_validator/schemas/1.1.1/additional_resources_schema.json\#L12-L21): limit number of characters in license, add 'description' field
GraemeWatt
left a comment
There was a problem hiding this comment.
Thanks for opening the new PR. Tests are passing, but test coverage decreased by 0.6%, mainly because the new code added to the update_analyses function is not being tested. I see two options (see also my previous comment):
|
Thanks for the review, @GraemeWatt! I've addressed all issues. Regarding the decreased test coverage, I opted for using SModelS as test case as it is currently the most mature option. All tests should come back as green as soon as the issue you raised in #878 (comment) is addressed. |
* Simplify test_general_pages to avoid calling flask.url_for twice. * Access URLs for submission_schema, analyses_schema, formats, ping.
* Test case of updating SModelS analyses with no analyses to add. * Test case of updating analyses for an endpoint with no endpoint_url.
There was a problem hiding this comment.
Pull Request Overview
This PR introduces a JSON schema framework for defining how analysis implementation tools communicate with HEPData. It adds support for both a new structured schema (version 1.0.0) and backward compatibility with the previous format (version 0.1.0), using the presence and value of the schema_version field to determine which processing approach to use.
- Adds JSON schema definitions and documentation for both v1.0.0 and v0.1.0 analysis formats
- Updates analysis processing logic to handle both old and new schema formats based on
schema_versionfield - Adds comprehensive test coverage for the new schema validation functionality
Reviewed Changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_data/analyses_example.json | Example file demonstrating the new v1.0.0 schema format |
| tests/records_test.py | Updated test expectations to match new schema processing behavior |
| tests/e2e/test_general.py | Added end-to-end tests for new schema endpoints and refactored existing tests |
| tests/analyses_schema_test.py | New test file specifically for validating the analysis schema |
| hepdata/version.py | Minor version bump to reflect changes |
| hepdata/templates/analyses_schema/ | Added schema definitions and documentation for both v0.1.0 and v1.0.0 |
| hepdata/modules/theme/views.py | New endpoint for serving analysis schema files |
| hepdata/modules/records/utils/analyses.py | Core logic updates to support dual schema processing |
| hepdata/config.py | Updated SModelS endpoint configuration |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
|
@codecov-ai-reviewer review |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
|
@codecov-ai-reviewer review |
This comment has been minimized.
This comment has been minimized.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Clarify some code comments and use spaces around comparison operators. * Catch ValidationError exception and add a test for coverage. * Also replace "is not" by "!=" twice in records_test.py.
* submission_schema, analyses_schema and ping in test_general_pages return either JSON or "OK" where favicon.ico is not present.
GraemeWatt
left a comment
There was a problem hiding this comment.
@mhabedan : I think this is ready to be merged now. Please make a final check. Thanks for all your work on this PR.
My only concern is that the analyses_schema.json and readme.md files are buried inside the templates folder (which is meant for HTML templates), so they are not easily accessible, but I can link them from the "Analyses" section of the HEPData submission documentation.
|
Hi @GraemeWatt! Had another read-through and fixed a few minor typos in the readme/ JSON schema comments. It's good to go now in my opinion. Regarding the findability: Yes, it would certainly be good if it was easier to find. Linking in the "Analysis" section sounds good to me. We can also link it from the OpenMAPP webpage. Not sure if we can do anything apart from that though. |
Following up on a discussion of the OpenMAPP project and #878, this PR adds a JSON schema to HEPData that defines the format of the input "analyses" JSONs.
This new PR is not based on a fork so all test should run successfully.
Furthermore, this PR now adds the last two items discussed in #878:
schema_versionfield.