Add URIs for some dataset targets #72
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR quickly mocks up the addition of URIs to dataset target metadata. Somewhat addresses #8 --- need to decide how useful this is first.
The idea is that these URIs can at least i) resolve whether two definitions are the same across datasets and could potentially ii) be used to augment the dataset with canonical descriptions and semantic links, either during prep or by the model on-the-fly.
This currently assumes an "exact match" style mapping between target and property -- we could build in additional semantic context in the schema here to enable things like related identities/subclasses/parthood and all that jazz. I struggled with the
Butkiewiczsets as it is really outside my field and definitions are available for e.g.,cav3,t-type,calcium channelandactivitybut notactivity_cav3_t_type_calcium_channels.As discussed, this is quite a niche task that may not be suitable to ask others to perform. Even in my own case, it is not clear exactly how good these particular definitions are -- I just went via BioPortal for fields that have good matches: https://bioportal.bioontology.org/