-
Notifications
You must be signed in to change notification settings - Fork 8
Open
1 / 11 of 1 issue completedLabels
Milestone
Description
ROBOKOP needs to store properties alongside identifiers; this might be something that is useful for other users to have as well. Since Babel needs to read through the input files anyway, it might make sense for those properties to be extracted in Babel and then exposed through NodeNorm.
Plan
- (Babel) Extract properties from resources (e.g. CHEBI) and store them alongside the IDs in the Babel outputs
- We will need to figure out how to handle provenance and collisions: what if two data sources disagree on the molecular weight of a molecule?
- Given how Babel is structured, this might be a separate Snakemake file that is run separately from the main Babel run
- Figure out units, checking property ranges, etc.
- We could focus on one dataset, work it through all the way to NodeNorm with an interface that works for everybody, and then
- (NodeNorm) Load them into the NodeNorm redis
- ROBOKOP team has a KGX file for this data on CHEBI; we could test the NodeNorm part of this by loading this into Redis
- (NodeNorm) Allow them to be queried via NodeNorm
- Might be a separate endpoint (ID -> properties) or a flag on the
/get_normalized_nodesendpoint (?include_properties=true) - Will need to figure out how this interacts with conflation, especially with the chemical-chemical conflation that will eventually be implemented
- Might be a separate endpoint (ID -> properties) or a flag on the
Next steps
- @cbizon and @EvanDietzMorris provide feedback on this proposal.
- We decide whether to implement it in NodeNorm first (Gaurav's preference -- it'll be simpler to test, could be based on a KGX input for now) or in Babel first.
- We implement it on one dataset, make sure we're happy with it, and then add additional datasets over time.
Reactions are currently unavailable