Background
As discussed previously (PathwayCommons/grounding-search#159) there is desire to reduce grounding errors that arise from poor matches.
This can happen because author node label is:
- an out of scope (aka out of dictionary or 'OOD') concept
- synonym that isn't listed in a entity database record
- gibberish
An underlying issue is that the grounding search is configured to always return a list of search hits from which a "match" (i.e. top ranked item) is assigned. Based on previous work (see PathwayCommons/grounding-search#162) we can configure Elasticsearch to reduce these errors, returning empty results for many of these cases.
Problem
A (prior) assumption in Biofactoid is that users understand the scope of curation, and therefore, would only enter a name for an entity that is supported and for which there is a corresponding match.
a. Node grounding popup message: The existing warning message for no results (which really only occurred in case of a network error) implies that they should "try again" (i.e. case 2 above, just find the right synonym). This is a problem if the entity doesn't exist (cases 1 also 3). This should be handled more generically.
b. Ungrounded nodes (that aren't complexes): When a node is never labelled or not ungrounded (e.g. if no grounding search results are returned) the input label is not shown. The problem with the latter that this could be confusing for a user - "what happened to the name I just typed for this node?" Also, at submit time, the presence of blank nodes presents warnings, but does not block submission. Rather, it blocks the document from becoming "public". This should handle the case more generically.
c. Admin: The admin doesn't flag the existence of ungrounded (not complex) nodes in Documents.
Implementation
Options for curation tool:
(a) (Blocking) Most users understand grounding: Remind them about scope and suggest they may want to use another synonym. Perhaps they will correct it on their own (best). Downside is that users may ignore this, become confused or frustrated - this tool doesn't "work"
(b) ("Non-blocking") Most users have no knowledge of grounding, do not care: Show the label on the node always. Corrections are made manually post hoc by asking the author, by us without consultation or by expert curator in review. Downside is its a lot of work, and users could participate in this, but wouldn't realize the issue.
Background
As discussed previously (PathwayCommons/grounding-search#159) there is desire to reduce grounding errors that arise from poor matches.
This can happen because author node label is:
An underlying issue is that the grounding search is configured to always return a list of search hits from which a "match" (i.e. top ranked item) is assigned. Based on previous work (see PathwayCommons/grounding-search#162) we can configure Elasticsearch to reduce these errors, returning empty results for many of these cases.
Problem
A (prior) assumption in Biofactoid is that users understand the scope of curation, and therefore, would only enter a name for an entity that is supported and for which there is a corresponding match.
a. Node grounding popup message: The existing warning message for no results (which really only occurred in case of a network error) implies that they should "try again" (i.e. case 2 above, just find the right synonym). This is a problem if the entity doesn't exist (cases 1 also 3). This should be handled more generically.
b. Ungrounded nodes (that aren't complexes): When a node is never labelled or not ungrounded (e.g. if no grounding search results are returned) the input label is not shown. The problem with the latter that this could be confusing for a user - "what happened to the name I just typed for this node?" Also, at submit time, the presence of blank nodes presents warnings, but does not block submission. Rather, it blocks the document from becoming "public". This should handle the case more generically.
c. Admin: The admin doesn't flag the existence of ungrounded (not complex) nodes in Documents.
Implementation
Options for curation tool:
(a) (Blocking) Most users understand grounding: Remind them about scope and suggest they may want to use another synonym. Perhaps they will correct it on their own (best). Downside is that users may ignore this, become confused or frustrated - this tool doesn't "work"
(b) ("Non-blocking") Most users have no knowledge of grounding, do not care: Show the label on the node always. Corrections are made manually post hoc by asking the author, by us without consultation or by expert curator in review. Downside is its a lot of work, and users could participate in this, but wouldn't realize the issue.