-
Notifications
You must be signed in to change notification settings - Fork 7
Description
As previously discussed, we want a hybrid scheme for annotations in NIF 3.0:
- a lean way to attach annotations directy to nif:Strings when there is only one annotation
per aspect - a scheme with an intermediate (blank) node for alternative annotations concerning the
same aspect
Here is my idea for details, already as draft for documentation text:
==== QUOTE START====
NIF 2.1 annotation schemes
NIF 2.1 offers two schemes to attach annotations to a nif:String individual S:
direct attatchment
assertions comprising information for a given annotation are attachted
directly, S being des subject of corresponding triples. Several
sub-properties of nif:annotation are provided for this approach,
e.g. itsrdf:taIdentRef, nif:oliaLink.
If one intends to specify confidence and provenance information, no more than
one annotation property assertion per aspect must be attached directy,to prevent ambiguity:
<http://example.org/document/1#offset_20_24>
a nif:String, nif:OffsetBasedString ;
nif:beginIndex "20"^^xsd:int ;
nif:endIndex "24"^^xsd:int ;
nif:anchorOf "Bill"^^xsd:string ;
itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
itsrdf:taConfidence "0.9"^^xsd:decimal .
To allow use of the direct attachtment scheme to annotate several aspect on S
simultanesously, corresponding specialisation of nif:confidence and nif:provenance
are provided for some of the specialisations of nif:annotation (e.g. nif:oliaConf
and nif:oliaProv for nif:oliaLink):
<http://example.org/document/1#offset_20_24>
a nif:String, nif:OffsetBasedString ;
nif:beginIndex "20"^^xsd:int ;
nif:endIndex "24"^^xsd:int ;
nif:anchorOf "Bill"^^xsd:string ;
itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
itsrdf:taConfidence "0.9"^^xsd:decimal ;
nif:oliaLink <http://acoli.cs.uni-frankfurt.de/resources/olia/penn.owl#NNP> ;
nif:oliaProv "0.95"^^xsd:decimal .
related T-Box assertions / constraints
SubObjectPropertyOf(itsrdf:taIdentRef nif:annotion)
SubObjectPropertyOf(nif:oliaLink nif:annotion)
SubDataPropertyOf(itsrdf:taConfidence nif:confidence)
SubDataPropertyOf(nif:oliaConf nif:confidence)
FunctionalDataProperty(itsrdf:taConfidence) # this still has to be coordinated with the itsrdf maintainers
FunctionalDataProperty(nif:oliaConf)
individual per annotation
a new nif:AnnotationUnit indiviual is introduced for each annotation and
conntected to S using nif:annotationUnit. Using this scheme, several
annotations for the same annocation aspect can be expressed for S
(for example, different links to Linked Data Resources for the same
token obtained from several Named Entity Recognition and
Disabmiguation System)
<http://example.org/document/1#offset_20_24>
a nif:String, nif:OffsetBasedString ;
nif:beginIndex "20"^^xsd:int ;
nif:endIndex "24"^^xsd:int ;
nif:anchorOf "Bill"^^xsd:string ;
nif:annotationUnit [
itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans> ;
itsrdf:taConfidence "0.8"^^xsd:decimal ;
] .
related T-Box assertions / constraints
ObjectPropertyDomain(nif:annotationUnit nif:String)
ObjectPropertyRange(nif:annotationUnit nif:AnnotationUnit)
each nif:AnnotationUnit carries the information of exactly one
piece of annotation information. This can be a property assertion
with a nif:annotation, nif:classAnnotation or nif:literalAnnotation
or inherent annotation of a text span if the annotation unit is also
a nif:TextSpanAnnotation.
There is no more than one confidence and provenance assertion to
prevent ambiguity:
SubClassOf(nif:AnnotationUnit DataMaxCardinality( 1 nif:confidence))
SubClassOf(nif:AnnotationUnit ObjectMaxCardinality( 1 nif:provenance))
unified use and querying of both schemes
Naturally, both annotation schemes can be combined for a nif:String indiviual:
<http://example.org/document/1#offset_20_24>
a nif:String, nif:OffsetBasedString ;
nif:beginIndex "20"^^xsd:int ;
nif:endIndex "24"^^xsd:int ;
nif:anchorOf "Bill"^^xsd:string ;
itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
itsrdf:taConfidence "0.9"^^xsd:decimal ;
nif:annotationUnit [
itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans_(saxophonist)> ;
itsrdf:taConfidence "0.4"^^xsd:decimal
] ;
nif:annotationUnit [
itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans> ;
itsrdf:taConfidence "0.8"^^xsd:decimal ;
] ;
nif:annotationUnit [
itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Stewart_(musician)> ;
itsrdf:taConfidence "0.2"^^xsd:decimal
].
The property path feature in SPARQL allows for a concise way to access
annotation expressed using both schemes simultaneously:
PREFIX nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>
select ?str ?link ?conf {
?str nif:annotationUnit? [
itsrdf:taIdentRef ?link ;
itsrdf:taConfidence ?conf
] ;
a nif:String .
}
If the used SPARQL processor supports RDFS-inference (or if relevant inferences are
materialized) it is possible to query more general for all (object) annotations in a
NIF document in a similar fashion:
PREFIX nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>
select ?str ?anno ?conf {
?str nif:annotationUnit? [
nif:annotation ?anno ;
nif:confidence ?conf
] ;
a nif:String .
}
==== QUOTE END====
Do you also deem a specific nif:annotationUnit property justified and useful?