Skip to content

An additional property for annotation assertions grouped by distinct individual? #11

@neradis

Description

@neradis

As previously discussed, we want a hybrid scheme for annotations in NIF 3.0:

  • a lean way to attach annotations directy to nif:Strings when there is only one annotation
    per aspect
  • a scheme with an intermediate (blank) node for alternative annotations concerning the
    same aspect

Here is my idea for details, already as draft for documentation text:

==== QUOTE START====

NIF 2.1 annotation schemes

NIF 2.1 offers two schemes to attach annotations to a nif:String individual S:

direct attatchment

assertions comprising information for a given annotation are attachted
directly, S being des subject of corresponding triples. Several
sub-properties of nif:annotation are provided for this approach,
e.g. itsrdf:taIdentRef, nif:oliaLink.

If one intends to specify confidence and provenance information, no more than
one annotation property assertion per aspect must be attached directy,to prevent ambiguity:

<http://example.org/document/1#offset_20_24>
    a nif:String, nif:OffsetBasedString ;
    nif:beginIndex "20"^^xsd:int ;
    nif:endIndex "24"^^xsd:int ;
    nif:anchorOf "Bill"^^xsd:string ;
    itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
    itsrdf:taConfidence "0.9"^^xsd:decimal .

To allow use of the direct attachtment scheme to annotate several aspect on S
simultanesously, corresponding specialisation of nif:confidence and nif:provenance
are provided for some of the specialisations of nif:annotation (e.g. nif:oliaConf
and nif:oliaProv for nif:oliaLink):

<http://example.org/document/1#offset_20_24>
    a nif:String, nif:OffsetBasedString ;
    nif:beginIndex "20"^^xsd:int ;
    nif:endIndex "24"^^xsd:int ;
    nif:anchorOf "Bill"^^xsd:string ;
    itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
    itsrdf:taConfidence "0.9"^^xsd:decimal ;
    nif:oliaLink <http://acoli.cs.uni-frankfurt.de/resources/olia/penn.owl#NNP> ;
    nif:oliaProv "0.95"^^xsd:decimal .

related T-Box assertions / constraints

SubObjectPropertyOf(itsrdf:taIdentRef nif:annotion)
SubObjectPropertyOf(nif:oliaLink  nif:annotion)

SubDataPropertyOf(itsrdf:taConfidence nif:confidence)
SubDataPropertyOf(nif:oliaConf nif:confidence)
FunctionalDataProperty(itsrdf:taConfidence) # this still has to be coordinated with the itsrdf maintainers
FunctionalDataProperty(nif:oliaConf)

individual per annotation

a new nif:AnnotationUnit indiviual is introduced for each annotation and
conntected to S using nif:annotationUnit. Using this scheme, several
annotations for the same annocation aspect can be expressed for S
(for example, different links to Linked Data Resources for the same
token obtained from several Named Entity Recognition and
Disabmiguation System)

<http://example.org/document/1#offset_20_24>
    a nif:String, nif:OffsetBasedString ;
    nif:beginIndex "20"^^xsd:int ;
    nif:endIndex "24"^^xsd:int ;
    nif:anchorOf "Bill"^^xsd:string ;
    nif:annotationUnit [
        itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans> ;
        itsrdf:taConfidence "0.8"^^xsd:decimal ;
    ] .

related T-Box assertions / constraints

ObjectPropertyDomain(nif:annotationUnit nif:String)
ObjectPropertyRange(nif:annotationUnit nif:AnnotationUnit)

each nif:AnnotationUnit carries the information of exactly one
piece of annotation information. This can be a property assertion
with a nif:annotation, nif:classAnnotation or nif:literalAnnotation
or inherent annotation of a text span if the annotation unit is also
a nif:TextSpanAnnotation.

There is no more than one confidence and provenance assertion to
prevent ambiguity:

SubClassOf(nif:AnnotationUnit DataMaxCardinality( 1 nif:confidence))
SubClassOf(nif:AnnotationUnit ObjectMaxCardinality( 1 nif:provenance))

unified use and querying of both schemes

Naturally, both annotation schemes can be combined for a nif:String indiviual:

<http://example.org/document/1#offset_20_24>
    a nif:String, nif:OffsetBasedString ;
    nif:beginIndex "20"^^xsd:int ;
    nif:endIndex "24"^^xsd:int ;
    nif:anchorOf "Bill"^^xsd:string ;
    itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Clinton> ;
    itsrdf:taConfidence "0.9"^^xsd:decimal ;
    nif:annotationUnit [
        itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans_(saxophonist)> ;
        itsrdf:taConfidence "0.4"^^xsd:decimal 
    ] ;
    nif:annotationUnit [
        itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Evans> ;
        itsrdf:taConfidence "0.8"^^xsd:decimal ;
    ] ;
    nif:annotationUnit [
        itsrdf:taIdentRef <http://dbpedia.org/resource/Bill_Stewart_(musician)> ;
        itsrdf:taConfidence "0.2"^^xsd:decimal
    ].

The property path feature in SPARQL allows for a concise way to access
annotation expressed using both schemes simultaneously:

PREFIX nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>

select ?str ?link ?conf {
 ?str nif:annotationUnit? [
      itsrdf:taIdentRef ?link ;
      itsrdf:taConfidence ?conf
  ] ;
  a nif:String .
}

If the used SPARQL processor supports RDFS-inference (or if relevant inferences are
materialized) it is possible to query more general for all (object) annotations in a
NIF document in a similar fashion:

PREFIX nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#>

select ?str ?anno ?conf {
 ?str nif:annotationUnit? [
      nif:annotation ?anno ;
      nif:confidence ?conf
  ] ;
  a nif:String .
}

==== QUOTE END====

Do you also deem a specific nif:annotationUnit property justified and useful?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions