Skip to content

Wrong entities are removed when iterating through CAS #259

@giuliabaldini

Description

@giuliabaldini

Describe the bug
Hey there, thank you for the great package! I am currently using it for my annotations, and I have noticed something weird when trying to remove multiple entities.

To Reproduce
Steps to reproduce the behavior:

import cassis

typesystem = cassis.TypeSystem()
ner_type = typesystem.create_type(
    name="NamedEntity", supertypeName="uima.tcas.Annotation"
)
typesystem.create_feature(
    domainType=ner_type, name="source", rangeType=cassis.typesystem.TYPE_NAME_STRING
)

cas = cassis.Cas(typesystem)
for i in range(100):
    if i % 2:
        cas.add(ner_type(source="spacy"))
    else:
        cas.add(ner_type(source="user"))

print("Possible values annotation.source", set(entity.source for entity in cas.select("NamedEntity")))
print(
    "Number of annotations where source is not user",
    sum(1 for entity in cas.select("NamedEntity") if entity.source != "user"),
)
found_entities = 0
for entity in cas.select("NamedEntity"):
    if entity.source != "user":
        found_entities += 1
        cas.remove(entity)
print("Found and removed", found_entities, "entities")
print("Possible values annotation.source", set(entity.source for entity in cas.select("NamedEntity")))
print(
    "Number of annotations where source is not user after removal",
    sum(1 for entity in cas.select("NamedEntity") if entity.source != "user"),
)

Output:

Possible values annotation.source {'spacy', 'user'}
Number of annotations where source is not user 50
Found and removed 50 entities
Possible values annotation.source {'spacy', 'user'}
Number of annotations where source is not user after removal 25

Expected behavior
I would expect the remove function to remove all the entities where source is not user. Also, since it removes 50 entities, it also removes annotations that do have user as source.

Please complete the following information:

  • Version: 0.7.2
  • OS: OS X

Thank you very much in advance!
Best,
Giulia

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions