Remote model service option, and frontend fixes for overlapping, multiple annos per spans by tomolopolis · Pull Request #264 · CogStack/cogstack-nlp

tomolopolis · 2025-12-16T17:01:59Z

Note

Adds remote MedCAT service processing (with URL/config validation) and updates the UI to handle overlapping annotations via badges/popovers, alongside backend support and minor API adjustments.

Backend/API:
- Add use_model_service and model_service_url to ProjectAnnotateEntities and ProjectGroup with validation; bypass local model requirements when enabled.
- Implement remote processing via call_remote_model_service (requests) and lightweight RemoteSpacyDoc/RemoteEntity wrappers.
- Update annotation pipeline (prep_docs, prepare_documents, add_annotations) to support remote/local paths and CUI filters; allow overlapping manual annotations; remove CAT dependency from create_annotation.
- Disable interim training and concept addition when using remote service; adjust _submit_document, add_concept accordingly.
Admin/Migrations:
- Expose new fields in admin (_PROJECT_ANNO_ENTS_SETTINGS_FIELD_ORDER).
- Migration 0093_add_remote_model_service_fields adds the new fields.
Frontend (ClinicalText.vue):
- Improve rendering/sorting of overlapping annotations; add overlap badge with popover to select among stacked annos; manage popover open/close and outside clicks; exclude badge text from selection.
- New styles for badges/popovers; simplify underline/nesting CSS; remove deep nesting underline rules from _common.scss.
Dependencies:
- Add requests.

^{Written by Cursor Bugbot for commit 378634b. This will update automatically on new commits. Configure here.}

… for more than one annotation

…running inference on remote models for prepare_doc. Online learning not yet supported for this project setup type

cursor

This PR is being reviewed by Cursor Bugbot

Details

You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.

To receive Bugbot reviews on all of your PRs, visit the Cursor dashboard to activate Pro and start your 14-day free trial.

cursor · 2025-12-16T17:08:42Z

medcat-trainer/webapp/api/api/utils.py

+        self.start_char_index = entity_data.get('start', 0)
+        self.end_char_index = entity_data.get('end', 0)
+        self.text = entity_data.get('detected_name') or entity_data.get('source_value', '')
+        self.context_similarity = entity_data.get('context_similarity', entity_data.get('acc', 0.0))


Bug: Missing similarity score marks all annotations as deleted

When using the remote model service, RemoteEntity.context_similarity defaults to 0.0 if the API response doesn't include context_similarity or acc fields. Since the hardcoded threshold is 0.3, any entity without these fields will have ent.context_similarity < MIN_ACC evaluate to true, causing the annotation to be marked as deleted=True and validated=True. This silently marks all such annotations as deleted and validated (appearing as human-reviewed) when they should be presented for manual review.

Additional Locations (1)

medcat-trainer/webapp/api/api/utils.py#L192-L195

intentional, remote model service models should set this

cursor · 2025-12-16T17:08:42Z

medcat-trainer/webapp/api/api/models.py

+                raise ValidationError('Must set a DeID ModelPack for De-ID Model Annotation, cannot only set a cdb / vocab pair as'
+                                      ' not be a DeId model')
+        elif self.use_model_service and not self.model_service_url:
+            raise ValidationError('When using model service, model_service_url must be set')


Bug: DeID flag silently ignored for remote model service

When use_model_service is True, the deid_model_annotation flag is completely ignored. The model validation skips the DeID check when using remote service, allowing users to enable both options. However, the document processing code only checks deid_model_annotation in the local model path, meaning DeID processing never occurs for remote service projects even when the flag is set. Users who configure a project expecting De-identification processing will silently get standard processing instead.

Additional Locations (2)

medcat-trainer/webapp/api/api/views.py#L280-L294

medcat-trainer/webapp/api/api/utils.py#L345-L363

mart-r

Some minor comments and questions, perhaps a few nice to haves. But overall looks alright to me.

mart-r · 2025-12-17T09:07:38Z

medcat-trainer/webapp/api/api/models.py

                                             'if a model pack is used for the project')
    relations = models.ManyToManyField('Relation', blank=True, default=None,
                                       help_text='Relations that will be available for this project')
+    use_model_service = models.BooleanField(default=False,


Is there a need for both the boolean and the URL? Couoldn't we just assume that if there's a URL, it needs to be used?

agree - but in prep for when the admin - project setup is re-done, having a checkbox and some validation on a URL or some auto-generated URL etc. easier to keep explicit for now

mart-r · 2025-12-17T09:14:02Z

medcat-trainer/webapp/api/api/utils.py

+    """
    spacy_doc.linked_ents.sort(key=lambda x: len(x.text), reverse=True)

    tkns_in = []


This is no longer used. I suppose that's because we now DO allow entities that cover the same tokens.

not sure what you mean by its not used? its used views.py and utils.py

The variable defined on this line is not used anywhere else. It used to be used. But with the removal of code lower down (when iterating over the entities), this is no longer ever used.

mart-r · 2025-12-17T09:18:06Z

medcat-trainer/webapp/api/api/utils.py

+        logger.info('Using remote model service in bg process for project: %s', project.id)
+        filters = SimpleFilters(cuis=cuis)
+        for doc in docs:
+            logger.info(f'Running remote MedCAT service for project {project.id}:{project.name} over doc: {doc.id}')


The other logs above have lazy formatting, this one uses an f-string. Perhaps keep things lazy?

mart-r · 2025-12-17T09:27:04Z

medcat-trainer/webapp/api/api/utils.py

+        except FileNotFoundError:
+            logger.warning('Missing CUI filter file for project %s', project.id)
+
+    if project.use_model_service:


The only 2 differences between these 2 if clauses seem to be:

Getting the spacy_doc

Passing different variables to add_annotations

Perhaps we can avoid some of the code duplication by:

Using a functools.partial on add_annotations to pass required / changed kwargs

Using a callable for getting the spacy_doc

Change the logged message to be dynamic

Running everything else just once

While using the partial and the callable

Just a nice to have, really.

mart-r · 2025-12-17T09:27:36Z

medcat-trainer/webapp/api/api/utils.py

+        cat.config.components.linking.filters.cuis = cuis
+
+        for doc in docs:
+            logger.info(f'Running MedCAT model for project {project.id}:{project.name} over doc: {doc.id}')


Same comment here regarding f-strings vs lazy log record formatting.

mart-r · 2025-12-17T09:29:59Z

medcat-trainer/webapp/api/api/views.py

-
-                    if not project.deid_model_annotation:
-                        spacy_doc = cat(document.text)
+                    if project.use_model_service:


Same comment here regarding potential use of partials and callables. Though admittedly a lot less actual duplication here.

mart-r · 2025-12-17T09:32:37Z

medcat-trainer/webapp/api/api/views.py

 def _submit_document(project: ProjectAnnotateEntities, document: Document):
-    if project.train_model_on_submit:
+    if project.train_model_on_submit and not project.use_model_service:
+        # interim model training not supported for remote model service projects


Right now it's a silent failure. Is that expected?

logger warning and a TODO

tomolopolis added 2 commits December 10, 2025 21:02

CU-869bacykn: medcat-trainer(feat): fix overlapping annos styling and…

aa7195b

… for more than one annotation

feat(medcat-trainer): CU-869bgx7m2: remote service functionality for …

70f3fd9

…running inference on remote models for prepare_doc. Online learning not yet supported for this project setup type

tomolopolis changed the title ~~A Project option to run MedCAT models via a remote model service~~ Remote model service option, and frontend fixes for overlapping, multiple annos per spans Dec 16, 2025

cursor bot reviewed Dec 16, 2025

View reviewed changes

mart-r approved these changes Dec 17, 2025

View reviewed changes

review comments

378634b

tomolopolis merged commit ed63858 into main Dec 17, 2025
10 of 11 checks passed

tomolopolis deleted the lx-mc-comps branch December 17, 2025 11:49

mart-r mentioned this pull request Mar 9, 2026

fix(medcat-trainer): Fix remote model service errors on cache_project… #363

Merged

Conversation

tomolopolis commented Dec 16, 2025 • edited by cursor bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

This PR is being reviewed by Cursor Bugbot

Uh oh!

cursor bot Dec 16, 2025

Choose a reason for hiding this comment

Bug: Missing similarity score marks all annotations as deleted

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot Dec 16, 2025

Choose a reason for hiding this comment

Bug: DeID flag silently ignored for remote model service

Uh oh!

mart-r left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mart-r Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tomolopolis commented Dec 16, 2025 •

edited by cursor bot

Loading

mart-r Dec 17, 2025 •

edited

Loading