Skip to content

fix: make pipeline step settings optional and fix TrOCR device mismatch#48

Open
andy778 wants to merge 1 commit intoAI-Riksarkivet:mainfrom
andy778:runtime_error
Open

fix: make pipeline step settings optional and fix TrOCR device mismatch#48
andy778 wants to merge 1 commit intoAI-Riksarkivet:mainfrom
andy778:runtime_error

Conversation

@andy778
Copy link

@andy778 andy778 commented Mar 22, 2026

Description

pipeline/steps.py

  • Made settings field in PipelineStepConfig optional (dict | None = None), so pipeline YAML configs no longer require an explicit settings: key for steps that use defaults.
  • Added a or {} fallback in init_step so a None settings value is safely passed as an empty dict to from_config.
    models/huggingface/trocr.py
  • Added a workaround for a device mismatch bug in transformers' TrOCRSinusoidalPositionalEmbedding. During inference, position_ids are on CUDA while self.weights remains on CPU (or stored as a meta tensor from the checkpoint). This causes a runtime error.
  • Fix: after loading the model, recompute the sinusoidal positional embedding weights from scratch using the layer's own get_embedding() method and move them to the correct device.

Why
These two fixes address runtime errors encountered when running TrOCR-based pipelines:

  1. A ValidationError when a pipeline step config omits the settings key.
  2. A device mismatch error (Expected all tensors to be on the same device) during TrOCR inference on CUDA.

Issue ticket number and link from

#47

Checklist

  • I have performed a self-review of my code locally
  • [x ] My code follows the style guidelines of this project (e.g. passes ruff lint + mypy type hinting)
  • If it is a core feature, I have added thorough tests.
  • [x ] Documentation has been updated or the changes are too minor to be documented

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant