Skip to content

Processor: allow empty output file grp#1321

Merged
kba merged 7 commits intoOCR-D:masterfrom
bertsky:processor-allow-empty-output-file-grp
Apr 16, 2025
Merged

Processor: allow empty output file grp#1321
kba merged 7 commits intoOCR-D:masterfrom
bertsky:processor-allow-empty-output-file-grp

Conversation

@bertsky
Copy link
Copy Markdown
Collaborator

@bertsky bertsky commented Apr 8, 2025

Spec actually says this is legitimate. We have processors like ocrd-docstruct, or might have evaluators etc. that simply do not produce output in the form of a file group.

@bertsky bertsky requested a review from kba April 8, 2025 17:37
Copy link
Copy Markdown
Member

@kba kba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran into this same issue with OCR-D/ocrd_anybaseocr#112.

Comment thread src/ocrd/decorators/__init__.py Outdated
set_json_key_value_overrides(kwargs['parameter'], *kwargs.pop('parameter_override'))
# Assert -I / -O
if not kwargs['input_file_grp']:
if 'input_file_grp' not in kwargs:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if 'input_file_grp' not in kwargs:
if not kwargs.get('input_file_grp', None):

as discussed

Copy link
Copy Markdown
Member

@kba kba Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ocrd-olahd-client requires neither input nor output. But since that is a fairly esoteric processor that no one seems to be using, probably not reason enough to change that behavior in spec/core.

@MehmedGIT @j-panzer @joschrew Is the OLA-HD client still in use?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kba, no it is not used by Operandi nor the Integration script.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could treat this as a special case: do require an input fileGrp, but only for informational purposes or to represent workflow dependency (final fileGrp), not for actually doing something specific with that fileGrp.

So instead of input_file_grp_cardinality: 0 this would become 1.

It would require users specify an input fileGrp, but that should normally not be a problem – either pass the "maximal" fileGrp of the workflow, or just use OCR-D-IMG.

(Just to make sure we don't bend the spec because of this.)

@kba kba merged commit e6f53fe into OCR-D:master Apr 16, 2025
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants