Skip to content

update modules (missing dockerhub key)#461

Closed
bertsky wants to merge 12 commits intoOCR-D:masterfrom
bertsky:gen-dockerhub-list
Closed

update modules (missing dockerhub key)#461
bertsky wants to merge 12 commits intoOCR-D:masterfrom
bertsky:gen-dockerhub-list

Conversation

@bertsky
Copy link
Copy Markdown
Collaborator

@bertsky bertsky commented Apr 15, 2025

@bertsky bertsky requested a review from kba April 15, 2025 14:47
Copy link
Copy Markdown

@MehmedGIT MehmedGIT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Comment thread ocrd-all-dockerhub.py Outdated
@kba kba mentioned this pull request Apr 16, 2025
@mikegerber
Copy link
Copy Markdown
Contributor

mikegerber commented Apr 17, 2025

Please don't include the linked PR for dinglehopper and wait for a release. The PR currently contains a change of CER calculation that needs discussion. Otherwise we'd end up with a dinglehopper in ocrd_all that gives different results ...

I also wish if ocrd_all would start using the versioned releases - I specifically started doing those releases for OCR-D, but they don't seem to be used.

@stweil
Copy link
Copy Markdown
Collaborator

stweil commented Apr 17, 2025

I also wish if ocrd_all would start using the versioned releases - I specifically started doing those releases for OCR-D, but they don't seem to be used.

Yes, please, and not only for dinglehopper, but for all submodules.

@mikegerber
Copy link
Copy Markdown
Contributor

Please don't include the linked PR for dinglehopper and wait for a release. The PR currently contains a change of CER calculation that needs discussion. Otherwise we'd end up with a dinglehopper in ocrd_all that gives different results ...

I've released dinglehopper v0.10.0 with the V3 update (but not the proposed change to using normalized CER. (For the latter there is a separate PR now.)

@bertsky
Copy link
Copy Markdown
Collaborator Author

bertsky commented Apr 17, 2025

@mikegerber The dinglehopper update was originally (the last remaining submodule) scheduled for this PR as our last release of the ocrd/all fat container images. For that, it must still support Python 3.8, which all our base stages are currently built on (because some module still rely on Tensorflow 1.x, which is only available prebuilt via nvidia-tensorflow for Py38). That can only be overcome by the slim containers (where every module can have its own base version of Python etc). So can we please get a version that does not rely on uniseg>=0.9? See here for how this fails our ocrd/all builds. It all hinges on this module now.

I cannot speak to whether or not my normalized-cer branch should have been included. We had a frenzy of PRs and updates on our hands for v3 and related updates. So I guess combining this was to save time (as we also depend on usable results without Infinity or averages over CER results above 100%).

@bertsky
Copy link
Copy Markdown
Collaborator Author

bertsky commented Apr 17, 2025

superseded by #462

@bertsky bertsky closed this Apr 17, 2025
@mikegerber
Copy link
Copy Markdown
Contributor

@mikegerber The dinglehopper update was originally (the last remaining submodule) scheduled for this PR as our last release of the ocrd/all fat container images. For that, it must still support Python 3.8, which all our base stages are currently built on (because some module still rely on Tensorflow 1.x, which is only available prebuilt via nvidia-tensorflow for Py38). That can only be overcome by the slim containers (where every module can have its own base version of Python etc). So can we please get a version that does not rely on uniseg>=0.9? See here for how this fails our ocrd/all builds. It all hinges on this module now.

Ah, that's good context to know, I wasn't aware of this (and only checked EOLness of Python 3.8 after it broke). I will release 0.10.1 shortly with the patch!

@mikegerber
Copy link
Copy Markdown
Contributor

I cannot speak to whether or not my normalized-cer branch should have been included. We had a frenzy of PRs and updates on our hands for v3 and related updates. So I guess combining this was to save time (as we also depend on usable results without Infinity or averages over CER results above 100%).

I can say it definitely didn't save ME time. It's a rather intrusive change because it redefines CER and this needs much more thought w.r.t. proper integration. It gives different results for users! To be me it's clear that it needs more love than sneaking it in with a PR that was just supposed to update to OCR-D's V3 API.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants