[Model] Add PP-OCRv5_server_rec and PP-OCRv5_mobile_rec models Support by zhang-prog · Pull Request #44808 · huggingface/transformers

zhang-prog · 2026-03-18T04:29:07Z

No description provided.

vasqu

I think we have the core down now, now it's about the last details! Great work overall 🤗

vasqu · 2026-03-18T11:05:48Z

+    logging,
+    requires_backends,
+)
+from ...utils.constants import (  # noqa: F401


Suggested change

from ...utils.constants import ( # noqa: F401

from ...utils.constants import (

really unsure but we dont need the noqa I think

vasqu · 2026-03-18T11:07:45Z

+    pad_size = {"height": 48, "width": 320}
+    do_resize = True
+    do_rescale = True
+    do_convert_rgb = True


Can we retroactively change that for previous models (e.g. server/mobile det) re rbg?

Probably in a different PR

yeah, I will make a new PR to solve this problem

vasqu · 2026-03-18T11:10:11Z

+
+
+@auto_docstring
+@requires(backends=("torch",))


Suggested change

@requires(backends=("torch",))

Not 100% sure but it seems that the base processing does not use anything torch specific? If yes, then we don't need the requires backend within the post processing function

vasqu · 2026-03-18T11:31:43Z

+        logits = self.head(outputs.last_hidden_state, **kwargs)
+
+        return BaseModelOutputWithNoAttention(
+            last_hidden_state=logits,
+            hidden_states=outputs.hidden_states,
+        )


Suggested change

logits = self.head(outputs.last_hidden_state, **kwargs)

return BaseModelOutputWithNoAttention(

last_hidden_state=logits,

hidden_states=outputs.hidden_states,

)

head_outputs = self.head(outputs.last_hidden_state, **kwargs)

return YourNewOutputClass(

last_hidden_state=head_outputs.last_hidden_states,

hidden_states=outputs.hidden_states,

head_hidden_states=head_outputs.hidden_states,

)

Just as a rough idea what I had in mind --> allow hidden states of both since the head model is quite sophisticated, I think it makes sense to have them as well

well, i see

vasqu · 2026-03-18T11:36:16Z

+        batch_size=3,
+        image_size=[48, 320],
+        num_channels=3,
+        is_training=False,


just as always for my interest: any support for training planned 👀

vasqu · 2026-03-18T11:38:31Z

+    @unittest.skip("PPOCRV5ServerRec does not has no attribute `hf_device_map`")
+    def test_cpu_offload(self):
+        pass
+
+    @unittest.skip("PPOCRV5ServerRec does not has no attribute `hf_device_map`")
+    def test_disk_offload_bin(self):
+        pass
+
+    @unittest.skip("PPOCRV5ServerRec does not has no attribute `hf_device_map`")
+    def test_disk_offload_safetensors(self):
+        pass


We should change model_split_percents (attribute within the mixin); likely needs higher splits so e.g. model_split_percents = [0.5, 0.7, 0.8] # [0.5, 0.8]

[0.5, 0.7, 0.8] doesn’t work, but [0.5, 0.8] works.

However, the test_model_parallelism test is still failing.

Not super important imo, can be skipped for now

vasqu · 2026-03-18T13:23:07Z

Btw main should be stable again, just need to merge/rebase with main

vasqu

Mostly good with the current state, see my last comments. And sorry but gotta be strict about adding tests :/

Careful approval

vasqu · 2026-03-18T14:26:42Z

+    def forward(
+        self,
+        hidden_states: torch.Tensor,
+        attention_mask: torch.Tensor | None = None,


Suggested change

attention_mask: torch.Tensor | None = None,

attention_mask: torch.Tensor | None = None, # Not used but kept for signature matching in downstream modules

vasqu · 2026-03-18T14:27:02Z

+    # NOTE:
+    # Prevents TypeError from duplicate attention_mask arguments (passed both directly and in **kwargs).
+    # This parameter is a placeholder for compatibility and is not actually consumed by the function.


Suggested change

# NOTE:

# Prevents TypeError from duplicate attention_mask arguments (passed both directly and in **kwargs).

# This parameter is a placeholder for compatibility and is not actually consumed by the function.

just a nit: dont think we need to be too verbose

vasqu · 2026-03-18T14:28:36Z

+    main_input_name = "pixel_values"
+    input_modalities = ("image",)
+    _can_record_outputs = {
+        "hidden_states": [PPOCRV5ServerRecConvLayer, PPOCRV5ServerRecBlock],


Suggested change

"hidden_states": [PPOCRV5ServerRecConvLayer, PPOCRV5ServerRecBlock],

"hidden_states": PPOCRV5ServerRecBlock,

Imo, I think we only want these because they are described in a way with config.depth

vasqu · 2026-03-18T14:30:03Z

+        head_outputs = self.head(outputs.last_hidden_state, **kwargs)
+
+        return PPOCRV5ServerRecForTextRecognitionOutput(
+            last_hidden_state=head_outputs,


Suggested change

last_hidden_state=head_outputs,

last_hidden_state=head_outputs.last_hidden_state,

vasqu · 2026-03-18T14:32:06Z

+        config,
+    ):
+        super().__init__(config)
+        # Use noqa to bypass the `unused in modular` check.


Suggested change

# Use noqa to bypass the `unused in modular` check.

no need for the comment dont worry

vasqu · 2026-03-18T14:34:25Z

Ok, hate to be stern but would definitely have modeling tests with integration tests - just for the simple reason that the backbone is different and we have a slightly different model albeit by very little

Reopening to add tests please

vasqu · 2026-03-18T14:35:35Z

+            with torch.no_grad():
+                _ = model(**self._prepare_for_class(inputs_dict, model_class))
+
+    def test_hidden_states_output(self):


Can we add the head hidden states to check as well?

vasqu · 2026-03-18T14:38:02Z

Hmm, maybe I was wrong on the gradient checkpointing:
FAILED tests/models/pp_ocrv5_server_rec/test_modeling_pp_ocrv5_server_rec.py::PPOCRV5ServerRecModelTest::test_gradient_checkpointing_backward_compatibility - ValueError: PPOCRV5ServerRecModel is not compatible with gradient checkpointing. Make sure all the architecture support it by setting a boolean attribute gradient_checkpointing to modules of the model that uses checkpointing.

zhang-prog · 2026-03-18T15:01:36Z

Yes, setting self.gradient_checkpointing = False is necessary to fix it, at least for now. :)

vasqu

Don't have much to add except for a few small nits + let's add tests for the mobile version as well please

Other than that, good to go!

vasqu · 2026-03-18T15:57:03Z

Reopening to add tests please

vasqu · 2026-03-18T16:11:39Z

run-slow: hgnet_v2, pp_ocrv5_mobile_rec, pp_ocrv5_server_det, pp_ocrv5_server_rec

github-actions · 2026-03-18T16:13:19Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/hgnet_v2", "models/pp_ocrv5_mobile_rec", "models/pp_ocrv5_server_det", "models/pp_ocrv5_server_rec"]
quantizations: []

github-actions · 2026-03-18T16:25:12Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	4cae9ac9	workflow commit (merge commit)
PR	6e5aaeff	branch commit (from PR)
main	4ec84a02	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

HuggingFaceDocBuilderDev · 2026-03-18T16:47:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

vasqu · 2026-03-18T17:23:09Z

@zhang-prog Will merge tomorrow probably, CI is struggling at the moment - nothing to do on your side 🤗

vasqu · 2026-03-18T19:46:24Z

run-slow: hgnet_v2, pp_ocrv5_mobile_rec, pp_ocrv5_server_det, pp_ocrv5_server_rec

github-actions · 2026-03-18T19:47:42Z

Workflow Run ⚙️

This comment contains run-slow, running the specified jobs:

models: ["models/hgnet_v2", "models/pp_ocrv5_mobile_rec", "models/pp_ocrv5_server_det", "models/pp_ocrv5_server_rec"]
quantizations: []

github-actions · 2026-03-18T19:48:15Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: auto, hgnet_v2, pp_ocrv5_mobile_rec, pp_ocrv5_server_det, pp_ocrv5_server_rec

github-actions · 2026-03-18T20:02:49Z

CI Results

Workflow Run ⚙️

Commit Info

Context	Commit	Description
RUN	ee1f6921	workflow commit (merge commit)
PR	d0e841d6	branch commit (from PR)
main	21950930	base commit (on `main`)

✅ No failing test specific to this PR 🎉 👏 !

init

3c9b906

zhang-prog mentioned this pull request Mar 18, 2026

[Model] Add PP-OCRv5_server_rec Model Support #43795

Closed

5 tasks

zhang-prog added 2 commits March 18, 2026 16:53

fix

cabdd90

add image processor test

858831f

vasqu reviewed Mar 18, 2026

View reviewed changes

add mobile_rec

ef7e1ee

zhang-prog added 5 commits March 18, 2026 21:32

fix

cc31805

Merge remote-tracking branch 'origin/main' into feat/pp_ocrv5_rec_models

50660d3

fix

6a07234

fix code style

d5f58c7

Merge remote-tracking branch 'origin/main' into feat/pp_ocrv5_rec_models

fba1a16

zhang-prog requested a review from vasqu March 18, 2026 14:04

add mobile_rec

80f5456

vasqu approved these changes Mar 18, 2026

View reviewed changes

zhang-prog added 3 commits March 18, 2026 23:44

fix

55ce990

Merge remote-tracking branch 'origin/main' into feat/pp_ocrv5_rec_models

eb26527

fix toctree

da6ad01

vasqu approved these changes Mar 18, 2026

View reviewed changes

update

6e5aaef

vasqu added 2 commits March 18, 2026 17:32

cleanup inits and docs etc

73f773e

dang

7706073

vasqu enabled auto-merge March 18, 2026 16:43

vasqu added the New model label Mar 18, 2026

Merge branch 'main' into feat/pp_ocrv5_rec_models

f9bd248

vasqu disabled auto-merge March 18, 2026 17:30

vasqu enabled auto-merge March 18, 2026 17:45

Merge branch 'main' into feat/pp_ocrv5_rec_models

847ac3e

vasqu added this pull request to the merge queue Mar 18, 2026

vasqu removed this pull request from the merge queue due to a manual request Mar 18, 2026

make separate auto model for text recognition

d0e841d

vasqu enabled auto-merge March 18, 2026 20:02

vasqu added this pull request to the merge queue Mar 18, 2026

Merged via the queue into huggingface:main with commit c55f650 Mar 18, 2026
29 checks passed

vasqu mentioned this pull request Mar 19, 2026

[Model] Add PP-OCRV5_mobile_rec Model Support #43793

Closed

5 tasks

	from ...utils.constants import ( # noqa: F401
	from ...utils.constants import (

	attention_mask: torch.Tensor \| None = None,
	attention_mask: torch.Tensor \| None = None, # Not used but kept for signature matching in downstream modules

	# NOTE:
	# Prevents TypeError from duplicate attention_mask arguments (passed both directly and in **kwargs).
	# This parameter is a placeholder for compatibility and is not actually consumed by the function.

	"hidden_states": [PPOCRV5ServerRecConvLayer, PPOCRV5ServerRecBlock],
	"hidden_states": PPOCRV5ServerRecBlock,

	last_hidden_state=head_outputs,
	last_hidden_state=head_outputs.last_hidden_state,



		@auto_docstring
		@requires(backends=("torch",))

Conversation

zhang-prog commented Mar 18, 2026

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vasqu commented Mar 18, 2026

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vasqu commented Mar 18, 2026

Uh oh!

zhang-prog commented Mar 18, 2026

Uh oh!

vasqu left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

vasqu commented Mar 18, 2026

Uh oh!

github-actions Bot commented Mar 18, 2026

Uh oh!

github-actions Bot commented Mar 18, 2026

CI Results

Commit Info

Uh oh!

HuggingFaceDocBuilderDev commented Mar 18, 2026

Uh oh!

vasqu commented Mar 18, 2026

Uh oh!

Uh oh!