[Model] Add PP-OCRv5_server_rec and PP-OCRv5_mobile_rec models Support#44808
[Model] Add PP-OCRv5_server_rec and PP-OCRv5_mobile_rec models Support#44808vasqu merged 19 commits intohuggingface:mainfrom
Conversation
vasqu
left a comment
There was a problem hiding this comment.
I think we have the core down now, now it's about the last details! Great work overall 🤗
| logging, | ||
| requires_backends, | ||
| ) | ||
| from ...utils.constants import ( # noqa: F401 |
There was a problem hiding this comment.
| from ...utils.constants import ( # noqa: F401 | |
| from ...utils.constants import ( |
really unsure but we dont need the noqa I think
| pad_size = {"height": 48, "width": 320} | ||
| do_resize = True | ||
| do_rescale = True | ||
| do_convert_rgb = True |
There was a problem hiding this comment.
Can we retroactively change that for previous models (e.g. server/mobile det) re rbg?
Probably in a different PR
There was a problem hiding this comment.
yeah, I will make a new PR to solve this problem
|
|
||
|
|
||
| @auto_docstring | ||
| @requires(backends=("torch",)) |
There was a problem hiding this comment.
| @requires(backends=("torch",)) |
Not 100% sure but it seems that the base processing does not use anything torch specific? If yes, then we don't need the requires backend within the post processing function
| logits = self.head(outputs.last_hidden_state, **kwargs) | ||
|
|
||
| return BaseModelOutputWithNoAttention( | ||
| last_hidden_state=logits, | ||
| hidden_states=outputs.hidden_states, | ||
| ) |
There was a problem hiding this comment.
| logits = self.head(outputs.last_hidden_state, **kwargs) | |
| return BaseModelOutputWithNoAttention( | |
| last_hidden_state=logits, | |
| hidden_states=outputs.hidden_states, | |
| ) | |
| head_outputs = self.head(outputs.last_hidden_state, **kwargs) | |
| return YourNewOutputClass( | |
| last_hidden_state=head_outputs.last_hidden_states, | |
| hidden_states=outputs.hidden_states, | |
| head_hidden_states=head_outputs.hidden_states, | |
| ) |
Just as a rough idea what I had in mind --> allow hidden states of both since the head model is quite sophisticated, I think it makes sense to have them as well
| batch_size=3, | ||
| image_size=[48, 320], | ||
| num_channels=3, | ||
| is_training=False, |
There was a problem hiding this comment.
just as always for my interest: any support for training planned 👀
| @unittest.skip("PPOCRV5ServerRec does not has no attribute `hf_device_map`") | ||
| def test_cpu_offload(self): | ||
| pass | ||
|
|
||
| @unittest.skip("PPOCRV5ServerRec does not has no attribute `hf_device_map`") | ||
| def test_disk_offload_bin(self): | ||
| pass | ||
|
|
||
| @unittest.skip("PPOCRV5ServerRec does not has no attribute `hf_device_map`") | ||
| def test_disk_offload_safetensors(self): | ||
| pass |
There was a problem hiding this comment.
We should change model_split_percents (attribute within the mixin); likely needs higher splits so e.g. model_split_percents = [0.5, 0.7, 0.8] # [0.5, 0.8]
There was a problem hiding this comment.
[0.5, 0.7, 0.8] doesn’t work, but [0.5, 0.8] works.
However, the test_model_parallelism test is still failing.
There was a problem hiding this comment.
Not super important imo, can be skipped for now
|
Btw main should be stable again, just need to merge/rebase with main |
vasqu
left a comment
There was a problem hiding this comment.
Mostly good with the current state, see my last comments. And sorry but gotta be strict about adding tests :/
Careful approval
| def forward( | ||
| self, | ||
| hidden_states: torch.Tensor, | ||
| attention_mask: torch.Tensor | None = None, |
There was a problem hiding this comment.
| attention_mask: torch.Tensor | None = None, | |
| attention_mask: torch.Tensor | None = None, # Not used but kept for signature matching in downstream modules |
| # NOTE: | ||
| # Prevents TypeError from duplicate attention_mask arguments (passed both directly and in **kwargs). | ||
| # This parameter is a placeholder for compatibility and is not actually consumed by the function. |
There was a problem hiding this comment.
| # NOTE: | |
| # Prevents TypeError from duplicate attention_mask arguments (passed both directly and in **kwargs). | |
| # This parameter is a placeholder for compatibility and is not actually consumed by the function. |
just a nit: dont think we need to be too verbose
| main_input_name = "pixel_values" | ||
| input_modalities = ("image",) | ||
| _can_record_outputs = { | ||
| "hidden_states": [PPOCRV5ServerRecConvLayer, PPOCRV5ServerRecBlock], |
There was a problem hiding this comment.
| "hidden_states": [PPOCRV5ServerRecConvLayer, PPOCRV5ServerRecBlock], | |
| "hidden_states": PPOCRV5ServerRecBlock, |
Imo, I think we only want these because they are described in a way with config.depth
| head_outputs = self.head(outputs.last_hidden_state, **kwargs) | ||
|
|
||
| return PPOCRV5ServerRecForTextRecognitionOutput( | ||
| last_hidden_state=head_outputs, |
There was a problem hiding this comment.
| last_hidden_state=head_outputs, | |
| last_hidden_state=head_outputs.last_hidden_state, |
| config, | ||
| ): | ||
| super().__init__(config) | ||
| # Use noqa to bypass the `unused in modular` check. |
There was a problem hiding this comment.
| # Use noqa to bypass the `unused in modular` check. |
no need for the comment dont worry
There was a problem hiding this comment.
Ok, hate to be stern but would definitely have modeling tests with integration tests - just for the simple reason that the backbone is different and we have a slightly different model albeit by very little
There was a problem hiding this comment.
Reopening to add tests please
| with torch.no_grad(): | ||
| _ = model(**self._prepare_for_class(inputs_dict, model_class)) | ||
|
|
||
| def test_hidden_states_output(self): |
There was a problem hiding this comment.
Can we add the head hidden states to check as well?
|
Hmm, maybe I was wrong on the gradient checkpointing: |
|
Yes, setting |
vasqu
left a comment
There was a problem hiding this comment.
Don't have much to add except for a few small nits + let's add tests for the mobile version as well please
Other than that, good to go!
There was a problem hiding this comment.
Reopening to add tests please
|
run-slow: hgnet_v2, pp_ocrv5_mobile_rec, pp_ocrv5_server_det, pp_ocrv5_server_rec |
|
This comment contains models: ["models/hgnet_v2", "models/pp_ocrv5_mobile_rec", "models/pp_ocrv5_server_det", "models/pp_ocrv5_server_rec"] |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
|
@zhang-prog Will merge tomorrow probably, CI is struggling at the moment - nothing to do on your side 🤗 |
|
run-slow: hgnet_v2, pp_ocrv5_mobile_rec, pp_ocrv5_server_det, pp_ocrv5_server_rec |
|
This comment contains models: ["models/hgnet_v2", "models/pp_ocrv5_mobile_rec", "models/pp_ocrv5_server_det", "models/pp_ocrv5_server_rec"] |
|
[For maintainers] Suggested jobs to run (before merge) run-slow: auto, hgnet_v2, pp_ocrv5_mobile_rec, pp_ocrv5_server_det, pp_ocrv5_server_rec |
No description provided.