Add rtdetr-v2 version of code by SangbumChoi · Pull Request #33244 · huggingface/transformers

SangbumChoi · 2024-09-02T05:57:35Z

What does this PR do?

This is the code of compatible for rtdetr-v2. https://github.com/lyuwenyu/RT-DETR/blob/main/rtdetrv2_pytorch/configs/rtdetrv2/rtdetrv2_r18vd_120e_coco.yml

At this moment I just uploaded rtdetrv2_r18vd for the test, but while in the reviewing code I will also upload other model weight also. https://huggingface.co/danelcsb/rtdetr_v2_r18vd/tree/main

@qubvel @amyeroberts

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

…to rtdetr_v2

SangbumChoi · 2024-09-02T07:17:50Z

CI error seems unrelated (jax, albumentation uninstalled error)

amyeroberts · 2024-09-02T10:16:42Z

Thanks for adding @SangbumChoi!

As v2 is released with a new paper, it should be added as it's own, separate model in the repo.

SangbumChoi · 2024-09-02T11:21:03Z

@amyeroberts There is no problem making with v2 independent repo, however should I make it compatible to import v1 configuration in v2?

amyeroberts · 2024-09-02T12:26:03Z

@SangbumChoi As there appears to be v2 specific checkpoints, I'd say no.

SangbumChoi · 2024-09-08T12:32:44Z

Files to be transferred https://huggingface.co/danelcsb

SangbumChoi · 2024-09-10T05:15:51Z

+>>> url = 'http://images.cocodataset.org/val2017/000000039769.jpg' 
+>>> image = Image.open(requests.get(url, stream=True).raw)
+
+>>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd")


Need to be changed after approval

SangbumChoi · 2024-09-10T05:15:57Z

+>>> image = Image.open(requests.get(url, stream=True).raw)
+
+>>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd")
+>>> model = RTDetrV2ForObjectDetection.from_pretrained("danelcsb/rtdetr_v2_r50vd")


Need to be changed after approval

SangbumChoi · 2024-09-10T05:17:21Z

+
+_CONFIG_FOR_DOC = "RTDetrV2Config"
+# TODO: Replace all occurrences of the checkpoint with the final one
+_CHECKPOINT_FOR_DOC = ""


Need to be changed after approval

SangbumChoi · 2024-09-10T05:18:07Z

+    from PIL import Image
+
+
+CHECKPOINT = "danelcsb/rtdetr_v2_r50vd"  # TODO: replace


Need to be changed after approval

SangbumChoi · 2024-09-10T05:32:58Z

@amyeroberts RTDetrV2 is ready 👍🏼

amyeroberts

Looks great - thanks for adding!

Just a few small comments. Final step after addressing these is running the slow tests for the model before merge. Could you push an empty commit with the message [run_slow] rt_detr_v2?

amyeroberts · 2024-09-25T10:51:08Z

+
+
+@require_torch
+class RTDetrV2ModelTest(ModelTesterMixin, PipelineTesterMixin, unittest.TestCase):


We should use # Copied from for the tests too

@amyeroberts Well actually since the configuration of RTDetr and RTDetrV2 is different. However, I will add the part that I can do e.g. RTDetrV2ResNetModelTester

Wait how can we use # Copied from since it starts from transformers ?

# Copied from transformers.models.rt_detr.modeling_rt_detr.RTDetrPreTrainedModel with RTDetr->RTDetrV2,rt_detr->rt_detr_v2

amyeroberts · 2024-09-25T11:08:35Z

+        num_backbone_outs = len(config.decoder_in_channels)
+        decoder_input_proj_list = []
+        for _ in range(num_backbone_outs):
+            in_channels = config.decoder_in_channels[_]
+            decoder_input_proj_list.append(
+                nn.Sequential(
+                    nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),
+                    nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
+                )
+            )
+        for _ in range(config.num_feature_levels - num_backbone_outs):
+            decoder_input_proj_list.append(
+                nn.Sequential(
+                    nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),
+                    nn.BatchNorm2d(config.d_model, config.batch_norm_eps),
+                )
+            )
+            in_channels = config.d_model


As above - this makes it more explicit which dimensions are being used wrt the scope

Suggested change

num_backbone_outs = len(config.decoder_in_channels)

decoder_input_proj_list = []

for _ in range(num_backbone_outs):

in_channels = config.decoder_in_channels[_]

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

for _ in range(config.num_feature_levels - num_backbone_outs):

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

in_channels = config.d_model

decoder_input_proj_list = []

for in_channels in config.decoder_in_channels:

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(config.decoder_in_channels[-1], config.d_model, kernel_size=3, stride=2, padding=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

for _ in range(config.num_feature_levels - num_backbone_outs - 1):

decoder_input_proj_list.append(

nn.Sequential(

nn.Conv2d(config.d_model, config.d_model, kernel_size=3, stride=2, padding=1, bias=False),

nn.BatchNorm2d(config.d_model, config.batch_norm_eps),

)

)

@amyeroberts Unlike above case I think this is not always true when config.num_feature_levels = num_backbone_outs. Let me think of it and try to fix it

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

… into rtdetr_v2

yonigozlan · 2024-10-05T18:38:55Z

Hi @SangbumChoi! Excited to see RT-DETR-V2 in Transformers thanks for working on this!
As the implementation is so similar to RT-DETR and contains a lot of copied from, I think it could really benefit from using the new Modular system: more info and examples here and here.

As most of the work is already done in this PR, using Modular should be straightforward: you could put every module that does not include a copied from inside the modular file and discard the ones that do, and you should also be able to simplify the part that don't use copied from if they are similar/only add logic to parts in RT-DETR, using inheritance.

Happy to help if you have any questions!

SangbumChoi · 2024-12-16T13:56:15Z

close since there is another PR for modular function

SangbumChoi added 8 commits August 29, 2024 05:48

tmp

13feb5c

add custom function for deformable_attention

730f2d7

make style

bb8044e

add rtdetr_v2

6c7ced6

make style

71ee1fb

Merge branch 'main' of https://github.com/SangbumChoi/transformers in…

dcc05d1

…to rtdetr_v2

add docstring

cad399a

add docstring

4e3a064

NielsRogge mentioned this pull request Sep 2, 2024

RT-DETR is now available in Hugging Face Transformers lyuwenyu/RT-DETR#413

Open

SangbumChoi marked this pull request as draft September 2, 2024 12:59

SangbumChoi added 5 commits September 2, 2024 13:42

revert rt_detr

0693e5d

add rt_detr_v2

75462bd

temp commit

c3dab0e

1st draft

47b61c0

Add more model file

b7cd250

SangbumChoi added 5 commits September 10, 2024 02:00

change to v2

8697337

add missing test file

2e543c8

change docs

43ed118

change value in test

2c55549

make to v2

a2116ea

SangbumChoi commented Sep 10, 2024

View reviewed changes

SangbumChoi marked this pull request as ready for review September 10, 2024 05:32

amyeroberts added the run-slow label Sep 25, 2024

amyeroberts reviewed Sep 25, 2024

View reviewed changes

SangbumChoi and others added 11 commits September 25, 2024 21:19

Update src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

d45aed5

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

b5c63f1

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Update src/transformers/models/rt_detr_v2/modeling_rt_detr_v2.py

06eac9b

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

renaming

5f654de

Merge branch 'rtdetr_v2' of https://github.com/SangbumChoi/transformers…

d970397

… into rtdetr_v2

tmp

63e2f96

make style

2579984

enc -> encoder

5b8b496

tmp

3465ff2

revert to original

565fc31

revert

8943878

qubvel added New model Vision labels Oct 2, 2024

SangbumChoi mentioned this pull request Nov 18, 2024

Adding RTDETRv2 #34773

Merged

3 tasks

SangbumChoi mentioned this pull request Dec 16, 2024

Request to add D-FINE #35283

Closed

2 tasks

SangbumChoi closed this Dec 16, 2024

SangbumChoi deleted the rtdetr_v2 branch March 26, 2026 15:48

		from PIL import Image


		CHECKPOINT = "danelcsb/rtdetr_v2_r50vd" # TODO: replace



		@require_torch
		class RTDetrV2ModelTest(ModelTesterMixin, PipelineTesterMixin, unittest.TestCase):

Conversation

SangbumChoi commented Sep 2, 2024

What does this PR do?

Before submitting

Uh oh!

SangbumChoi commented Sep 2, 2024

Uh oh!

amyeroberts commented Sep 2, 2024

Uh oh!

SangbumChoi commented Sep 2, 2024

Uh oh!

amyeroberts commented Sep 2, 2024

Uh oh!

SangbumChoi commented Sep 8, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SangbumChoi commented Sep 10, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amyeroberts left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SangbumChoi Sep 25, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yonigozlan commented Oct 5, 2024

Uh oh!

SangbumChoi commented Dec 16, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

SangbumChoi commented Sep 10, 2024 •

edited

Loading

SangbumChoi Sep 25, 2024 •

edited

Loading