Add rtdetr-v2 version of code#33244
Conversation
|
CI error seems unrelated (jax, albumentation uninstalled error) |
|
Thanks for adding @SangbumChoi! As v2 is released with a new paper, it should be added as it's own, separate model in the repo. |
|
@amyeroberts There is no problem making with v2 independent repo, however should I make it compatible to import v1 configuration in v2? |
|
@SangbumChoi As there appears to be v2 specific checkpoints, I'd say no. |
|
Files to be transferred https://huggingface.co/danelcsb |
| >>> url = 'http://images.cocodataset.org/val2017/000000039769.jpg' | ||
| >>> image = Image.open(requests.get(url, stream=True).raw) | ||
|
|
||
| >>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd") |
There was a problem hiding this comment.
Need to be changed after approval
| >>> image = Image.open(requests.get(url, stream=True).raw) | ||
|
|
||
| >>> image_processor = RTDetrImageProcessor.from_pretrained("danelcsb/rtdetr_v2_r50vd") | ||
| >>> model = RTDetrV2ForObjectDetection.from_pretrained("danelcsb/rtdetr_v2_r50vd") |
There was a problem hiding this comment.
Need to be changed after approval
|
|
||
| _CONFIG_FOR_DOC = "RTDetrV2Config" | ||
| # TODO: Replace all occurrences of the checkpoint with the final one | ||
| _CHECKPOINT_FOR_DOC = "" |
There was a problem hiding this comment.
Need to be changed after approval
| from PIL import Image | ||
|
|
||
|
|
||
| CHECKPOINT = "danelcsb/rtdetr_v2_r50vd" # TODO: replace |
There was a problem hiding this comment.
Need to be changed after approval
|
@amyeroberts RTDetrV2 is ready 👍🏼 |
amyeroberts
left a comment
There was a problem hiding this comment.
Looks great - thanks for adding!
Just a few small comments. Final step after addressing these is running the slow tests for the model before merge. Could you push an empty commit with the message [run_slow] rt_detr_v2?
|
|
||
|
|
||
| @require_torch | ||
| class RTDetrV2ModelTest(ModelTesterMixin, PipelineTesterMixin, unittest.TestCase): |
There was a problem hiding this comment.
We should use # Copied from for the tests too
There was a problem hiding this comment.
@amyeroberts Well actually since the configuration of RTDetr and RTDetrV2 is different. However, I will add the part that I can do e.g. RTDetrV2ResNetModelTester
There was a problem hiding this comment.
Wait how can we use # Copied from since it starts from transformers ?
# Copied from transformers.models.rt_detr.modeling_rt_detr.RTDetrPreTrainedModel with RTDetr->RTDetrV2,rt_detr->rt_detr_v2
| num_backbone_outs = len(config.decoder_in_channels) | ||
| decoder_input_proj_list = [] | ||
| for _ in range(num_backbone_outs): | ||
| in_channels = config.decoder_in_channels[_] | ||
| decoder_input_proj_list.append( | ||
| nn.Sequential( | ||
| nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False), | ||
| nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | ||
| ) | ||
| ) | ||
| for _ in range(config.num_feature_levels - num_backbone_outs): | ||
| decoder_input_proj_list.append( | ||
| nn.Sequential( | ||
| nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False), | ||
| nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | ||
| ) | ||
| ) | ||
| in_channels = config.d_model |
There was a problem hiding this comment.
As above - this makes it more explicit which dimensions are being used wrt the scope
| num_backbone_outs = len(config.decoder_in_channels) | |
| decoder_input_proj_list = [] | |
| for _ in range(num_backbone_outs): | |
| in_channels = config.decoder_in_channels[_] | |
| decoder_input_proj_list.append( | |
| nn.Sequential( | |
| nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False), | |
| nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
| ) | |
| ) | |
| for _ in range(config.num_feature_levels - num_backbone_outs): | |
| decoder_input_proj_list.append( | |
| nn.Sequential( | |
| nn.Conv2d(in_channels, config.d_model, kernel_size=3, stride=2, padding=1, bias=False), | |
| nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
| ) | |
| ) | |
| in_channels = config.d_model | |
| decoder_input_proj_list = [] | |
| for in_channels in config.decoder_in_channels: | |
| decoder_input_proj_list.append( | |
| nn.Sequential( | |
| nn.Conv2d(in_channels, config.d_model, kernel_size=1, bias=False), | |
| nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
| ) | |
| ) | |
| decoder_input_proj_list.append( | |
| nn.Sequential( | |
| nn.Conv2d(config.decoder_in_channels[-1], config.d_model, kernel_size=3, stride=2, padding=1, bias=False), | |
| nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
| ) | |
| ) | |
| for _ in range(config.num_feature_levels - num_backbone_outs - 1): | |
| decoder_input_proj_list.append( | |
| nn.Sequential( | |
| nn.Conv2d(config.d_model, config.d_model, kernel_size=3, stride=2, padding=1, bias=False), | |
| nn.BatchNorm2d(config.d_model, config.batch_norm_eps), | |
| ) | |
| ) |
There was a problem hiding this comment.
@amyeroberts Unlike above case I think this is not always true when config.num_feature_levels = num_backbone_outs. Let me think of it and try to fix it
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
|
Hi @SangbumChoi! Excited to see RT-DETR-V2 in Transformers thanks for working on this! As most of the work is already done in this PR, using Happy to help if you have any questions! |
|
close since there is another PR for modular function |
What does this PR do?
This is the code of compatible for rtdetr-v2. https://github.com/lyuwenyu/RT-DETR/blob/main/rtdetrv2_pytorch/configs/rtdetrv2/rtdetrv2_r18vd_120e_coco.yml
At this moment I just uploaded rtdetrv2_r18vd for the test, but while in the reviewing code I will also upload other model weight also. https://huggingface.co/danelcsb/rtdetr_v2_r18vd/tree/main
@qubvel @amyeroberts

Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.