System Info
transformers version: 5.7.0.dev0
- Platform: Linux-6.6.87.2-microsoft-standard-WSL2-x86_64-with-glibc2.39
- Python version: 3.12.3
- Huggingface_hub version: 1.7.1
- Safetensors version: 0.7.0
- Accelerate version: 1.13.0
- Accelerate config: not found
- DeepSpeed version: not installed
- PyTorch version (accelerator?): 2.8.0+cu128 (CUDA)
- GPU type: NVIDIA RTX 500 Ada Generation Laptop GPU
Who can help?
@VladOS95-cyber
@yonigozlan
Information
Tasks
Reproduction
Turn off denoising, by e.g.:
model = DFineForObjectDetection.from_pretrained("ustc-community/dfine-medium-obj365", num_denoising=0)
Expected behavior
Only the denoising losses should be turned off. However, DfineForObjectDetectionLoss only checks for denoising_meta_values to either use all or no auxiliary losses at all. See here: https://github.com/huggingface/transformers/blob/main/src/transformers/loss/loss_d_fine.py#L339
RT-DETR for example handles this in a different way and keeps the other losses even when denoising is off:
https://github.com/huggingface/transformers/blob/main/src/transformers/loss/loss_rt_detr.py#L454
I found that denoising can lead to degraded validation performance with complex datasets, and turning it off helps. But without the other losses, D-FINE does not train properly.
System Info
transformersversion: 5.7.0.dev0Who can help?
@VladOS95-cyber
@yonigozlan
Information
Tasks
examplesfolder (such as GLUE/SQuAD, ...)Reproduction
Turn off denoising, by e.g.:
Expected behavior
Only the denoising losses should be turned off. However, DfineForObjectDetectionLoss only checks for denoising_meta_values to either use all or no auxiliary losses at all. See here: https://github.com/huggingface/transformers/blob/main/src/transformers/loss/loss_d_fine.py#L339
RT-DETR for example handles this in a different way and keeps the other losses even when denoising is off:
https://github.com/huggingface/transformers/blob/main/src/transformers/loss/loss_rt_detr.py#L454
I found that denoising can lead to degraded validation performance with complex datasets, and turning it off helps. But without the other losses, D-FINE does not train properly.