We are the CDiscover team of the NTIRE 2026 Cross-Domain Few-Shot Object Detection (CD-FSOD) Challenge at the CVPR Workshop.
- ๐ Track:
open-source track - ๐๏ธ Award: 2nd Place
- ๐งฐ Method: GiPL-Grounding: Generative augmented iterative Pseudo-Labeling for Grounding
๐ NTIRE 2026 Official Website
๐ NTIRE 2026 Challenge Website
๐ CD-FSOD Challenge Repository
- [2026.3] ๐ Win the 2nd Place in the NTIRE 2026 CD-FSOD Challenge, CVPR2026.
This repository contains our solution for the open-source track of the NTIRE 2026 CD-FSOD Challenge.
We propose a method that integrates Qwen-based generative augmentation for GroundingDINO and GLIP-driven iterative pseudo-labeling, which achieves strong performance on the challenge.
Our method addresses the CDFSOD challenge through a domain-adaptive hybrid strategy.
For Dataset 2 (a multi-object natural domain with vehicles), we identify that only a single instance is labeled per image despite the presence of multiple targets. To prevent the model from treating unlabeled vehicles as background, we employ a GLIP-based pseudo-labeling pipeline to recover missing annotations, followed by iterative self-training to achieve dense and robust object localization. Refer to GLIP-Repository.
For Dataset 1&3, we leverage the Qwen to synthesize generative data, enriching the feature space for GroundingDINO. Implementation details are given as follows.
The experimental environment is based on mmdetection, the installation environment reference mmdetection's installation guide.
conda create --name ets python=3.8 -y
conda activate ets
cd ./mmdetection
pip3 install torch==1.10.0+cu113 torchvision==0.11.1+cu113 torchaudio==0.10.0+cu113 -f https://download.pytorch.org/whl/cu113/torch_stable.html
pip install -U openmim
mim install mmengine
mim install "mmcv>=2.0.0"
# Develop and run directly mmdet
pip install -v -e .
pip install -r requirements/multimodal.txt
pip install emoji ddd-dataset
pip install git+https://github.com/lvis-dataset/lvis-api.gitThen download the BERT weights bert-base-uncased into the weights directory,
cd CDiscover/
huggingface-cli download --resume-download google-bert/bert-base-uncased --local-dir weights/bert-base-uncasedPlease follow the instructions in the official CD-FSOD repo to download and prepare the dataset.
.
โโโ configs
โโโ data
โโโ LICENSE
โโโ mmdetection
โโโ pkl2coco.py
โโโ pkls
โโโ README.md
โโโ submit
โโโ submit_codalab
โโโ weightsMix Image Augmentation Config
train_pipeline = [
dict(type='LoadImageFromFile', backend_args=backend_args),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='CachedMosaic', img_scale=(640, 640), pad_val=114.0, prob=0.6),
# dict(type='CopyPaste', max_num_pasted=5, paste_by_box=True), # ๆทปๅ CopyPaste ๆฐๆฎๅขๅผบ
dict(type='YOLOXHSVRandomAug'),
dict(type='RandomFlip', prob=0.5),
dict(
type='CachedMixUp',
img_scale=(640, 640),
ratio_range=(1.0, 1.0),
max_cached_images=10,
pad_val=(114, 114, 114),
prob = 0.3),
dict(
type='RandomChoice',
transforms=[
[
dict(
type='RandomChoiceResize',
scales=[(480, 1333), (512, 1333), (544, 1333), (576, 1333),
(608, 1333), (640, 1333), (672, 1333), (704, 1333),
(736, 1333), (768, 1333), (800, 1333)],
keep_ratio=True)
],
[
dict(
type='RandomChoiceResize',
# The radio of all image in train dataset < 7
# follow the original implement
scales=[(400, 4200), (500, 4200), (600, 4200)],
keep_ratio=True),
dict(
type='RandomCrop',
crop_type='absolute_range',
crop_size=(384, 600),
allow_negative_crop=True),
dict(
type='RandomChoiceResize',
scales=[(480, 1333), (512, 1333), (544, 1333), (576, 1333),
(608, 1333), (640, 1333), (672, 1333), (704, 1333),
(736, 1333), (768, 1333), (800, 1333)],
keep_ratio=True)
]
]),
# dict(type='RandomErasing', n_patches=(0,2), ratio=0.3, img_border_value=128, bbox_erased_thr=0.9),
dict(
type='PackDetInputs',
meta_keys=('img_id', 'img_path', 'ori_shape', 'img_shape',
'scale_factor', 'flip', 'flip_direction', 'text',
'custom_entities'))
]To train the model:
Experiments were carried out on on a single NVIDIA A6000 GPU for Datasets 1 and 3.
cd ./mmdetection
./tools/dist_train_muti.sh configs/grounding_dino/CDFSOD/GroudingDINO-few-shot-SwinB.py "0" 1use sampling4val.py for sampling test set for validation set.
use sata_logs for search to get best model parameter from train logs.
pretrained model:
Download the checkpoint files to dir ./weights.
Baidu Disk: [link] or ้่ฟ็ฝ็ๅไบซ็ๆไปถ๏ผweights ้พๆฅ: https://pan.baidu.com/s/1r5H8TsV1qgwuWPuvc1EJZQ?pwd=hg8e ๆๅ็ : hg8e --ๆฅ่ช็พๅบฆ็ฝ็่ถ ็บงไผๅv5็ๅไบซ
Run evaluation:
cd ./mmdetection
bash tools/dist_test.sh configs/grounding_dino/CDFSOD/GroudingDINO-few-shot-SwinB.py /path/to/model/ 4Run inference:
Save to *.pkl file and convert to submit .json format.
cd ./mmdetection
## 1-shot-dataset1
bash tools/dist_test_out.sh ../configs/1-shot-dataset1.py ../weights/1-shot-dataset1-db4c5ebf.pth 1 ../pkls/dataset1_1shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset1/annotations/test.json --pkl_file ../pkls/dataset1_1shot.pkl --output_json ../pkls/dataset1_1shot_coco.json --annotations_json ../submit/dataset1_1shot.json
## 1-shot-dataset2
bash tools/dist_test_out.sh ../configs/1-shot-dataset2.py ../weights/1-shot-dataset2-0bd5d280.pth 1 ../pkls/dataset2_1shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset2/annotations/test.json --pkl_file ../pkls/dataset2_1shot.pkl --output_json ../pkls/dataset2_1shot_coco.json --annotations_json ../submit/dataset2_1shot.json
## 1-shot-dataset3
bash tools/dist_test_out.sh ../configs/1-shot-dataset3.py ../weights/1-shot-dataset3-433149f8.pth 1 ../pkls/dataset3_1shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset3/annotations/test.json --pkl_file ../pkls/dataset3_1shot.pkl --output_json ../pkls/dataset3_1shot_coco.json --annotations_json ../submit/dataset3_1shot.json
## 5-shot-dataset1
bash tools/dist_test_out.sh ../configs/5-shot-dataset1.py ../weights/5-shot-dataset1-ad2ac5f0.pth 1 ../pkls/dataset1_5shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset1/annotations/test.json --pkl_file ../pkls/dataset1_5shot.pkl --output_json ../pkls/dataset1_5shot_coco.json --annotations_json ../submit/dataset1_5shot.json
## 5-shot-dataset2
bash tools/dist_test_out.sh ../configs/5-shot-dataset2.py ../weights/5-shot-dataset2-0bfccba8.pth 1 ../pkls/dataset2_5shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset2/annotations/test.json --pkl_file ../pkls/dataset2_5shot.pkl --output_json ../pkls/dataset2_5shot_coco.json --annotations_json ../submit/dataset2_5shot.json
## 5-shot-dataset3
bash tools/dist_test_out.sh ../configs/5-shot-dataset3.py ../weights/5-shot-dataset3-0011f4b1.pth 1 ../pkls/dataset3_5shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset3/annotations/test.json --pkl_file ../pkls/dataset3_5shot.pkl --output_json ../pkls/dataset3_5shot_coco.json --annotations_json ../submit/dataset3_5shot.json
## 10-shot-dataset1
bash tools/dist_test_out.sh ../configs/10-shot-dataset1.py ../weights/10-shot-dataset1-33caf03b.pth 1 ../pkls/dataset1_10shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset1/annotations/test.json --pkl_file ../pkls/dataset1_10shot.pkl --output_json ../pkls/dataset1_10shot_coco.json --annotations_json ../submit/dataset1_10shot.json
## 10-shot-dataset2
bash tools/dist_test_out.sh ../configs/10-shot-dataset2.py ../weights/10-shot-dataset2-46b5584c.pth 1 ../pkls/dataset2_10shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset2/annotations/test.json --pkl_file ../pkls/dataset2_10shot.pkl --output_json ../pkls/dataset2_10shot_coco.json --annotations_json ../submit/dataset2_10shot.json
## 10-shot-dataset3
bash tools/dist_test_out.sh ../configs/10-shot-dataset3.py ../weights/10-shot-dataset3-7325994e.pth 1 ../pkls/dataset3_10shot.pkl
python ../pkl2coco.py --coco_file ../data/dataset3/annotations/test.json --pkl_file ../pkls/dataset3_10shot.pkl --output_json ../pkls/dataset3_10shot_coco.json --annotations_json ../submit/dataset3_10shot.jsonProvided codes were adapted from:

