Code for our GRSL 2025 paper"Multimodal-Aware Fusion Network for Referring Remote Sensing Image Segmentation"
Contributed by Leideng Shi, Juan Zhang*.
Install the dependencies.
The code was tested on Ubuntu 20.04.6, with Python 3.7 and PyTorch v1.12.1.
-
Clone this repository.
git clone https://github.com/Roaxy/MAFN.git -
Create a new Conda environment with Python 3.7 then activate it:
conda create -n MAFN python==3.7 conda activate MAFN -
Install pytorch v1.12.1 (CUDA 10.2 is used in this example).
conda install pytorch==1.12.1 torchvision==0.13.1 torchaudio==0.12.1 cudatoolkit=10.2 -c pytorch -
Install the requirements.
pip install -r requirements.txt
RRSIS-D dataset can be downloaded from Google Drive or Baidu Netdisk, and then follow RMSIN datasets usage. The files and make directories as follows.
$DATA_PATH
├── rrsisd
│ ├── refs(unc).p
│ ├── instances.json
└── images
└── rrsisd
├── JPEGImages
├── ann_split
finishing downloading, unpack the tarball (hico_20160224_det.tar.gz) to the data directory.
Download the pre-trained classification weights of the Swin Transformer for training to initialize the model
, and put the pth file in ./pretrained_weights.
Download the pre-trained bert-base-uncased weights of the BERT for training to initialize the model
, and put the files in ./MAFN/bert-base-uncased
After the preparation, you can start training with the following commands. We use DistributedDataParallel from PyTorch for training. To run on 2 GPUs (with IDs 0, 1) on a single node:
sh ./train.sh
# default setting
sh ./test.sh
You may modify codes in test.sh:11 to use val instead of test. By default, we set the split to test.
| P@0.5 | P@0.6 | P@0.7 | P@0.8 | P@0.9 | oIoU | mIoU | Download | |
|---|---|---|---|---|---|---|---|---|
| MAFN | 76.32 | 69.31 | 58.33 | 44.54 | 24.71 | 78.33 | 66.03 | model |
Code in this repository is built on RMSIN. We'd like to thank the authors for open sourcing their project.
