Prompt then Refine: Prompt-Free SAM-Enhanced Collaborative Learning Network for Detecting Salient Objects in Underwater Images

Data

The USOD10k training set can be downloaded from the publisher of USOD10k. “USOD10K: A New Benchmark Dataset for Underwater Salient Object Detection”--https://github.com/LinHong-HIT/USOD10K.

SAM

The SAM fine-tuning framework is available on the release site of MDSAM. “Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection” Due to the limited equipment of our model, SAM_B can only be trained at 256 x 256 and SAM_L can only be trained at 224 x 224. If you have good equipment to reproduce this code and get better weights, such as using a higher size or a more powerful SAM original weight.

Eaasy

Abstract— RGB–depth underwater salient object detection (USOD) poses considerable challenges, such as uneven lighting, visual interference, and image blur, which limit the effectiveness of traditional approaches. The segment anything model (SAM), known for its robust segmentation capabilities, offers a promising alternative. However, SAM depends on prompt labels (e.g., points, boxes, masks) to perform effective resources typically unavailable in USOD datasets. To address this, we propose SAM-CLNet, a prompt-free, SAM-enhanced collaborative learning network comprising three main components: (1) SAM, (2) a mask prompt generator (MPG), and (3) a region-aware attention collaborative learning loss (RCL). In our framework, pseudo-mask prompts generated by MPG were used as input prompts for SAM, helping to offset performance degradation due to the absence of manual labels. Simultaneously, RCL leveraged high-quality SAM predictions to refine MPG, enhancing its feature extraction while minimizing the impact of low-quality pseudo-prompts on SAM. This cyclic feedback mechanism facilitated mutual improvement in detection accuracy. In addition, we introduced a U-Adapter module to adapt SAM for underwater imagery and incorporated a frequency cross-attention fusion module in MPG to integrate RGB and depth information. The region-aware attention in RCL further targeted challenging regions by comparing SAM’s predictions with MPG’s pseudo-mask. Experiments on the USOD10K and USOD datasets demonstrated that SAM-CLNet outperformed existing methods and generalized effectively across five public salient object detection benchmarks.

The diagram of our model is as follows:

The results of our comparison method are as follows:

Weight

The weights will be uploaded in a timely manner once the paper is accepted.

Environment

Refer to requirements.txt for the environment configuration file.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
RAAL.py		RAAL.py
README.md		README.md
SAM_SOD.py		SAM_SOD.py
SAM_SOD_L.py		SAM_SOD_L.py
SAM_SOD_SAM.py		SAM_SOD_SAM.py
SAM_SOD_SOD.py		SAM_SOD_SOD.py
SAM_SOD_addMask.py		SAM_SOD_addMask.py
image_encoder_U_Adapter.py		image_encoder_U_Adapter.py
mask_decoder_.py		mask_decoder_.py
prompt_encoder_.py		prompt_encoder_.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prompt then Refine: Prompt-Free SAM-Enhanced Collaborative Learning Network for Detecting Salient Objects in Underwater Images

Data

SAM

Eaasy

Weight

Environment

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Prompt then Refine: Prompt-Free SAM-Enhanced Collaborative Learning Network for Detecting Salient Objects in Underwater Images

Data

SAM

Eaasy

Weight

Environment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages