Skip to content

xiweix/ReME

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

ReME

Xiwei Xuan, Ziquan Deng, and Kwan-Liu Ma

[ICCV'25] ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Introduction

Training-free open-vocabulary semantic segmentation (OVS) aims to segment images given a set of arbitrary textual categories without costly model fine-tuning. Existing solutions often explore attention mechanisms of pre-trained models, such as CLIP, or generate synthetic data and design complex retrieval processes to perform OVS. However, their performance is limited by the capability of reliant models or the suboptimal quality of reference sets. In this work, we investigate the largely overlooked data quality problem for this challenging dense scene understanding task, and identify that a high-quality reference set can significantly benefit training-free OVS. With this observation, we introduce a data-quality-oriented framework, comprising a data pipeline to construct a reference set with well-paired segment-text embeddings and a simple similarity-based retrieval to unveil the essential effect of data. Remarkably, extensive evaluations on ten benchmark datasets demonstrate that our method outperforms all existing training-free OVS approaches, highlighting the importance of data-centric design for advancing OVS without training.

Installation

Requirements

  • Linux with Python ≥ 3.10
  • PyTorch ≥ 2.5.1 is recommended and torchvision that matches the PyTorch installation. Install them together at pytorch.org to make sure of this. An example of installation is shown below:
git clone https://github.com/xiweix/ReME.git
cd ReME
conda create -n reme python=3.10 -y
conda activate reme
conda install pip
bash install.sh

Data Preparation

This document explains how to download and organize datasets.

We refer to detectron.data for data preparation. Please follow Detectron2 installation instructions for installing Detectron2. Note that, with Detectron2, a dataset can be used by accessing DatasetCatalog for its data, or MetadataCatalog for its metadata (class names, etc). Use Custom Datasets gives a deeper dive on how to use DatasetCatalog and MetadataCatalog, and how to add new datasets to them.

The datasets are assumed to exist in a directory specified by the environment variable DETECTRON2_DATASETS. You can set the location by export DETECTRON2_DATASETS=/path/to/datasets. If left unset, the default is ./datasets relative to your current working directory. In case of DETECTRON2 not used, modify the dataset_dir in scripts/datasets/prepare*.py to the dataset path. We expect datasets in the structure described below.

data/                     # Specify this location by DETECTRON2_DATASETS or dataset_dir
  coco_stuff164k/         # COCO-Stuff
    coco_object/          # COCO-Object
  ADEChallengeData2016/   # ADE20K-150
  ADE20K_2021_17_01/      # ADE20K-847
  VOCdevkit/ 
    VOC2010/              # PASCAL Context
    VOC2012/              # PASCAL VOC
  cityscapes/             # Cityscapes

Prepare data for COCO-Stuff:

Expected data structure

coco_stuff164k/
  annotations/
    val2017/
  images/
    train2017/   #### For train split, only images are needed for curating the reference set
    val2017/
  # below are generated by prepare_coco_stuff.py
  annotations_detectron2/
    val2017/ 

Download the COCO (2017) images from https://cocodataset.org/

wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip

Download the COCO-Stuff annotation from https://github.com/nightrome/cocostuff.

wget http://calvin.inf.ed.ac.uk/wp-content/uploads/data/cocostuffdataset/stuffthingmaps_trainval2017.zip

Unzip val2017.zip and stuffthingmaps_trainval2017.zip. Then put them to the correct location listed above. and generate the labels for testing.

python datasets/prepare_coco_stuff.py

Prepare data for ADE20K-150:

Expected data structure

ADEChallengeData2016/
  annotations/
    validation/
  images/
    validation/
  # below are generated by prepare_ade20k_150.py
  annotations_detectron2/
    validation/

Download the data of ADE20K-150 from http://sceneparsing.csail.mit.edu.

wget http://data.csail.mit.edu/places/ADEchallenge/ADEChallengeData2016.zip

Unzip ADEChallengeData2016.zip and generate the labels for testing.

python datasets/prepare_ade20k_150.py

Prepare data for ADE20k-847:

Expected data structure

ADE20K_2021_17_01/
  images/
    ADE/
      validation/
  index_ade20k.mat
  index_ade20k.pkl
  # below are generated by prepare_ade20k_847.py
  annotations_detectron2/
    validation/

Download the data of ADE20k-Full from https://groups.csail.mit.edu/vision/datasets/ADE20K/request_data/ Unzip the dataset and generate the labels for testing.

python datasets/prepare_ade20k_847.py

Prepare data for PASCAL VOC 2012:

Expected data structure

VOCdevkit/
  VOC2012/
    Annotations/
    ImageSets/
    JPEGImages/
    SegmentationClass/
    SegmentationClassAug/ 
    SegmentationObject/
    # below are generated by prepare_voc.py
    annotations_detectron2
    annotations_detectron2_bg

Download the data of PASCAL VOC from http://host.robots.ox.ac.uk/pascal/VOC/voc2012/#devkit.

We use SBD augmentated training data as SegmentationClassAug following Deeplab.

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar
wget https://www.dropbox.com/s/oeu149j8qtbs1x0/SegmentationClassAug.zip

Unzip VOCtrainval_11-May-2012.tar and SegmentationClassAug.zip. Then put them to the correct location listed above and generate the labels for testing.

python datasets/prepare_voc.py

Prepare data for PASCAL Context:

Expected data structure

VOCdevkit/
  VOC2010/
    Annotations/
    ImageSets/
    JPEGImages/
    SegmentationClass/
    SegmentationObject/
    trainval/
    labels.txt
    pascalcontext_val.txt
    trainval_merged.json
    # below are generated by prepare_pascal_context_59.py and prepare_pascal_context_459.py
    annotations_detectron2/
      pc459_val
      pc59_val

Download the data of PASCAL VOC 2010 from https://www.cs.stanford.edu/~roozbeh/pascal-context/.

wget http://host.robots.ox.ac.uk/pascal/VOC/voc2010/VOCtrainval_03-May-2010.tar

Download the annotation for 59 and 459 classes.

wget https://codalabuser.blob.core.windows.net/public/trainval_merged.json
wget https://roozbehm.info/pascal-context/trainval.tar.gz.

Unzip VOCtrainval_03-May-2010.tar and trainval.tar.gz. Then put them to the correct location listed above and generate the labels for testing.

python datasets/prepare_pascal_context_59.py
python datasets/prepare_pascal_context_459.py

Prepare data for Cityscapes:

Expected data structure

cityscapes/
  leftImg8bit/
    val/
  gtFine/
    val/

Follow https://www.cityscapes-dataset.com/downloads/ for downloading data and annotations. Registration is needed. Download leftImg8bit_trainvaltest.zip and gtFine_trainvaltest.zip. Then unzip the validation split and put them to the correct location listed above.

About

[ICCV'25] ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors