This repository contains the implementation for Place Anywhere: Learning Spatial Reasoning for Occlusion-aware Image Composition. The project focuses on automatic object composition using datasets such as BSDS-A and leverages methods like pix2gestalt for foreground generation.
The code has been developed and tested on the following high-performance computing environment:
- OS: Ubuntu 22.04
- CPU: AMD EPYC 7763 @ 2.45GHz (256 cores)
- RAM: 512GB
- GPU: 3x NVIDIA Tesla A100
- IP: 192.168.20.10 (Internal)
- Python: 3.8.8
- PyTorch: 1.12.1+cu113
- Torchaudio: 0.12.1+cu113
- Torchvision: 0.13.1+cu113
To install the necessary dependencies, please run:
conda create -n autocomp python=3.8.8
conda activate autocomp
pip install -r requirements.txt-
- Download Source Data First, download the BSDS-A dataset.
-
- Generate Masks and Foregrounds This project uses a two-step process to prepare the data.
Step 1: Extract basic masks and foregrounds
python getForeBackMask_from_BSDSA.pyThis script extracts partial complete foregrounds and two types of masks.
Step 2: Generate full foregrounds using Pix2Gestalt Note: Ensure you activate the pix2gestalt environment for this step.
conda activate pix2gestalt
python getFullFore_by_pix2gestalt.pyUses pix2gestalt method to obtain all foregrounds. 3. Directory Structure After preparation, your directory structure should look like this:
autocomp_master
└── dataset/
├── BSDS/
├── BSDSAandCOCOA/
└── forComp_dataset/
├── foreground/
├── objmask_amodal/
├── objmask_visiable/
├── amodalmask.txt
├── foreground_from_generate.txt
├── foreground.txt
└── vismask.txt
- Dataset Splits Please ensure you select the correct dataset configuration in the code/args:
To train the model, navigate to the project root and run the training script:
cd autoComposition/autocomp_master
python train.pyTo evaluate the model on the test dataset:
python test.pyWe provide a web-based interface for easy inference. Start the inference server:
python inference.pyOpen the URL displayed in your terminal (e.g., http://127.0.0.1:7860) in your web browser.