This fork replaces the original CLI workflow with a full-stack web application: a FastAPI backend exposing a job-based API, and a React + TypeScript frontend with interactive 3D viewers.
| Component | Technology |
|---|---|
| API server | FastAPI + Uvicorn |
| Job system | Python threading β daemon threads, in-memory job store |
| ML inference | segvigen package β lazy-loading segmenter classes with per-instance locking and automatic VRAM management |
| 3D rendering | Blender (headless, via bpy) β renders GLB conditioning views |
| Guidance pipeline | Gemini API β VLM describe step + image generation (flat-color segmentation map) |
| Background removal | rembg (isnet-general-use) β optional, injected into segmenters via remove_bg_fn callback |
| 3D I/O | trimesh, o_voxel |
| Data validation | Pydantic v2 |
| Component | Technology |
|---|---|
| Framework | React 18 + TypeScript |
| Build tool | Vite |
| Styling | Tailwind CSS v3 (custom dark theme) |
| 3D viewer | <model-viewer> (Google web component) |
| Icons | Lucide React |
| HTTP | Native fetch |
Browser (Vite dev :5173 / or FastAPI static)
β /api/*
βΌ
FastAPI :7860 (server.py)
βββ POST /api/upload β save to temp file, return path
βββ GET /api/files β serve any file by path
βββ POST /api/jobs/* β spawn thread β return job_id
βββ GET /api/jobs/{id} β poll status / result
β
βββ segvigen/ (core ML package)
β βββ FullSegmenter β full.py
β βββ FullGuidedSegmenter β full_guided.py
β βββ InteractiveSegmenter β interactive.py
β βββ _shared.py (model loading, voxel I/O, GLB export)
β βββ _samplers.py (flow-matching samplers, DiT wrappers)
β
βββ util.py
βββ remove_bg() (rembg background removal)
βββ split_glb_by_texture_palette_rgb()
βββ generate_guidance_map() (Blender render β Gemini β PNG)
Each segmenter class follows the same lifecycle:
- Lazy loading β models are downloaded and loaded on first
run()call - Thread-safe β a per-instance
threading.Lockserializes concurrent requests - VRAM management β sub-models are moved to CUDA one at a time during inference, then offloaded to CPU;
clear_vram()runs in afinallyblock - Temp file cleanup β intermediate
.vxzand.pngfiles are deleted after each run; output.glbis left for the caller - Weight caching β heavy model weights are cached globally in
_shared.py, so multiple segmenter instances sharing the same checkpoint reuse the same weights
- System: Linux
- GPU: NVIDIA GPU with at least 12 GB VRAM
- Python: 3.11 (managed by the install script β do not use 3.12+)
- CUDA: 12.x
- Conda: miniconda or anaconda
- Node.js: 18+ (for frontend development only)
The script handles everything: creates a segvigen conda env (Python 3.11), builds all TRELLIS.2 CUDA extensions, installs all dependencies, and downloads missing checkpoints automatically.
bash install.shWhat the script does, step by step:
- Clones TRELLIS.2 (skips if already present)
- Creates conda env
segvigenwith Python 3.11, installs PyTorch cu128- Builds TRELLIS.2 CUDA extensions (
o_voxel,cumesh,flex_gemm,nvdiffrast,nvdiffrec) β takes 30β60 min- Installs SegviGen Python deps (including patched
mathutils,bpy 4.0.0,gradio 6.0.1,Pillow 10.x)- Installs system libs (
libsm6,libopenexr-dev, etc.)- Downloads any missing checkpoints from HuggingFace into
ckpt/
Note on
mathutils: The install script automatically patchesmathutils 5.1.0source to compile on Python 3.11 (fixesPyLong_AsIntand_PyArg_CheckPositionalcompatibility issues).
Note on Pillow:
gradio 6.0.1requiresPillow < 11(HAVE_WEBPANIMwas removed in Pillow 11). The script pinsPillow>=10,<11and removespillow-simdif installed by TRELLIS.2.
If you already have the TRELLIS.2 CUDA prerequisites (trellis2, o_voxel, torch) installed in your environment, you can install just the segvigen Python package:
pip install git+https://github.com/maepopi/SegviGen-appNote: This installs only the
segvigenpackage and its PyPI dependencies. It does not build the CUDA extensions (trellis2,o_voxel, etc.) β those must already be present in your environment. If they are missing,import segvigenwill still work for lightweight usage (e.g. accessing presets), but the segmenter classes will raise a clearImportErrorwhen accessed.
Checkpoints are downloaded automatically by install.sh. If you need to download them manually:
conda activate segvigen
pip install huggingface_hub
python -c "
from huggingface_hub import hf_hub_download
import shutil, os
ckpt_dir = 'ckpt'
os.makedirs(ckpt_dir, exist_ok=True)
for f in ['interactive_seg.ckpt', 'full_seg.ckpt', 'full_seg_w_2d_map.ckpt']:
shutil.copy(hf_hub_download('fenghora/SegviGen', f), os.path.join(ckpt_dir, f))
"Expected layout:
ckpt/
βββ interactive_seg.ckpt β Interactive Part Segmentation
βββ full_seg.ckpt β Full Segmentation
βββ full_seg_w_2d_map.ckpt β Full Segmentation + 2D Guidance
The static/ directory already contains a pre-built frontend. To rebuild from source:
cd frontend
npm install
npm run build # outputs to ../static/conda activate segvigen # Python 3.11 env created by install.sh
uvicorn server:app --host 0.0.0.0 --port 7860
# β Open http://localhost:7860Run both processes in separate terminals:
# Terminal 1 β backend
conda activate segvigen
uvicorn server:app --host 0.0.0.0 --port 7860 --reload
# Terminal 2 β frontend dev server
cd frontend
npm run dev
# β Open http://localhost:5173ColorVisuals β TextureVisualsconversion before voxelization β GLBs with vertex/flat colors work correctlySimpleMaterial β PBRMaterialconversion before voxelization β no moreAssertionErrorin o_voxelpipeline.jsonresolved viahuggingface_hubinstead of a hardcoded relative path- Multiview guidance fix β grid used only for VLM describe step; segmentation image generated from the single
transforms.jsonmain view to avoid out-of-distribution interference patterns - Proper RGBA handling in
preprocess_imageβ images with pre-applied alpha (e.g., user-supplied background-removed images) are used directly instead of being re-processed through background removal
- SegviGen β original research and codebase by Lin Li, Haoran Feng, Zehuan Huang, Haohua Chen, Wenbo Nie, Shaohua Hou, Keqing Fan, Pan Hu, Sheng Wang, Buyu Li, and Lu Sheng
- Dickoah β my dear friend, and main contributor for the 2D segmentation pipeline (guidance map generation)
π Project Page | Paper | Online Demo
SegviGen is a framework for 3D part segmentation that leverages the rich 3D structural and textural knowledge encoded in large-scale 3D generative models. It learns to predict part-indicative colors while reconstructing geometry, and unifies three settings in one architecture: interactive part segmentation, full segmentation, and 2D segmentation mapβguided full segmentation with arbitrary granularity.
@article{li2026segvigen,
title = {SegviGen: Repurposing 3D Generative Model for Part Segmentation},
author = {Lin Li and Haoran Feng and Zehuan Huang and Haohua Chen and Wenbo Nie and Shaohua Hou and Keqing Fan and Pan Hu and Sheng Wang and Buyu Li and Lu Sheng},
journal = {arXiv preprint arXiv:2603.16869},
year = {2026}
}
