ArtifactWorld: Scaling 3D Gaussian Splatting Artifact Restoration via Video Generation Models
Sparse-view 3D Gaussian Splatting (3DGS) often suffers from geometric and photometric degradation. ArtifactWorld scales artifact restoration using systematic data expansion and a homogeneous dual-model paradigm on a shared video diffusion backbone: a shared-structure predictor estimates an artifact heatmap first; at restoration time, Artifact-Aware Triplet Fusion (AATF) and Decoupled Boundary Anchoring (DBA) work together inside native self-attention for intensity-guided spatiotemporal repair and improved 3D reconstruction. See the paper: arXiv:2604.12251 · PDF.
This repository is the official ArtifactWorld codebase. It currently ships runnable two-stage inference (stages/), benchmark preprocessing, and release configs. Training data and training code will follow in a later release (see News).
- Released: Standardized benchmark data ArtifactWorld-Benchmark (
gt.tar/artifact.tar). - Released: Two-stage LoRA weights and AATF auxiliary latents consistent with the paper (Hugging Face weights,
weights.tar), plus runnable inference code understages/. - Planned: Large-scale training data (107.5K generative flywheel scale from the paper) and training code.
License: This repository’s code is licensed under the Apache License 2.0 (see LICENSE in the repo root). Third-party components under stages/ retain their own license files (e.g. LTX-Video–related notices).
| Paper concept | Default pipeline in this repo |
|---|---|
| Homogeneous dual model, shared latent space | Stage-1 and Stage-2 build on the same LTX-Video-Trainer–style stack and base version |
| Shared-structure predictor → artifact heatmap | Stage-1: predicts headmap / noisemap from the reference video (stage1_pred_headmap_lora.safetensors) |
| AATF & DBA (jointly at restoration) | Stage-2: both operate in the same restoration forward pass; staged fusion on heatmaps and auxiliary latents implements AATF (e.g. noisemap_blend_weights in configs/stage2_infer.yaml); reference conditioning and related design reflect DBA (see paper Fig. 3) |
| Inference preprocessing | tools/process_videos.py: merges benchmark gt / artifact pairs into reference mp4 inputs for Stage-1 and Stage-2 |
- Benchmark (released): buaadwxl/ArtifactWorld-Benchmark — the curated 1.28K subset described on the project page and in the paper.
| Archive | Description |
|---|---|
gt.tar |
Ground-truth videos for the benchmark |
artifact.tar |
Artifact/degraded videos for the benchmark |
export HF_ENDPOINT=https://hf-mirror.com # optional mirror
huggingface-cli download buaadwxl/ArtifactWorld-Benchmark --repo-type dataset --local-dir ./ArtifactWorld-Benchmark
# Extract gt.tar and artifact.tar, e.g. to ./data/gt and ./data/artifact- Generative flywheel training data (107.5K in the paper): not shipped in this repo yet; status will be updated in News.
- Hugging Face: buaadwxl/ArtifactWorld
| File | Description |
|---|---|
weights.tar |
Stage-1 / Stage-2 LoRA checkpoints and auxiliary latents |
Download and extract:
export HF_ENDPOINT=https://hf-mirror.com # optional mirror
huggingface-cli download buaadwxl/ArtifactWorld --local-dir ./ArtifactWorld-weights
cd ./ArtifactWorld-weights && tar -xf weights.tarBy default, scripts read from the in-repo weights/ directory. If you unpack elsewhere, set:
export ARTIFACTWORLD_WEIGHTS_ROOT=/path/to/your/weightsThe weights root should contain (or adjust ARTIFACTWORLD_WEIGHTS_ROOT accordingly):
| File | Description |
|---|---|
stage1_pred_headmap_lora.safetensors |
Stage-1: artifact heatmap / noisemap prediction (LoRA1) |
stage2_noisemap_blend_lora.safetensors |
Stage-2: restoration with AATF and DBA jointly in the model (LoRA2) |
auxiliary_latents/z_full.pt |
Stage-2: AATF auxiliary latent (one fusion branch) |
auxiliary_latents/z_null.pt |
Stage-2: AATF auxiliary latent (other fusion branch) |
stage1_infer.yaml / stage2_infer.yaml expose explicit LTX base paths: scheduler_source, tokenizer_source, text_encoder_source, vae_source, transformer_source. The loader appends standard subfolders (tokenizer/, text_encoder/, vae/, scheduler/, transformer/). Therefore each field must point at a repository root or Hugging Face snapshot root, not at .../tokenizer or .../transformer, or you will get doubled subfolders or missing sharded weights.
Convention:
| Fields | Root directory to use | Environment variable |
|---|---|---|
tokenizer_source, text_encoder_source, vae_source |
Root of Lightricks/LTX-Video (or local snapshot with tokenizer/, text_encoder/, vae/ directly underneath) |
ARTIFACTWORLD_LTX_MAIN_ROOT |
scheduler_source, transformer_source |
Root of Lightricks/LTX-Video-0.9.7-dev (with scheduler/, transformer/ underneath) |
ARTIFACTWORLD_LTX_097_DEV_ROOT |
# LTX-Video (tokenizer / text_encoder / vae)
export ARTIFACTWORLD_LTX_MAIN_ROOT=/path/to/LTX-Video
# LTX-Video-0.9.7-dev (scheduler + sharded transformer)
export ARTIFACTWORLD_LTX_097_DEV_ROOT=/path/to/LTX-Video-0.9.7-devDefaults if unset:
ArtifactWorld-main/weights/LTX-VideoArtifactWorld-main/weights/LTX-Video-0.9.7-dev(clone the full repo; do not point only at.../transformeror you will stacksubfolder="transformer"twice)
Environment: Use the same Python environment that has all dependencies installed (e.g. conda activate ltx_video, or that env’s python). Otherwise you may see missing packages (e.g. typer).
Download LTX base weights before inference:
- VAE + tokenizer + text encoder: Lightricks/LTX-Video
- Scheduler + LTX-13B transformer (sharded): Lightricks/LTX-Video-0.9.7-dev (clone the whole repo; no need to hand-split path levels)
Create a Conda (or venv) environment with a Python version matching LTX-Video-Trainer (commonly 3.10). Example env name ltx_video:
conda create -n ltx_video python=3.10 -y
conda activate ltx_videoThen follow upstream docs/ for CUDA PyTorch and inference dependencies, or install from this repo’s frozen snapshot requirements.txt. For Hugging Face downloads you may set HF_ENDPOINT (mirror) and HF_HOME (cache directory).
Root requirements.txt is a full dependency snapshot from the release ltx_video environment (editable install lines removed). tools/process_videos.py additionally requires ffmpeg on your system PATH.
cd /path/to/ArtifactWorld-main
pip install -r requirements.txtArtifactWorld-main/
├── LICENSE
├── README.md
├── requirements.txt
├── configs/
│ ├── stage1_infer.yaml # __AW_ROOT__ / __WEIGHTS_ROOT__ / __LTX_*__ expanded by scripts
│ └── stage2_infer.yaml
├── scripts/
│ ├── _common.sh
│ ├── run_stage1_infer.sh
│ ├── run_stage2_infer.sh
│ └── run_pipeline_example.sh
├── tools/
│ └── process_videos.py
├── stages/
│ ├── stage1/
│ │ ├── scripts/validate.py
│ │ └── src/ltxv_trainer/
│ └── stage2/
│ ├── scripts/validate.py
│ └── src/ltxv_trainer/
├── weights/ # default layout for LoRA + auxiliary_latents
│ ├── stage1_pred_headmap_lora.safetensors
│ ├── stage2_noisemap_blend_lora.safetensors
│ └── auxiliary_latents/
│ ├── z_full.pt
│ └── z_null.pt
├── workspace/ # optional intermediates
└── outputs/ # default run outputs (override with --output-folder)
Merge same-named gt and artifact mp4 files from the benchmark into Stage-1/2 reference inputs (GT first/last frames + artifact middle + tail alignment).
python tools/process_videos.py \
--gt-dir /path/to/gt \
--artifact-dir /path/to/artifact \
--output-dir /path/to/processedRuns vendored code under stages/stage1 by default (LTX_STAGE1_ROOT overrides):
conda activate ltx_video # or your env name
CUDA_VISIBLE_DEVICES=0 ./scripts/run_stage1_infer.sh \
--input-folder /path/to/processed \
--output-folder "$(pwd)/outputs/stage1_run"Videos are written to <output-folder>/samples/ with the same basenames as inputs.
Runs vendored code under stages/stage2 by default (LTX_STAGE2_ROOT overrides):
conda activate ltx_video # or your env name
CUDA_VISIBLE_DEVICES=0 ./scripts/run_stage2_infer.sh \
--input-folder /path/to/processed \
--noisemap-videos-folder "$(pwd)/outputs/stage1_run/samples" \
--output-folder "$(pwd)/outputs/stage2_run"Restored videos: <stage2-output>/samples/.
conda activate ltx_video # or your env name
# export ARTIFACTWORLD_WEIGHTS_ROOT=/path/to/weights # non-default layout
# export LTX_STAGE1_ROOT=/path/to/external-stage1 # optional
# export LTX_STAGE2_ROOT=/path/to/external-stage2 # optional
export GT_DIR=.../gt
export ARTIFACT_DIR=.../artifact
CUDA_VISIBLE_DEVICES=0 ./scripts/run_pipeline_example.sh| Variable | Meaning |
|---|---|
ARTIFACTWORLD_WEIGHTS_ROOT |
Optional. Directory containing stage*.safetensors and auxiliary_latents/; default ArtifactWorld-main/weights/ |
ARTIFACTWORLD_LTX_MAIN_ROOT |
Optional. Root of LTX-Video (tokenizer / text_encoder / vae); default ArtifactWorld-main/weights/LTX-Video |
ARTIFACTWORLD_LTX_097_DEV_ROOT |
Optional. Root of LTX-Video-0.9.7-dev (scheduler + transformer); default ArtifactWorld-main/weights/LTX-Video-0.9.7-dev |
LTX_STAGE1_ROOT |
Optional. Override Stage-1 code root; default ArtifactWorld-main/stages/stage1 |
LTX_STAGE2_ROOT |
Optional. Override Stage-2 code root; default ArtifactWorld-main/stages/stage2 |
CUDA_VISIBLE_DEVICES |
Optional GPU selection |
Launch scripts expand __AW_ROOT__, __WEIGHTS_ROOT__, __LTX_MAIN_ROOT__, and __LTX_097_DEV_ROOT__ in configs/*.yaml to absolute paths.
@article{wang2026artifactworld,
title = {ArtifactWorld: Scaling 3D Gaussian Splatting Artifact Restoration via Video Generation Models},
author = {Wang, Xinliang and Shi, Yifeng and Wu, Zhenyu},
journal = {arXiv preprint arXiv:2604.12251},
year = {2026},
}This codebase builds upon LTX-Video-Trainer. We thank the authors for their open-source release.

