[ πPaper | πData | πModel ]
In this work, we introduce:
πHydraFake Dataset: A deepfake detection dataset with rigorous training and evaluation protocol.
πVeritas Model: A reasoning model achieving remarkable generalization on OOD forgeries, capable of providing transparent and human-aligned decision process.
- π₯
2026.2.10We release our new work πVideoVeritas for AI-generated video detectionπ₯π₯π₯! Dataset and Model are released [ Paper | Data | Model ] - π₯
2026.2.6The training data is released. Veritas and Veritas-Cold-Start are both released. We recommend using Veritas-Cold-Start to customize your own detector (see below for more details). - π₯
2026.2.6Veritas is selected as ICLR 2026 Oral. - π₯
2026.1.26Veritas has been accepted to ICLR 2026. - π₯
2025.9.17We release the inference code for MLLMs and small vision models. - π₯
2025.9.17We release the HydraFake dataset (train/val/test).
conda create -n veritas python=3.10
conda activate veritas
# Install the dependencies
pip install -e .Download training data here, including sft_36k.json, mipo_3k.json, pgrpo_8k.json.
sh self_scripts/train/train_sft.sh sft sft_36ksh self_scripts/train/train_mipo.sh mipo mipo_3k1.Deploy reward model. Download reward model here, and replace the path in swift/plugin/prm.py and self_scripts/deploy/deploy_reward_model.sh.
Note: the choice of reward model is flexible. More powerful models may lead to better performance, e.g., UnifiedReward-qwen-7B, Qwen3-VL-8B or UnifiedReward-2.0-qwen3vl-8B.
sh self_scripts/deploy/deploy_reward_model.sh2.P-GRPO training.
sh self_scripts/train/train_pgrpo.sh pgrpo pgrpo_8kWe recommend using Veritas-Cold-Start + P-GRPO for further customization:
- Veritas is fine-tuned on in-domain datasets, i.e., the latest generative model is SD-XL. As mentioned in our paper, for practical usage, you can further fine-tune Veritas-Cold-Start on your own collected data.
- If you adopt our P-GRPO, the only thing you should prepare is the binary labels of your data. Arrange them similar to our
pgrpo_8k.json. - We also encourage the development of (1) novel GRPO-style algorithm, e.g., involving grounded reward design to deliver more precise cross-modal signals, and (2) collaborative framework with small vision models, which can be exciting furture works.
We recommend using vLLM for model deployment:
sh self_scripts/deploy/deploy_model.sh /path/to/your/modelInference on a single image:
python self_scripts/infer/infer_vllm_single.py \
--image_path /path/to/your/imageDownload the HydraFake dataset and the json files. Put the json files under ./datasets. The data structure should be like:
hydrafake
βββ test # testing images
| βββ AdobeFirefly
| | βββ 0_real
| | β βββ *.png
| | βββ 1_fake
| | β βββ *.png
| |ββ ...
βββ val # validation images
| βββ real
| | βββ *.png
| βββ fake
| | βββ *.png
βββ train # training images
| βββ fake
| | βββ FS
| | | βββ blendface
| | | β βββ *.png
| | | βββ ...
| | βββ FR
| | | βββ Aniportrait
| | | β βββ *.png
| | | βββ ...
| | βββ EFG
| | | βββ Dall-E1
| | | β βββ *.png
| | | βββ ...
βββ jsons
| βββ test
| | βββ id
| | β βββ *.json
| | βββ cm
| | β βββ *.json
| | βββ cf
| | β βββ *.json
| | βββ cd
| | β βββ *.json
| βββ val
| | βββ *.json
| βββ train
| | βββ fake
| | | βββ FS
| | | β βββ *.json
| | | βββ FR
| | | β βββ *.json
| | | βββ EFG
| | | β βββ *.json
| | βββ real
| | β βββ *.json
You can also put the dataset in other places, then you should change the json file path in ./swift/llm/dataset/dataset/data_utils.py and the image path in the json files.
Run inference on HydraFake:
sh self_scripts/infer/infer_hydrafake.sh /path/to/your/modelInference on a specific subset:
swift infer \
--val_dataset cd_gpt4o \
--model /path/to/your/model \
--infer_backend pt \
--max_model_len 8192 \
--max_new_tokens 2048 \
--dataset_num_proc 16 \
--max_batch_size 8 \
--metric self_acc_tagsStep1: Deploy your model:
sh self_scripts/deploy/deploy_model.sh /path/to/your/model # models/Qwen2.5-VL-7B-InstructStep2: Run inference (put your model path in self_scripts/infer/infer_vllm.py):
sh self_scripts/infer/infer_hydrafake_vllm.sh We provide a script based on DeepfakeBench.
# Effort for example
python DeepfakeBench/training/test.py \
--detector_cfg DeepfakeBench/training/config/detector/effort.yaml \
--dataset_cfg DeepfakeBench/training/config/dataset/hydrafake.yaml \
--weights_path /path/to/your/modelπ Overview:
(a) We carefully collect and reimplement advanced deepfake techniques to construct our HydraFake dataset. Real images are collected from 8 datasets. Fake images are from classic datasets, high-quality public datasets and our self-constructed deepfake data. (b) We introduce a rigorous and hierarchical evaluation protocol. Training data contains abundant samples but limited forgery types. Evaluations are split into four distinct levels. (c) Illustration of the subsets in different evaluation splits. (d) The performance of prevailing detectors on our HydraFake dataset. Most detectors shows strong generalization on Cross-Model setting but poor ability on Cross-Forgery and Cross-Domain scenarios.
π Statistics:
HydraFake contains 52K images in total for evaluation, with 14K in-domain testing, 11K cross-model testing, 12K cross-forgery testing and 15K cross-domain testing.
π We introduce a pattern-aware reasoning framework, including three basic thinking patterns (fast judgement, reasoning, conclusion) and two advanced patterns (planning and self-reflection).
π Two-stage training pipeline:
(1) Pattern-guided Cold-Start (SFT + MiPO): Internalize thinking patterns and align reasoning process
(2) Pattern-aware Exploration (P-GRPO): Scale up effective patterns, improve reflection quality.
If you find our work useful, please cite our paper:
@inproceedings{tan2025veritas,
title={Veritas: Generalizable Deepfake Detection via Pattern-Aware Reasoning},
author={Tan, Hao and Lan, Jun and Tan, Zichang and Liu, Ajian and Song, Chuanbiao and Shi, Senyuan and Zhu, Huijia and Wang, Weiqiang and Wan, Jun and Lei, Zhen},
booktitle={International Conference on Learning Representations},
year={2026}
}
This repo is released under the Apache 2.0 License.
This repo benefits from ms-swift and DeepfakeBench. Thanks for their great works!


