First to trace LVLM hallucinations to visual encoders β a training-free framework that fixes statistical bias, inherent bias & vulnerability without any fine-tuning.
SHIELD consistently outperforms existing methods across 3 LVLM families and 6 benchmarks β all without training.
|
CHAIR β LLaVA-1.5 7B (β lower is better)
|
POPE Avg β LLaVA-1.5 7B (β higher is better)
|
|||||||||||||||||||||||||||||||||||
|
MME Hallucination (β higher is better)
|
AMBER Score β LLaVA-1.5 7B (β higher is better)
|
SHIELD also achieves 1810.8 on MME Full (vs. Vanilla 1632.1, OPERA 1717.2), confirming that hallucination suppression does not sacrifice general capability.
SHIELD works as a non-invasive wrapper β no source code modification of LLaVA needed.
# Run POPE evaluation in one command
bash experiments/scripts/llava1.5_pope_coco.bash
# Run CHAIR evaluation
bash experiments/scripts/llava1.5_chair.bash
# Run MME evaluation
bash experiments/scripts/llava1.5_MME_full.bash
bash experiments/scripts/llava1.5_MME_hal.bashimport shield
from llava.model.builder import load_pretrained_model
tokenizer, model, image_processor, _ = load_pretrained_model(
"liuhaotian/llava-v1.5-7b", None, "llava-v1.5-7b"
)
# One-line setup: wrap the model with SHIELD
shield.wrap(model, tokenizer,
caption_file="experiments/first_cap/llava15_coco_pope_first_caption.jsonl",
cd_alpha=2.0, cd_beta=0.35,
)
image_tensor = image_processor.preprocess(image, return_tensors="pt")["pixel_values"][0]
shield_kw = model.shield_prepare(image, image_tensor, "image.jpg", use_cd=True)
output_ids = model.generate(input_ids, **shield_kw, do_sample=True, max_new_tokens=1024)conda create -n shield python=3.10
conda activate shield
pip install torch==2.0.1 torchvision==0.15.2 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txtThe repository includes all text-based metadata. You only need to download images and COCO annotations.
Click to expand full data setup instructions
SHIELD uses LLaVA-1.5 as the base model. Model weights are automatically downloaded from Hugging Face when first used (liuhaotian/llava-v1.5-7b).
COCO val2014 images are shared across POPE (COCO), CHAIR, and LLaVA-Bench evaluations.
- Download COCO val2014 images and extract to
experiments/data/coco/val2014/. - Download COCO 2014 annotations and extract to
experiments/data/coco/annotations/.
cd experiments/data/coco
wget http://images.cocodataset.org/zips/val2014.zip
unzip val2014.zip
wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip
unzip annotations_trainval2014.zip -d .
mv annotations_trainval2014/annotations .A pre-built cache (
experiments/eval/chair.pkl) is included so you can skip COCO annotation download if you only want to run the CHAIR metric.
POPE question files for COCO, A-OKVQA, and GQA are already included under experiments/data/POPE/. No extra download needed.
For GQA images (only needed for POPE-GQA evaluation), download from the GQA dataset and extract to experiments/data/gqa/images/.
CHAIR questions are included at experiments/data/CHAIR/questions.jsonl. Images come from COCO val2014.
- Download the MME Benchmark images and extract to
experiments/data/MME/MME_Benchmark_release_version/. - MME question lists and evaluation tools are already included.
LLaVA-Bench data (images + questions) is fully included in experiments/data/llava-bench/. No extra download needed.
Pre-generated first-round captions for all benchmarks are provided under experiments/first_cap/.
experiments/data/
βββ POPE/ # (included) question files
β βββ coco/
β βββ aokvqa/
β βββ gqa/
βββ CHAIR/
β βββ questions.jsonl # (included)
βββ MME/
β βββ full.json # (included) question list
β βββ hal.json # (included) question list
β βββ MME_Benchmark_release_version/ # (download)
βββ llava-bench/ # (fully included)
βββ coco/
β βββ val2014/ # (download) COCO val2014 images
β βββ annotations/ # (download) COCO 2014 annotations
βββ gqa/
βββ images/ # (download) GQA images
bash experiments/scripts/llava1.5_pope_coco.bash
python experiments/eval/eval_pope.py \
--gt_file experiments/data/POPE/coco/coco_pope_random.json \
--gen_file output/llava15_coco_pope_random_answers_*.jsonlOther POPE splits (popular, adversarial) and datasets (A-OKVQA, GQA) can be run by passing arguments to the script.
bash experiments/scripts/llava1.5_chair.bashThe CHAIR evaluation script computes CHAIRs, CHAIRi, and Recall metrics. It requires the pattern library:
pip install git+https://github.com/clips/pattern.gitbash experiments/scripts/llava1.5_MME_full.bash
bash experiments/scripts/llava1.5_MME_hal.bashPOPE COCO (all splits, 3 LVLMs)
| LVLM | Method | Rand. Acc | Rand. F1 | Pop. Acc | Pop. F1 | Adv. Acc | Adv. F1 |
|---|---|---|---|---|---|---|---|
| LLaVA-1.5 | Vanilla | 83.2 | 81.3 | 81.8 | 80.0 | 78.9 | 77.5 |
| VCD | 87.7 | 87.1 | 85.3 | 85.0 | 80.8 | 81.3 | |
| OPERA | 89.1 | 89.0 | 86.0 | 86.3 | 79.1 | 80.9 | |
| SHIELD | 91.3 | 91.1 | 87.4 | 87.6 | 82.5 | 83.6 | |
| InstructBLIP | Vanilla | 80.7 | 80.4 | 78.2 | 78.3 | 75.8 | 76.5 |
| VCD | 84.5 | 83.6 | 81.4 | 81.0 | 79.5 | 79.5 | |
| OPERA | 89.8 | 89.6 | 83.4 | 84.0 | 80.7 | 81.8 | |
| SHIELD | 88.2 | 87.6 | 84.6 | 84.3 | 82.2 | 82.4 | |
| Qwen-VL | Vanilla | 84.7 | 82.6 | 84.1 | 82.0 | 82.2 | 80.3 |
| VCD | 88.6 | 87.8 | 87.1 | 86.4 | 84.2 | 83.9 | |
| OPERA | 86.1 | 84.2 | 85.7 | 83.8 | 83.9 | 82.1 | |
| SHIELD | 89.2 | 88.6 | 87.6 | 87.1 | 84.3 | 84.2 |
CHAIR (3 LVLMs)
| Method | LLaVA C_Sβ | LLaVA C_Iβ | IBLIP C_Sβ | IBLIP C_Iβ | Qwen C_Sβ | Qwen C_Iβ |
|---|---|---|---|---|---|---|
| Vanilla | 48.8 | 14.2 | 54.6 | 24.8 | 49.2 | 13.1 |
| VCD | 46.8 | 13.2 | 44.0 | 13.6 | 46.4 | 11.9 |
| OPERA | 44.6 | 12.8 | 46.4 | 14.2 | 34.6 | 9.5 |
| SHIELD | 36.6 | 10.3 | 40.4 | 10.9 | 28.9 | 9.2 |
MME Hallucination Subset (3 LVLMs)
| LVLM | Method | Existence | Count | Position | Color | Total |
|---|---|---|---|---|---|---|
| LLaVA-1.5 | Vanilla | 175.6 | 124.6 | 114.0 | 151.0 | 565.3 |
| VCD | 184.6 | 138.3 | 128.6 | 153.0 | 604.6 | |
| OPERA | 180.6 | 133.3 | 123.3 | 155.0 | 592.3 | |
| SHIELD | 195.0 | 141.6 | 148.3 | 183.3 | 668.3 | |
| InstructBLIP | Vanilla | 141.0 | 75.3 | 66.6 | 97.3 | 380.3 |
| VCD | 168.3 | 92.3 | 64.0 | 123.0 | 447.6 | |
| OPERA | 156.0 | 78.3 | 55.0 | 95.0 | 384.3 | |
| SHIELD | 170.0 | 75.0 | 88.3 | 128.3 | 461.6 | |
| Qwen-VL | Vanilla | 155.0 | 127.6 | 131.6 | 173.0 | 587.3 |
| VCD | 156.0 | 131.0 | 128.0 | 181.6 | 596.6 | |
| OPERA | 165.0 | 145.0 | 133.3 | 180.0 | 623.3 | |
| SHIELD | 180.0 | 170.0 | 128.3 | 190.0 | 668.3 |
GPT-4o Aided Evaluation
| Method | LLaVA Cβ | LLaVA Dβ | IBLIP Cβ | IBLIP Dβ | Qwen Cβ | Qwen Dβ |
|---|---|---|---|---|---|---|
| Vanilla | 4.9 | 5.0 | 4.2 | 4.2 | 6.2 | 4.6 |
| VCD | 5.5 | 5.5 | 5.1 | 5.5 | 6.5 | 5.7 |
| OPERA | 5.6 | 6.0 | 5.3 | 5.2 | 6.5 | 5.6 |
| SHIELD | 6.2 | 6.1 | 5.6 | 5.3 | 6.9 | 5.8 |
MME Full & Efficiency
MME Full (LLaVA-1.5 7B)
| Method | Perceptionβ | Cognitionβ | Totalβ |
|---|---|---|---|
| Vanilla | 1279.2 | 352.9 | 1632.1 |
| VCD | 1363.9 | 353.2 | 1717.1 |
| OPERA | 1413.0 | 304.2 | 1717.2 |
| SHIELD | 1473.0 | 337.8 | 1810.8 |
Efficiency (LLaVA-1.5 7B, CHAIR)
| Method | C_Sβ | Time (s/sample)β | Memoryβ |
|---|---|---|---|
| Vanilla | 48.8 | 2.59 | 15.69 GB |
| VCD | 46.8 | 4.89 | 16.52 GB |
| OPERA | 44.6 | 24.01 | 34.88 GB |
| SHIELD | 36.6 | 7.34 | 18.17 GB |
SHIELD/
βββ shield/ # Core SHIELD library
β βββ __init__.py # Public API
β βββ wrapper.py # shield.wrap() β non-invasive model patching
β βββ attack.py # CW and PGD adversarial attacks in CLIP space
β βββ caption.py # Caption loading and preprocessing
β βββ clip_utils.py # CLIP model loading and text features
β βββ feature.py # Feature weighting, bias computation
β βββ noise.py # Diffusion noise injection
β βββ sampling.py # Custom contrastive decoding sampler
βββ experiments/
β βββ eval/ # Evaluation scripts
β βββ scripts/ # Bash scripts for running experiments
β βββ data/ # Evaluation datasets
β βββ first_cap/ # Pre-generated first-round captions
β βββ llava/ # LLaVA model code (vendored)
βββ logs/ # SOTA results for LLaVA, InstructBLIP, Qwen-VL
βββ figs/ # Paper figures
βββ requirements.txt
βββ CITATION.bib
βββ LICENSE
We extend our gratitude to the following projects:
- LLaVA β Large Language and Vision Assistant
- VCD β Visual Contrastive Decoding
- OPERA β Alleviating Hallucination in Multi-Modal LLMs
- CHAIR β Object Hallucination evaluation metric
- Qwen-VL β Qwen Vision-Language model
If you find this work useful, please cite our paper:
@inproceedings{
huang2026shield,
title={{SHIELD}: Suppressing Hallucinations In {LVLM} Encoders via Bias and Vulnerability Defense},
author={Yiyang Huang and Liang Shi and Yitian Zhang and Yi Xu and Yun Fu},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=yk7FFLoNcP}
}arXiv version:
@article{huang2025shield,
title={SHIELD: Suppressing Hallucinations In LVLM Encoders via Bias and Vulnerability Defense},
author={Huang, Yiyang and Shi, Liang and Zhang, Yitian and Xu, Yi and Fu, Yun},
journal={arXiv preprint arXiv:2510.16596},
year={2025}
}This project is released under the Apache 2.0 License.

