Delta Energy: Optimizing Energy Change During Vision-Language Alignment Improves both OOD Detection and OOD Generalization
Official implementation of Delta Energy: Optimizing Energy Change During Vision-Language Alignment Improves both OOD Detection and OOD Generalization The paper has been accepted by NeurIPS 2025.
If you are interested in concurrent optimzaition for OOD generalzaition and OOD detection, checkout our
T-PAMI 2025 work:InfoBound: A Provable Information-Bounds Inspired Framework for Both OoD Generalization and OoD Detection
ICML 2024 work: CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection (openreview.net)
Recent approaches for vision-language models (VLMs) have shown remarkable success in achieving fast downstream adaptation.
When applied to real-world downstream tasks, VLMs inevitably encounter both the in-distribution (ID) data and out-of-distribution (OOD) data. The OOD datasets often include both covariate shifts (e.g., known classes with changes in image styles) and semantic shifts (e.g., test-time unseen classes). This highlights the importance of improving VLMs' generalization ability to covariate-shifted OOD data, while effectively detecting open-set semantic-shifted OOD classes. In this paper, inspired by the substantial energy change observed in closed-set data when re-aligning vision-language modalities—specifically by directly reducing the maximum cosine similarity to a low value—we introduce a novel OOD score, named
(A) Illustration of
(B) Illustration of the
(C) Comparison between our
Overview of the proposed method. Based on the prompt-tuning approach, we freeze both the image encoder and the text encoder, making only the context vectors (
This code is developed based on CRoFT. For environment setup and datasets used to evaluate both OOD generalization and OOD detection, please refer to the original CRoFT repository.
We provide the running scripts in CoOp/scripts. We take Delta Energy as an example, other methods can be similarly evaluated. Make sure you change the path on DATA in shell files under CoOp/scripts/DeltaEnergy and run the commands under CoOp/scripts/DeltaEnergy.
# For evaluating Delta Energy on the SETUP-I:
python test_setup1.py
# For evaluating Delta Energy on the SETUP-II:
python test_setup2.pyThis workflow is consistent for other baselines (such as, MCM, MaxLogits, MSP, Energy Score, CLIPN, React, ODIN). For example, to evaluate MCM, please navigate to its corresponding directory, CoOp/scripts/MCM and execute the provided shell scripts:
bash test_openood_setup1.sh gpu_id
bash test_openood_setup1_osr.sh gpu_idWe provide the running scripts in CoOp/scripts. We take EBM as an example, other methods can be similarly evaluated. Make sure you change the path on DATA in shell files under CoOp/scripts/EBM and run the commands under CoOp/scripts/EBM.
# For training EBM on the in-distribution ImageNet46 datasets:
python run_setup1.py
# For evaluating EBM on the closed-set OOD datasets and open-set OOD datasets:
python test_setup1.py
# For training EBM on the in-distribution PACS or VLCS datasets:
python run_setup2.py
# For evaluating EBM on the closed-set OOD datasets and open-set OOD datasets:
python test_setup2.pyTo collect the EBM results across different experimental setups, please use the scripts provided in the original CRoFT repository. For example:
- To collect OOD generalization results in SETUP-I, run:
# run the commands under CoOp/
python collect_result_set1_oodg.pyThis repo benefits from CLIP, CoOp [CoCoOp](KaiyangZhou/CoOp: Prompt Learning for Vision-Language Models (IJCV'22, CVPR'22) (github.com)), MCM, etc.
Thanks for their wonderful works.
If you use this code in your research, please kindly cite the following papers:
@inproceedings{zhu2025DeltaEnergy,
title={Delta Energy: Optimizing Energy Change During Vision-Language Alignment Improves both OOD Detection and OOD Generalization},
author={Zhu, Lin and Yang, Yifeng and Wang, Xinbing and Gu, Qinying and Ye, Nanyang},
booktitle={Advances in Neural Information Processing Systems},
year={2025}
}
@article{zhu2024croft,
title={CRoFT: Robust Fine-Tuning with Concurrent Optimization for OOD Generalization and Open-Set OOD Detection},
author={Zhu, Lin and Yang, Yifeng and Gu, Qinying and Wang, Xinbing and Zhou, Chenghu and Ye, Nanyang},
journal={arXiv preprint arXiv:2405.16417},
year={2024}
}
If you have any question about this project, please feel free to contact zhulin_sjtu@sjtu.edu.cn.

