Research code for the paper "GEM: Gaussian Embedding Modeling for Out-of-Distribution Detection in GUI Agents".
Paper link: https://arxiv.org/abs/2505.12842
git clone https://github.com/Wuzheng02/GEM-OODforGUIagents
cd GEM-OODforGUIagentsTo evaluate GEM on the AITZ train set (ID) and test using AITZ test (ID) and OmniAct-Desktop test (OOD):
-
Extract input scores (for both ID and OOD datasets):
python run.py
-
Fit GMM and perform OOD detection:
python GEM.py
π Note: Baseline methods (e.g., MSP, Energy, Mahalanobis) are also available in
run.py(see commented sections).
@article{wu2025gem,
title={GEM: Gaussian Embedding Modeling for Out-of-Distribution Detection in GUI Agents},
author={Wu, Zheng and Cheng, Pengzhou and Wu, Zongru and Dong, Lingzhong and Zhang, Zhuosheng},
journal={arXiv preprint arXiv:2505.12842},
year={2025}
}