EnsemHalDet: Robust VLM Hallucination Detection via Ensemble of Internal State Detectors

Mar 1, 2026·
Ryuhei Miyazato
Ryuhei Miyazato
,
Shunsuke Kitada
,
Kei Harada
· 1 min read
Abstract
Vision-Language Models (VLMs) excel at multimodal tasks, but they remain vulnerable to hallucinations that are factually incorrect or ungrounded in the input image. Recent work suggests that hallucination detection using internal representations is more efficient and accurate than approaches that rely solely on model outputs. However, existing internal-representation-based methods typically rely on a single representation or detector, limiting their ability to capture diverse hallucination signals. In this paper, we propose EnsemHalDet, an ensemble-based hallucination detection framework that leverages multiple internal representations of VLMs, including attention outputs and hidden states. EnsemHalDet trains independent detectors for each representation and combines them through ensemble learning. Experimental results across multiple VQA datasets and VLMs show that EnsemHalDet consistently outperforms prior methods and single-detector models in terms of AUC. These results demonstrate that ensembling diverse internal signals significantly improves robustness in multimodal hallucination detection.
Type
Publication
ACL Student Research Workshop (SRW) 2026
publications
Note

Click the Cite button above to demo the feature to enable visitors to import publication metadata into their reference management software.

Note

Create your slides in Markdown - click the Slides button to check out the example.

Add the publication’s full text or supplementary notes here. You can use rich formatting such as including code, math, and images.

Ryuhei Miyazato
Authors
Researcher at AISI/UEC
I am currently a researcher at the Japan AI Safety Institute under the supervision of Prof. Satoshi Sekine, and a research associate at the University of Electro-Communications under Satoshi Hara. I received my Master’s degree from the same university under the supervision of Kei Harada. My research focuses on hallucination detection in vision language models and narrative understanding, especially temporal discourse understanding. I am currently seeking Ph.D. positions.
Authors