This repository includes the following research papers on audio hallucination:
1. Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning 🚀
- Conference: ICASSP 2025
- Keywords: Object Existence, Temporal Order, Object Attribute, Multi-turn And Thoughtful Chain of Hearings (MATCH)
- GitHub Page | arXiv
2. Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models
- Conference: Interspeech 2024
- Keywords: Object Hallucination, LALMs
- GitHub Page | arXiv
If you find our work useful, please kindly cite the following papers:
The first paper aims to provide a more comprehensive analysis of audio hallucination, covering aspects such as object existence, temporal order, and object attributes. It also proposes simple and effective methods to improve the performance of these models.
The second paper is the first to systematically analyze and explore object hallucination phenomena in large audio-language models.
@article{kuan2024can,
title={Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning},
author={Kuan, Chun-Yi and Lee, Hung-yi},
booktitle={ICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2025},
arxiv = {2410.16130}
}
@article{kuan2024understanding,
title={Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models},
author={Kuan, Chun-Yi and Huang, Wei-Ping and Lee, Hung-yi},
booktitle={2024 Conference of the International Speech Communication Association (INTERSPEECH)},
year={2024},
arxiv = {2406.08402},
}