Powered by OpenAIRE graph
Found an issue? Give us feedback
addClaim

Audio Explainable Artificial Intelligence

Authors: Akman, Alican;

Audio Explainable Artificial Intelligence

Abstract

This thesis explores the potential of audio Explainable Artificial Intelligence (XAI) to improve the interpretability of deep audio processing models. While most existing XAI methods focus on visual and textual explanations, audio explanations offer a more intuitive approach for audio-based tasks. They provide a unique level of expressiveness, particularly where visual explanations require specialised knowledge. By aligning explanations with the audio domain, this research aims to bridge the interpretability gap and enhance understanding of complex audio models. As a case study, this thesis examines COVID-19 detection from cough and speech audio, considering both classifier performance and explainability. To enhance interpretability, CoughLIME, a modified version of LIME tailored for cough data, is introduced. CoughLIME generates faithful and listenable explanations, addressing a key challenge in trust for audio-based COVID-19 classifiers. With transformer models excelling in audio processing, the need for interpretability of their complex decision-making has grown. This thesis proposes a technique to explain audio-processing transformers by integrating their attention mechanisms with non-negative matrix factorisation (NMF). NMF decomposes audio into spectral patterns, while attention weights identify the most relevant time activations. By reconstructing key audio components, the method generates high-fidelity, listenable explanations, validated through audio classification tasks. Additionally, a novel explanation method is introduced, leveraging the meaningful representation space and generative capacity of audio foundation models. By integrating feature attribution techniques, significant features in their embedding space are identified, enabling the generation of meaningful audio explanations. Extensive evaluations explaining audio classification models confirm the effectiveness of this approach. Finally, we propose a novel framework, extending beyond traditional feature attribution which emphasise only the most relevant features, overlooking the broader representational space, including less important features. Rather than removing features, the framework uses generative audio language models to replace removed features with contextually appropriate alternatives, offering a more comprehensive understanding of model behaviour.

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Related to Research communities
Upload OA version
Are you the author of this publication? Upload your Open Access version to Zenodo!
It’s fast and easy, just two clicks!