An Empirical Study on Low- and High-Level Explanations of Deep Learning Misbehaviours

descriptionPublicationkeyboard_double_arrow_right Article , Report , Conference object 26 Oct 2023 Italy Publisher:IEEEJournal:2023 ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM)Funded by:EC | PRECRIME

Authors: Tahereh Zohdinasab; Vincenzo Riccio; Paolo Tonella;

doi: 10.1109/esem56168.2023.10304866 , 10.5281/zenodo.12653972 , 10.5281/zenodo.12653973

handle: 11390/1276864

An Empirical Study on Low- and High-Level Explanations of Deep Learning Misbehaviours

- Summary
- Metrics

Abstract

Background: Most quality assessment approaches for Deep Learning (DL) focus on finding misbehaviourinducing inputs. However, it is difficult to clearly understand the causes of misbehaviours, due to the DL software opaqueness. Recent research proposed different techniques to explain DL misbehaviours, producing input explanations either at a “low level” (raw input elements) or at a “high level” (input features). Aims: We aim to compare the similarity between different explanations and assess to what extent they are understandable. Method: We have conducted an empirical study involving 3 state-of-the-art techniques for DL explanation in 13 configurations, applied to 2 different DL tasks. We have also collected answers from 48 questionnaires submitted to SE experts. Results: Low- and high-level techniques provide dissimilar explanations for the same inputs. However, experts deemed none of the explanations as useful in 28% of the cases. Conclusion: Despite the complementarity of existing explanations, further research is needed to produce better explanations.

Country

Italy

Related Organizations

Universita della Svizzera Italiana
Switzerland
University of Udine
Italy

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	2
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

2

Average

Green

Funded by

EC| PRECRIME