• shareshare
  • link
  • cite
  • add
auto_awesome_motion View all 8 versions
Publication . Presentation . Conference object . Report . Other literature type . Article . Preprint . 2021

Fail-Safe Execution of Deep Learning based Systems through Uncertainty Monitoring

Weiss, Michael; Tonella, Paolo;
Open Access
Modern software systems rely on Deep Neural Networks (DNN) when processing complex, unstructured inputs, such as images, videos, natural language texts or audio signals. Provided the intractably large size of such input spaces, the intrinsic limitations of learning algorithms, and the ambiguity about the expected predictions for some of the inputs, not only there is no guarantee that DNN's predictions are always correct, but rather developers must safely assume a low, though not negligible, error probability. A fail-safe Deep Learning based System (DLS) is one equipped to handle DNN faults by means of a supervisor, capable of recognizing predictions that should not be trusted and that should activate a healing procedure bringing the DLS to a safe state. In this paper, we propose an approach to use DNN uncertainty estimators to implement such a supervisor. We first discuss the advantages and disadvantages of existing approaches to measure uncertainty for DNNs and propose novel metrics for the empirical assessment of the supervisor that rely on such approaches. We then describe our publicly available tool UNCERTAINTY-WIZARD, which allows transparent estimation of uncertainty for regular tf.keras DNNs. Lastly, we discuss a large-scale study conducted on four different subjects to empirically validate the approach, reporting the lessons-learned as guidance for software engineers who intend to monitor uncertainty for fail-safe execution of DLS.
Backup recording of the talk @ ICST 2021
Subjects by Vocabulary

Microsoft Academic Graph classification: Artificial intelligence business.industry business Artificial neural network Software system Computer science Machine learning computer.software_genre computer Supervisor Audio signal Deep learning Natural language Ambiguity media_common.quotation_subject media_common Software

ACM Computing Classification System: GeneralLiterature_MISCELLANEOUS


Computer Science - Software Engineering, Computer Science - Machine Learning, Software Engineering (cs.SE), Machine Learning (cs.LG), FOS: Computer and information sciences

42 references, page 1 of 5

[1] B. Templeton. (2020) Tesla in taiwan crashes directly into overturned truck, ignores pedestrian, with autopilot on. [Online]. Available:

[2] J. Vincent. (2018) Google 'fixed' its racist algorithm by removing gorillas from its image-labeling tech. [Online]. Available:

[3] A. Kendall and Y. Gal, “What uncertainties do we need in bayesian deep learning for computer vision?” in Advances in Neural Information Processing Systems 30, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds. Curran Associates, Inc., 2017, pp. 5574-5584.

[4] V. Riccio, G. Jahangiroba, A. Stocco, N. Humbatova, M. Weiss, and P. Tonella, “Testing machine learning based systems: a systematic mapping,” Empirical Software Engineering, 2020.

[5] L. V. Jospin, W. Buntine, F. Boussaid, H. Laga, and M. Bennamoun, “Hands-on bayesian neural networks - a tutorial for deep learning users,” 2020. [OpenAIRE]

[6] Y. Ovadia, E. Fertig, J. Ren, Z. Nado, D. Sculley, S. Nowozin, J. Dillon, B. Lakshminarayanan, and J. Snoek, “Can you trust your models uncertainty? evaluating predictive uncertainty under dataset shift,” Advances in Neural Information Processing Systems, pp. 13 991-14 002, 2019.

[7] R. M. Neal, “Bayesian training of backpropagation networks by the hybrid monte carlo method,” Citeseer, Tech. Rep., 1992.

[8] D. J. MacKay, “A practical bayesian framework for backpropagation networks,” Neural computation, vol. 4, no. 3, pp. 448-472, 1992.

[9] E. Ilg, O. Cicek, S. Galesso, A. Klein, O. Makansi, F. Hutter, and T. Brox, “Uncertainty estimates and multi-hypotheses networks for optical flow,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 652-667. [OpenAIRE]

[10] Y. Gal and Z. Ghahramani, “Dropout as a bayesian approximation: Representing model uncertainty in deep learning,” in Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ser. ICML'16., 2016, pp. 1050-1059.

Funded by
Self-assessment Oracles for Anticipatory Testing
  • Funder: European Commission (EC)
  • Project Code: 787703
  • Funding stream: H2020 | ERC | ERC-ADG
Validated by funder
Download fromView all 8 sources