
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
You have already added 0 works in your ORCID record related to the merged Research product.
You have already added 0 works in your ORCID record related to the merged Research product.
Explaining Predictive Models with Mixed Features Using Shapley Values and Conditional Inference Trees
Explaining Predictive Models with Mixed Features Using Shapley Values and Conditional Inference Trees
International audience; It is becoming increasingly important to explain complex, black-box machine learning models. Although there is an expanding literature on this topic, Shapley values stand out as a sound method to explain predictions from any type of machine learning model. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. This methodology was then extended to explain dependent features with an underlying continuous distribution. In this paper, we propose a method to explain mixed (i.e. continuous, discrete, ordinal, and categorical) dependent features by modeling the dependence structure of the features using conditional inference trees. We demonstrate our proposed method against the current industry standards in various simulation studies and find that our method often outperforms the other approaches. Finally, we apply our method to a real financial data set used in the 2018 FICO Explainable Machine Learning Challenge and show how our explanations compare to the FICO challenge Recognition Award winning team.
- Norwegian Computing Center Norway
- Norwegian Computing Center Norway
- Université Paris Diderot France
Microsoft Academic Graph classification: Categorical variable Data set Current (mathematics) Artificial intelligence business.industry business Structure (mathematical logic) Distribution (mathematics) Computer science Type (model theory) Inference Machine learning computer.software_genre computer
Explainable AI, Shapley values, Conditional inference trees, Feature dependence, Prediction explanation, [INFO]Computer Science [cs], [SHS.INFO]Humanities and Social Sciences/Library and information sciences, [INFO] Computer Science [cs], [SHS.INFO] Humanities and Social Sciences/Library and information sciences, Machine Learning (stat.ML), Machine Learning (cs.LG), FOS: Computer and information sciences, Statistics - Machine Learning, Computer Science - Machine Learning
Explainable AI, Shapley values, Conditional inference trees, Feature dependence, Prediction explanation, [INFO]Computer Science [cs], [SHS.INFO]Humanities and Social Sciences/Library and information sciences, [INFO] Computer Science [cs], [SHS.INFO] Humanities and Social Sciences/Library and information sciences, Machine Learning (stat.ML), Machine Learning (cs.LG), FOS: Computer and information sciences, Statistics - Machine Learning, Computer Science - Machine Learning
Microsoft Academic Graph classification: Categorical variable Data set Current (mathematics) Artificial intelligence business.industry business Structure (mathematical logic) Distribution (mathematics) Computer science Type (model theory) Inference Machine learning computer.software_genre computer
21 references, page 1 of 3
1. Aas, K., Jullum, M., Løland, A.: Explaining individual predictions when features are dependent: More accurate approximations to shapley values. arXiv preprint arXiv:1903.10464 (2019)
2. Adadi, A., Berrada, M.: Peeking inside the black-box: A survey on explainable artificial intelligence (xai). IEEE Access 6, 52138-52160 (2018) [OpenAIRE]
3. Arellano-Valle, R.B., Branco, M.D., Genton, M.G.: A unified view on skewed distributions arising from selections. Canadian Journal of Statistics 34(4), 581-601 (2006)
4. Breiman, L., Friedman, J., Olshen, R., Stone, C.: Classification and regression trees. Chapman and Hall (1984)
5. Chen, C., Lin, K., Rudin, C., Shaposhnik, Y., Wang, S., Wang, T.: An interpretable model with globally consistent explanations for credit risk. arXiv preprint arXiv:1811.12615 (2018)
6. Chen, T., Guestrin, C.: XGBoost: A scalable tree boosting system. In: Proceedings of the 22Nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD). pp. 785-794. ACM (2016)
7. Cramir, H.: Mathematical methods of statistics. Princeton U. Press, Princeton p. 500 (1946)
8. Elith, J., Leathwick, J.R., Hastie, T.: A working guide to boosted regression trees. Journal of Animal Ecology 77(4), 802-813 (2008)
9. FICO: Explainable machine learning challenge (2018), https://community.fico.com/s/explainable-machine-learning-challenge
10. Fisher, R.A.: Statistical methods for research workers. In: Breakthroughs in statistics, pp. 66-70. Springer (1992)
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).3 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).3 popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.Average influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).Average impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.Average Powered byBIP!

International audience; It is becoming increasingly important to explain complex, black-box machine learning models. Although there is an expanding literature on this topic, Shapley values stand out as a sound method to explain predictions from any type of machine learning model. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. This methodology was then extended to explain dependent features with an underlying continuous distribution. In this paper, we propose a method to explain mixed (i.e. continuous, discrete, ordinal, and categorical) dependent features by modeling the dependence structure of the features using conditional inference trees. We demonstrate our proposed method against the current industry standards in various simulation studies and find that our method often outperforms the other approaches. Finally, we apply our method to a real financial data set used in the 2018 FICO Explainable Machine Learning Challenge and show how our explanations compare to the FICO challenge Recognition Award winning team.