DALEX: Explainers for Complex Predictive Models in R

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 01 Jan 2018Embargo end date: 01 Jan 2018Publisher:ZenodoJournal:CoRR, volume abs/1806.08915Funded by:EC | RENOIR

Authors: Biecek, Przemysław;

doi: 10.5281/zenodo.3670940 , 10.5281/zenodo.3670939 , 10.48550/arxiv.1806.08915

arXiv: 1806.08915

DALEX: Explainers for Complex Predictive Models in R

- Summary
- Subjects
- Metrics

Abstract

Predictive modeling is invaded by elastic, yet complex methods such as neural networks or ensembles (model stacking, boosting or bagging). Such methods are usually described by a large number of parameters or hyper parameters - a price that one needs to pay for elasticity. The very number of parameters makes models hard to understand. This paper describes a consistent collection of explainers for predictive models, a.k.a. black boxes. Each explainer is a technique for exploration of a black box model. Presented approaches are model-agnostic, what means that they extract useful information from any predictive method despite its internal structure. Each explainer is linked with a specific aspect of a model. Some are useful in decomposing predictions, some serve better in understanding performance, while others are useful in understanding importance and conditional responses of a particular variable. Every explainer presented in this paper works for a single model or for a collection of models. In the latter case, models can be compared against each other. Such comparison helps to find strengths and weaknesses of different approaches and gives additional possibilities for model validation. Presented explainers are implemented in the DALEX package for R. They are based on a uniform standardized grammar of model exploration which may be easily extended. The current implementation supports the most popular frameworks for classification and regression.

12 pages

Related Organizations

Samsung (Poland)
Poland
Warsaw University of Technology
Poland

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, Computer Science - Artificial Intelligence, machine learning, R, visualization, model interpretability, modeling, Machine Learning (stat.ML), Statistics - Applications, Machine Learning (cs.LG), Artificial Intelligence (cs.AI), Statistics - Machine Learning, Applications (stat.AP)

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	14
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average