AMF: Aggregated Mondrian Forests for Online Learning

descriptionPublicationkeyboard_double_arrow_right Article , Preprint 19 May 2021Embargo end date: 01 Jan 2019 France English Publisher:Oxford University Press (OUP)Journal:Journal of the Royal Statistical Society Series B: Statistical Methodology, volume 83, pages 505-533 (issn: 1369-7412, eissn: 1467-9868,

Copyright policy )

Authors: Mourtada, Jaouad; Gaïffas, Stéphane; Scornet, Erwan;

doi: 10.1111/rssb.12425 , 10.48550/arxiv.1906.10529

arXiv: 1906.10529

AMF: Aggregated Mondrian Forests for Online Learning

- Summary
- Subjects
- Metrics

Abstract

AbstractRandom forest (RF) is one of the algorithms of choice in many supervised learning applications, be it classification or regression. The appeal of such tree-ensemble methods comes from a combination of several characteristics: a remarkable accuracy in a variety of tasks, a small number of parameters to tune, robustness with respect to features scaling, a reasonable computational cost for training and prediction, and their suitability in high-dimensional settings. The most commonly used RF variants, however, are ‘offline’ algorithms, which require the availability of the whole dataset at once. In this paper, we introduce AMF, an online RF algorithm based on Mondrian Forests. Using a variant of the context tree weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function. Numerical experiments show that AMF is competitive with respect to several strong baselines on a large number of datasets for multi-class classification.

Country

France

Related Organizations

École Polytechnique
France
École Normale Supérieure
France
Center for Research in Economics and Statistics
France
French Institute for Research in Computer Science and Automation
France
University of Paris
France

View all View all

Keywords

FOS: Computer and information sciences, Computer Science - Machine Learning, [STAT.TH] Statistics [stat]/Statistics Theory [stat.TH], Mathematics - Statistics Theory, Machine Learning (stat.ML), [STAT.TH]Statistics [stat]/Statistics Theory [stat.TH], [INFO.INFO-LG] Computer Science [cs]/Machine Learning [cs.LG], Statistics Theory (math.ST), Online regression trees, [STAT.ML] Statistics [stat]/Machine Learning [stat.ML], Machine Learning (cs.LG), [STAT.ML]Statistics [stat]/Machine Learning [stat.ML], [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], [MATH.MATH-ST]Mathematics [math]/Statistics [math.ST], Online learning, Statistics - Machine Learning, Nonparametric methods, FOS: Mathematics, Adaptive regression, [MATH.MATH-ST] Mathematics [math]/Statistics [math.ST]

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	24
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 10%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%

Found an issue? Give us feedback

24

Top 10%

Green

hybrid

Fields of Science (4) View all

Fields of Science

Related to Research communities

INRIA