Margin-adaptive model selection in statistical learning

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Other literature type 01 May 2011Embargo end date: 01 Jan 2008 Australia, France Publisher:Bernoulli Society for Mathematical Statistics and ProbabilityJournal:Bernoulli, volume 17 (issn: 1350-7265,

Copyright policy )Funded by:NSF | MSPA-MCS: Collaborative R...

Authors: Arlot, Sylvain; Bartlett, Peter;

doi: 10.3150/10-bej288 , 10.48550/arxiv.0804.2937

arXiv: 0804.2937

Margin-adaptive model selection in statistical learning

- Summary
- Subjects
- Metrics

Abstract

A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of model selection, in a general learning framework. Actually, we consider a weaker version of this condition that allows one to take into account that learning within a small model can be much easier than within a large one. Requiring this "strong margin adaptivity" makes the model selection problem more challenging. We first prove, in a general framework, that some penalization procedures (including local Rademacher complexities) exhibit this adaptivity when the models are nested. Contrary to previous results, this holds with penalties that only depend on the data. Our second main result is that strong margin adaptivity is not always possible when the models are not nested: for every model selection procedure (even a randomized one), there is a problem for which it does not demonstrate strong margin adaptivity.

Published in at http://dx.doi.org/10.3150/10-BEJ288 the Bernoulli (http://isi.cbs.nl/bernoulli/) by the International Statistical Institute/Bernoulli Society (http://isi.cbs.nl/BS/bshome.htm)

Countries

Australia, France

Related Organizations

French National Centre for Scientific Research
France
French Institute for Research in Computer Science and Automation
France
Queensland University of Technology
Australia
University of California, Berkeley
United States
PSL Research University
France

View all View all

Keywords

FOS: Computer and information sciences, model selection, Ridge regression; shrinkage estimators (Lasso), empirical minimization, Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), adaptivity, 519, Statistics - Machine Learning, FOS: Mathematics, oracle inequalities, [MATH.MATH-ST] Mathematics [math]/Statistics [math.ST], local Rademacher complexity, Classification and discrimination; cluster analysis (statistical aspects), [STAT.TH] Statistics [stat]/Statistics Theory [stat.TH], empirical risk minimization, [STAT.OT] Statistics [stat]/Other Statistics [stat.ML], [STAT.ML] Statistics [stat]/Machine Learning [stat.ML], statistical learning, classification, margin condition

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	4
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average