Name: Cross-validation
Creator: Arlot, Sylvain
Keywords: FOS: Computer and information sciences, model selection, bias-corrected cross-validation, estimator selection, leave-one-out, sélection d'estimateurs, Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), cross-validation

descriptionPublicationkeyboard_double_arrow_right Article , Preprint , Part of book or chapter of book 01 Jan 2017Embargo end date: 01 Jan 2017Publisher:arXiv

Authors: Arlot, Sylvain;

doi: 10.48550/arxiv.1703.03167

arXiv: http://arxiv.org/abs/1703.03167

Cross-validation

- Summary
- Subjects
- Metrics

Abstract

This text is a survey on cross-validation. We define all classical cross-validation procedures, and we study their properties for two different goals: estimating the risk of a given estimator, and selecting the best estimator among a given family. For the risk estimation problem, we compute the bias (which can also be corrected) and the variance of cross-validation methods. For estimator selection, we first provide a first-order analysis (based on expectations). Then, we explain how to take into account second-order terms (from variance computations, and by taking into account the usefulness of overpenalization). This allows, in the end, to provide some guidelines for choosing the best cross-validation method for a given learning problem.

in French

Related Organizations

University of Paris-Saclay
France
Département de Mathématiques
France
University of Paris-Sud
France
French Institute for Research in Computer Science and Automation
France
French National Centre for Scientific Research
France

Keywords

FOS: Computer and information sciences, model selection, bias-corrected cross-validation, estimator selection, leave-one-out, sélection d'estimateurs, Mathematics - Statistics Theory, Machine Learning (stat.ML), Statistics Theory (math.ST), cross-validation, V-fold cross-validation, Statistics - Machine Learning, overpenalization, FOS: Mathematics, risk estimation, sélection de modèles, [MATH.MATH-ST] Mathematics [math]/Statistics [math.ST], V-fold penalization, [STAT.TH] Statistics [stat]/Statistics Theory [stat.TH], estimation du risque, pénalisation V-fold, [STAT.ML] Statistics [stat]/Machine Learning [stat.ML], leave-p-out, validation croisée V-fold, validation croisée corrigée, surpénalisation, validation croisée

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

Average

Green

Beta

SDGs Suggest

4. Education

Beta

SDGs:

4. Education,

Related to Research communities

INRIA

Knowmad Institut