
arXiv: 0804.0551
handle: 11858/00-001M-0000-0013-F3BD-4
The support vector machine (SVM) algorithm is well known to the computer learning community for its very good practical results. The goal of the present paper is to study this algorithm from a statistical perspective, using tools of concentration theory and empirical processes. Our main result builds on the observation made by other authors that the SVM can be viewed as a statistical regularization procedure. From this point of view, it can also be interpreted as a model selection principle using a penalized criterion. It is then possible to adapt general methods related to model selection in this framework to study two important points: (1) what is the minimum penalty and how does it compare to the penalty actually used in the SVM algorithm; (2) is it possible to obtain ``oracle inequalities'' in that setting, for the specific loss function used in the SVM algorithm? We show that the answer to the latter question is positive and provides relevant insight to the former. Our result shows that it is possible to obtain fast rates of convergence for SVMs.
Published in at http://dx.doi.org/10.1214/009053607000000839 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org)
model selection, Classification and discrimination; cluster analysis (statistical aspects), Learning and adaptive systems in artificial intelligence, Mathematics - Statistics Theory, Statistics Theory (math.ST), Classification, oracle inequality, Neural nets and related approaches to inference from stochastic processes, classification, Asymptotic properties of nonparametric inference, FOS: Mathematics, support vector machine, 62G05, Applications of operator theory in probability theory and statistics, 62G05, 62G20 (Primary), Nonparametric estimation, reproducing kernel Hilbert space, 62G20
model selection, Classification and discrimination; cluster analysis (statistical aspects), Learning and adaptive systems in artificial intelligence, Mathematics - Statistics Theory, Statistics Theory (math.ST), Classification, oracle inequality, Neural nets and related approaches to inference from stochastic processes, classification, Asymptotic properties of nonparametric inference, FOS: Mathematics, support vector machine, 62G05, Applications of operator theory in probability theory and statistics, 62G05, 62G20 (Primary), Nonparametric estimation, reproducing kernel Hilbert space, 62G20
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 85 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
