
arXiv: 1503.05077
This paper presents an adaptive version of the Hill estimator based on Lespki's model selection method. This simple data-driven index selection method is shown to satisfy an oracle inequality and is checked to achieve the lower bound recently derived by Carpentier and Kim. In order to establish the oracle inequality, we derive non-asymptotic variance bounds and concentration inequalities for Hill estimators. These concentration inequalities are derived from Talagrand's concentration inequality for smooth functions of independent exponentially distributed random variables combined with three tools of Extreme Value Theory: the quantile transform, Karamata's representation of slowly varying functions, and Rényi's characterisation of the order statistics of exponential samples. The performance of this computationally and conceptually simple method is illustrated using Monte-Carlo simulations.
62G30, [MATH.MATH-PR] Mathematics [math]/Probability [math.PR], 60G70, Statistics of extreme values; tail inference, Lepski’s method, Mathematics - Statistics Theory, Hill estimator, Statistics Theory (math.ST), adaptivity, Extreme value theory; extremal stochastic processes, order statistics, concentration inequalities, FOS: Mathematics, Inequalities; stochastic orderings, Order statistics; empirical distribution functions, Lepski's method, 60E15, 62G32
62G30, [MATH.MATH-PR] Mathematics [math]/Probability [math.PR], 60G70, Statistics of extreme values; tail inference, Lepski’s method, Mathematics - Statistics Theory, Hill estimator, Statistics Theory (math.ST), adaptivity, Extreme value theory; extremal stochastic processes, order statistics, concentration inequalities, FOS: Mathematics, Inequalities; stochastic orderings, Order statistics; empirical distribution functions, Lepski's method, 60E15, 62G32
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 10 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
