
arXiv: 1804.08741
The new estimates of the conditional Shannon entropy are introduced in the framework of the model describing a discrete response variable depending on a vector ofdfactors having a density w.r.t. the Lebesgue measure in ℝd. Namely, the mixed-pair model (X,Y) is considered whereXandYtake values in ℝdand an arbitrary finite set, respectively. Such models include, for instance, the famous logistic regression. In contrast to the well-known Kozachenko–Leonenko estimates of unconditional entropy the proposed estimates are constructed by means of the certain spacial order statistics (ork-nearest neighbor statistics wherek=kndepends on amount of observationsn) and a random number of i.i.d. observations contained in the balls of specified random radii. The asymptotic unbiasedness andL2-consistency of the new estimates are established under simple conditions. The obtained results can be applied to the feature selection problem which is important,e.g., for medical and biological investigations.
Measures of information, entropy, Estimation in multivariate analysis, logistic regression, Shannon entropy, asymptotic unbiasedness, Gaussian model, Mathematics - Statistics Theory, Statistics Theory (math.ST), \(L^p\)-limit theorems, conditional entropy estimates, Asymptotic properties of nonparametric inference, FOS: Mathematics, 60F25, 62G20, 62H12, \(L^2\)-consistency
Measures of information, entropy, Estimation in multivariate analysis, logistic regression, Shannon entropy, asymptotic unbiasedness, Gaussian model, Mathematics - Statistics Theory, Statistics Theory (math.ST), \(L^p\)-limit theorems, conditional entropy estimates, Asymptotic properties of nonparametric inference, FOS: Mathematics, 60F25, 62G20, 62H12, \(L^2\)-consistency
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 10 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
