
LASSO is known to have a problem of excessive shrinkage at a sparse representation. To analyze this problem in detail, in this paper, we consider a positive scaling for soft-thresholding estimators that are LASSO estimators in an orthogonal regression problem. We especially consider a non-parametric orthogonal regression problem which includes wavelet denosing. We first gave a risk (generalization error) of LARS (least angle regression) based soft-thresholding with a single scaling parameter. We then showed that an optimal scaling value that minimizes the risk under a sparseness condition is 1 + O ( log n / n ) , where n is the number of samples. The important point is that the optimal value of scaling is larger than one. This implies that expanding soft-thresholding estimator shows a better generalization performance compared to a naive soft-thresholding. This also implies that a risk of LARS-based soft-thresholding with the optimal scaling is smaller than without scaling. We then showed their difference is O ( log n / n ) . This also shows an effectiveness of the introduction of scaling. Through simple numerical experiments, we found that LARS-based soft-thresholding with scaling can improve both of sparsity and generalization performance compared to a naive soft-thresholding.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 5 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
