
In machine learning and data mining, dimensionality reduction is one of the main tasks. Linear Discriminant Analysis (LDA) is a widely used supervised dimensionality reduction algorithm and it has attracted a lot of research interests. Classical Linear Discriminant Analysis finds a subspace to minimize within-class distance and maximize between-class distance, where between-class distance is computed using arithmetic mean of all between-class distances. However, arithmetic mean between-class distance has some limitations. First, arithmetic mean gives equal weight to all between-class distances, and large between-class distance could dominate the result. Second, it does not consider pairwise between-class distance and thus some classes may overlap with each other in the subspace. In this paper, we propose two formulations of harmonic mean based Linear Discriminant Analysis: HLDA and HLDAp, to demonstrate the benefit of harmonic mean between-class distance and overcome the limitations of classical LDA. We compare our algorithm with 11 existing single-label algorithms on seven datasets and five existing multi-label algorithms on two datasets. On some single-label experiment data, the classification accuracy absolute percentage increase can reach 39 percent compared to state-of-art existing algorithms; on multi-label data, significant improvement on five evaluation metric has been achieved compared to existing algorithms.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 20 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
