
Many big data applications require accurate classification of objects into one of possibly thousands or millions of categories. Such classification tasks are challenging due to issues such as class imbalance, high testing cost, and model interpretability problems. To overcome these challenges, we propose a novel hierarchical learning method known as MF-Tree to efficiently classify data sets with large number of classes while simultaneously inducing a taxonomy structure that captures relationships among the classes. Unlike many other existing hierarchical learning methods, our approach is designed to optimize a global objective function. We demonstrate the equivalence between our proposed regularized loss function and the Hilbert-Schmidt Independence Criterion (HSIC). The latter has a nice additive property, which allows us to decompose the multi-class learning problem into hierarchical binary classification tasks. To improve its training efficiency, an approximate algorithm for inducing MF-Tree is also proposed. We performed extensive experiments to compare MF-Tree against several state-of-the-art algorithms and showed both its effectiveness and efficiency when applied to real-world data sets.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 4 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
