Downloads provided by UsageCounts
Many information tasks involve objects that are explicitly or implicitly connected in a network (or graph), such as webpages connected by hyperlinks or people linked by “friendships” in a social network. Research on link-based classification (LBC) has shown how to leverage these connections to improve classification accuracy. Unfortunately, acquiring a sufficient number of labeled examples to enable accurate learning for LBC can often be expensive or impractical. In response, some recent work has proposed the use of active learning, where the LBC method can intelligently select a limited set of additional labels to acquire, so as to reduce the overall cost of learning a model with sufficient accuracy. This work, however, has produced conflicting results and has not considered recent progress for LBC inference and semi-supervised learning. In this paper, we evaluate multiple prior methods for active learning and demonstrate that none consistently improve upon random guessing. We then introduce two new methods that both seek to improve active learning by leveraging the link structure to identify nodes to acquire that are more representative of the underlying data. We show that both approaches have some merit, but that one method, by proactively acquiring nodes so as to produce a more representative distribution of known labels, often leads to significant accuracy increases with minimal computational cost.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 4 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 3 | |
| downloads | 6 |

Views provided by UsageCounts
Downloads provided by UsageCounts