
doi: 10.1007/s11192-025-05418-8 , 10.48550/arxiv.2407.13329 , 10.5281/zenodo.11841798 , 10.5281/zenodo.15011985
arXiv: 2407.13329
handle: 11585/1027454
doi: 10.1007/s11192-025-05418-8 , 10.48550/arxiv.2407.13329 , 10.5281/zenodo.11841798 , 10.5281/zenodo.15011985
arXiv: 2407.13329
handle: 11585/1027454
Abstract Understanding the motivations underlying scholarly citations is essential to evaluate research impact and promote transparent scholarly communication. This study introduces CiteFusion , an ensemble framework designed to address the multi-class Citation Intent Classification task on two benchmark datasets: SciCite and ACL-ARC. The framework employs a one-vs-all decomposition of the multi-class task into class-specific binary subtasks, leveraging complementary pairs of SciBERT and XLNet models, independently tuned, for each citation intent. The outputs of these base models are aggregated through a feedforward neural network meta-classifier to reconstruct the original classification task. To enhance interpretability, SHAP ( SHapley Additive exPlanations ) is employed to analyze token-level contributions, and interactions among base models, providing transparency into the classification dynamics of CiteFusion, and insights about the kind of misclassifications of the ensemble. In addition, this work investigates the semantic role of structural context by incorporating section titles, as framing devices, into input sentences, assessing their positive impact on classification accuracy. CiteFusion ultimately demonstrates robust performance in imbalanced and data-scarce scenarios: experimental results show that CiteFusion achieves state-of-the-art performance, with Macro-F1 scores of 89.60% on SciCite, and 76.24% on ACL-ARC. Furthermore, to ensure interoperability and reusability, citation intents from both datasets schemas are mapped to Citation Typing Ontology (CiTO) object properties, highlighting some overlaps. Finally, we describe and release a web-based application that classifies citation intents leveraging the CiteFusion models developed on SciCite.
FOS: Computer and information sciences, Citation Intent Classification; Ensemble Strategies; Explainable AI; Language Models, Explainable AI, Citation Intent Classification, Computation and Language, Ensemble Strategies, Language Models, Computation and Language (cs.CL)
FOS: Computer and information sciences, Citation Intent Classification; Ensemble Strategies; Explainable AI; Language Models, Explainable AI, Citation Intent Classification, Computation and Language, Ensemble Strategies, Language Models, Computation and Language (cs.CL)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
