Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ https://doi.org/10.3...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
https://doi.org/10.3233/faia25...
Part of book or chapter of book . 2025 . Peer-reviewed
License: CC BY NC
Data sources: Crossref
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
mEDRA
Part of book or chapter of book . 2025
Data sources: mEDRA
versions View all 1 versions
addClaim

Automatic Labelling of Topics in Topic Modelling

Authors: Angerri, Xavier; Sacco, Dario; Bosque, Edgar; Gibert, Karina;

Automatic Labelling of Topics in Topic Modelling

Abstract

Data surrounds us, data-driven decisions are becoming increasingly common, and the applications of the KDD (Knowledge Discovery from Data) process proposed by [7] are becoming crucial in the construction of the new digital society. In the KDD process, the main step is Data Mining, where the data-driven models are really trained and built and many data-driven methodologies can be used there to extract knowledge from data [8] [4], among them, also advanced multivariate techniques like Principal Component Analysis (PCA). In the last years it has become clear that a very relevant step in KDD process is the interpretation of data mining results, especially when KDD is applied to real situations where decision making support is intended and explainable AI has become a new central research field. This paper proposes a methodology for automatic elicitation of the latent topic represented in a principal component. The proposal is based on the automatization of the interpretation process that the multivariate experts follow to interpret the factorial components. It is based on the introduction of a machine readable metadata model that describes data, through which semantic elements can be transferred to the machine to be used for the interpretation. The proposal relies on [2], where regular expressions are used to generate automatic verbal descriptions on the interpretation of a PCA axis, based on the relevant contributions of the numerical variables and modalities projected in the factorial map. Nevertheless, one of the main goals of PCA is to elicit latent variables and determine the main topic represented in each dimension [1] and eventhough the proposal of [2] provides a verbal description of the principal components with success, more research is required to abstract the meaning of the principal component through a term or tag sysnthesizing the verbal description. To address this further contribution to the interpretation process, the machine-readable meta-information model developed in [3] is used to elevate the level of conceptualization provided by the system. This metainformation model contains the necessary information to address our goals. The proposal is applied to education, on a dataset containing scores in basic competencies of children in primary and secondary education. A key conclusion is that academic performance and learning progression emerge as the dominant themes in the first factorial plane, and can be automatically identified using the developed methodology.

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
hybrid
Related to Research communities