Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Advances in Engineer...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
mEDRA
Article . 2025
Data sources: mEDRA
versions View all 2 versions
addClaim

This Research product is the result of merged Research products in OpenAIRE.

You have already added 0 works in your ORCID record related to the merged Research product.

EPLSC: A New Semi-Supervised Ensemble Spectral Clustering Algorithm Based on The Graph P-Laplacian for Genetic Data

Authors: Garcia, Valeria; Sanchez, Agustina;

EPLSC: A New Semi-Supervised Ensemble Spectral Clustering Algorithm Based on The Graph P-Laplacian for Genetic Data

Abstract

Due to the ever-increasing amount of information and their detailed analysis, the problem of clustering, which is used to reveal hidden patterns in data, is still of great importance. On the other hand, the clustering of important genetic data, which often have high dimensions, faces many limitations using traditional methods. In the current work, a new semi-supervised ensemble spectral clustering (EPLSC) algorithm based on the graph p-Laplacian for genetic data is introduced. In the proposed approach, we first propagate the pairwise must-linked as well as cannot-linked constraints on all data. Then the feature space is randomly split into various unequal subspaces. Using the updated pairwise constraints, semi-supervised spectral clustering is performed in each subspace independently. Then, using the results of each one, an adjacency matrix is created based on ensemble learning. Next, by using several search operators in environments composed of different subspaces, the best set of subspaces is obtained. Experimental validation on 15 high-dimensional genetic datasets demonstrates that EPLSC outperforms existing methods, achieving improvements of up to 18% in Normalized Mutual Information (NMI) and 15% in Adjusted Rand Index (ARI) compared to traditional semi-supervised techniques. This indicates that EPLSC not only enhances clustering efficacy but also effectively addresses the unique challenges posed by genetic data.

Keywords

QA76.75-76.765, high-dimensional data, Mining engineering. Metallurgy, TN1-997, ensemble learning, random subspace, Computer software, semi-supervised, pairwise constraints, clustering

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
gold