Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Software
Data sources: ZENODO
addClaim

HOPPred – Prediction of peptide hormones using an ensemble of machine learning and similarity‑based methods

Authors: Dashleen Kaur; Arora, Akanksha; Raghava, Gajendra;

HOPPred – Prediction of peptide hormones using an ensemble of machine learning and similarity‑based methods

Abstract

Title:HOPPred Dataset – Experimentally validated peptide hormones and non‑hormonal peptides Description: Project: HOPPred – Prediction of peptide hormones using an ensemble of machine learning and similarity‑based methods Publication: Kaur, D., Arora, A., Vigneshwar, P., & Raghava, G.P.S. (2024). Prediction of peptide hormones using an ensemble of machine learning and similarity‑based methods. Proteomics, 24, e2400004. https://doi.org/10.1002/pmic.202400004 Overview: This dataset accompanies HOPPred, the first computational tool for predicting peptide hormones. Peptide hormones are genome‑encoded signal transduction molecules essential for regulating growth, development, and homeostasis; their dysregulation leads to endocrine disorders (e.g., diabetes, neoplasia). The dataset is curated from Hmrbase2 and other sources, balanced (1,174 hormonal + 1,174 non‑hormonal peptides), and redundancy‑reduced (CD‑HIT at 90% similarity). Content: Dataset Peptides Hormonal (positive) 1,174 Non‑hormonal (negative) 1,174 Total 2,348 Key Findings – Compositional analysis (hormonal peptides enriched in): Cysteine (C), Aspartic acid (D), Phenylalanine (F), Glycine (G), Arginine (R), Serine (S), Asparagine (N), Proline (P), Tyrosine (Y) – statistically significant (Mann‑Whitney U, p < 0.05) Non‑hormonal enriched in: Glutamic acid (E), Isoleucine (I), Leucine (L), Methionine (M), Glutamine (Q), Lysine (K), Threonine (T), Valine (V) Exclusive motifs in hormonal peptides (MERCI): FGPR, WFGP, WFGPR, FGPRL, GPRL, WFGP, MWFGPRL, LCGS (LCGS is known motif in Insulin chain B) Best Model Performance (validation set – 20% held out): Model AUC MCC Accuracy Sensitivity Specificity Ensemble (LR + Motif + BLAST) 0.96 0.80 89.8% 90.1% 89.5% LR (ML alone – top 50 features) 0.93 0.72 86.0% 85.3% 86.6% TextCNN (DL) 0.90 0.67 83.0% 87.0% 79.0% RF (ML – top 50 features) 0.90 0.64 82.1% 80.2% 84.0% TabNet (DL) 0.75 0.57 74.0% 73.0% 75.0% Top features align with motifs: DPC1_CF (Cys‑Phe), TPC_FRP (Phe‑Arg‑Pro), TPC_GNF (Gly‑Asn‑Phe), TPC_LMG (Leu‑Met‑Gly), TPC_RGL (Arg‑Gly‑Leu) – overlapping with motifs FGPR, WFGPRL, etc., confirming biological relevance. Data Curation & Quality Control: Source: Hmrbase2 (hormone database) + PeptideAtlas + UniProt/Swiss‑Prot Redundancy reduction: CD‑HIT at 90% sequence identity Negative set: Randomly selected from Swiss‑Prot excluding known hormones Train/validation split: 80/20 (5‑fold CV on training) Feature selection: RFE (Recursive Feature Elimination) with Logistic Regression as estimator Usage: Predicting peptide hormones from sequence, designing novel hormone peptides (Design module), scanning protein sequences for hormone regions (Protein Scan module), identifying hormone‑associated motifs, developing peptide‑based therapeutics and endocrine disorder treatments. Related Resources: Web server: https://webs.iiitd.edu.in/raghava/hoppred/ | GitHub: https://github.com/raghavagps/HOPPRED Contact: raghava@iiitd.ac.in (Gajendra P. S. Raghava)

Powered by OpenAIRE graph
Found an issue? Give us feedback