
Title:Pfeature – A comprehensive tool for computing protein/peptide features and building prediction models Description: Project: Pfeature – A tool for computing wide range of protein features and building prediction models Publication: Pande, A., Patiyal, S., Lathwal, A., Arora, C., Kaur, D., Dhall, A., Mishra, G., Kaur, H., Sharma, N., Jain, S., Usmani, S.S., Agrawal, P., Kumar, R., Kumar, V., & Raghava, G.P.S. (2023). Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models. Journal of Computational Biology, 30(2), 204–222. https://doi.org/10.1089/cmb.2022.0241 Overview: Pfeature is a comprehensive software platform for computing a wide range of protein/peptide features (>200,000 descriptors) and building machine learning prediction models. It addresses limitations of existing tools by integrating novel features (Shannon entropy, residue repeats, distance distribution, atom/bond composition) and supporting chemically modified peptides (structural descriptors). The tool is available as a web server, Python library, and standalone package. Key Modules: Module Description Composition AAC, DPC, TPC, atom/bond composition, AAIndex, autocorrelation, entropy, repeats, PSSM‑400 Binary Profiles Amino acid, dipeptide, property, AAIndex, atom/bond profiles (residue‑level annotation) Evolutionary Info PSSM profile generation (raw + 4 normalization methods) Structural Descriptors Fingerprints (14,532), SMILES, surface accessibility, secondary structure (for chemically modified peptides) Patterns Overlapping windows, terminal regions (N‑term, C‑term, split, SAAP) Model Building Feature merging, selection (mRMR, etc.), normalization, classification (RF, ET, XGB, SVC, etc.), 5‑fold CV Feature Comparison – Unique to Pfeature: Shannon entropy (protein + residue level) Distance distribution of residues Residue repeats (homo‑ and hetero‑repeats) Physicochemical property repeats Atom and bond composition Dipeptide binary profiles AAIndex binary profiles Structural descriptors for chemically modified peptides Total descriptors (whole protein, λ=5): 11,879 (protein level) + terminal/split regions → 95,137 total Usage: Computing protein/peptide features for classification/regression, residue‑level annotation (secondary structure, binding sites), chemically modified peptide analysis (FDA‑approved therapeutics), model building without programming expertise. Case Studies Citing Pfeature: IL6pred, IL13pred, AlgPred2.0, HLAncPred, ABCRpred SARS‑CoV‑2 ACE2 receptor analysis (Hassan et al., 2020) Amyloid protein prediction (Sofi & ArifWani, 2021) Metagenomic biocatalyst discovery (Shahraki et al., 2022) Related Resources: Web server: https://webs.iiitd.edu.in/raghava/pfeature/ | GitHub: https://github.com/raghavagps/Pfeature Contact: raghava@iiitd.ac.in (Gajendra P. S. Raghava)
