Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ ZENODOarrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
ZENODO
Dataset . 2025
License: CC BY
Data sources: ZENODO
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
ZENODO
Dataset . 2025
License: CC BY
Data sources: Datacite
versions View all 2 versions
addClaim

MassIVE-KB v1 30 million PSMs training/validation/test splits

Authors: Bittremieux, Wout;

MassIVE-KB v1 30 million PSMs training/validation/test splits

Abstract

The MassIVE-KB data are derived from PSMs used to compile the MassIVE-KB v1 spectral library and consists of approximately 30 million PSMs. The PSMs were obtained by collecting up to the top 100 PSMs for each of the 2,154,269 precursors (as defined by a peptidoform and charge) included in the MassIVE-KB v1 spectral library. The data are split into peptide-disjoint training, validation, and test sets, consisting of: Training: 28,508,636 PSMs for 1,496,701 unique peptidoforms. Validation: 1,000,234 PSMs for 52,379 unique peptidoforms. Test: 996,027 PSMs for 52,399 unique peptidoforms. The dataset was originally compiled through the following steps: On the MassIVE website, go to MassIVE Knowledge Base > Human HCD Spectral Library > All Candidate library spectra > Download. This will give you a zipped TSV file with the metadata and peptide identifications for all 30 million PSMs. Using the filename (column "filename") you can then retrieve the corresponding peak files from the MassIVE FTP server (done using a wget script) and extract the desired spectra using their scan number (column "scan").

Related Organizations
  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average