
The MassIVE-KB data are derived from PSMs used to compile the MassIVE-KB v1 spectral library and consists of approximately 30 million PSMs. The PSMs were obtained by collecting up to the top 100 PSMs for each of the 2,154,269 precursors (as defined by a peptidoform and charge) included in the MassIVE-KB v1 spectral library. The data are split into peptide-disjoint training, validation, and test sets, consisting of: Training: 28,508,636 PSMs for 1,496,701 unique peptidoforms. Validation: 1,000,234 PSMs for 52,379 unique peptidoforms. Test: 996,027 PSMs for 52,399 unique peptidoforms. The dataset was originally compiled through the following steps: On the MassIVE website, go to MassIVE Knowledge Base > Human HCD Spectral Library > All Candidate library spectra > Download. This will give you a zipped TSV file with the metadata and peptide identifications for all 30 million PSMs. Using the filename (column "filename") you can then retrieve the corresponding peak files from the MassIVE FTP server (done using a wget script) and extract the desired spectra using their scan number (column "scan").
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
