
Bottom-up mass spectrometry (MS) supports disease research through the detection of protein isoforms. Recently, personalized proteomes have enabled more sensitive MS searches. Long-read RNA-seq (lrRNA-seq) data can be leveraged to create a sample-specific proteome in proteogenomics approaches to integrate genomic variation and alternative splicing events. We benchmark several algorithms for variant phasing on PacBio lrRNA-seq data and show that incorporating lrRNA-seq-based phased variants can increase peptide and protein isoform detection within MS-based searches. For this purpose, we develop a pipeline that constructs haplotype-resolved sample-specific proteomes, followed by MS search and annotation. Our workflow can be applied to samples containing matched MS and lrRNA-seq. We apply our workflow on a WTC11 sample and a ten-day osteoblast differentiation, highlighting the applicability of our work for both singular samples and more complicated experimental designs. We show that searching against sample-specific haplotype-resolved proteomes enables better detection and characterization of protein isoforms and supports the detection of linked variants. Consistent with previous work, genetic variation was consistently a much greater contributor to proteomic complexity than alternative splicing in our considered WTC11 sample. Our open-source Snakemake pipeline strives to support research and applications of haplotype-resolved MS searching based on lrRNA-seq data.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
