N-Terminal-oriented proteogenomics of the marine bacterium Roseobacter Denitrificans Och114 using N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPP) labeling and diagonal chromatography.
Other literature type, Article
Hartmann, E. M.
Christie-Oleza, Joseph Alexander
- Publisher: American Society for Biochemistry and Molecular Biology
Technological Innovation and Resources | QH301
Given the ease of whole genome sequencing with next-generation sequencers, structural and functional gene annotation is now purely based on automated prediction. However, errors in gene structure are frequent, the correct determination of start codons being one of the main concerns. Here, we combine protein N termini derivatization using (N-Succinimidyloxycarbonylmethyl)tris(2,4,6-trimethoxyphenyl)phosphonium bromide (TMPP Ac-OSu) as a labeling reagent with the COmbined FRActional DIagonal Chromatography (COFRADIC) sorting method to enrich labeled N-terminal peptides for mass spectrometry detection. Protein digestion was performed in parallel with three proteases to obtain a reliable automatic validation of protein N termini. The analysis of these N-terminal enriched fractions by high-resolution tandem mass spectrometry allowed the annotation refinement of 534 proteins of the model marine bacterium Roseobacter denitrificans OCh114. This study is especially efficient regarding mass spectrometry analytical time. From the 534 validated N termini, 480 confirmed existing gene annotations, 41 highlighted erroneous start codon annotations, five revealed totally new mis-annotated genes; the mass spectrometry data also suggested the existence of multiple start sites for eight different genes, a result that challenges the current view of protein translation initiation. Finally, we identified several proteins for which classical genome homology-driven annotation was inconsistent, questioning the validity of automatic annotation pipelines and emphasizing the need for complementary proteomic data. All data have been deposited to the ProteomeXchange with identifier PXD000337.