descriptionPublicationkeyboard_double_arrow_right Article , Other literature type 11 Jun 2016 English Publisher:Oxford University Press (OUP)Journal:Bioinformatics, volume 32, pages i28-i36 (issn: 1367-4803, eissn: 1367-4811,

Authors: Shen, Huibin; Dührkop, Kai; d'Alché-Buc, Florence; Böcker, Sebastian; Rousu; Juho; Brouard, Celine;

doi: 10.1093/bioinformatics/btw246

pmid: 27307628

pmc: PMC4908330

Fast metabolite identification with Input Output Kernel Regression

- Summary
- Subjects
- Related research
  (1)
- Metrics

Abstract

AbstractMotivation: An important problematic of metabolomics is to identify metabolites using tandem mass spectrometry data. Machine learning methods have been proposed recently to solve this problem by predicting molecular fingerprint vectors and matching these fingerprints against existing molecular structure databases. In this work we propose to address the metabolite identification problem using a structured output prediction approach. This type of approach is not limited to vector output space and can handle structured output space such as the molecule space.Results: We use the Input Output Kernel Regression method to learn the mapping between tandem mass spectra and molecular structures. The principle of this method is to encode the similarities in the input (spectra) space and the similarities in the output (molecule) space using two kernel functions. This method approximates the spectra-molecule mapping in two phases. The first phase corresponds to a regression problem from the input space to the feature space associated to the output kernel. The second phase is a preimage problem, consisting in mapping back the predicted output feature vectors to the molecule space. We show that our approach achieves state-of-the-art accuracy in metabolite identification. Moreover, our method has the advantage of decreasing the running times for the training step and the test step by several orders of magnitude over the preceding methods.Availability and implementation :Contact: celine.brouard@aalto.fiSupplementary information: Supplementary data are available at Bioinformatics online.

Related Organizations

View all View all

Keywords

ta113, Molecular Structure, [SDV]Life Sciences [q-bio], Computational Biology, 004, [SDV] Life Sciences [q-bio], Machine Learning, Tandem Mass Spectrometry, Metabolomics, Ismb 2016 Proceedings July 8 to July 12, 2016, Orlando, Florida, Algorithms, Databases, Chemical

1 Research products, page 1 of 1

Data Used In "Fast Metabolite Identification With Input Output Kernel Regression"
2017IsSupplementedBy

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	69
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Top 1%
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Top 10%
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Top 10%