
High-throughput metabolomics investigations, when conducted in large human cohorts, represent a potentially powerful tool for elucidating the biochemical diversity underlying human health and disease. Large-scale metabolomics data sources, generated using either targeted or nontargeted platforms, are becoming more common. Appropriate statistical analysis of these complex high-dimensional data will be critical for extracting meaningful results from such large-scale human metabolomics studies. Therefore, we consider the statistical analytical approaches that have been employed in prior human metabolomics studies. Based on the lessons learned and collective experience to date in the field, we offer a step-by-step framework for pursuing statistical analyses of cohort-based human metabolomics data, with a focus on feature selection. We discuss the range of options and approaches that may be employed at each stage of data management, analysis, and interpretation and offer guidance on the analytical decisions that need to be considered over the course of implementing a data analysis workflow. Certain pervasive analytical challenges facing the field warrant ongoing focused research. Addressing these challenges, particularly those related to analyzing human metabolomics data, will allow for more standardization of as well as advances in how research in the field is practiced. In turn, such major analytical advances will lead to substantial improvements in the overall contributions of human metabolomics investigations.
Biomedical and Clinical Sciences, Clinical Sciences, ta1182, 610, Review, Medical Biochemistry and Metabolomics, Microbiology, QR1-502, Analytical Chemistry, high-dimensional data, large-scale metabolomics, 2.5 Research design and methodologies (aetiology), Biochemistry and cell biology, Chemical Sciences, statistical methods, Medical biochemistry and metabolomics, Generic health relevance, Biochemistry and Cell Biology, Aetiology, Analytical chemistry
Biomedical and Clinical Sciences, Clinical Sciences, ta1182, 610, Review, Medical Biochemistry and Metabolomics, Microbiology, QR1-502, Analytical Chemistry, high-dimensional data, large-scale metabolomics, 2.5 Research design and methodologies (aetiology), Biochemistry and cell biology, Chemical Sciences, statistical methods, Medical biochemistry and metabolomics, Generic health relevance, Biochemistry and Cell Biology, Aetiology, Analytical chemistry
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 70 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 1% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 1% |
