
doi: 10.1002/cem.3232
handle: 2078.1/230905 , 2078.1/219772
AbstractNowadays, life science experiments—and especially “omics” fields—often imply a high volume of information from high throughput technologies that is gathered in the form of a wide and short multivariate response. These data are intrinsically correlated and generally produced by another multivariate set of factors or continuous variables, collected in what is defined as the design matrix. Such design factors usually involve the presence of a treatment, but other sources of biological or technical variability in the data are often measured as well. The ASCA framework, based on ANOVA and PCA, leads to promising results. By combining dimension reduction projection methods and classic statistical modelling, it enables to decipher the main sources of variability in the produced response and offers attractive graphical representations of the factors' effect. However, this approach has not yet been extended to more advanced designs involving random factors, being typically involved in longitudinal, hierarchical, or repeatability/reproducibility studies. This paper has its roots in the GLM version of ASCA, called ASCA+, that leads to unbiased estimators of the factors' effects for unbalanced data. It is here extended by replacing GLM by LMM and adapting the methodology. Taking into account the error structure of the data indeed leads to more accurate data modelling and more generalisable results. The suggested methodology is applied to two experimental case studies that highlight the benefits of this approach as it leads to a refined data analysis with interesting inferential properties, while keeping the powerful visualisation outputs produced by ASCA.
PCA, ASCA, random effects, chemometrics, linear mixed models
PCA, ASCA, random effects, chemometrics, linear mixed models
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 31 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
