mixMC is a multivariate framework implemented in
mixOmics
for microbiome data analysis. The framework takes
into account the inherent characteristics of microbiome data, namely
sparsity (a large number of zeros in the data) and compositionality
(occurring naturally in ecosystems, as well as resulting sequencing
artefacts). The mixMC framework aims to identify key
microbial communities associated with their habitat or environment.
mixMC addresses the limitations of existing multivariate methods for microbiome studies and proposes unique analytical capabilities: it handles compositional and sparse data, repeated-measures experiments and multiclass problems. It also highlights important discriminative features, and provides interpretable graphical outputs to better understand the microbial communities’ contribution to each habitat. The framework from our paper is summarised below:
mixMC is a pipeline set up for microbial
communities, using some of the standards methods in
mixOmics
but with a bit of tweaking. The method sPLS-DA has
been improved with CLR (centered log ratio) transformation and includes
a multilevel decomposition for repeated measurements design that are
commonly encountered in microbiome studies. The multilevel approach we
developed in [4] enables the detection of subtle differences when high
inter-subject variability is present due to microbial sampling performed
repeatedly on the same subjects but in multiple habitats. To account for
subject variability, the data variance is decomposed into within
variation (due to habitat) and between subject variation
[5], similar to a within-subjects ANOVA in univariate analyses.