The mixOmics package

mixOmics is an R package designed for the analysis and visualization of high-throughput data. It includes statistical methods, plotting functions, and example datasets for various biological questions. Here’s a quick guide to what you’ll find inside:

💡Tip: Refer to the glossary for unfamiliar terms

1. Statistical methods to analyse high throughput data

We offer a range of powerful statistical methods for analyzing your data. Here are some key methods:

(s)PCA: (sparse) Principal Component Analysis – Shen and Huang 2008
(s)IPCA: (sparse) Independent Principal Component Analysis – Yao et al. 2012
(r)CCA: (regularized) Canonical Correlation Analysis – Gonzales et al. 2008
(s)PLS: (sparse) Partial Least Squares – articles for regression or canonical deflations
(s)PLS-DA: (sparse) Partial Least Squares Discriminant Analysis – Lê Cao et al. 2011 sPLS-DA
Multilevel decomposition: for repeated measurements – Liquet et al. 2012
mixMC: for 16S multivariate analysis – Lê Cao et al. 2016
MINT: for P-integration – Rohart et al. 2017
DIABLO: for multiblock N-integration – Singh et al. 2019

Learn More: Check out our mixOmics article for an overview of integrative and supervised methods. To identify which method is most suitable for your analysis, use our Select your Method guide.

2. Plotting functions to display and interpret the results

2D and 3D sample plot: to visualise sample relationships
Arrow plots: to visualise paired coordinates
Relevance Network Graphs:to visualise associations between variables
Clustered Image Maps:heatmaps of expression values or correlation between variables
Correlation circle plots:correlation of variables to latent components
Circos plots: for DIABLO analyses, visualise how modalities relate to each other – Singh et al. 2019
Loading plots:to visualise variable importance – Lê Cao et al. 2016

Learn More: Check out our González et al. 2012 for an overview of network graphs, clustered image maps and correlation circle plots

3. Example data sets

To help you get started, we provide a variety of datasets illustrating different biological questions, some of these are used in the case studies.

Data	Integration type	Omics	Samples	Groups	Case study	Publication
Multidrug	Single omics	Transporter expression, drug activity	60	7 cell lines	sPCA, sPCA case study	Szakács et al., 2004
SRBCT	Single omics	mRNA	63	4 tumour classes	Performance assessment and parameter tuning, sPLS-DA, sPLS-DA case study	Khan et al, 2001
vac18	Single omics	mRNA	42	4 stimulation groups, multilevel	Multilevel, Multilevel case study	Salmon-Céron et al. 2010
liver.toxicity	N-integration- two omics	mRNA, clinical data	64	4 treatment doses, 4 treatment times	Missing Values, Parameter tuning, sIPCA , sIPCA case study, sPLS, sPLS case study	Bushel et al., 2007
nutrimouse	N-integration-two omics	mRNA, lipid data	40	4 diet groups, 2 genotypes	rCCA, rCCA case study	Martin et al., 2007
breast.TGCA	N-integration-multiomics	miRNA, mRNA, protein	150 training, 70 test	3 cancer subtypes	DIABLO, DIABLO case study	Network et al., 2012
stemcells	P-integration	mRNA	125	3 cell lines, 4 studies	MINT, MINT case study
diverse.16S	Single omics	microbiome	162	3 body sites	mixMC case study	Human Microbiome Project 16S dataset
koren.16S	Single omics	microbiome	43	3 body sites	mixMC pre-processing, mixMC case study	Koren et al. 2013