Select your Method

mixOmics includes a range of statistical methods (also called models) which allow you to integrate datasets, identify important variables and make predictions on test data. This guide is to help you select which mixOmics method to use for your data and research question.

💡Tip: Refer to the glossary for unfamiliar terms

Identify what kind of data you have

mixOmics is appropriate for any omics data (e.g. transcriptomics, metabolomics, proteomics, microbiome/metagenomics …) but also non-omics data (e.g. spectral imaging, clinical data). Before choosing your method, answer the following questions to identify some key characteristics from your data:

Do you have one modality or more?

If you have one modality (e.g. transcriptomics), you are working with Single ‘Omics. If you have multiple modalities (e.g. transcriptomics and metabolomics) you need N-integration methods. Some N-integration methods can only integrate two modalities, whilst others can integrated more (multiomic).

Do you have one study or more?

If you have more than one study (e.g. you collected transcriptomics data for 10 samples and then collected transcriptomics data again for another 20 samples), you need to perform P-integration

Do you have an output variable?

If you have an output of interest (e.g. you want to see if transciptomics data can predict disease outcome), you require a supervised method. If your analysis is exploratory without an outcome variable, you need an unsupervised model.

Is your output variable continuous or categorical?

If your output variable is categorical (e.g. classification of samples into groups A, B, C) you are facing a classification problem. If your output variable is continuous (e.g. protein level) you have a regression problem.

After answering these questions, you should now know whether you need a method which is:

Single ‘omics, N-integration (two modalities or more) or P-integration
Supervised (classification or regression) or unsupervised

2. Choose your method

This table provides an overview of the models you can build using mixOmics:

Model	# of omics	# of studies	Integration?	Supervised?
(s)PCA	1	1	No integration	Unsupervised
(s)IPCA	1	1	No integration	Unsupervised
(s)PLS-DA	1 + categorical outcome	1	No integration	Supervised: Classification
(s)PLS	2	1	N-integration	Unsupervised and Supervised: Regression
(r)CCA	2	1	N-integration	Unsupervised
Block (s)PLS	>2	1	N-integration	Supervised: Regression
Block (s)PLS-DA (DIABLO)	>2 + categorical outcome	1	N-integration	Supervised: Classification
(sparse) Generalised CCA	>2	1	N-integration	Unsupervised
MINT PCA	1	>1	P-integration	Unsupervised
MINT (s)PLS-DA	1 + categorical outcome	>1	P-integration	Supervised: Classification
MINT (s)PLS	2	>1	P-integration	Supervised: Regression
MINT block (s)PLS	>2	>1	P-integration	Supervised: Regression
MINT block (s)PLS-DA	>2 + categorical outcome	>1	P-integration	Supervised: Classification

⚙️ Note: Most models have a sparse variant avaliable in mixOmics, e.g. sPCA is the sparse variant of PCA. Sparse models allow you to identify and use only the most important variables.

This decision tree can also be used to aid your model selection:

🦠 Microbiome data: If you are handling microbiome data, you can take advantage of the mixMC framework within the mixOmics package.

3. Learn more about your chosen method

Each mixOmics method has been documented on this website, with examples demonstrating the whole mixOmics workflow for each method in the Case Studies and the Vignette. Use the Resources Overview to help navigate more of our resources.