Methods
The mixOmics
package provides several methodologies that can answer a variety of biological questions. Below are listed some typical analysis frameworks.
When referring to the expression or abundance of entities that are measured, the term variable will be used. Samples, instances or observations will be used when referring to the unit (individual, patient or cell) on which the experiment was performed.
In mixOmics
, the data should be formatted with samples in rows and variables in columns.
The mixOmics Philosophy
The multivariate statistical methods implemented in mixOmics
aim at summarising the main characteristics of the data while capturing the largest sources of variation in the data. Multivariate methods are mostly considered to be 'exploratory' methods as they do not enable statistical inference. However, the biological question matters in order to apply the most suitable multivariate method. Go to Selecting the Method for assistance in choosing with methodology provided by mixOmics
is suitable for your problem.
Methods in MixOmics
In mixOmics
we propose a whole range of multivariate methods that were developed and validated on many biological studies. Selecting which of these methods to use depends on the type and quantity of data being analysed.
Exploring a single data set
If one is looking to get a better understanding of the structure of a single omics dataset (eg. transcriptomics data), these methods are applicable:
-
Principal Component Analysis (PCA)
-
sparse Principal Component Analysis (sPCA)
-
Independent Principal Component Analysis (IPCA)
-
sparse Independent Principal Component Analysis (sIPCA)
Classification
If one is looking to classify different groups of novel samples according to a discrete outcome, these methods will generate appropriate models:
Integration of two datasets
If one is looking to observe how multiple datasets (e.g. transcriptomics and proteomics data) relate to one another, as well as how they can be used to predict one another, the following methods will be useful:
-
Canonical Correlation Analysis (CCA)
-
regularized Canonical Correlation Analysis (rCCA)
-
Partial Least Squares (PLS)
-
sparse Partial Least Squares (sPLS)
N-Integration
If one is looking to integrate more than two datasets measured across the same \(N\) samples, mixOmics
contains both supervised and unsupervised methods.
-
Multiblock PLS (Multiblock PLS)
-
Multiblock sPLS (Multiblock sPLS)
-
Multiblock (s)PLS-DA – DIABLO (DIABLO)
Properties of mixOmics
Missing values
All multivariate methods in mixOmics
can be performed with missing values. Refer to their specific pages to determine if a given method cannot handle missing values. These use the NIPALS algorithm. Refer to Missing Values for further information.
Analysing repeated measurement or a cross-over design
For studies using a repeated measurement methodology, mixOmics
has an incorporated multilevel functionality. Most functions within the package contain a multilevel
parameter which handles multiple samples from the same individual. This can also be achieved through the withinVariation()
function. Refer to Multilevel for further information.