mixOmics

mixOmics is collaborative project developed by the mixOmics team and several key collaborators. The project started at the Institut de Mathématiques de Toulouse, Université Paul Sabatier, Toulouse, France and was then further extended at the University of Queensland, Brisbane, Australia.

Why multivariate methods?

It is now generally admitted that single `omics analysis does not provide enough information to give a deep understanding of a biological system, but we can obtain a more holistic view of a system by combining multiple omics analyses. Our mixOmics R package proposes a whole range of multivariate methods that we developed and validated on many biological studies to gain more insight into ‘omics biological studies.

mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection

Multivariate methods are well suited to large ‘omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (‘components’), which are defined as combination of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of  the relationships and correlation structure between the different data sets that are integrated. We have developed several sparse multivariate models to identify the key variables that are highly correlated, and/or explain the biological outcome of interest. The identified variables are then more amenable to statistical inference and the generation of novel biological hypotheses.

Which type of data?

The data analysed with mixOmics may come from high throughput sequencing technologies, such as ‘omics data (transcriptomics, metabolomics, proteomics, metagenomics …) but also beyond the realm of ‘omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data.

About this website

This website gives a full tutorial introduction to the main mixOmics features and illustrate full multivariates analyses on some case studies. Click on the different tabs to see all options available.

Any questions or feedback? Contact us here.

mixOmics is under active development as we focus on the development of novel multivariate methods to address pressing needs for omics data integration. Register to our mailing list to make sure you are on top of the game with our latest version, or have a look at the NEWS posts.

Workshops

We also run regular 2 and 3-day workshops in Australia and in Europe. Have a look at our upcoming workshops and don’t hesitate to ask for more.

The mixOmics framework today

framework-mixOmics-June2016

Summary of the current methods implemented in mixOmics version 6.0.0 (June 2016)

The R package and key references

The mixOmics R package is organised into three main parts:

  1. Statistical methodologies to analyze high throughput data
  2. Graphical outputs to display the results and improve interpretation
    • 2D and 3D sample plots, with confidence ellipses
    • Relevance Networks (see article)
    • Clustered Image Maps (heatmaps, see article)
    • Correlation circle plots(see article)
    • NEW Arrow plots
    • NEW Circos plots for DIABLO analyses (see reference here)
    • NEW Loading plots (first used here)
  3. Example data sets
    • breast.tumor (gene expression data, with missing data)
    • linnerud: very small data set
    • liver.toxicity (gene expression and clinical data)
    • multidrug (ABC transporters and compounds)
    • nutrimouse (gene expression and fatty acids data)
    • srbct (gene expression data)
    • yeast (metabolites data)
    • vac18 and vac18.simulated for multilevel analyses
    • NEW diverse.16S and Koren.16S for mixMC 16S analyses (similar to that paper)
    • NEW breast.TCGA for DIABLO horizontal multiple integration analyses
    • NEW stemcells for MINT vertical multiple integration analyses