mixOmics

mixOmics is collaborative project developed by the mixOmics team and several key collaborators. The project started at the Institut de Mathématiques de Toulouse, Université Paul Sabatier, Toulouse, France and was then further extended at the University of Queensland, Brisbane, Australia.

Why multivariate methods?

It is now generally admitted tha single `omics analysis does not provide enough information to give a deep understanding of a biological system. We can obtain a more precise picture of a system by combining multiple omics analyses. In the mixOmics R package we propose a whole range of multivariate methods that we developed and validated on many biological studies.

mixOmics offers a wide range of multivariate methods for the exploration and integration of biological datasets with a particular focus on variable selection

Multivariate methods are well suited to large ‘omics data sets where the number of variables (e.g. genes, proteins, metabolites) is much larger than the number of samples (patients, cells, mice). They have the appealing properties of reducing the dimension of the data by using instrumental variables (‘components’), which are defined as combination of all variables. Those components are then used to produce useful graphical outputs that enable better understanding of  the relationships and correlation structure between the different data sets that are integrated. We have further developed sparse multivariate models to identify the key variables that are highly correlated, or explain the biological outcome of interest. The identified variables are then more amenable to statistical inference and the generation of novel biological hypotheses.

Which type of data?

The data we analyse may come from high throughput sequencing technologies, such as omics data (transcriptomics, metabolomics, proteomics, metagenomics …) but also beyond the realm of ‘omics (e.g. spectral imaging). The methods implemented in mixOmics can also handle missing values without having to delete entire rows with missing data.

About this website

This website gives a full tutorial introduction to the main mixOmics features and illustrate full multivariates analyses on some case studies. Click on the different tabs to see all options available.

Any questions or feedback? Contact us here.

mixOmics is under active development as we implement more methods. Register to our mailing list to make sure you are on top of the game with our latest version.

Workshops

We also run regular 2 and 3-day workshops in Australia and in Europe. Have a look at our upcoming workshops and don’t hesitate to ask for more.

The mixOmics framework today

framework-mixOmics-June2016

Summary of the current methods implemented in mixOmics version 6.0.0 (June 2016)

The mixOmics R package is organised into three main parts:

  1. Statistical methodologies to analyze high throughput data
  2. Graphical outputs to display the results and improve interpretation
    • 2D and 3D sample plots, with confidence ellipses
    • Relevance Networks (see article)
    • Clustered Image Maps (heatmaps, see article)
    • Correlation circle plots(see article)
    • NEW Arrow plots
    • NEW Circos plots for DIABLO analyses
    • NEW Loading plots
  3. Example data sets
    • breast.tumor (gene expression data, with missing data)
    • linnerud: very small data set
    • liver.toxicity (gene expression and clinical data)
    • multidrug (ABC transporters and compounds)
    • nutrimouse (gene expression and fatty acids data)
    • srbct (gene expression data)
    • yeast (metabolites data)
    • vac18 and vac18.simulated for multilevel analyses
    • NEW diverse.16S and Koren.16S for mixMC 16S analyses
    • NEW breast.TCGA for DIABLO horizontal multiple integration analyses
    • NEW stemcells for MINT vertical multiple integration analyses