News – Page 3 – mixOmics

Feb 3-5 2020, Perth, AUS (beginner, omics and microbiome)

We had a great group of participants at UWA and were hosted in the beautiful Forrest Hall facility from the Forrest Research Foundation.

Some feedback from the participants:

‘I liked that simple concepts were explained, to facilitate understanding of the complex ones. It is always nice to recap on the basics’

‘[It] Provided me with novel insights for my research and how to approach and interpret the data from an integrated statistical/computational and biological point of view.’

‘I liked you took some time to explain the concept behind the statistical tool, useful examples to clarify the concepts. After this workshop, it is going to be much easier to make better decisions about future experimental designs (or recommendations for other experiments) and the application of specific statistical tool as MixOmics. I will definitely apply mixMC to the data I am preparing for a publication. I appreciate you granted me the bursary. This workshop has uplifted my microbiome-applied statistical skills.’

‘This was a great workshop and it exceed my expectations’

I could complete all the exercises based on the material provided (which was great, with very clear instructions!). Most importantly, I acquired a better understanding of how I should approach my data in the future when my new project is in a more advanced stage. In other words, I now have a better idea of the packages and tools available and what to focus on in terms of skills development. I am also feeling a lot more confident to interpret complex plots generated in these multivariate analyses and make further decisions based on them and my research question.

Context. Advances in high-throughput technologies have transformed the way we examine molecular information, including microbial communities. However, analytical tool development is critically trailing behind data generation, which hinders the analysis, understanding or integration of microbiome data with other types of molecular data. Data integration adopt a holistic, data-driven and hypothesis-free approach. This new approach is necessary to understand the role of biological systems and posit new hypotheses.

The workshop will introduce concepts of multivariate dimension methods developed in mixOmics for statistical analysis. Our methods make no distributional assumptions, are highly flexible for unsupervised (exploratory), supervised (classification) and integration analyses. Various analytical frameworks will be presented ranging from data exploration, selection of markers, integration with other omics datasets and introduction to time-course analysis.

Each methodology will be illustrated on real biological studies. The third day is ‘BYO data’ day where you can reinforce your learnings on your own study! The workshop is not limited to microbiome data only, as we will cover general omics data integration concepts with appropriate case studies if the need arise.

Instructor: Dr Kim-Anh Lê Cao;Tutor: TBA

Organized and hosted by: West Australian Heath Translational Network (WAHTN) and WA Human Microbiome Collaborating Centre (WAHMCC), Curtin University.

Fees for 3 days are AUD450+GST for RHD students, AUD750+GST for research non-profit organisations (Universities and CSIRO) and AUD1200+GST for industry. The West Australian Heath Translational Network generously sponsors registration bursaries ($225 to support 50% of the registration costs) to 4 RHD students. Apply at the EOI survey link below.

Registrations fees include coffee breaks, lunch, lecture notes and electronic material (slides, R code, data).

Registration Express your interest at this survey link. As we have a limited number of participants (30), priority will be given to postgraduate students and early career researchers. EOI for bursaries closes on November 4 2019 5pm AEST, but there are still spots for non bursaries. Results announced to the participants with details for registration.

Location: Forrest Hall, 35 Stirling Highway, Crawley WA 6009, Australia. Google map.

Accommodation: short stay can be booked at Forrest Hall ($120/night)

Contact: mixomics[ at] math.univ-toulouse.fr (for pre-requisite or content)

Prerequisite and requirements We require from the trainees a good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) to fully benefit from the workshop. Participants are requested to bring their own laptop, having installed the software RStudio http://www.rstudio.com/and the R package mixOmics (instructions will be provided prior to the training).

Outline

Day 1 & 2: methods and hands-on. The following broad topics will be covered.

A. Key methodologies in mixOmics and their variants:

Basic processing of count data (scaling, how to handle compositional data)
Exploration of one data set and how to estimate missing values
Identification of a microbial signature to discriminate different treatment groups
Integration of two data sets and identification of microbial markers
Introduction to repeated measurements or longitudinal studies analysis
How to deal with batch effects
Integration of more than two data sets to identify multi omics signatures (if sufficient interest)
Integration of independent but related studies (optional)

B. Review on the graphical outputs implemented in mixOmics

Sample plot representation
Variable plot representation for data integration
Other useful graphical outputs

C. Case studies and applications

Several microbiome and omics studies will be analysed using the methods presented above.

Day 3: bring your own data. Participants will be given the opportunity to analyse their own data under the guidance and the advice of the three instructors. Participants can also work in a team. Some data sets will also be provided for those unable to bring their own data.

The following statistical concepts will be introduced: covariance and correlation, multiple linear regression, classification and prediction, cross-validation, selection of markers, penalised regressions. Each methodology will be illustrated on a case study (theory and application will alternate).

Target group The course is intended for microbiologists working in the fields of bioinformatics, computational biology and applied statistics with some statistical knowledge and a good working knowledge in R. It will be particularly useful to those interested in:

Exploring data sets.
Selecting molecular / microbial features with methods implementing LASSO-based penalisations.
Using graphical techniques to better visualise data.
Understanding and/or applying multivariate projection methodologies to large data sets.

Anticipated learning outcomes After completion of this workshop, participants will be able to

Understand fundamental principles of multivariate projection-based dimension reduction technique.
Perform statistical integration and feature selection using recently developed multivariate methodologies.
Apply those methods to high throughput microbiome studies, including their own studies.

Multi-omics data integration: method and showcase applications

Lê Cao team and collaborators from University of British Columbia (Vancouver, Canada) have published their first method to integrate multiple omics data from the same set of biospecimens or individuals (e.g. transcriptomics, proteomics). Their method adopts a systems biology holistic approach by statistically integrating data from multiple biological compartments. Such approach provides improved biological insights compared with traditional single omics analyses, as it allows to take into account interactions between omics layers and extract multi-omics molecular networks.

DIABLO is a multivariate dimension reduction method and is hypothesis-free. The method constructs combinations of variables (e.g. cytokines, transcripts, proteins, metabolites) that are maximally correlated across data types to identify a minimal subset of markers – a multi-omics signature. This signature can highlight novel findings but is also the starting point to network modelling.

More information about DIABLO, implemented in the mixOmics R package: Amrit Singh, Casey P Shannon, Benoît Gautier, Florian Rohart, Michaël Vacher, Scott J Tebbutt and Kim-Anh Lê Cao (2019) DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays, Bioinformatics. You can also find some technical information in the mixOmics paper (particularly in the Supp!) and also in our tutorials here.

While the computational researchers where busy developing their method, they also analysed the data from the #SmallBig study (small sample, big data) with the EPIC (Expanded Program on Immunization) Consortium. EPIC comprises researchers from the Boston Children’s Hospital, University of British Columbia, Medical Research Council Unit The Gambia, Université libre de Bruxelles, Telethon Kids Institute and University of Western Australia, the Papua New Guinea Institute for Medical Research, to answer the question: What can less than 1mL of blood tell us about a newborn’s health?

Sample processing of the #SmallBig study (adapted from Lee et al. 2019)

In this study recently published in Nature Communications, the team has developed a technique to collect extremely small volumes of blood samples (< 1mL) to comprehensively characterise how biological molecules evolve in newborns. Using cutting-edge computational and statistical methods including DIABLO, they show that to the contrary to biology in adults that has a relatively steady-state, the first week of human life is highly dynamic and undergoes dramatic changes. Their results were consistently observed in vastly different areas of the world, West Africa (The Gambia) and Australasian (Papua New Guinea) and suggest a purposeful rather than random developmental path.

More information about the SmallBig study: Amy H. Lee, Casey P. Shannon, […]Tobias R. Kollmann (2019). Dynamic molecular changes during the first week of human life follow a robust developmental trajectory Nature Communications volume 10, Article number: 1092.

If you are interested in the potential of DIABLO to integrate microbiome and omics from the host, here is another study we published. We integrated the microbiome, proteome and meta-proteomics in T1D individuals.

Design of the multi-omics microbiome study

Identification of multi-omics signature from Gavin et al 2018.

More details about the study: Gavin PG, […], and Hamilton-Williams EE (2018). Intestinal metaproteomics reveals host-microbiota interactions in subjects at risk for type 1 diabetes Diabetes care 41: 10. We used DIABLO to integrate microbiome, proteomics and meta-proteomics.

Nordic Precision Medicine Forum, 18-19 march 2019

The mixOmics team will present at the Nordic Precision Medicine Forum in Stockholm, 18-19 march 2019. Sébastien Déjean will give a talk about Data integration: Examining Statistical Methods for the Exploration and Integration of Heterogeneous Biological Data Sets.

Nordic Precision Medicine Forum brings together those at the very forefront of precision medicine from biologists, physicians and technology developers to data scientists, patient groups, governments and more.

April 15 – 17 2019, Melbourne AUS (beginner, microbiome)

Our participants for our first microbiome-dedicated workshop

Feedback from the workshop: This time we included several new case studies specifically focused on microbiome applications. We presented new material, including the problem of compositional data, how to detect and assess existing methods for batch effects (Ms Yiwen (Eva) Wang, PhD student) and our first timeOmics pipeline (Dr Olivier Chapleur).

‘I was especially pleased with the pace of the workshop. There was time to ask questions during lectures and practice. The pracs were designed to be relevant to our actual research questions.’

The event was sponsored by AFRAN and we obtained 50% bursaries from EMRI UoM for 5 PhD students.

‘Good contextualisation of methods before application of them, lots of depth on the background to methods which was important even when concepts were very complex. ‘I think the case studies were really helpful. The R code is written in such a clear and digestible way that it was easy to apply to my own data’

‘The pace and depth was good. All topics covered were highly relevant, and techniques were directly applicable. The ‘mood’ of the workshop was very friendly.’

Complex microbial networks have a central role in the provision and regulation of ecosystems. Multiple microbial biotechnology applications are contributing to global efforts to achieve sustainability – through purification of wastewater, waste valorisation, bioenergy production, or to understand the role of microbiome in human disease and healthy states.

Statistical analysis of microbiome data is challenging due to the inherent characteristics of the data, such as high sparsity and compositional structure. Our workshop will introduce major concepts including multivariate dimension methods developed in mixOmics. Our methods make no distributional assumptions, are highly flexible for unsupervised (exploratory), supervised (classification) and integration analyses.

This hands-on course will cover basic processing and inherent characteristics of microbiome data (compositionality, batch effects), various analytical frameworks ranging from data exploration, selection of microbial markers, integration with other omics datasets and introduction to time-course analysis. Each methodology introduced in the workshop will be illustrated on real biological studies. The third day is ‘BYO data’ day where you can reinforce your learnings on your own study!

Instructor: Dr Kim-Anh Lê Cao and Dr Olivier Chapleur; Tutor: Ms Laetitia Cardonna. The travels of Olivier and Laetitia is proudly sponsored by AFRAN, the Australian-French Association for Research and Innovation.

Organized by: Melbourne Integrative Genomics, University of Melbourne

Fees for 3 days are AUD500 for RHD students, AUD900 for research non-profit organisations and AUD1500 for industry / government. The Environmental Microbiology Research Initiative EMRI (University of Melbourne) proudly sponsors registration bursaries ($225 to support some of the registration costs) to 5 RHD students enrolled at UoM. Apply at the EOI survey link below.

Registrations fees include coffee breaks, lunch, lecture notes and electronic material (slides, R code, data).

Location: Theatre 4 Alan Gilbert Building, University of Melbourne

Contacts mixomics[ at] math.univ-toulouse.fr (for pre-requisite or content)

Outline

Day 1 & 2: methods and hands-on. The following broad topics will be covered.

A. Key methodologies in mixOmics and their variants:

Basic processing of count data (scaling, how to handle compositional data)
Exploration of one data set and how to estimate missing values
Identification of a microbial signature to discriminate different treatment groups
Integration of two data sets and identification of microbial markers
Introduction to repeated measurements or longitudinal studies analysis
How to deal with batch effects
Integration of more than two data sets to identify multi omics signatures (if applicable)
Integration of independent but related studies (if applicable)

B. Review on the graphical outputs implemented in mixOmics

Sample plot representation
Variable plot representation for data integration
Other useful graphical outputs

C. Case studies and applications

Several microbiome studies will be analysed using the methods presented above.

The following statistical concepts will be introduced: covariance and correlation, multiple linear regression, classification and prediction, cross-validation, selection of microbial markers, penalised regressions. Each methodology will be illustrated on a case study (theory and application will alternate).

Exploring microbiome data sets.
Selecting microbial features with methods implementing LASSO-based penalisations.
Using graphical techniques to better visualise data.
Understanding and/or applying multivariate projection methodologies to large data sets.

Anticipated learning outcomes After completion of this workshop, participants will be able to

Understand fundamental principles of multivariate projection-based dimension reduction technique.
Perform statistical integration and feature selection using recently developed multivariate methodologies.
Apply those methods to high throughput microbiome studies, including their own studies.

June 4-6 2019, Toulouse, FR (beginner, 3 days)

We will be running a three-day workshop in June 2019 in Toulouse at the introductory level.

Instructors: Dr Sébastien Déjean

Organized by: GenoToul Biostatistics

Dates: 4-6 june 2019

Registration and registration fees: before 17 may 2019, using this form. Fees are: academic (500€), private (1000€), see more details below in the link provided.

Language: French or English depending on the attendees

Contact mixomics[ at] math.univ-toulouse.fr

More details can be found here (in french): https://perso.math.univ-toulouse.fr/biostat/2018/12/07/formation-mixomics/

We are moving …. to bioC!

Dear all,

After 9 years hosted at the R CRAN we are migrating to bioconductor! It’s been a great first journey and we are grateful to the R CRAN for hosting our package. We are now ready for the next adventure.

Why are we moving?

It is our aspiration to empower computational and molecular biologists, which aligns with bioC vision.
We will be able to link with new experimentClass S4 objects and existing data packages using them in bioC, ranging from multi omics, microbiome and single cell.
We will be able to provide better vignettes and examples that will complement our website.

What has changed? What should I do? Should I panic?

So far we have allowed as little disruptions as possible, so the call of the functions and objects are the same. Gradually we will be adding more capabilities, which will grandly improve your usability (see above for the S4 class).

We are almost on bioC but the full acceptance is pending on the removal of mixOmics on the R CRAN. We fixed a few bugs, if you would like to install this new version:

The development version is now accessible on gitHub (feel free to fork / help* / comment on gitHub):

R>install_github("mixOmicsTeam/mixOmics")

Or alternatively, once we will be in bioConductor:

R> if (!requireNamespace("BiocManager", quietly = TRUE))  install.packages("BiocManager")
R> BiocManager::install("mixOmics", version = "3.8")

Then, business as usual!

* We would like to formally acknowledge the help of Lluís Revilla (Centre Esther Koplowitz, Barcelona) for helping us with setting up some testthat checks for our bioC version.

As we enter this new journey, we also thank you for this.
And also for this!

PS: a one-day microbiome workshop is scheduled in chilly Vancouver on November 6.

Nov 6 2018, Vancouver (microbiome)

Note: this workshop is primarily restricted to Microbiome Research Network students at UBC until Oct 22 when the registration will be open outside MRN if space is available.

About the Workshop

The objective of this workshop is to introduce fundamental concepts of multivariate dimension reduction methodologies for biological data analysis. Each methodology presented during the course will be applied to case studies available in the R package mixOmics.

Methods for multivariate data analysis, data visualisation and microbial signature identification will be covered, as well as an introduction for multi-omics data integration.

You will learn how to:

Understand fundamental principles of multivariate projection-based dimension reduction techniques.
Perform statistical integration and feature selection using recently developed multivariate methodologies.
Apply those methods to high throughput biological studies, including your own studies.

This a special workshop offering as part of the Microbiome Research Network’s Exploring the Microcosmos Symposium. Tickets are available to MRN members only until Oct 22. At this time, additional tickets will be made publicly available if space is available.

Pre-requisites

“Introduction to R”, EDUCE modules in MICB 301/405/425, or equivalent knowledge of R.

MRN students

Microbiome Research Network (MRN) students should contact info.ecoscope@ubc.ca with 1) their name, 2) their PI’s name, and 3) the name of workshop to receive a registration code.

Instructor

Dr Kim-Anh Lê Cao (University of Melbourne, Australia) was awarded her PhD in 2008 at Université de Toulouse, France. She then moved to Australia as a postdoctoral fellow at the University of Queensland, Brisbane. Since the beginning of her PhD Kim-Anh has initiated a wide range of valuable collaborative and research opportunities in both statistics and molecular biology. Her main research focus is on variable selection for biological data (`omics’ data) coming from different functional levels by the means of multivariate dimension reduction approaches. Since 2009, her team has been working on developing a statistical software dedicated to the integrative analysis of ‘omics’ data, to help researchers make sense of biological big data. Kim-Anh is a senior lecturer at the University of Melbourne (Melbourne Integrative Genomics, School of Mathematics and Statistics), and regularly runs statistical training workshops and short series seminars as well as mixOmics multi-day workshops.

Software requirements for 2020 mixOmics workshops

We list below some installation requirements to ensure the mixOmics workshop will run smoothly for everyone. Please update / install prior to the workshop to avoid a WIFI backload.

Software installation and updates.

0 – Mac OS users only: install X Quartz first https://www.xquartz.org/

1 – Install the mixOmics package from Bioconductor You may need to install the latest R version and the latest BiocManager package installed following these instructions (if you use R versions <=3.5.0) refer to the instructions at the end of the link). Install mixOmics using the following code:

## install BiocManager if not installed if (!requireNamespace("BiocManager", quietly = TRUE))     install.packages("BiocManager") 
## install mixOmics 
BiocManager::install('mixOmics')

The mixOmics package should directly import the following packages: igraph, rgl, ellipse, corpcor, RColorBrewer, plyr, parallel, dplyr, tidyr, reshape2, methods , matrixStats , rARPACK, gridExtra .

1 alternative – To obtain the latest update of mixOmics(as Bioconductor updates every 6 months our package) you will need to pull from our gitHub page via the devtools and the install_github libraries. Install the libraries ‘devtools’ in R, then load and install mixOmics from gitHub:

install.packages("devtools")
# then load
library(devtools)
install_github("mixOmicsTeam/mixOmics")

2 – Check after install that the following does not throw any error (see step 0) and that the welcome message confirms you have installed version > 6.10. If this is not the case, try step 1 alternative (installation from gitHub):

>library(mixOmics) 
Loaded mixOmics 6.10.x

We also advise using the software RStudio

If R makes any complain, you may have to install the latest R version here: https://cran.r-project.org/

Wifi will be available on site, but it is preferable that you make those installations before the workshop to avoid delays for the analyses.

Any question regarding the requirements and software installation: email us at mixomics[at]math.univ-toulouse.fr

July 8 2018, Barcelona, SP (Introductory)

The registrations are now closed as we have reached above and beyond capacity! (53 registrations! Many thanks for your interest!)

We will be running a one-day workshop as part of the XXIX International Biometric Conference (IBC 2018).

Instructors: Dr Kim-Anh Lê Cao and Dr Sébastien Déjean

Organized and sponsored by IBC 2018

Dates 8 July 9am – 5pm

Practical information Register at the IBC website, early bird close 16th April!

Contact mixomics[ at] math.univ-toulouse.fr

More details can be found here:

http://2018.biometricconference.org/course-2/

News 2018, workshops 2018 and DIABLO

Dear all,

The first few months of the year have been busy for us. Thanks to your support, we have been ranked second to the Bioinformatics Peer Prize (57 votes, so close after the winner with 59 votes!). Our entry is listed at this link if you would like to watch a basic introduction to the package.

For those who are new to mixOmics, I also cooked some prezi slides to introduce the broad context of where mixOmics sits, which was presented at the University of Melbourne ResBaz event in February.

We have now scheduled our 2018 workshops:

An advanced workshop focusing on omics data integration 7-8 June in the Parisian region. The registration will be in two stages: Expression of Interest due on April 29, followed by registration. The workshop will accommodate 30 participants. More details here.
A 3-day beginner workshop 23-25 July at the University of Melbourne. More details will be populated very soon.

We have pushed the second version of our DIABLO manuscript on bioRxiv. The codes are currently on gitHub but they will also be rendered on our website soon.

For some little news, you can also follow us on Twitter @mixOmics_team.

Kim-Anh for the mixOmics team