Feedback from a previous iteration can be found here.
Key summary
The new course is open and will run for 7 weeks. This course is online, but at your own pace, meaning that you need to dedicate enough time (5-8h per week) to fully benefit from the program.
There are 4 weeks of asynchronous learning (you work at our own pace to cover the material each week).
There are 4 live webinars organised on the first 4 Thursdays at 5pm AEST (convert your time here) to summarise some key concepts and ask your questions (the webinars will be recorded, as there are daylight savings during this period).
You will have the opportunity to chat on Slack and ask your questions during the whole course.
You can analyse your own data for the assessment (due in week 6) or use the data provided. You will reinforce your learning by marking the assignments of 2-3 other learners.
Teaching Period Dates, asynchronised:
Teaching commences: Monday, 24 Feb 2025, 9:00 am AEST
Prerequisites. A good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) is essential to fully benefit from the course*. The course is divided into theory (50%) and hands-on practice, with the opportunity to analyse your own data. The exercises and assignments are in R. Participants are encouraged to use RStudio and Rmarkdown (template and R code provided).
For those looking for an R refresher well ahead of the course:
Unfortunately we had to cancel the workshop as we did not receive a sufficient number of participants to justify running the workshop at this time. These workshops involve peer review and a cohort feel to provide the best experience to our learners.
Register your EOI here and we will let you know when the registration page is up. Our next intake is scheduled for February 2025.
Feedback from a previous iteration can be found here.
Key summary
The new course is open and will run for 7 weeks. This course is online, but at your own pace, meaning that you need to dedicate enough time (5-8h per week) to fully benefit from the program.
There are 4 weeks of asynchronous learning (you work at our own pace to cover the material each week).
There are 4 live webinars organised on the first 4 Thursdays at 5pm AEST (convert your time here) to summarise some key concepts and ask your questions (the webinars will be recorded, as there are daylight savings during this period).
You will have the opportunity to chat on Slack and ask your questions during the whole course.
You can analyse your own data for the assessment (due in week 6) or use the data provided. You will reinforce your learning by marking the assignments of 2-3 other learners.
Teaching Period Dates, asynchronised:
Teaching commences: Monday, 21 Oct 2024, 9:00 am AEST
Prerequisites. A good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) is essential to fully benefit from the course*. The course is divided into theory (50%) and hands-on practice, with the opportunity to analyse your own data. The exercises and assignments are in R. Participants are encouraged to use RStudio and Rmarkdown (template and R code provided).
For those looking for an R refresher well ahead of the course:
We will be running a 2-day workshop at Frazer Institute, University of Queendland. The workshop will cover 1.5 days of lectures and hands-on, and an additional 0.5 day for discussions and opportunities to analyse your own data (assuming the data are already processed and normalised).
Fill the survey so that you can register your interest and needs for this workshop. We can only allow a limited number of participants, so lock in those dates in your calendar before we confirm your participation! Priority will be given to postgraduate students and early career researchers. Results will be announced to the participants with details for registration on 17th February.
Context. Advances in high-throughput technologies have transformed the way we examine molecular information, including microbial communities. However, analytical tool development is critically trailing behind data generation, which hinders the analysis, understanding or integration of omics data. Data integration adopt a holistic, data-driven and hypothesis-free approach. This new approach is necessary to understand the role of biological systems and posit new hypotheses.
The workshop will introduce concepts of multivariate dimension methods developed in mixOmics for statistical analysis. Our methods make no distributional assumptions, are highly flexible for unsupervised (exploratory), supervised (classification) and integration analyses. Various analytical frameworks will be presented ranging from data exploration, selection of markers, integration with other omics datasets and introduction to time-course analysis. There will be an opportunity also to analyse your own data.
Each method will be illustrated on real biological studies. The last afternoon is ‘BYO data’ where you can reinforce your learnings on your own study!
Instructor: A/Prof Kim-Anh Lê Cao;Tutor: Nick Matigian (QCIF)
Organized and hosted by: Frazer institute, University of Queensland
There are no registration fees for this workshop. We do expect your attendance as the number of places is limited. The workshop is fully catered. Slides, R code and data will be provided.
Registration Fill the survey and lock the dates in your calendar! As we have a limited number of participants (30), priority will be given to postgraduate students and early career researchers. Results will be announced to the participants with details for registration after the survey’s deadline. Online attendance is also available for a limited number of participants (but with reduced opportunities for interactions).
Prerequisite and requirements. We require from the trainees a good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) to fully benefit from the workshop. Participants are requested to bring their own laptop, having installed the software RStudio http://www.rstudio.com/and the R package mixOmics (instructions will be provided prior to the training).
Outline
The following broad topics will be covered during these two days:
A. Key methodologies in mixOmics and their variants:
Exploration of one data set with Principal Component Analysis (the basics!)
Identification of a molecular signature to discriminate different treatment groups with PLS-Discriminant Analysis
Integration of two data sets and identification of markers with PLS
Integration of more than two data sets to identify multi omics signatures (if sufficient interest) with PLS-DIABLO
B. Graphical outputs implemented in mixOmics
Sample plot representation
Variable plot representation for data integration
Other useful graphical outputs
C. Case studies and applications
Several omics studies (and microbiome if there is some interest) will be analysed using the methods presented above.
Day 2: bring your own data. Participants will be given the opportunity to analyse their own data under the guidance and the advice of the instructors. Participants can also work in a team. Your data need to be processed and normalised beforehand.
The following statistical concepts will be introduced: covariance and correlation, multiple linear regression, classification and prediction, cross-validation, selection of markers, penalised regressions. Each methodology will be illustrated on a case study (theory and application will alternate).
Target group The course is intended for molecular biologists working in the fields of bioinformatics, computational biology and applied statistics with some statistical knowledge and a good working knowledge in R. It will be particularly useful to those interested in:
Exploring data sets.
Selecting molecular signatures with methods implementing LASSO-based penalisations.
Using graphical techniques to better visualise data.
Understanding and/or applying multivariate projection methodologies to large data sets.
Anticipated learning outcomes After completion of this workshop, participants will be able to
Understand fundamental principles of multivariate projection-based dimension reduction technique.
Perform statistical integration and feature selection using recently developed multivariate methodologies.
Apply those methods to high throughput microbiome studies, including their own studies.
The next iteration of the course will be in September 2023 for a likely duration of 6-8 weeks (it will be advertised 3 months before opening the course). This course is online, but at your own pace, meaning that you need to dedicate enough time (5-8h per week) to fully benefit from the program.
Feedback from the 2022 iteration:
You can do it at your own time since the resources provided (Webinars and reading material) are very helpful. Due to working hours I had to watch/read on demand (at my own time)
Kim-Anh has done a very good job in the webinars and was generally approachable and helpful. Thank you! The online course material was very good and explained the basics of the program quite well. The integration with the mixOmics online material and sample cases is very helpful.
It had the option to attend live webinars (two offered times) or watch recordings. – The possibility to ask questions was available for both live webinars and stack. – The assignments are designed to enhance further learning allowing to use of either own data or provided data at different challenge skills.
Course organisers were very responsive to our questions in Slack. Modules flowed nicely and were well organised. Webinars were useful.
This is our second round of online course ‘mixOmics R Essentials for Biological Data Integration‘ that includes 4 weeks of asynchronous learning (with one live summary + Q&A per week), numerous chats on Slack and an additional 3 weeks to complete the assignment. Some feedback from our last round can be found here. Our last survey seem to suggest most learners spent between 5-8h per week on the program.
Teaching Period Dates, asynchronised:
Start – Monday, 31st October 2022
End – Sunday, 27th November 2022
(non marked) Assessment due Sunday, 9th December 2022
Peer-review of assessment due Sunday, 16th December 2022
Prerequisites. A good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) is essential to fully benefit from the course*. The course is divided into theory (50%) and hands-on practice, with the opportunity to analyse your own data. The exercises and assignments are in R. Participants are encouraged to use RStudio and Rmarkdown (template and R code provided).
For those looking for an R refresher well ahead of the course:
Feedback from the workshop: This time we included several new case studies specifically focused on microbiome applications. We presented new material, including the problem of compositional data, how to detect and assess existing methods for batch effects (Ms Yiwen (Eva) Wang, PhD student) and our first timeOmics pipeline (Dr Olivier Chapleur).
‘I was especially pleased with the pace of the workshop. There was time to ask questions during lectures and practice. The pracs were designed to be relevant to our actual research questions.’
The event was sponsored by AFRAN and we obtained 50% bursaries from EMRI UoM for 5 PhD students.
‘Good contextualisation of methods before application of them, lots of depth on the background to methods which was important even when concepts were very complex. ‘I think the case studies were really helpful. The R code is written in such a clear and digestible way that it was easy to apply to my own data’
‘The pace and depth was good. All topics covered were highly relevant, and techniques were directly applicable. The ‘mood’ of the workshop was very friendly.’
Complex microbial networks have a central role in the provision and regulation of ecosystems. Multiple microbial biotechnology applications are contributing to global efforts to achieve sustainability – through purification of wastewater, waste valorisation, bioenergy production, or to understand the role of microbiome in human disease and healthy states.
Statistical analysis of microbiome data is challenging due to the inherent characteristics of the data, such as high sparsity and compositional structure. Our workshop will introduce major concepts including multivariate dimension methods developed in mixOmics. Our methods make no distributional assumptions, are highly flexible for unsupervised (exploratory), supervised (classification) and integration analyses.
This hands-on course will cover basic processing and inherent characteristics of microbiome data (compositionality, batch effects), various analytical frameworks ranging from data exploration, selection of microbial markers, integration with other omics datasets and introduction to time-course analysis. Each methodology introduced in the workshop will be illustrated on real biological studies. The third day is ‘BYO data’ day where you can reinforce your learnings on your own study!
Instructor: Dr Kim-Anh Lê Cao and Dr Olivier Chapleur; Tutor: Ms Laetitia Cardonna. The travels of Olivier and Laetitia is proudly sponsored by AFRAN, the Australian-French Association for Research and Innovation.
Fees for 3 days are AUD500 for RHD students, AUD900 for research non-profit organisations and AUD1500 for industry / government. The Environmental Microbiology Research Initiative EMRI (University of Melbourne) proudly sponsors registration bursaries ($225 to support some of the registration costs) to 5 RHD students enrolled at UoM. Apply at the EOI survey link below.
Registrations fees include coffee breaks, lunch, lecture notes and electronic material (slides, R code, data).
Registration Express your interest at this survey link. As we have a limited number of participants (30), priority will be given to postgraduate students and early career researchers. EOI closes on March 11.
Prerequisite and requirements We require from the trainees a good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) to fully benefit from the workshop. Participants are requested to bring their own laptop, having installed the software RStudio http://www.rstudio.com/and the R package mixOmics (instructions will be provided prior to the training).
Outline
Day 1 & 2: methods and hands-on. The following broad topics will be covered.
A. Key methodologies in mixOmics and their variants:
Basic processing of count data (scaling, how to handle compositional data)
Exploration of one data set and how to estimate missing values
Identification of a microbial signature to discriminate different treatment groups
Integration of two data sets and identification of microbial markers
Introduction to repeated measurements or longitudinal studies analysis
How to deal with batch effects
Integration of more than two data sets to identify multi omics signatures (if applicable)
Integration of independent but related studies (if applicable)
B. Review on the graphical outputs implemented in mixOmics
Sample plot representation
Variable plot representation for data integration
Other useful graphical outputs
C. Case studies and applications
Several microbiome studies will be analysed using the methods presented above.
Day 3: bring your own data. Participants will be given the opportunity to analyse their own data under the guidance and the advice of the three instructors. Participants can also work in a team. Some data sets will also be provided for those unable to bring their own data.
The following statistical concepts will be introduced: covariance and correlation, multiple linear regression, classification and prediction, cross-validation, selection of microbial markers, penalised regressions. Each methodology will be illustrated on a case study (theory and application will alternate).
Target group The course is intended for microbiologists working in the fields of bioinformatics, computational biology and applied statistics with some statistical knowledge and a good working knowledge in R. It will be particularly useful to those interested in:
Exploring microbiome data sets.
Selecting microbial features with methods implementing LASSO-based penalisations.
Using graphical techniques to better visualise data.
Understanding and/or applying multivariate projection methodologies to large data sets.
Anticipated learning outcomes After completion of this workshop, participants will be able to
Understand fundamental principles of multivariate projection-based dimension reduction technique.
Perform statistical integration and feature selection using recently developed multivariate methodologies.
Apply those methods to high throughput microbiome studies, including their own studies.
Registration and registration fees: before 17 may 2019, using this form. Fees are: academic (500€), private (1000€), see more details below in the link provided.
Language: French or English depending on the attendees
Prerequisite and requirements We require from the trainees a good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) to fully benefit from the workshop. Participants are requested to bring their own laptop, having installed the software RStudio http://www.rstudio.com/and the R package mixOmics (instructions will be provided prior to the training).
Note: this workshop is primarily restricted to Microbiome Research Network students at UBC until Oct 22 when the registration will be open outside MRN if space is available.
About the Workshop
The objective of this workshop is to introduce fundamental concepts of multivariate dimension reduction methodologies for biological data analysis. Each methodology presented during the course will be applied to case studies available in the R package mixOmics.
Methods for multivariate data analysis, data visualisation and microbial signature identification will be covered, as well as an introduction for multi-omics data integration.
You will learn how to:
Understand fundamental principles of multivariate projection-based dimension reduction techniques.
Perform statistical integration and feature selection using recently developed multivariate methodologies.
Apply those methods to high throughput biological studies, including your own studies.
This a special workshop offering as part of the Microbiome Research Network’s Exploring the Microcosmos Symposium. Tickets are available to MRN members only until Oct 22. At this time, additional tickets will be made publicly available if space is available.
Pre-requisites
“Introduction to R”, EDUCE modules in MICB 301/405/425, or equivalent knowledge of R.
MRN students
Microbiome Research Network (MRN) students should contact info.ecoscope@ubc.ca with 1) their name, 2) their PI’s name, and 3) the name of workshop to receive a registration code.
Instructor
Dr Kim-Anh Lê Cao (University of Melbourne, Australia) was awarded her PhD in 2008 at Université de Toulouse, France. She then moved to Australia as a postdoctoral fellow at the University of Queensland, Brisbane. Since the beginning of her PhD Kim-Anh has initiated a wide range of valuable collaborative and research opportunities in both statistics and molecular biology. Her main research focus is on variable selection for biological data (`omics’ data) coming from different functional levels by the means of multivariate dimension reduction approaches. Since 2009, her team has been working on developing a statistical software dedicated to the integrative analysis of ‘omics’ data, to help researchers make sense of biological big data. Kim-Anh is a senior lecturer at the University of Melbourne (Melbourne Integrative Genomics, School of Mathematics and Statistics), and regularly runs statistical training workshops and short series seminars as well as mixOmics multi-day workshops.
We list below some installation requirements to ensure the mixOmics workshop will run smoothly for everyone. Please update / install prior to the workshop to avoid a WIFI backload.
1 – Install the mixOmics package from Bioconductor You may need to install the latest R version and the latest BiocManager package installed following these instructions (if you use R versions <=3.5.0) refer to the instructions at the end of the link). Install mixOmics using the following code:
## install BiocManager if not installed if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") ## install mixOmics BiocManager::install('mixOmics')
The mixOmics package should directly import the following packages: igraph, rgl, ellipse, corpcor, RColorBrewer, plyr, parallel, dplyr, tidyr, reshape2, methods , matrixStats , rARPACK, gridExtra .
1 alternative – To obtain the latest update of mixOmics(as Bioconductor updates every 6 months our package) you will need to pull from our gitHub page via the devtools and the install_github libraries. Install the libraries ‘devtools’ in R, then load and install mixOmics from gitHub:
install.packages("devtools") # then load library(devtools) install_github("mixOmicsTeam/mixOmics")
2 – Check after install that the following does not throw any error (see step 0) and that the welcome message confirms you have installed version > 6.10. If this is not the case, try step 1 alternative (installation from gitHub):
Prerequisite and requirements We require from the trainees a good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) to fully benefit from the workshop. Participants are requested to bring their own laptop, having installed the software RStudio http://www.rstudio.com/and the R package mixOmics (instructions will be provided prior to the training).
Our new update 6.2.0 is now available on CRAN as part of our new version of our manuscript.
manuscript & package update:
The mixOmics manuscript introducing the supervised and integrative frameworks (PLS-DA, DIABLO block.plsda and MINT) has be updated, along with all the R / Sweave case studies, manuscript and codes are available at this link. The case studies are also published on our website (sPLSDA:SRBCT, Case study: TCGA and Case study: MINT).
The manuscript describes in more details the difference prediction distances (see also the supplemental material) and the interpretation of the AUROC for our supervised methods.
The constraint argument was removed from all our methods, due to a risk of overfitting.
New features:
– The constraint argument (version 6.1.0 – 6.1.3) was removed in the functions perf and tune for all supervised objects because of a risk of overfitting
Enhancements:
– AUROC aded for MINT objects mint.plsda and mint.splsda where the study name needs to be specified, e.g. auroc( .., roc.study = “study4”). See ?auroc
– choice.ncomp output added on all perf and tune functions for all supervised methods.
– mat.c output for pls and plsda objects (matrix of coefficients from the regression of X / residual matrices X on the X-variates).
Bug fixes (thank you to the users who notified us on bitbucket):
– fixed bug when using predict, perf or tune with the error msg: ‘Error in predict.spls(spls.res, X.test[, nzv]) : ‘newdata’ must include all the variables of ‘object$X”
Workshops:
We advertised two workshops at this link. The advanced workshop 23-24 Oct 2017 is fully subscribed. This is our first MAW (mixOmics advanced workshop), but there will be more planned in 2018. We still have a few spots left for the classic workshop on the 9-10 Nov 2017 in Toulouse, contact us for more information (priority will be given to students and early career researchers).
Two senior postdoc positions (2 year and 3 year) still open!
The Australian mixOmics team now based at the University of Melbourne is recruiting two senior postdocs in the fields of computational biology or statistics, 1 full time 2-year position to work with the Stemformatics team on exciting omics integrating problems (‘omics and single cell omics) to improve stem cell classification, and 1 full time 3-year position for innovative multivariate methods developments for ‘omics time course, microbiome and P-integration. Contact us for more information.
Website update:
With the invaluable help from the bioinformatics masters students Danielle Davenport and Zoe Welham we are currently revamping the website to ensure all codes are running correctly. Thank you for those who sent us some feedback!