We will be running a 2-day workshop at Frazer Institute, University of Queendland. The workshop will cover 1.5 days of lectures and hands-on, and an additional 0.5 day for discussions and opportunities to analyse your own data (assuming the data are already processed and normalised).
Fill the survey so that you can register your interest and needs for this workshop. We can only allow a limited number of participants, so lock in those dates in your calendar before we confirm your participation! Priority will be given to postgraduate students and early career researchers. Results will be announced to the participants with details for registration on 17th February.
Context. Advances in high-throughput technologies have transformed the way we examine molecular information, including microbial communities. However, analytical tool development is critically trailing behind data generation, which hinders the analysis, understanding or integration of omics data. Data integration adopt a holistic, data-driven and hypothesis-free approach. This new approach is necessary to understand the role of biological systems and posit new hypotheses.
The workshop will introduce concepts of multivariate dimension methods developed in mixOmics for statistical analysis. Our methods make no distributional assumptions, are highly flexible for unsupervised (exploratory), supervised (classification) and integration analyses. Various analytical frameworks will be presented ranging from data exploration, selection of markers, integration with other omics datasets and introduction to time-course analysis. There will be an opportunity also to analyse your own data.
Each method will be illustrated on real biological studies. The last afternoon is ‘BYO data’ where you can reinforce your learnings on your own study!
Instructor: A/Prof Kim-Anh Lê Cao;Tutor: Nick Matigian (QCIF)
Organized and hosted by: Frazer institute, University of Queensland
There are no registration fees for this workshop. We do expect your attendance as the number of places is limited. The workshop is fully catered. Slides, R code and data will be provided.
Registration Fill the survey and lock the dates in your calendar! As we have a limited number of participants (30), priority will be given to postgraduate students and early career researchers. Results will be announced to the participants with details for registration after the survey’s deadline. Online attendance is also available for a limited number of participants (but with reduced opportunities for interactions).
Location: TBA, Translational Research Institute
Contact: kimanh.lecao[ at] unimelb.edu.au (for pre-requisite or content)
Prerequisite and requirements. We require from the trainees a good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) to fully benefit from the workshop. Participants are requested to bring their own laptop, having installed the software RStudio http://www.rstudio.com/and the R package mixOmics (instructions will be provided prior to the training).
Outline
The following broad topics will be covered during these two days:
A. Key methodologies in mixOmics and their variants:
- Exploration of one data set with Principal Component Analysis (the basics!)
- Identification of a molecular signature to discriminate different treatment groups with PLS-Discriminant Analysis
- Integration of two data sets and identification of markers with PLS
- Integration of more than two data sets to identify multi omics signatures (if sufficient interest) with PLS-DIABLO
B. Graphical outputs implemented in mixOmics
- Sample plot representation
- Variable plot representation for data integration
- Other useful graphical outputs
C. Case studies and applications
Several omics studies (and microbiome if there is some interest) will be analysed using the methods presented above.
Day 2: bring your own data. Participants will be given the opportunity to analyse their own data under the guidance and the advice of the instructors. Participants can also work in a team. Your data need to be processed and normalised beforehand.
The following statistical concepts will be introduced: covariance and correlation, multiple linear regression, classification and prediction, cross-validation, selection of markers, penalised regressions. Each methodology will be illustrated on a case study (theory and application will alternate).
Target group The course is intended for molecular biologists working in the fields of bioinformatics, computational biology and applied statistics with some statistical knowledge and a good working knowledge in R. It will be particularly useful to those interested in:
- Exploring data sets.
- Selecting molecular signatures with methods implementing LASSO-based penalisations.
- Using graphical techniques to better visualise data.
- Understanding and/or applying multivariate projection methodologies to large data sets.
Anticipated learning outcomes After completion of this workshop, participants will be able to
- Understand fundamental principles of multivariate projection-based dimension reduction technique.
- Perform statistical integration and feature selection using recently developed multivariate methodologies.
- Apply those methods to high throughput microbiome studies, including their own studies.