Forum

[open] 3-day mixOmics workshop, 22 – 24 Sept 2025, Lund University

We have a few spots left for an in-person mixOmics workshop, which we would like to open to our wider community!

Modern high-throughput technologies generate complex biological data that require powerful yet accessible tools for analysis. This beginner-level workshop introduces participants to data integration and multivariate analysis using the R package mixOmics.

Through a series of hands-on sessions, we will explore how multivariate methods can uncover biological patterns, identify key molecular features (or ‘markers’), and integrate multiple omics datasets. The approach is hypothesis-free, flexible, and does not rely on strict statistical assumptions.

By the end of the workshop, participants will be familiar with the core mixOmics workflows for exploratory and supervised analysis. There will also be an opportunity to apply the methods to your own dataset, with expert guidance throughout.

Pre-requisite: Basic proficiency in R is essential (e.g. working with data frames, basic calculations, simple plots). Participants without R experience have reported difficulty keeping up and gaining value from the course.

Instructor: Prof Kim-Anh Lê Cao, the University of Melbourne

WHERE: Mon 22 to Wed 24 Sept 2025: 9am – 5pm, Lund University (Room: Maskrosen (E121), Ekologihuset, Lunds Universitet,Kontaktvägen 10,Lund 22362, Sweden; google map pin)

REGISTRATION AND FEES at this link, registrations close on 12th September 2025.

Workshop schedule

Monday 22 Sept and Tuesday 23rd Sept: methods and hands-on. 

The following broad topics will be covered.

A. Key methodologies in mixOmics and their variants

  • Basic processing of count data
  • Exploration of one data set and how to estimate missing values
  • Identification of molecular signature to discriminate different treatment groups
  • Integration of two data sets and identification of biomarkers
  • Introduction to repeated measurements or longitudinal studies analysis
  • Integration of more than two data sets to identify multi omics signatures
  • Integration of independent but related studies (optional)

B. Review on the graphical outputs implemented in mixOmics

  • Sample plot representation
  • Variable plot representation for data integration
  • Other useful graphical outputs

C. Case studies and applications

Several microbiome and omics studies will be analysed using the methods presented above.

Wednesday 24th Sept: bring your own data. Participants will be given the opportunity to analyse their own data under the guidance and the advice of the instructor. Participants can also work in a team. Some data sets will also be provided for those unable to bring their own data.

Statistical concepts

The following statistical concepts will be introduced: covariance and correlation, multiple linear regression, classification and prediction, cross-validation, selection of markers, penalised regressions. Each methodology will be illustrated on a case study (theory and application will alternate).

Target group

The course is intended for computational biologists and biologists with some statistical knowledge and a good working knowledge in R. It will be particularly useful to those interested in:

Understanding and/or applying multivariate projection methodologies to large data sets.

Exploring data sets.

Selecting molecular / microbial features with methods implementing LASSO-based penalisations.

Using graphical techniques to better visualise data.

Anticipated outcomes

After completion of this workshop, participants will be able to

Apply those methods to high throughput microbiome studies, including their own studies.

Understand fundamental principles of multivariate projection-based dimension reduction technique.

Perform statistical integration and feature selection using recently developed multivariate methodologies.

Workshop registration cancellation policy

To confirm your place in this workshop, the registration fee is payable at the time of booking. This commitment helps us plan and deliver the workshop effectively for all participants.

Cancellations and Refunds: Refunds are only available if the workshop is cancelled or postponed by the organiser. In that case, a full refund (including any service fees) will be issued automatically.

No-Show Policy: If you do not attend the workshop, your registration fee will be non-refundable.

Illness or Exceptional Circumstances: We understand that unexpected situations can arise. If you are unable to attend due to illness or other exceptional circumstances, please contact us as soon as possible. While refunds cannot be issued, we will review your situation with care and may consider alternative options at the organiser’s discretion.

This policy is designed to ensure fairness to all participants and to support the smooth delivery of our workshops.

🚀 mixOmics v6.32.0 released on Bioconductor 3.21

mixOmics v6.32.0 is now available on Bioconductor 3.21, compatible with R 4.5.0. This update brings new features, bug fixes, and improvements based on your feedback.

What’s new since the last Bioconductor release:

🔬 New features and enhancements

  • plotLoadings() now supports ggplot2-style plots with fully customisable aesthetics
  • tune() has been enhanced to support tuning of components or variables
  • New function perf.assess() evaluates final model performance

⚙️ Improved performance and reproducibility

  • tune() and perf() now support parallel processing using the BPPARAM argument and accept a seed argument to improve reproducibility

🧹 Bug fixes and usability improvements

  • plotIndiv() now correctly handles pch ordering and ellipse colours
  • Better error message in perf() when a class has only one sample
  • Streamlined multiblock functions by removing unused arguments


📦 Install this version:

if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager") 
BiocManager::install("mixOmics")

🔍 For a full list of changes, visit the README on our GitHub repo.

New Performance Assessment & Parameter Tuning – Beta test now!

We’ve streamlined performance assessment and parameter tuning functions, available for beta testing before the next Bioconductor release in April!

What’s New?

  • New perf.assess(): Assesses only the final model’s performance, returning key metrics (no plots). PR #344
  • Enhanced tune(): Now supports tuning components separately or alongside variables across multiple model types. PR #348
  • New documentation pages: Explore new webpages explaining key concepts and usage of these functions.

Get Involved

  • Install the latest development version using
    devtools::install_github("mixOmicsTeam/mixOmics", ref = "6.31.4")
  • Test the new functions on your models
  • Share feedback on the User Forum or identified bugs on Github Issues

Try it out and help us refine these features before official release!

Webinar: Φ-Space ST: a platform-agnostic method to identify cell states in spatial transcriptomics studies

We have a sequel to Φ-Space, Φ-Space ST developed by  Dr Jiadong Mao  for spatial transcriptomics studies! We are very excited about these new developments and the potential of Φ-Space for single cell annotation!

Φ-Space ST is:

  • A novel and fast approach for cell type composition analysis.
  • Platform-Agnostic and Scalable as it works across multiple spatial transcriptomics (ST) platforms, including CosMx, Visium, and Stereo-seq.
  • Accurate and integrative as it identifies cell states by leveraging multiple scRNA-seq references.
  • Segmentation-Free & Niche-Driven as it annotates cell states at subcellular resolution, uncovering niche-specific cell types and tumor-distinguishing patterns.

Φ-Space ST: a platform-agnostic method to identify cell states in spatial transcriptomics studies. Jiadong Mao, Jarny Choi, Kim-Anh Lê Cao. bioRxiv 2025.

Check Jiadong’s latest seminar he presented at Melbourne Integrative Genomics on Friday 14th February 2025:

Abstract

We introduce Φ-Space ST, a platform-agnostic method to identify continuous cell states in spatial transcriptomics (ST) data using multiple scRNA-seq references. For ST with supercellular resolution, Φ-Space ST achieves interpretable cell type deconvolution with significantly faster computation. For subcellular resolution, Φ-Space ST annotates cell states without cell segmentation, leading to highly insightful spatial niche identification. Φ-Space ST harmonises annotations derived from multiple scRNA-seq references, and provides interpretable characterisations of disease cell states by leveraging healthy references. We validate Φ-Space ST in three case studies involving CosMx, Visium and Stereo-seq platforms for various cancer tissues. Our method revealed niche-specific enriched cell types and distinct cell type co-presence patterns that distinguish tumour from non-tumour tissue regions. These findings highlight the potential of Φ-Space ST as a robust and scalable tool for ST data analysis for understanding complex tissues and pathologies.

mixOmics website update

We’re pleased to share that the mixOmics website has undergone a redesign to enhance your browsing experience and make it easier to access our resources.

What’s New?

  • Refreshed Design: A cleaner, more modern layout
  • 📚 Expanded Getting Started Pages: Helpful pages to help you get up and running with mixOmics
  • 🧭 Reorganized Navigation: A more intuitive menu to quickly find key resources
  • 🔗 Updated Social Links: Stay connected with the mixOmics community
  • 💬 Direct Links to the User Forum: If you haven’t already, join our mixOmics user forum to connect with over 500 other users and experts
  • 🧑‍💻 Updated About Pages: Learn more about the project and our team
  • 📅 Streamlined Workshops, Webinars, and News Sections: Easier access to events and updates
  • 🖥️ Embedded R Markdown Pages: Improved code presentation with syntax highlighting in our Methods, Plots, and Case Studies pages

We are continuing to make small improvements, so if you encounter any issues or have feedback, please feel free to contact us.

Thank you for your continued support of mixOmics.

The mixOmics Team

Page from R Markdown



Missing_Values.knit





All methodologies implemented in mixOmics can handle missing values.
In particular, (s)PLS, (s)PLS-DA,
(s)PCA utilise the NIPALS
(Non-linear Iterative
Partial Least
Squares) algorithm as part of their dimension reduction
procedures. This algorithm is built to handle NAs [1].

This is implemented through the nipals() function within
mixOmics. This function is called internally by the above methods but
can also be used manually, as can be seen below.

Usage in mixOmics

library(mixOmics)
data(liver.toxicity)
X <- liver.toxicity$gene[, 1:100] # a reduced size data set

## pretend there are 20 NA values in our data
na.row <- sample(1:nrow(X), 20, replace = TRUE)
na.col <- sample(1:ncol(X), 20, replace = TRUE)
X.na <- as.matrix(X)

## fill these NA values in X
X.na[cbind(na.row, na.col)] <- NA
sum(is.na(X.na)) # number of cells with NA
## [1] 20
# this might take some time depending on the size of the data set
nipals.tune = nipals(X.na, ncomp = 10)$eig
barplot(nipals.tune, xlab = 'Principal component', ylab = 'Explained variance')

FIGURE 1: Column graph of the explained variance of each Principal
Component.

If missing values need to be imputed, the package contains
impute.nipals() for this scenario. NIPALS
is used to decompose the dataset. The resulting components, singular
values and feature loadings can be used to reconstitute the original
dataset, now with estimated values where the missing values were
previously. To allow for the best estimation of missing values, there is
a large number of components being used (ncom = 10).

X.impute <- impute.nipals(X = X.na, ncomp = 10)
sum(is.na(X.impute)) # number of cells with NA
## [1] 0

The difference between the imputed and real values can be checked.
Here are the original values:

id.na = is.na(X.na) # determine position of NAs in dataframe

X[id.na] # show original values
##  [1]  0.09041 -0.04070  0.03497 -0.01712  0.01309  0.00233 -0.04142  0.11104
##  [9] -0.01519 -0.17034 -0.01641  0.15964  0.00557 -0.06217  0.04131  0.02157
## [17]  0.01226 -0.00753  0.03038 -0.00783

The values which were estimated via the NIPALS
algorithm:

X.impute[id.na] # show imputted values
##  [1]  0.0837747419 -0.0190061068  0.0004024897 -0.0180879247 -0.0094185656
##  [6] -0.0312362158 -0.0706920015  0.1400817774  0.0083359545 -0.1158255139
## [11]  0.0164817649  0.1007897385  0.0236184385  0.0191934144  0.0214240977
## [16]  0.0686280312 -0.0039198425  0.0085870558  0.0450234407  0.0013964758



mixOmics 6.30.0 on Bioconductor

At the end of October 2024 Bioconductor updated to version 3.20, and with it updated to the latest version of mixOmics 6.30.0. You can install the latest version of mixOmics on Bioconductor here. This latest release version of the package runs on R version 4.4 and includes some minor bug fixes and updated code and unit tests. See our Github page for more details on these updates.

Webinar: PLS methods

This webinar was presented for a seminar to a group of quantitative researchers (mostly statisticians) at the University of Melbourne. Abstract is below.

Topics covered: context of data integration, PCA solved with NIPALS algorithm and SVD, sparse PCA, correlation circle plot interpretation, PLS algorithms and deflation modes, sparse PLS.

Technological improvements have allowed for the collection of data from different types of molecules (e.g. genes, proteins, metabolites, microorganisms) resulting in multiple ‘omics data (e.g. transcriptomics, proteomics, metabolomics, microbiome) measured from the same set N of biospecimens or individuals. In this talk I will introduce the statistical integration of these multi-omics data to shed more light into a biological system.

Integrating data include numerous challenges – data are complex and large, each with few samples (N < 50) and many molecules (P > 10,000), and generated using different technologies. I will present PLS (Partial Least Squares / Projection to Latent Structures developed by Wold in the 1980s) as an algorithm of choice for data integration of small N large P problems. These variants form the basis of our comprehensive mixOmics R package for feature selection, dimension reduction and integration of omics data sets. This talk is targeted at a general audience with background knowledge in statistics and interest in large data

The webinar was re-recorded for the PLS section.

Webinar: Time-course multi-omics integration

I presented this talk for a group of statisticians at the Australian National University in Canberra. The abstract is below.

Topics covered: linear mixed model splines, multi-omics integration (PLS multiblock), correlation circle plot interpretation, timeOmics.

Longitudinal experiments are becoming increasingly popular in omics studies to monitor molecular changes following treatment or during disease progression. Integrating these data sets can give us some mechanistic insights into the different types of omics layers.

However, longitudinal omics data present numerous challenges including a small number of time points that may be unevenly spaced and unmatched between different data types, a small number of individuals, and a high individual variability. While current approaches have focused on differential expression across time or time profile clustering, the modelling of omics time profiles in a multivariate manner is critically lacking to understand longitudinal biological interactions.

I will present a statistical framework, timeOmics, to identify correlated profiles over time and between omics (transcriptomics, metabolomics, microbiome) to give insights into the molecular dynamics of biological systems and discuss future avenues of research in this expanding area.

Some key references

The timeOmics package

timeOmics is currently not directly available from the mixOmics package, instead it is a separate R package hosted on Bioconductor. See the Bioconductor page for installation instructions.