🚀 mixOmics v6.32.0 released on Bioconductor 3.21

mixOmics v6.32.0 is now available on Bioconductor 3.21, compatible with R 4.5.0. This update brings new features, bug fixes, and improvements based on your feedback.

What’s new since the last Bioconductor release:

🔬 New features and enhancements

plotLoadings() now supports ggplot2-style plots with fully customisable aesthetics
tune() has been enhanced to support tuning of components or variables
New function perf.assess() evaluates final model performance

⚙️ Improved performance and reproducibility

tune() and perf() now support parallel processing using the BPPARAM argument and accept a seed argument to improve reproducibility

🧹 Bug fixes and usability improvements

plotIndiv() now correctly handles pch ordering and ellipse colours
Better error message in perf() when a class has only one sample
Streamlined multiblock functions by removing unused arguments

📦 Install this version:

if (!require("BiocManager", quietly = TRUE))
  install.packages("BiocManager") 
BiocManager::install("mixOmics")

🔍 For a full list of changes, visit the README on our GitHub repo.

New Performance Assessment & Parameter Tuning – Beta test now!

We’ve streamlined performance assessment and parameter tuning functions, available for beta testing before the next Bioconductor release in April!

What’s New?

New perf.assess(): Assesses only the final model’s performance, returning key metrics (no plots). PR #344
Enhanced tune(): Now supports tuning components separately or alongside variables across multiple model types. PR #348
New documentation pages: Explore new webpages explaining key concepts and usage of these functions.

Get Involved

Install the latest development version using
devtools::install_github("mixOmicsTeam/mixOmics", ref = "6.31.4")
Test the new functions on your models
Share feedback on the User Forum or identified bugs on Github Issues

Try it out and help us refine these features before official release!

Webinar: Φ-Space ST: a platform-agnostic method to identify cell states in spatial transcriptomics studies

We have a sequel to Φ-Space, Φ-Space ST developed by Dr Jiadong Mao for spatial transcriptomics studies! We are very excited about these new developments and the potential of Φ-Space for single cell annotation!

Φ-Space ST is:

A novel and fast approach for cell type composition analysis.
Platform-Agnostic and Scalable as it works across multiple spatial transcriptomics (ST) platforms, including CosMx, Visium, and Stereo-seq.
Accurate and integrative as it identifies cell states by leveraging multiple scRNA-seq references.
Segmentation-Free & Niche-Driven as it annotates cell states at subcellular resolution, uncovering niche-specific cell types and tumor-distinguishing patterns.

Φ-Space ST: a platform-agnostic method to identify cell states in spatial transcriptomics studies. Jiadong Mao, Jarny Choi, Kim-Anh Lê Cao. bioRxiv 2025.

Check Jiadong’s latest seminar he presented at Melbourne Integrative Genomics on Friday 14th February 2025:

Abstract

We introduce Φ-Space ST, a platform-agnostic method to identify continuous cell states in spatial transcriptomics (ST) data using multiple scRNA-seq references. For ST with supercellular resolution, Φ-Space ST achieves interpretable cell type deconvolution with significantly faster computation. For subcellular resolution, Φ-Space ST annotates cell states without cell segmentation, leading to highly insightful spatial niche identification. Φ-Space ST harmonises annotations derived from multiple scRNA-seq references, and provides interpretable characterisations of disease cell states by leveraging healthy references. We validate Φ-Space ST in three case studies involving CosMx, Visium and Stereo-seq platforms for various cancer tissues. Our method revealed niche-specific enriched cell types and distinct cell type co-presence patterns that distinguish tumour from non-tumour tissue regions. These findings highlight the potential of Φ-Space ST as a robust and scalable tool for ST data analysis for understanding complex tissues and pathologies.

mixOmics website update

We’re pleased to share that the mixOmics website has undergone a redesign to enhance your browsing experience and make it easier to access our resources.

What’s New?

✨ Refreshed Design: A cleaner, more modern layout
📚 Expanded Getting Started Pages: Helpful pages to help you get up and running with mixOmics
🧭 Reorganized Navigation: A more intuitive menu to quickly find key resources
🔗 Updated Social Links: Stay connected with the mixOmics community
💬 Direct Links to the User Forum: If you haven’t already, join our mixOmics user forum to connect with over 500 other users and experts
🧑‍💻 Updated About Pages: Learn more about the project and our team
📅 Streamlined Workshops, Webinars, and News Sections: Easier access to events and updates
🖥️ Embedded R Markdown Pages: Improved code presentation with syntax highlighting in our Methods, Plots, and Case Studies pages

We are continuing to make small improvements, so if you encounter any issues or have feedback, please feel free to contact us.

Thank you for your continued support of mixOmics.

The mixOmics Team

Page from R Markdown

Missing_Values.knit

All methodologies implemented in mixOmics can handle missing values.
In particular, (s)PLS, (s)PLS-DA,
(s)PCA utilise the NIPALS
(Non-linear Iterative
Partial Least
Squares) algorithm as part of their dimension reduction
procedures. This algorithm is built to handle NAs [1].

This is implemented through the nipals() function within
mixOmics. This function is called internally by the above methods but
can also be used manually, as can be seen below.

Usage in mixOmics

library(mixOmics)
data(liver.toxicity)
X <- liver.toxicity$gene[, 1:100] # a reduced size data set

## pretend there are 20 NA values in our data
na.row <- sample(1:nrow(X), 20, replace = TRUE)
na.col <- sample(1:ncol(X), 20, replace = TRUE)
X.na <- as.matrix(X)

## fill these NA values in X
X.na[cbind(na.row, na.col)] <- NA
sum(is.na(X.na)) # number of cells with NA

## [1] 20

# this might take some time depending on the size of the data set
nipals.tune = nipals(X.na, ncomp = 10)$eig
barplot(nipals.tune, xlab = 'Principal component', ylab = 'Explained variance')

FIGURE 1: Column graph of the explained variance of each Principal
Component.

If missing values need to be imputed, the package contains
impute.nipals() for this scenario. NIPALS
is used to decompose the dataset. The resulting components, singular
values and feature loadings can be used to reconstitute the original
dataset, now with estimated values where the missing values were
previously. To allow for the best estimation of missing values, there is
a large number of components being used (ncom = 10).

X.impute <- impute.nipals(X = X.na, ncomp = 10)
sum(is.na(X.impute)) # number of cells with NA

## [1] 0

The difference between the imputed and real values can be checked.
Here are the original values:

id.na = is.na(X.na) # determine position of NAs in dataframe

X[id.na] # show original values

##  [1]  0.09041 -0.04070  0.03497 -0.01712  0.01309  0.00233 -0.04142  0.11104
##  [9] -0.01519 -0.17034 -0.01641  0.15964  0.00557 -0.06217  0.04131  0.02157
## [17]  0.01226 -0.00753  0.03038 -0.00783

The values which were estimated via the NIPALS
algorithm:

X.impute[id.na] # show imputted values

##  [1]  0.0837747419 -0.0190061068  0.0004024897 -0.0180879247 -0.0094185656
##  [6] -0.0312362158 -0.0706920015  0.1400817774  0.0083359545 -0.1158255139
## [11]  0.0164817649  0.1007897385  0.0236184385  0.0191934144  0.0214240977
## [16]  0.0686280312 -0.0039198425  0.0085870558  0.0450234407  0.0013964758

References

Wold,
H. (1973). Nonlinear Iterative Partial Least Squares (NIPALS) Modelling:
Some Current Developments. Multivariate Analysis–III, 383-407.
https://doi.org/10.1016/b978-0-12-426653-7.50032-6

Test Post from R

This is a test post created via the REST API using R. It supports HTML formatting!

mixOmics 6.30.0 on Bioconductor

At the end of October 2024 Bioconductor updated to version 3.20, and with it updated to the latest version of mixOmics 6.30.0. You can install the latest version of mixOmics on Bioconductor here. This latest release version of the package runs on R version 4.4 and includes some minor bug fixes and updated code and unit tests. See our Github page for more details on these updates.

Webinar: PLS methods

This webinar was presented for a seminar to a group of quantitative researchers (mostly statisticians) at the University of Melbourne. Abstract is below.

Topics covered: context of data integration, PCA solved with NIPALS algorithm and SVD, sparse PCA, correlation circle plot interpretation, PLS algorithms and deflation modes, sparse PLS.

Technological improvements have allowed for the collection of data from different types of molecules (e.g. genes, proteins, metabolites, microorganisms) resulting in multiple ‘omics data (e.g. transcriptomics, proteomics, metabolomics, microbiome) measured from the same set N of biospecimens or individuals. In this talk I will introduce the statistical integration of these multi-omics data to shed more light into a biological system.

Integrating data include numerous challenges – data are complex and large, each with few samples (N < 50) and many molecules (P > 10,000), and generated using different technologies. I will present PLS (Partial Least Squares / Projection to Latent Structures developed by Wold in the 1980s) as an algorithm of choice for data integration of small N large P problems. These variants form the basis of our comprehensive mixOmics R package for feature selection, dimension reduction and integration of omics data sets. This talk is targeted at a general audience with background knowledge in statistics and interest in large data

The webinar was re-recorded for the PLS section.

Webinar: Time-course multi-omics integration

I presented this talk for a group of statisticians at the Australian National University in Canberra. The abstract is below.

Topics covered: linear mixed model splines, multi-omics integration (PLS multiblock), correlation circle plot interpretation, timeOmics.

Longitudinal experiments are becoming increasingly popular in omics studies to monitor molecular changes following treatment or during disease progression. Integrating these data sets can give us some mechanistic insights into the different types of omics layers.

However, longitudinal omics data present numerous challenges including a small number of time points that may be unevenly spaced and unmatched between different data types, a small number of individuals, and a high individual variability. While current approaches have focused on differential expression across time or time profile clustering, the modelling of omics time profiles in a multivariate manner is critically lacking to understand longitudinal biological interactions.

I will present a statistical framework, timeOmics, to identify correlated profiles over time and between omics (transcriptomics, metabolomics, microbiome) to give insights into the molecular dynamics of biological systems and discuss future avenues of research in this expanding area.

Some key references

Straube J, Gorse AD, PROOF Centre of Excellence Team, Huang BE^& and Lê Cao K-A^& (2015). A linear mixed model spline framework for analysing time course ‘omics’ data. PLoS ONE 10(8): e0134540
A Bodein, O Chapleur, A Droit, K-A Lê Cao (2019). A Generic Multivariate Framework for the Integration of Microbiome Longitudinal Studies With Other Data Types, Frontiers in Genetics, 10,
A Bodein, M-P Scott-Boyer, O Perin, K-A Lê Cao, A Droit (2022). timeOmics: an R package for longitudinal multi-omics data integration, Bioinformatics, 38(2)

The timeOmics package

timeOmics is currently not directly available from the mixOmics package, instead it is a separate R package hosted on Bioconductor. See the Bioconductor page for installation instructions.

[closed] Self-paced online course Feb 24 – April 11, 2025

Single and multi-omics analysis and integration with mixOmics

Our registrations are now closed! Fill in this Expression Of Interest for if you missed out, so that we can notify you of new workshops.

This course is designed for:

Beginners looking for an introduction to mixOmics methods for single- and multi-omics analyses.
Current mixOmics users who want to deepen their understanding of the mixOmics methods.
Users who would like more guidance on analyzing their own data (we also provide exemplar datasets).

The workshop is self-paced and spans across 7 weeks. There are 4 Q&A live sessions, and many opportunities to interact with the cohort and your instructor Prof Kim-Anh Lê Cao via Slack. BYO data is encouraged: we provide advice so that you can analyse your own data with mixOmics tools as part of your learning process. A good working knowledge in R programming (e.g. handling data frame, perform simple calculations and display simple graphical outputs) is essential to fully benefit from the course*.

According to our past participants, a time commitment of 5-8h/week was sufficient to feel that they were progressing. Here is some feedback from a previous course.

We provide a certificate of attendance or completion.

Register here, places are limited!

Fees

Research Higher Degree students enrolled at a University: $495 AUD (incl. GST) [discount code: MIXO_RHD]

Staff and members from Universities & Not-for-profit organisations: $825 (incl. GST) [discount code: MIXO_NFP_STAFF]

Other industries: $1320 AUD (incl. GST)

discounts of 5% for a group of 3-9 learners and 10% for 10+ learners, however, this will require a single invoice per group.

These funds go towards the support of a software developer to maintain the package. If you need an invoice, contact Student Support at continuing-education[at]unimelb.edu.au

Teaching Period Dates

Teaching commences: Monday, 24 Feb 2025, 9:00 am AEST
- Q&A live webinars are scheduled on Thursdays 6pm AEST / 8am CET during the first 4 weeks (27^th Feb, 6^th, 13^th and 20^th March).
- An additional session might be added on Fridays 9am AEST ( = Thursdays 2pm PST / 5pm EST / 9pm CET)

Teaching concludes: Sunday, 23 March 2025, 11:59 pm AEST (after 4 weeks)

(non marked) Assessment due: Friday 4 April 2025 (2 weeks prep)

Peer-review of assessment due: Friday 11 April 2025 (1 week prep)

The course is divided into theory (50%) and hands-on practice, with the opportunity to analyse your own data. The exercises and assignments are in R. Participants are encouraged to use RStudio and Rmarkdown (template and R code provided).

*Need an R refresher?

Learners who are not proficient in R do not get the full benenefit of the course (based on their own, honest, feedback!) For those looking for an R refresher well ahead of the course:

The R cheatsheets for reference: https://iqss.github.io/dss-workshops/R/Rintro/base-r-cheat-sheet.pdf

https://monashdatafluency.github.io/r-intro-2/index.html