DIABLO (Data Integration Analysis for Biomarker Discovery using Latent Variable Approaches for Omics Studies) is a supervised, N-integration method for integrating multiple datasets in relation to a categorical outcome variable. It employs multiblock (s)PLS-DA to identify correlations between datasets, and uses a design matrix to control the relationships between them. DIABLO constructs latent components by maximising the covariances between datasets, while balancing model discrimination and integration. It can also perform predictions for novel samples based on these components.
🎥 Watch: Webinar on DIABLO
Data used on this page:breast.TGCA
Key functions used on this page:block.plsda()
block.splsda()
plotLoadings()
plotIndiv()
plotVar()
Related case studies:
DIABLO TGCA Case Study
More DIABLO Examples
References:
1. Singh, A., Shannon, C. P., Gautier, B., Rohart, F., Vacher, M., Tebbutt, S. J., and Lê Cao, K.-A. (2019). Diablo: an integrative approach for identifying key molecular drivers from multi-omics assays. Bioinformatics,35(17):3055–3062.
Publications where DIABLO has been applied:
1. Lee, A.H., Shannon, C.P., Amenyogbe, N. et al. Dynamic molecular changes during the first week of human life follow a robust developmental trajectory. Nat Commun 10, 1092 (2019)
2. Gavin PG, Mullaney JA, Loo D, Cao KL, Gottlieb PA, Hill MM, Zipris D, Hamilton-Williams EE. Intestinal Metaproteomics Reveals Host-Microbiota Interactions in Subjects at Risk for Type 1 Diabetes. Diabetes Care. 2018 Oct;41(10):2178-2186
Additional notes:
DIABLO is a collaborative work between the core team (Dr Florian Rohart, Dr Kim-Anh Lê Cao), and key contributors (Dr Amrit Singh, Benoît Gautier) as a result of a long-term collaboration with the University of British Columbia.
We are also investigating an N-integration method based on kernels (see our example here with mixKernel), which has currently been developed for unsupervised analysis.