SRBCT sPLS-DA Case Study (Permuted Labels)

This case study follows the same approach as the original ssPLS-DA SRBCT case study, but with randomly shuffled class labels (srbct$class). While the number of samples in each class stays the same, the labels are reassigned to different samples. This simulates a situation where the class differences are unclear or the data are very noisy. Using the same pre-tuned parameters, sPLS-DA models were run on both the original and permuted data. In the permuted case, the model performed much worse—showing poor class separation, low stability of selected features, and higher error rates. This highlights that sPLS-DA works well only when real signal exists in the data, and breaks down when meaningful class structure is lost.

🔍 More on sPLS-DA
📄 Download R script

Data used on this page:
srbct

Key functions used on this page:
splsda()
plotIndiv()
plotVar()

Note:
seed is not set in this script, so re-running the code will result in slightly different outputs (i.e. values and plots) from those shown here.

References:
1. Ruiz-Perez, D., Guan, H., Madhivanan, P. et al. So you think you can PLS-DA?. BMC Bioinformatics 21, 2 (2020). https://doi.org/10.1186/s12859-019-3310-7