Contents

1 Introduction

Cardinal 3 provides statistical methods for both supervised and unsupervised analysis of mass spectrometry (MS) imaging experiments. Class comparison can also be performed, provided an appropriate experimental design and sample size.

Before statistical analysis, it is important to identify the statistical goal of the experiment:

CardinalWorkflows provides real experimental data and more detailed discussion of the statistical methods than will be covered in this brief overview.

2 Exploratory analysis

Suppose we are exploring an unlabeled dataset, and wish to understand the structure of the data.

set.seed(2020, kind="L'Ecuyer-CMRG")
mse <- simulateImage(preset=2, dim=c(32,32), sdnoise=0.5,
    peakheight=c(2,4), centroided=TRUE)

mse$design <- makeFactor(circle=mse$circle,
    square=mse$square, bg=!(mse$circle | mse$square))

image(mse, "design")

image(mse, i=c(5, 13, 21), layout=c(1,3))

2.1 Principal components analysis (PCA)

Principal components analysis is an unsupervised dimension reduction technique. It reduces the data to some number of “principal components” that are a linear combination of the original mass features, where each component is orthogonal to the last, and explains as much of the variance in the data as possible.

Use PCA() to perform PCA on a MSImagingExperiment.

pca <- PCA(mse, ncomp=3)
pca
## SpatialPCA on 30 variables and 1024 observations
## names(5): sdev, rotation, center, scale, x
## coord(2): x = 1...32, y = 1...32
## runNames(1): run0
## modelData(): Principal components (k=3)
## 
## Standard deviations (1, .., k=3):
##       PC1      PC2      PC3
##  7.031542 3.516199 1.092932
## 
## Rotation (n x k) = (30 x 3):
##              PC1         PC2         PC3
## [1,] -0.03141217  0.21197865  0.03941824
## [2,] -0.02743754  0.19152844  0.16421233
## [3,] -0.02974002  0.19314984  0.11896429
## [4,] -0.05048566  0.32818833 -0.04828145
## [5,] -0.05499438  0.34063726 -0.22523541
## [6,] -0.06129265  0.39304819 -0.18998119
## ...          ...         ...         ...

We can see that the first 2 principal components explain most of the variation in the data.

image(pca, type="x", superpose=FALSE, layout=c(1,3), scale=TRUE)

The loadings of the components show how each feature contributes to each component.

plot(pca, type="rotation", superpose=FALSE, layout=c(1,3), linewidth=2)

Plotting the principal component scores against each other is a useful way of visualization the separation between data classes.

plot(pca, type="x", groups=mse$design, linewidth=2)