Introduction
Understanding the spatial distribution and interplay of cell states in tissue is critical for elucidating tissue formation and function. Single-cell and spatial omics present a promising approach to addressing this need. Traditional methods typically include the identification of highly variable genes, dimensionality reduction, clustering, and the annotation of cells or functions based on gene over-expression. Nevertheless, these qualitative approaches are inadequate for accurately mapping the distributions of spatial features. To address this, integrating biomedical knowledge such as Gene Ontology, KEGG, Reactome, transcription factors, and cell-type marker genes directly allows for the evaluation of cell states from gene expression data, creating quantitative functional pathway profiles at the single captured location.
After quantifying cell functions, analyzing their spatial distribution and co-distribution with other features can provide deeper insights into related biological questions. We focus on three aspects: the spatial variability of cell functions, regions where these functions cluster, and their co-distribution patterns with other features. Although existing tools such as SPARK-X
(Zhu, Sun, and Zhou 2021), nnSVG
(Weber et al. 2023), SpatialDE
(Svensson, Teichmann, and Stegle 2018), SpaGFT
(Chang et al. 2024), Seurat
(Hao et al. 2023), and Squidpy
(Palla et al. 2022) facilitate the exploration of spatially variable genes, they are primarily designed for gene-level analysis and lack the capability to investigate the spatial co-distribution of features. Additionally, many of these tools, including SpatialDE
(Svensson, Teichmann, and Stegle 2018), SPARK
(Sun, Zhu, and Zhou 2020), MERINGUE
(Miller et al. 2021), and nnSVG
(Weber et al. 2023), face challenges in handling large-scale spatial transcriptome data due to high memory consumption and low computational efficiency.
To fill the gaps, we developed SVP
to accurately predict cell states, explore their spatial distribution, and assess their spatial relationship with other features. SVP is developed based on SingleCellExperiment
class, which can be interoperable with the existing computing ecosystem of Bioconductor
.
Overview of SVP
The evaluation of functional status at individual locations captured by SVP is achieved using Multiple Correspondence Analysis (MCA
) for dimensionality reduction. This process employs a standardized gene expression matrix to project both cells and genes into a unified MCA
space. It has been established that this method allows for the calculation of distances not only between genes and cells but also between cells and genes, thereby facilitating the assessment of their associations(Cortal et al. 2021). Proximity in this space indicates a stronger relationship. These calculated distances are crucial for constructing a weighted k-nearest neighbors (KNN) network, linking each cell or gene to its most relevant counterparts. To discern features with varying levels of proximity, distances are first normalized to a 0-1 scale, with closer distances approaching 1 and farther distances approaching 0. This normalization is followed by division by the total distance among the nearest features, thereby assigning greater connection weights to closer features. Subsequently, databases of known biological knowledge, such as transcription factor target gene sets, Reactome functional gene sets, and cell-type marker gene sets, serve as initial seeds. Random walks on the constructed weighted KNN network yield preliminary functional state activity scores for each location. To mitigate potential biases introduced by dimensionality reduction, a hypergeometric distribution test enhances the enrichment analysis of top-ranking genes extracted directly from the expression matrix at each location. These analyses provide weights for functional activities, culminating in the derivation of functional activity scores at the single captured location (Fig. 1A).
To identify spatially variable cell functions, we first established a cell neighbor weight matrix based on spot locations using the Delaunay triangulation (default) or KNN algorithm. This weight matrix, alongside global autocorrelation analyses such as Moran’s I (default), Geary’s C, or Getis-Ord’s G, facilitated the identification of spatially variable cell functions or gene characteristics. Additionally, we utilized the same cell neighbor weight matrix and local spatial autocorrelation algorithms (Local Getis-Ord or Local Moran) to delineate the local spatial distribution of these variable features (Fig. 1B). To examine spatial co-distribution among cell functions, we designed a bivariate spatial global and local autocorrelation algorithm employing the Lee index. This approach enables the assessment of correlation between different cell characteristics in their spatial distribution (Fig. 1C).
Overview of SVP.
Install
You can use the following commands to install it
#Once Bioconductor 3.21 is released, you can install it as follows.
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("SVP")
#Or you can install it via github
if (!requireNamespace("remotes", quietly=TRUE))
install.packages("remotes")
remotes::install_github("YuLab-SMU/SVP")
Quantification of cell states using SVP
SVP can integrate gene or protein lists from existing biomedical databases like cell-type markers (PanglaoDB, CellMarkers), gene ontology (GO), molecular features (MSigDB), and pathways (KEGG, Reactome, Wikipathways), transcription factors, and disease ontology, to assess cell states. Of course, the cell-type markers also can be extracted from reference single cell data. In this vignette, we use the spatial transcriptome and single cell transcriptome of mouse olfactory bulb from the articles(Ståhl et al. 2016; Tepe et al. 2018) to demonstrate. Here, we first perform the normalization by logNormCounts
of scuttle
and the MCA
dimensionality reduction by runMCA
of SVP
.
# load the packages required in the vignette
library(SpatialExperiment)
library(SingleCellExperiment)
library(scuttle)
library(scran)
library(SVP)
library(ggsc)
library(ggplot2)
library(magrittr)
library(STexampleData)
# load the spatial transcriptome, it is a SpatialExperiment object
# to build the object, you can refer to the SpatialExperiment package
mob_spe <- ST_mouseOB()
#> see ?STexampleData and browseVignettes('STexampleData') for documentation
#> loading from cache
# the genes that did not express in any captured location were removed.
# we use logNormCounts of scuttle to normalize the spatial transcriptome
mob_spe <- logNormCounts(mob_spe)
# Now the normalized counts was added to the assays of mob_spe as `logcounts`
# it can be extracted via logcounts(mob_spe) or assay(mob_spe, 'logcounts')
# Next, we use `runMCA` of `SVP` to preform the MCA dimensionality reduction
mob_spe <- runMCA(mob_spe, assay.type = 'logcounts')
#> Computing Fuzzy Matrix
#> Computing SVD
#> Computing Coordinates
# The MCA result was added to the reducedDims, it can be extracted by reducedDim(mob_spe, 'MCA')
# more information can refer to SingleCellExperiment package
mob_spe
#> class: SpatialExperiment
#> dim: 15928 262
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(15928): 0610007N19Rik 0610007P14Rik ... Zzef1 Zzz3
#> rowData names(1): gene_name
#> colnames(262): ACAACTATGGGTTGGCGG ACACAGATCCTGTTCTGA ...
#> TTTCAACCCGAGGAAGTC TTTCTAACTCATAAGGAT
#> colData names(4): barcode_id sample_id layer sizeFactor
#> reducedDimNames(1): MCA
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(1): sample_id
Quantification of BIOCARTA pathway or other function
SVP accepts SingleCellExperiment
or SpatialExperiment
object as input. The gene set should be a named list object, and the elements of the list must be included in the row names of the SingleCellExperiment
or SpatialExperiment
object. To quantifying the Biocarta
pathway or other function for each captured location, the command runSGSA
is used.
#Note: the gset.idx.list supports the named gene list, gson object or
# locate gmt file or html url of gmt.
mob_spe <- runSGSA(mob_spe,
gset.idx.list = "https://data.broadinstitute.org/gsea-msigdb/msigdb/release/2024.1.Mm/m2.cp.biocarta.v2024.1.Mm.symbols.gmt", # The target gene set gmt file
assay.type = 'logcounts', # The name of assays of gene profiler
gsvaExp.name = 'Biocarta' # The name of result to save to gsvaExps slot
)
#> Building the nearest neighbor graph with the distance between features and
#> cells ...
#> elapsed time is 9.219000 seconds
#> Building the seed matrix using the gene set and the nearest neighbor graph for
#> random walk with restart ...
#> elapsed time is 0.085000 seconds
#> Calculating the affinity score using random walk with restart ...
#> elapsed time is 0.806000 seconds
#> Tidying the result of random walk with restart ...
#> elapsed time is 0.008000 seconds
#> ORA analysis ...
#> elapsed time is 0.023000 seconds
# Then the result was added to gsvaExps to return a SVPExperiment object, the result
# can be extracted with gsvaExp, you can view more information via help(gsvaExp).
mob_spe
#> class: SVPExperiment
#> dim: 15928 262
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(15928): 0610007N19Rik 0610007P14Rik ... Zzef1 Zzz3
#> rowData names(1): gene_name
#> colnames(262): ACAACTATGGGTTGGCGG ACACAGATCCTGTTCTGA ...
#> TTTCAACCCGAGGAAGTC TTTCTAACTCATAAGGAT
#> colData names(4): barcode_id sample_id layer sizeFactor
#> reducedDimNames(1): MCA
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(0):
#> gsvaExps names(1) : Biocarta
gsvaExpNames(mob_spe)
#> [1] "Biocarta"
# The result is also a SingleCellExperiment or SpatialExperiment.
gsvaExp(mob_spe, 'Biocarta')
#> class: SpatialExperiment
#> dim: 252 262
#> metadata(0):
#> assays(1): affi.score
#> rownames(252): BIOCARTA_41BB_PATHWAY BIOCARTA_ACE2_PATHWAY ...
#> BIOCARTA_WNT_LRP6_PATHWAY BIOCARTA_WNT_PATHWAY
#> rowData names(4): exp.gene.num gset.gene.num gene.occurrence.rate
#> geneSets
#> colnames(262): ACAACTATGGGTTGGCGG ACACAGATCCTGTTCTGA ...
#> TTTCAACCCGAGGAAGTC TTTCTAACTCATAAGGAT
#> colData names(4): barcode_id layer sizeFactor sample_id
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(0):
# identify the spatially variable Biocarta pathway.
Biocarta.svdf <- gsvaExp(mob_spe, "Biocarta") |>
runDetectSVG(
assay.type = 1,
action = 'only'
)
#> Identifying the spatially variable gene sets (pathway) based on moransi
#> elapsed time is 0.044000 seconds
Biocarta.svdf |> head()
#> obs expect.moransi sd.moransi Z.moransi
#> BIOCARTA_41BB_PATHWAY 0.05302297 -0.003831418 0.03016130 1.8850112
#> BIOCARTA_ACE2_PATHWAY 0.09933677 -0.003831418 0.03247755 3.1766002
#> BIOCARTA_ACH_PATHWAY 0.04625366 -0.003831418 0.03142826 1.5936320
#> BIOCARTA_ACTINY_PATHWAY 0.12301548 -0.003831418 0.03276267 3.8716900
#> BIOCARTA_AGPCR_PATHWAY 0.01386184 -0.003831418 0.03520808 0.5025341
#> BIOCARTA_AGR_PATHWAY 0.04678783 -0.003831418 0.03553799 1.4243700
#> pvalue padj rank
#> BIOCARTA_41BB_PATHWAY 2.971417e-02 0.061290029 115
#> BIOCARTA_ACE2_PATHWAY 7.450615e-04 0.002884109 62
#> BIOCARTA_ACH_PATHWAY 5.550924e-02 0.100925887 132
#> BIOCARTA_ACTINY_PATHWAY 5.404167e-05 0.000324250 40
#> BIOCARTA_AGPCR_PATHWAY 3.076459e-01 0.426792051 173
#> BIOCARTA_AGR_PATHWAY 7.716969e-02 0.133242632 139
# We can obtain significant spatially variable pathways,
# which are sorted according to the Moran index.
# More details see also the Spatial statistical analysis session
Biocarta.svdf |> dplyr::filter(padj<=0.01) |> dplyr::arrange(rank) |> head()
#> obs expect.moransi sd.moransi Z.moransi
#> BIOCARTA_CACAM_PATHWAY 0.3623497 -0.003831418 0.03565189 10.271014
#> BIOCARTA_CELLCYCLE_PATHWAY 0.2831358 -0.003831418 0.03437045 8.349240
#> BIOCARTA_P53_PATHWAY 0.2480688 -0.003831418 0.03270444 7.702325
#> BIOCARTA_GSK3_PATHWAY 0.2476114 -0.003831418 0.03568856 7.045475
#> BIOCARTA_CALCINEURIN_PATHWAY 0.2443118 -0.003831418 0.03577427 6.936362
#> BIOCARTA_PTDINS_PATHWAY 0.2299768 -0.003831418 0.03574460 6.541078
#> pvalue padj rank
#> BIOCARTA_CACAM_PATHWAY 4.759931e-25 1.142383e-22 1
#> BIOCARTA_CELLCYCLE_PATHWAY 3.435170e-17 4.122204e-15 2
#> BIOCARTA_P53_PATHWAY 6.680625e-15 5.344500e-13 3
#> BIOCARTA_GSK3_PATHWAY 9.241475e-13 5.544885e-11 4
#> BIOCARTA_CALCINEURIN_PATHWAY 2.011633e-12 9.655839e-11 5
#> BIOCARTA_PTDINS_PATHWAY 3.053843e-11 1.221537e-09 6
ids.sv.biocarta <- Biocarta.svdf |> dplyr::filter(padj<=0.01) |> dplyr::arrange(rank) |> rownames()
# We use sc_spatial of ggsc to visualize the distribution of some spatially variable functions
gsvaExp(mob_spe, 'Biocarta') %>%
sc_spatial(
features = ids.sv.biocarta[seq(12)],
mapping = aes(x = x, y = y),
geom = geom_bgpoint,
pointsize = 3.5,
ncol = 4,
common.legend = FALSE # The output is a patchwork
) &
scale_color_viridis_c(option='B', guide='none')
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
The heatmap of spatially variable biocarta pathway
Quantification of cell-type
To quantifying the cell-type for each captured location, the gene set should be the marker gene sets of cell-type. It can be the pre-established marker gene sets, such as the markers from CellMarkers
or PanglaoDB
. High-quality cell marker genes from past reports or your own compilations are acceptable. In addition, It can also be extracted from the reference single cell data. The following sections will detail the specific operations of each approach.
Quantification of cell-type using the pre-established marker gene sets
We can use the pre-established marker gene sets to quantifying the cell-type for each captured location. Here, we used the markers obtained the differential expression genes from the single cell transcriptome of the article (Tepe et al. 2018).
# The marker gene sets
# We can view the marker gene number of each cell-type.
data(mob_marker_genes)
lapply(mob_marker_genes, length) |> unlist()
#> GC PGC M/TC OSNs EPL-IN
#> 14 13 13 34 12
mob_spe <- runSGSA(
mob_spe,
gset.idx.list = mob_marker_genes,
gsvaExp.name = 'CellTypeFree',
assay.type = 'logcounts'
)
#> Building the nearest neighbor graph with the distance between features and
#> cells ...
#> elapsed time is 9.067000 seconds
#> Building the seed matrix using the gene set and the nearest neighbor graph for
#> random walk with restart ...
#> elapsed time is 0.011000 seconds
#> Calculating the affinity score using random walk with restart ...
#> elapsed time is 0.147000 seconds
#> Tidying the result of random walk with restart ...
#> elapsed time is 0.009000 seconds
#> ORA analysis ...
#> elapsed time is 0.009000 seconds
# Then we can visualize the cell-type activity via sc_spatial of SVP with pie plot
gsvaExp(mob_spe, "CellTypeFree") |>
sc_spatial(
features = sort(names(mob_marker_genes)),
mapping = aes(x = x, y = y),
plot.pie = TRUE,
color = NA,
pie.radius.scale = 1.1
) +
scale_fill_brewer(palette='Set2')
#> Scale for fill is already present.
#> Adding another scale for fill, which will replace the existing scale.
The pie plot of cell-type activity
We cal also display each cell-type activity with heat-map according to the spatial coordinate.
# calculated the of cell-type
assay(gsvaExp(mob_spe, 2)) |> as.matrix() |>
prop.table(2) |>
Matrix::Matrix(sparse=TRUE) ->
assay(gsvaExp(mob_spe,2), "prop")
mob_spe |> gsvaExp(2) |>
sc_spatial(
features = sort(names(mob_marker_genes)),
mapping = aes(x = x, y = y),
geom = geom_bgpoint,
pointsize = 2,
ncol = 5,
slot = 2
) +
scale_color_viridis_c()
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
the heatmap of each cell-type activity
Quantification of cell-type using the reference single-cell data
This approach requires obtaining the cell-type marker genes from the reference single-cell data. First, the reference single cell data should be normalized. Here, the command of logNormCounts
of scuttle
is used. Next, the MCA
dimensionality reduction is performed by runMCA
of SVP
. Finally, the marker gene sets of cell-type are extracted via runDetectMarker
of SVP
.
# load the reference single cell transcriptome
# it is a SingleCellExperiment
data(mob_sce)
mob_sce <- logNormCounts(mob_sce)
# You can also first obtain the high variable genes, then
# perform the MCA reduction.
# set.seed(123)
# stats <- modelGeneVar(mob_sce)
# hvgs <- getTopHVGs(stats)
# perform the MCA reduction.
mob_sce <- mob_sce |> runMCA(assay.type = 'logcounts')
#> Computing Fuzzy Matrix
#> Computing SVD
#> Computing Coordinates
# obtain the marker gene sets from reference single-cell transcriptome based on MCA space
refmakergene <- runDetectMarker(mob_sce, group.by='cellType',
ntop=200, present.prop.in.group=.2, present.prop.in.sample=.5
)
refmakergene |> lapply(length) |> unlist()
#> GC PGC M/TC OSNs EPL-IN
#> 15 14 8 36 11
Then, the quantification of the cell-type gene signatures can also be performed via runSGSA
with the marker gene sets.
# use the maker gene from the reference single-cell transcriptome
mob_spe <- runSGSA(
mob_spe,
gset.idx.list = refmakergene,
gsvaExp.name = 'CellTypeRef'
)
#> Building the nearest neighbor graph with the distance between features and
#> cells ...
#> elapsed time is 8.996000 seconds
#> Building the seed matrix using the gene set and the nearest neighbor graph for
#> random walk with restart ...
#> elapsed time is 0.011000 seconds
#> Calculating the affinity score using random walk with restart ...
#> elapsed time is 0.154000 seconds
#> Tidying the result of random walk with restart ...
#> elapsed time is 0.008000 seconds
#> ORA analysis ...
#> elapsed time is 0.009000 seconds
mob_spe |>
gsvaExp('CellTypeRef') |>
sc_spatial(
features = sort(names(refmakergene)),
mapping = aes(x = x, y = y),
plot.pie = TRUE,
color = NA,
pie.radius.scale = 1.1
) +
scale_fill_brewer(palette = 'Set2')
#> Scale for fill is already present.
#> Adding another scale for fill, which will replace the existing scale.
The pie plot of cell-type activity with the marker gene sets from reference single-cell transcriptome
We can found that the result based on marker gene sets from reference single-cell transcriptome is similar with the result based on marker gene sets from previous reports.
We can also obtain the major cell-type for each captured location using pred.cell.signature
.
# the result will be added to the colData of gsvaExp(mob_spe, "CellTypeRef")
mob_spe <- mob_spe |> pred.cell.signature(gsvaexp='CellTypeFree', pred.col.name='pred')
#> The `gsvaexp` = CellTypeFree was specified, the specified gsva Experiment will
#> be used to predict the cell signature.
# Then we can use sc_spatial to visualize it.
mob_spe |>
gsvaExp("CellTypeFree") |>
sc_spatial(
mapping = aes(x = x, y = y, color = pred),
geom = geom_bgpoint,
pointsize = 4
) +
scale_color_brewer(palette = 'Set2')
The major cell type for each spot
Spatial statistical analysis
SVP implements several spatial statistical algorithms to explore the spatial distribution of cell function or other features from three aspects: whether cell function states have spatial variability, where spatially variable cell functions cluster, and the co-distribution patterns of spatially variable cell functions with other features. The following sections provide a detailed description of each aspect.
Univariate spatial statistical analysis
This section addresses whether the feature is spatially variable and identifies its high-cluster areas. In detail, to identify the spatial variable features, SVP
implements three spatial autocorrelation algorithms using Rcpp
and RcppParallel
: global Moran’s I, Geary’s C and Getis-Ord’s G. To identify the spatial high-cluster areas, SVP
implements two local univariate spatial statistical algorithms using Rcpp
and RcppParallel
: local Getis-Ord’s G and local Moran’s I.
Identification of spatial variable features
Here, we identify the spatial variable cell-types using runDetectSVG
, which implements the global Moran’s I, Geary’s C and Getis-Ord’s G algorithms.
# First approach:
# we can first obtain the target gsvaExp with gsvaExp names via
# gsvaExp() function, for example, CellTypeFree
mob_spe |> gsvaExp("CellTypeFree")
#> class: SpatialExperiment
#> dim: 5 262
#> metadata(0):
#> assays(2): affi.score prop
#> rownames(5): GC PGC M/TC OSNs EPL-IN
#> rowData names(4): exp.gene.num gset.gene.num gene.occurrence.rate
#> geneSets
#> colnames(262): ACAACTATGGGTTGGCGG ACACAGATCCTGTTCTGA ...
#> TTTCAACCCGAGGAAGTC TTTCTAACTCATAAGGAT
#> colData names(5): barcode_id layer sizeFactor sample_id pred
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(0):
# Then we use runDetectSVG with method = 'moran' to identify the spatial variable
# cell-types with global Moran's I.
celltype_free.I <- mob_spe |> gsvaExp("CellTypeFree") |>
runDetectSVG(
assay.type = "prop", # because default is 'logcounts', it should be adjused
method = 'moransi',
action = 'only'
)
#> Identifying the spatially variable gene sets (pathway) based on moransi
#> elapsed time is 0.095000 seconds
celltype_free.I |> dplyr::arrange(rank) |> head()
#> obs expect.moransi sd.moransi Z.moransi pvalue padj
#> GC 0.5590105 -0.003831418 0.03599358 15.637287 2.028236e-55 1.014118e-54
#> OSNs 0.3822118 -0.003831418 0.03591675 10.748281 3.018830e-27 7.547076e-27
#> M/TC 0.2944360 -0.003831418 0.03596283 8.293769 5.485780e-17 9.142967e-17
#> PGC 0.2809947 -0.003831418 0.03578380 7.959638 8.627161e-16 1.078395e-15
#> EPL-IN 0.0574522 -0.003831418 0.03554512 1.724108 4.234422e-02 4.234422e-02
#> rank
#> GC 1
#> OSNs 2
#> M/TC 3
#> PGC 4
#> EPL-IN 5
# Second approach:
# the result was added to the original object by specific the
# gsvaexp argument in the runDetectSVG
mob_spe <- mob_spe |>
runDetectSVG(
gsvaexp = 'CellTypeFree',
method = 'moran',
gsvaexp.assay.type = 'prop'
)
#> The `gsvaexp` was specified, the specified `gsvaExp` will be used to detect
#> 'svg'.
#> Identifying the spatially variable gene sets (pathway) based on moransi
#> elapsed time is 0.030000 seconds
#> The result is added to the input object, which can be extracted using
#> `svDf()` with type='sv.moransi' for `SingleCellExperiment` or
#> `SpatialExperiment`.
#> If input object is `SVPExperiment`, and `gsvaexp` is specified
#> the result can be extracted by `gsvaExp()` (return a `SingleCellExperiment`
#> or `SpatialExperiment`),then also using `svDf()` to extract.
gsvaExp(mob_spe, 'CellTypeFree') |> svDf("sv.moransi") |> dplyr::arrange(rank) |> dplyr::filter(padj<=0.01)
#> obs expect.moransi sd.moransi Z.moransi pvalue padj
#> GC 0.5590105 -0.003831418 0.03599358 15.637287 2.028236e-55 1.014118e-54
#> OSNs 0.3822118 -0.003831418 0.03591675 10.748281 3.018830e-27 7.547076e-27
#> M/TC 0.2944360 -0.003831418 0.03596283 8.293769 5.485780e-17 9.142967e-17
#> PGC 0.2809947 -0.003831418 0.03578380 7.959638 8.627161e-16 1.078395e-15
#> rank
#> GC 1
#> OSNs 2
#> M/TC 3
#> PGC 4
The obs
is the Moran’s I, expect.moransi
is the expect Moran’s I (E(I)
). Moran’s I quantifies the correlation between a single feature at a specific spatial point or region and its neighboring points or regions. Global Moran’s I usually ranges from -1 to 1, Global Moran’s I significantly above E(I)
indicate a distinct spatial pattern or spatial clustering, which occurs when neighboring captured locations tend to have similar feature value. Global Moran’s I around E(I)
suggest random spatial expression, that is, absence of spatial pattern. Global Moran’s I significantly below E(I)
imply a chessboard-like pattern or dispersion.
# using Geary's C
mob_spe <- mob_spe |>
runDetectSVG(
gsvaexp = 'CellTypeFree',
gsvaexp.assay.type = 'prop',
method = 'geary'
)
#> The `gsvaexp` was specified, the specified `gsvaExp` will be used to detect
#> 'svg'.
#> Identifying the spatially variable gene sets (pathway) based on gearysc
#> elapsed time is 0.029000 seconds
#> The result is added to the input object, which can be extracted using
#> `svDf()` with type='sv.gearysc' for `SingleCellExperiment` or
#> `SpatialExperiment`.
#> If input object is `SVPExperiment`, and `gsvaexp` is specified
#> the result can be extracted by `gsvaExp()` (return a `SingleCellExperiment`
#> or `SpatialExperiment`),then also using `svDf()` to extract.
gsvaExp(mob_spe, "CellTypeFree") |> svDf('sv.gearysc') |> dplyr::arrange(rank) |> dplyr::filter(padj<=0.01)
#> obs expect.gearysc sd.gearysc Z.gearysc pvalue padj
#> GC 0.4405675 1 0.03640685 15.366133 1.380832e-53 6.904162e-53
#> OSNs 0.6132465 1 0.03690356 10.480112 5.330845e-26 1.332711e-25
#> M/TC 0.7059801 1 0.03660655 8.031894 4.798975e-16 7.998292e-16
#> PGC 0.7112366 1 0.03774514 7.650346 1.002197e-14 1.252746e-14
#> rank
#> GC 1
#> OSNs 2
#> M/TC 3
#> PGC 4
The obs
is the Geary’s C. Global Geary’s C is another popular index of global spatial autocorrelation, but focuses more on differences between neighboring values. Global Geary’s C usually range from 0 to 2. Global Geary’s C significantly below E[C]=1
indicate a distinct spatial pattern or spatial clustering, the value significantly above E[C]=1
suggest a chessboard-like pattern or dispersion, the value around E[C]=1
imply absence of spatial pattern.
# using Getis-ord's G
mob_spe <- mob_spe |>
runDetectSVG(
gsvaexp = 'CellTypeFree',
gsvaexp.assay.type = 'prop',
method = 'getis'
)
#> The `gsvaexp` was specified, the specified `gsvaExp` will be used to detect
#> 'svg'.
#> Identifying the spatially variable gene sets (pathway) based on getisord
#> elapsed time is 0.029000 seconds
#> The result is added to the input object, which can be extracted using
#> `svDf()` with type='sv.getisord' for `SingleCellExperiment` or
#> `SpatialExperiment`.
#> If input object is `SVPExperiment`, and `gsvaexp` is specified
#> the result can be extracted by `gsvaExp()` (return a `SingleCellExperiment`
#> or `SpatialExperiment`),then also using `svDf()` to extract.
gsvaExp(mob_spe, "CellTypeFree") |> svDf('sv.getisord') |> dplyr::arrange(rank) |> dplyr::filter(padj<=0.01)
#> obs expect.G sd.G Z.G pvalue padj
#> GC 0.006508037 0.003831418 0.0001767082 15.147114 3.958388e-52 1.979194e-51
#> OSNs 0.004943604 0.003831418 0.0001123195 9.901992 2.040334e-23 5.100835e-23
#> M/TC 0.005215980 0.003831418 0.0001745559 7.931912 1.078986e-15 1.798310e-15
#> PGC 0.005363887 0.003831418 0.0002049821 7.476113 3.827644e-14 4.784556e-14
#> rank
#> GC 1
#> OSNs 2
#> M/TC 3
#> PGC 4
The obs
is the Getis-Ord’s G of each cell-type. Global Getis-Ord’s G measures global spatial autocorrelation by evaluating the clustering of high/low value, specifically identifying hot spots
or cold spots
. Global Getis-Ord’s G values have no fixed range, as they depend on the specific dataset and spatial weight matrix. Global Getis-Ord’s G significantly above E(G)
indicate a distinct spatial pattern or spatial clustering, which occurs when neighboring captured locations tend to have similar feature value. Global Getis-Ord’s G around E(G)
suggest random spatial expression, that is, absence of spatial pattern. Global Getis-Ord’s G significantly below E(G)
imply a chessboard-like pattern or dispersion.
So the GC
, OSNs
, M/TC
, and PGC
were significantly concentrated in the spatial distribution compare to EPL-IN
.
Identification of local spatial aggregation areas
Next, we can identify the spatial high-cluster areas for the spatial variable features via runLISA
, which provides two local spatial statistics algorithms: Local Getis-Ord’s G and Local Moran’s I
mob_spe_cellfree <- mob_spe |> gsvaExp("CellTypeFree")
mob_spe_cellfree <- mob_spe_cellfree |>
runPCA(assay.type='prop') |> runTSNE(dimred='PCA')
#> Warning in check_numbers(k = k, nu = nu, nv = nv, limit = min(dim(x)) - : more
#> singular values/vectors requested than available
#> Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth =
#> TRUE, : You're computing too large a percentage of total singular values, use a
#> standard svd instead.
# Here, we use the `TSNE` space to identify the location clustering regions
celltypefree_lisares <- mob_spe_cellfree |>
runLISA(
features = c("GC", "OSNs", "M/TC", "PGC"),
assay.type = 'prop',
reduction.used = 'TSNE',
weight.method = 'knn',
k = 8
)
celltypefree_lisares$GC |> head()
#> Gi E.Gi Var.Gi Z.Gi
#> ACAACTATGGGTTGGCGG 0.0004568125 0.003831418 2.163394e-06 -2.294326
#> ACACAGATCCTGTTCTGA 0.0015580172 0.003831418 2.169044e-06 -1.543625
#> ACATCACCTGCGCGCTCT 0.0101642046 0.003831418 2.193616e-06 4.275773
#> ACATTTAAGGCGCATGAT 0.0103808033 0.003831418 2.193136e-06 4.422499
#> ACCACTGTAATCTCCCAT 0.0004616075 0.003831418 2.167719e-06 -2.288780
#> ACCAGAGCCGTTGAGCAA 0.0016442001 0.003831418 2.172145e-06 -1.484047
#> Pr (z != E(Gi)) cluster.no.test cluster.test
#> ACAACTATGGGTTGGCGG 2.177176e-02 Low Low
#> ACACAGATCCTGTTCTGA 1.226792e-01 Low NoSign
#> ACATCACCTGCGCGCTCT 1.904754e-05 High High
#> ACATTTAAGGCGCATGAT 9.756574e-06 High High
#> ACCACTGTAATCTCCCAT 2.209216e-02 Low Low
#> ACCAGAGCCGTTGAGCAA 1.377966e-01 Low NoSign
Local Getis-Ord’s G (Gi) measures the aggregation of a feature value in a specific spot and its neighbors, determining if locations are significantly clustered with high (hot spots) or low values (cold spots). Gi usually have no fixed range, a larger Gi indicates a stronger clustering of high expression for the feature in captured location and its neighboring spots, while a smaller Gi indicates a stronger clustering of low expression.
gsvaExp(mob_spe, 'CellTypeFree') |>
plot_lisa_feature(
lisa.res = celltypefree_lisares,
assay.type = 2,
pointsize = 2,
gap_line_width = .18,
hlpointsize= 1.8
)
The result of local Getis-Ord for cell-type
The highlighted area signifies that the feature is significantly clustered in this region.
Bivariate spatial statistical analysis
Next, to explore the bivariate spatial co-distribution, SVP
implements global and local bivariate spatial statistics algorithms. Global bivariate spatial statistics algorithm can be used to identify whether the bivariate is co-cluster in the specific space. Local bivariate spatial algorithm is to find the areas of spatial co-clustering of bivariate.
Global bivariate spatial analysis
Here, we use runGLOBALBV
of SVP
to explore the spatial co-distribution between the cell-types.
gbv.res <- mob_spe |>
gsvaExp('CellTypeFree') |>
runGLOBALBV(
features1 = c("GC", "OSNs", "M/TC", "PGC"),
assay.type = 2,
permutation = NULL, # if permutation is NULL, mantel test will be used to calculated the pvalue
add.pvalue = TRUE,
alternative = 'greater' # one-side test
)
# gbv.res is a list contained Lee and pvalue matrix.
gbv.res |> as_tbl_df(diag=FALSE, flag.clust = TRUE) |>
dplyr::arrange(desc(Lee))
#> # A tibble: 6 × 4
#> x y Lee pvalue
#> <fct> <fct> <dbl> <dbl>
#> 1 OSNs PGC 0.177 1.40e-15
#> 2 M/TC PGC 0.0304 1.57e- 3
#> 3 M/TC OSNs -0.107 9.76e- 1
#> 4 GC M/TC -0.202 1.00e+ 0
#> 5 GC OSNs -0.246 1 e+ 0
#> 6 GC PGC -0.306 1 e+ 0
The Lee
value typically ranges from -1 to 1. When the index is closer to 1, it indicates a strong positive spatial association between the two variables, meaning the high-value areas of one variable tend to overlap with the high-value areas of the other, and low-value areas similarly overlap, showing strong spatial consistency. Conversely, if the index is near -1, it suggests a strong negative spatial association, where the high-value areas of one variable tend to overlap with the low-value areas of the other.
We developed a function plot_heatmap_globalbv
to visualize the result of global bivariate spatial analysis
# We can also display the result of global univariate spatial analysis
moran.res <- gsvaExp(mob_spe, 'CellTypeFree') |> svDf("sv.moransi")
# The result of local univariate spatial analysis also can be added
lisa.f1.res <- gsvaExp(mob_spe, "CellTypeFree", withColData = TRUE) |>
cal_lisa_f1(lisa.res = celltypefree_lisares, group.by = 'layer', rm.group.nm=NA)
plot_heatmap_globalbv(gbv.res, moran.t=moran.res, lisa.t=lisa.f1.res)
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_col()`).
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_text()`).
The results of clustering analysis of spatial distribution between cell-type
We can explore the spatial co-cluster between the different type of gsvaExp
. For example, we explore the spatial co-distribution between the CellTypeFree
and Biocarta
of mob_spe
object.
# If the number of features is excessive, we recommend selecting
# those with spatial variability.
celltypeid <- gsvaExp(mob_spe, "CellTypeFree") |> svDf("sv.moransi") |>
dplyr::filter(padj<=0.01) |> rownames()
biocartaid <- gsvaExp(mob_spe, "Biocarta") |>
runDetectSVG(assay.type = 1, verbose=FALSE) |>
svDf() |>
dplyr::filter(padj <= 0.01) |>
rownames()
#> elapsed time is 0.025000 seconds
runGLOBALBV(
mob_spe,
gsvaexp = c("CellTypeFree", "Biocarta"),
gsvaexp.features = c(celltypeid, biocartaid[seq(20)]),
gsvaexp.assay.type = c(2, 1),
permutation = NULL,
add.pvalue = TRUE
) -> res.gbv.celltype.biocarta
#> The `gsvaexp` is specified, the specified `gsvaExp` will be
#> used to perform the analysis.
plot_heatmap_globalbv(res.gbv.celltype.biocarta)
The heatmap of spatial correlation between cell tyep and Biocarta states
Or we can also explore the co-location between the spatially variable genes and cell-types.
svgid <- mob_spe |> runDetectSVG(assay.type = 'logcounts') |> svDf() |>
dplyr::filter(padj <= 0.01) |> dplyr::arrange(rank) |> rownames()
#> Identifying the spatially variable gene sets (pathway) based on moransi
#> elapsed time is 1.655000 seconds
#> The result is added to the input object, which can be extracted using
#> `svDf()` with type='sv.moransi' for `SingleCellExperiment` or
#> `SpatialExperiment`.
#> If input object is `SVPExperiment`, and `gsvaexp` is specified
#> the result can be extracted by `gsvaExp()` (return a `SingleCellExperiment`
#> or `SpatialExperiment`),then also using `svDf()` to extract.
res.gbv.genes.celltype <- runGLOBALBV(mob_spe,
features1 = svgid[seq(40)],
gsvaexp = 'CellTypeFree',
gsvaexp.features = celltypeid,
gsvaexp.assay.type = 2,
permutation = NULL,
add.pvalue = TRUE
)
#> The `gsvaexp` is specified, the specified `gsvaExp` will be
#> used to perform the analysis.
plot_heatmap_globalbv(res.gbv.genes.celltype)
The heatmap of spatial correlation between cell tyep and spatially variable genes
Local bivariate spatial analysis
After global bivariate spatial analysis, we found some genes (such as Psd
, Nrgn
, Kcnh3
, Gpsm1
, Pcp4
) and GC
show significant spatial co-aggregation. Next, we use runLOCALBV
of SVP
to identify the co-aggregation areas.
lbv.res <- mob_spe |>
runLOCALBV(features1=c("Psd", "Nrgn", "Kcnh3", "Gpsm1", "Pcp4"),
assay.type = 'logcounts',
gsvaexp = 'CellTypeFree',
gsvaexp.features=c('GC'),
gsvaexp.assay.type = 'prop',
weight.method='knn',
k = 8
)
#> The `gsvaexp` is specified, the specified `gsvaExp` will be
#> used to perform the analysis with `features` displayed from `rownames(data)`.
# The result can be added to gsvaExp, then visualized by ggsc.
LISAsce(mob_spe, lisa.res=lbv.res, gsvaexp.name='LBV.res') |>
gsvaExp("LBV.res") %>%
plot_lisa_feature(lisa.res = lbv.res, assay.type = 1)
The heatmap of Local bivariate spatial analysis
The highlighted area denotes the region where the two variables significantly co-aggregate in space.
Session information
Here is the output of sessionInfo()
on the system on which this document was compiled:
#> R Under development (unstable) (2025-03-13 r87965)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.21-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats4 stats graphics grDevices utils datasets methods
#> [8] base
#>
#> other attached packages:
#> [1] scater_1.35.4 STexampleData_1.15.2
#> [3] ExperimentHub_2.15.0 AnnotationHub_3.15.0
#> [5] BiocFileCache_2.15.1 dbplyr_2.5.0
#> [7] scatterpie_0.2.4 magrittr_2.0.3
#> [9] ggplot2_3.5.1 SVP_0.99.4
#> [11] scran_1.35.0 scuttle_1.17.0
#> [13] ggsc_1.5.0 SpatialExperiment_1.17.0
#> [15] SingleCellExperiment_1.29.2 SummarizedExperiment_1.37.0
#> [17] Biobase_2.67.0 GenomicRanges_1.59.1
#> [19] GenomeInfoDb_1.43.4 IRanges_2.41.3
#> [21] S4Vectors_0.45.4 BiocGenerics_0.53.6
#> [23] generics_0.1.3 MatrixGenerics_1.19.1
#> [25] matrixStats_1.5.0
#>
#> loaded via a namespace (and not attached):
#> [1] fs_1.6.5 spatstat.sparse_3.1-0
#> [3] httr_1.4.7 RColorBrewer_1.1-3
#> [5] tools_4.6.0 sctransform_0.4.1
#> [7] utf8_1.2.4 R6_2.6.1
#> [9] lazyeval_0.2.2 uwot_0.2.3
#> [11] withr_3.0.2 sp_2.2-0
#> [13] gridExtra_2.3 progressr_0.15.1
#> [15] cli_3.6.4 spatstat.explore_3.3-4
#> [17] fastDummies_1.7.5 labeling_0.4.3
#> [19] sass_0.4.9 Seurat_5.2.1
#> [21] spatstat.data_3.1-6 ggridges_0.5.6
#> [23] pbapply_1.7-2 yulab.utils_0.2.0
#> [25] parallelly_1.42.0 limma_3.63.9
#> [27] RSQLite_2.3.9 gridGraphics_0.5-1
#> [29] ica_1.0-3 spatstat.random_3.3-2
#> [31] dplyr_1.1.4 Matrix_1.7-3
#> [33] ggbeeswarm_0.7.2 abind_1.4-8
#> [35] lifecycle_1.0.4 yaml_2.3.10
#> [37] edgeR_4.5.9 SparseArray_1.7.6
#> [39] Rtsne_0.17 grid_4.6.0
#> [41] blob_1.2.4 promises_1.3.2
#> [43] dqrng_0.4.1 crayon_1.5.3
#> [45] miniUI_0.1.1.1 lattice_0.22-6
#> [47] beachmat_2.23.7 cowplot_1.1.3
#> [49] KEGGREST_1.47.0 magick_2.8.5
#> [51] pillar_1.10.1 knitr_1.50
#> [53] metapod_1.15.0 rjson_0.2.23
#> [55] future.apply_1.11.3 codetools_0.2-20
#> [57] fastmatch_1.1-6 glue_1.8.0
#> [59] ggfun_0.1.8 spatstat.univar_3.1-2
#> [61] data.table_1.17.0 vctrs_0.6.5
#> [63] png_0.1-8 treeio_1.31.0
#> [65] spam_2.11-1 gtable_0.3.6
#> [67] cachem_1.1.0 xfun_0.51
#> [69] S4Arrays_1.7.3 mime_0.12
#> [71] pracma_2.4.4 survival_3.8-3
#> [73] statmod_1.5.0 bluster_1.17.0
#> [75] fitdistrplus_1.2-2 ROCR_1.0-11
#> [77] nlme_3.1-167 ggtree_3.15.0
#> [79] bit64_4.6.0-1 filelock_1.0.3
#> [81] RcppAnnoy_0.0.22 bslib_0.9.0
#> [83] irlba_2.3.5.1 vipor_0.4.7
#> [85] KernSmooth_2.23-26 colorspace_2.1-1
#> [87] DBI_1.2.3 tidyselect_1.2.1
#> [89] bit_4.6.0 compiler_4.6.0
#> [91] curl_6.2.1 BiocNeighbors_2.1.3
#> [93] DelayedArray_0.33.6 plotly_4.10.4
#> [95] scales_1.3.0 lmtest_0.9-40
#> [97] rappdirs_0.3.3 stringr_1.5.1
#> [99] digest_0.6.37 goftest_1.2-3
#> [101] spatstat.utils_3.1-3 rmarkdown_2.29
#> [103] XVector_0.47.2 htmltools_0.5.8.1
#> [105] pkgconfig_2.0.3 sparseMatrixStats_1.19.0
#> [107] fastmap_1.2.0 rlang_1.1.5
#> [109] htmlwidgets_1.6.4 UCSC.utils_1.3.1
#> [111] shiny_1.10.0 prettydoc_0.4.1
#> [113] DelayedMatrixStats_1.29.1 farver_2.1.2
#> [115] jquerylib_0.1.4 zoo_1.8-13
#> [117] jsonlite_1.9.1 BiocParallel_1.41.2
#> [119] BiocSingular_1.23.0 GenomeInfoDbData_1.2.14
#> [121] ggplotify_0.1.2 dotCall64_1.2
#> [123] patchwork_1.3.0 munsell_0.5.1
#> [125] Rcpp_1.0.14 viridis_0.6.5
#> [127] ape_5.8-1 reticulate_1.41.0.1
#> [129] stringi_1.8.4 ggstar_1.0.4
#> [131] MASS_7.3-65 plyr_1.8.9
#> [133] parallel_4.6.0 listenv_0.9.1
#> [135] ggrepel_0.9.6 deldir_2.0-4
#> [137] Biostrings_2.75.4 splines_4.6.0
#> [139] tensor_1.5 locfit_1.5-9.12
#> [141] igraph_2.1.4 spatstat.geom_3.3-5
#> [143] RcppHNSW_0.6.0 reshape2_1.4.4
#> [145] ScaledMatrix_1.15.0 BiocVersion_3.21.1
#> [147] evaluate_1.0.3 SeuratObject_5.0.2
#> [149] RcppParallel_5.1.10 BiocManager_1.30.25
#> [151] tweenr_2.0.3 httpuv_1.6.15
#> [153] RANN_2.6.2 tidyr_1.3.1
#> [155] purrr_1.0.4 polyclip_1.10-7
#> [157] future_1.34.0 scattermore_1.2
#> [159] ggforce_0.4.2 rsvd_1.0.5
#> [161] xtable_1.8-4 RSpectra_0.16-2
#> [163] tidytree_0.4.6 tidydr_0.0.5
#> [165] later_1.4.1 viridisLite_0.4.2
#> [167] tibble_3.2.1 aplot_0.2.5
#> [169] beeswarm_0.4.0 memoise_2.0.1
#> [171] AnnotationDbi_1.69.0 cluster_2.1.8.1
#> [173] globals_0.16.3
References
Chang, Yuzhou, Jixin Liu, Yi Jiang, Anjun Ma, Yao Yu Yeo, Qi Guo, Megan McNutt, et al. 2024. “Graph Fourier Transform for Spatial Omics Representation and Analyses of Complex Organs.” Nature Communications 15 (1): 7467.
Cortal, Akira, Loredana Martignetti, Emmanuelle Six, and Antonio Rausell. 2021. “Gene Signature Extraction and Cell Identity Recognition at the Single-Cell Level with Cell-Id.” Nature Biotechnology 39 (9): 1095–1102.
Hao, Yuhan, Tim Stuart, Madeline H Kowalski, Saket Choudhary, Paul Hoffman, Austin Hartman, Avi Srivastava, et al. 2023. “Dictionary Learning for Integrative, Multimodal and Scalable Single-Cell Analysis.” Nature Biotechnology. https://doi.org/10.1038/s41587-023-01767-y.
Miller, Brendan F, Dhananjay Bambah-Mukku, Catherine Dulac, Xiaowei Zhuang, and Jean Fan. 2021. “Characterizing Spatial Gene Expression Heterogeneity in Spatially Resolved Single-Cell Transcriptomic Data with Nonuniform Cellular Densities.” Genome Research 31 (10): 1843–55.
Palla, Giovanni, Hannah Spitzer, Michal Klein, David Fischer, Anna Christina Schaar, Louis Benedikt Kuemmerle, Sergei Rybakov, et al. 2022. “Squidpy: A Scalable Framework for Spatial Omics Analysis.” Nature Methods 19 (2): 171–78. https://doi.org/10.1038/s41592-021-01358-2.
Ståhl, Patrik L., Fredrik Salmén, Sanja Vickovic, Anna Lundmark, José Fernández Navarro, Jens Magnusson, Stefania Giacomello, et al. 2016. “Visualization and Analysis of Gene Expression in Tissue Sections by Spatial Transcriptomics.” Science 353 (6294): 78–82. https://doi.org/10.1126/science.aaf2403.
Sun, Shiquan, Jiaqiang Zhu, and Xiang Zhou. 2020. “Statistical Analysis of Spatial Expression Patterns for Spatially Resolved Transcriptomic Studies.” Nature Methods 17 (2): 193–200.
Svensson, Valentine, Sarah A Teichmann, and Oliver Stegle. 2018. “SpatialDE: Identification of Spatially Variable Genes.” Nature Methods 15 (5): 343–46.
Tepe, Burak, Matthew C. Hill, Brandon T. Pekarek, Patrick J. Hunt, Thomas J. Martin, James F. Martin, and Benjamin R. Arenkiel. 2018. “Single-Cell Rna-Seq of Mouse Olfactory Bulb Reveals Cellular Heterogeneity and Activity-Dependent Molecular Census of Adult-Born Neurons.” Cell Reports 25 (10): 2689–2703.e3. https://doi.org/10.1016/j.celrep.2018.11.034.
Weber, Lukas M., Arkajyoti Saha, Abhirup Datta, Kasper D. Hansen, and Stephanie C. Hicks. 2023. “NnSVG for the Scalable Identification of Spatially Variable Genes Using Nearest-Neighbor Gaussian Processes.” Nature Communications 14 (4059). https://doi.org/https://doi.org/10.1038/s41467-023-39748-z.
Zhu, Jiaqiang, Shiquan Sun, and Xiang Zhou. 2021. “SPARK-X: Non-Parametric Modeling Enables Scalable and Robust Detection of Spatial Expression Patterns for Large Spatial Transcriptomic Studies.” Genome Biology 22 (1): 184. https://doi.org/10.1186/s13059-021-02404-0.