Introduction

Understanding the spatial distribution and interplay of cell states in tissue is critical for elucidating tissue formation and function. Single-cell and spatial omics present a promising approach to addressing this need. Traditional methods typically include the identification of highly variable genes, dimensionality reduction, clustering, and the annotation of cells or functions based on gene over-expression. Nevertheless, these qualitative approaches are inadequate for accurately mapping the distributions of spatial features. To address this, integrating biomedical knowledge such as Gene Ontology, KEGG, Reactome, transcription factors, and cell-type marker genes directly allows for the evaluation of cell states from gene expression data, creating quantitative functional pathway profiles at the single captured location.

After quantifying cell functions, analyzing their spatial distribution and co-distribution with other features can provide deeper insights into related biological questions. We focus on three aspects: the spatial variability of cell functions, regions where these functions cluster, and their co-distribution patterns with other features. Although existing tools such as SPARK-X(Zhu, Sun, and Zhou 2021), nnSVG(Weber et al. 2023), SpatialDE(Svensson, Teichmann, and Stegle 2018), SpaGFT(Chang et al. 2024), Seurat(Hao et al. 2023), and Squidpy(Palla et al. 2022) facilitate the exploration of spatially variable genes, they are primarily designed for gene-level analysis and lack the capability to investigate the spatial co-distribution of features. Additionally, many of these tools, including SpatialDE(Svensson, Teichmann, and Stegle 2018), SPARK(Sun, Zhu, and Zhou 2020), MERINGUE(Miller et al. 2021), and nnSVG(Weber et al. 2023), face challenges in handling large-scale spatial transcriptome data due to high memory consumption and low computational efficiency.

To fill the gaps, we developed SVP to accurately predict cell states, explore their spatial distribution, and assess their spatial relationship with other features. SVP is developed based on SingleCellExperiment class, which can be interoperable with the existing computing ecosystem of Bioconductor.

Overview of SVP

The evaluation of functional status at individual locations captured by SVP is achieved using Multiple Correspondence Analysis (MCA) for dimensionality reduction. This process employs a standardized gene expression matrix to project both cells and genes into a unified MCA space. It has been established that this method allows for the calculation of distances not only between genes and cells but also between cells and genes, thereby facilitating the assessment of their associations(Cortal et al. 2021). Proximity in this space indicates a stronger relationship. These calculated distances are crucial for constructing a weighted k-nearest neighbors (KNN) network, linking each cell or gene to its most relevant counterparts. To discern features with varying levels of proximity, distances are first normalized to a 0-1 scale, with closer distances approaching 1 and farther distances approaching 0. This normalization is followed by division by the total distance among the nearest features, thereby assigning greater connection weights to closer features. Subsequently, databases of known biological knowledge, such as transcription factor target gene sets, Reactome functional gene sets, and cell-type marker gene sets, serve as initial seeds. Random walks on the constructed weighted KNN network yield preliminary functional state activity scores for each location. To mitigate potential biases introduced by dimensionality reduction, a hypergeometric distribution test enhances the enrichment analysis of top-ranking genes extracted directly from the expression matrix at each location. These analyses provide weights for functional activities, culminating in the derivation of functional activity scores at the single captured location (Fig. 1A).

To identify spatially variable cell functions, we first established a cell neighbor weight matrix based on spot locations using the Delaunay triangulation (default) or KNN algorithm. This weight matrix, alongside global autocorrelation analyses such as Moran’s I (default), Geary’s C, or Getis-Ord’s G, facilitated the identification of spatially variable cell functions or gene characteristics. Additionally, we utilized the same cell neighbor weight matrix and local spatial autocorrelation algorithms (Local Getis-Ord or Local Moran) to delineate the local spatial distribution of these variable features (Fig. 1B). To examine spatial co-distribution among cell functions, we designed a bivariate spatial global and local autocorrelation algorithm employing the Lee index. This approach enables the assessment of correlation between different cell characteristics in their spatial distribution (Fig. 1C).

Overview of SVP.

Install

You can use the following commands to install it

#Once Bioconductor 3.21 is released, you can install it as follows.
if (!requireNamespace("BiocManager", quietly=TRUE))
    install.packages("BiocManager")

BiocManager::install("SVP")

#Or you can install it via github
if (!requireNamespace("remotes", quietly=TRUE))
    install.packages("remotes")
remotes::install_github("YuLab-SMU/SVP")

Quantification of cell states using SVP

SVP can integrate gene or protein lists from existing biomedical databases like cell-type markers (PanglaoDB, CellMarkers), gene ontology (GO), molecular features (MSigDB), and pathways (KEGG, Reactome, Wikipathways), transcription factors, and disease ontology, to assess cell states. Of course, the cell-type markers also can be extracted from reference single cell data. In this vignette, we use the spatial transcriptome and single cell transcriptome of mouse olfactory bulb from the articles(Ståhl et al. 2016; Tepe et al. 2018) to demonstrate. Here, we first perform the normalization by logNormCounts of scuttle and the MCA dimensionality reduction by runMCA of SVP.

# load the packages required in the vignette
library(SpatialExperiment)
library(SingleCellExperiment)
library(scuttle)
library(scran)
library(SVP)
library(ggsc)
library(ggplot2)
library(magrittr)
library(STexampleData)

# load the spatial transcriptome, it is a SpatialExperiment object
# to build the object, you can refer to the SpatialExperiment package
mob_spe <- ST_mouseOB()  
#> see ?STexampleData and browseVignettes('STexampleData') for documentation
#> loading from cache
# the genes that did not express in any captured location were removed.
# we use logNormCounts of scuttle to normalize the spatial transcriptome
mob_spe <- logNormCounts(mob_spe)
# Now the normalized counts was added to the assays of mob_spe as `logcounts`
# it can be extracted via logcounts(mob_spe) or assay(mob_spe, 'logcounts')

# Next, we use `runMCA` of `SVP` to preform the MCA dimensionality reduction
mob_spe <- runMCA(mob_spe, assay.type = 'logcounts')
#> Computing Fuzzy Matrix
#> Computing SVD
#> Computing Coordinates
# The MCA result was added to the reducedDims, it can be extracted by reducedDim(mob_spe, 'MCA')
# more information can refer to SingleCellExperiment package
mob_spe
#> class: SpatialExperiment 
#> dim: 15928 262 
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(15928): 0610007N19Rik 0610007P14Rik ... Zzef1 Zzz3
#> rowData names(1): gene_name
#> colnames(262): ACAACTATGGGTTGGCGG ACACAGATCCTGTTCTGA ...
#>   TTTCAACCCGAGGAAGTC TTTCTAACTCATAAGGAT
#> colData names(4): barcode_id sample_id layer sizeFactor
#> reducedDimNames(1): MCA
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(1): sample_id

Quantification of BIOCARTA pathway or other function

SVP accepts SingleCellExperiment or SpatialExperiment object as input. The gene set should be a named list object, and the elements of the list must be included in the row names of the SingleCellExperiment or SpatialExperiment object. To quantifying the Biocarta pathway or other function for each captured location, the command runSGSA is used.

#Note: the gset.idx.list supports the named gene list, gson object or
# locate gmt file or html url of gmt.
mob_spe <- runSGSA(mob_spe, 
             gset.idx.list = "https://data.broadinstitute.org/gsea-msigdb/msigdb/release/2024.1.Mm/m2.cp.biocarta.v2024.1.Mm.symbols.gmt", # The target gene set gmt file
             assay.type = 'logcounts', # The name of assays of gene profiler
             gsvaExp.name = 'Biocarta' # The name of result to save to gsvaExps slot
           )
#> Building the nearest neighbor graph with the distance between features and
#> cells ...
#> elapsed time is 9.161000 seconds
#> Building the seed matrix using the gene set and the nearest neighbor graph for
#> random walk with restart ...
#> elapsed time is 0.083000 seconds
#> Calculating the affinity score using random walk with restart ...
#> elapsed time is 0.925000 seconds
#> Tidying the result of random walk with restart ...
#> elapsed time is 0.009000 seconds
#> ORA analysis ...
#> elapsed time is 0.024000 seconds

# Then the result was added to gsvaExps to return a SVPExperiment object, the result
# can be extracted with gsvaExp, you can view more information via help(gsvaExp).
mob_spe
#> class: SVPExperiment 
#> dim: 15928 262 
#> metadata(0):
#> assays(2): counts logcounts
#> rownames(15928): 0610007N19Rik 0610007P14Rik ... Zzef1 Zzz3
#> rowData names(1): gene_name
#> colnames(262): ACAACTATGGGTTGGCGG ACACAGATCCTGTTCTGA ...
#>   TTTCAACCCGAGGAAGTC TTTCTAACTCATAAGGAT
#> colData names(4): barcode_id sample_id layer sizeFactor
#> reducedDimNames(1): MCA
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(0):
#> gsvaExps names(1) : Biocarta

gsvaExpNames(mob_spe)
#> [1] "Biocarta"

# The result is also a SingleCellExperiment or SpatialExperiment.
gsvaExp(mob_spe, 'Biocarta')
#> class: SpatialExperiment 
#> dim: 252 262 
#> metadata(0):
#> assays(1): affi.score
#> rownames(252): BIOCARTA_41BB_PATHWAY BIOCARTA_ACE2_PATHWAY ...
#>   BIOCARTA_WNT_LRP6_PATHWAY BIOCARTA_WNT_PATHWAY
#> rowData names(4): exp.gene.num gset.gene.num gene.occurrence.rate
#>   geneSets
#> colnames(262): ACAACTATGGGTTGGCGG ACACAGATCCTGTTCTGA ...
#>   TTTCAACCCGAGGAAGTC TTTCTAACTCATAAGGAT
#> colData names(4): barcode_id layer sizeFactor sample_id
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(0):

# identify the spatially variable Biocarta pathway.
Biocarta.svdf <- gsvaExp(mob_spe, "Biocarta") |>
 runDetectSVG(
   assay.type = 1,
   action = 'only'
 )
#> Identifying the spatially variable gene sets (pathway) based on moransi
#> elapsed time is 0.039000 seconds

Biocarta.svdf |> head()
#>                                obs expect.moransi sd.moransi Z.moransi
#> BIOCARTA_41BB_PATHWAY   0.05302297   -0.003831418 0.03016130 1.8850112
#> BIOCARTA_ACE2_PATHWAY   0.09933677   -0.003831418 0.03247755 3.1766002
#> BIOCARTA_ACH_PATHWAY    0.04625366   -0.003831418 0.03142826 1.5936320
#> BIOCARTA_ACTINY_PATHWAY 0.12301548   -0.003831418 0.03276267 3.8716900
#> BIOCARTA_AGPCR_PATHWAY  0.01386184   -0.003831418 0.03520808 0.5025341
#> BIOCARTA_AGR_PATHWAY    0.04678783   -0.003831418 0.03553799 1.4243700
#>                               pvalue        padj rank
#> BIOCARTA_41BB_PATHWAY   2.971417e-02 0.061290029  115
#> BIOCARTA_ACE2_PATHWAY   7.450615e-04 0.002884109   62
#> BIOCARTA_ACH_PATHWAY    5.550924e-02 0.100925887  132
#> BIOCARTA_ACTINY_PATHWAY 5.404167e-05 0.000324250   40
#> BIOCARTA_AGPCR_PATHWAY  3.076459e-01 0.426792051  173
#> BIOCARTA_AGR_PATHWAY    7.716969e-02 0.133242632  139

# We can obtain significant spatially variable pathways, 
# which are sorted according to the Moran index.
# More details see also the Spatial statistical analysis session
Biocarta.svdf |> dplyr::filter(padj<=0.01) |> dplyr::arrange(rank) |> head()
#>                                    obs expect.moransi sd.moransi Z.moransi
#> BIOCARTA_CACAM_PATHWAY       0.3623497   -0.003831418 0.03565189 10.271014
#> BIOCARTA_CELLCYCLE_PATHWAY   0.2831358   -0.003831418 0.03437045  8.349240
#> BIOCARTA_P53_PATHWAY         0.2480688   -0.003831418 0.03270444  7.702325
#> BIOCARTA_GSK3_PATHWAY        0.2476114   -0.003831418 0.03568856  7.045475
#> BIOCARTA_CALCINEURIN_PATHWAY 0.2443118   -0.003831418 0.03577427  6.936362
#> BIOCARTA_PTDINS_PATHWAY      0.2299768   -0.003831418 0.03574460  6.541078
#>                                    pvalue         padj rank
#> BIOCARTA_CACAM_PATHWAY       4.759931e-25 1.142383e-22    1
#> BIOCARTA_CELLCYCLE_PATHWAY   3.435170e-17 4.122204e-15    2
#> BIOCARTA_P53_PATHWAY         6.680625e-15 5.344500e-13    3
#> BIOCARTA_GSK3_PATHWAY        9.241475e-13 5.544885e-11    4
#> BIOCARTA_CALCINEURIN_PATHWAY 2.011633e-12 9.655839e-11    5
#> BIOCARTA_PTDINS_PATHWAY      3.053843e-11 1.221537e-09    6

ids.sv.biocarta <- Biocarta.svdf |> dplyr::filter(padj<=0.01) |> dplyr::arrange(rank) |> rownames()

# We use sc_spatial of ggsc to visualize the distribution of some spatially variable functions
gsvaExp(mob_spe, 'Biocarta') %>% 
  sc_spatial(
    features = ids.sv.biocarta[seq(12)],
    mapping = aes(x = x, y = y),
    geom = geom_bgpoint,
    pointsize = 3.5,
    ncol = 4,
    common.legend = FALSE # The output is a patchwork 
  ) & 
  scale_color_viridis_c(option='B', guide='none')
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.

$The heatmap of spatially variable biocarta pathway$

The heatmap of spatially variable biocarta pathway

Quantification of cell-type

To quantifying the cell-type for each captured location, the gene set should be the marker gene sets of cell-type. It can be the pre-established marker gene sets, such as the markers from CellMarkers or PanglaoDB. High-quality cell marker genes from past reports or your own compilations are acceptable. In addition, It can also be extracted from the reference single cell data. The following sections will detail the specific operations of each approach.

Quantification of cell-type using the pre-established marker gene sets

We can use the pre-established marker gene sets to quantifying the cell-type for each captured location. Here, we used the markers obtained the differential expression genes from the single cell transcriptome of the article (Tepe et al. 2018).

# The marker gene sets
# We can view the marker gene number of each cell-type.
data(mob_marker_genes)
lapply(mob_marker_genes, length) |> unlist()
#>     GC    PGC   M/TC   OSNs EPL-IN 
#>     14     13     13     34     12

mob_spe <- runSGSA(
                mob_spe, 
                gset.idx.list = mob_marker_genes,
                gsvaExp.name = 'CellTypeFree', 
                assay.type = 'logcounts'
              )
#> Building the nearest neighbor graph with the distance between features and
#> cells ...
#> elapsed time is 9.084000 seconds
#> Building the seed matrix using the gene set and the nearest neighbor graph for
#> random walk with restart ...
#> elapsed time is 0.011000 seconds
#> Calculating the affinity score using random walk with restart ...
#> elapsed time is 0.168000 seconds
#> Tidying the result of random walk with restart ...
#> elapsed time is 0.008000 seconds
#> ORA analysis ...
#> elapsed time is 0.009000 seconds

# Then we can visualize the cell-type activity via sc_spatial of SVP with pie plot
gsvaExp(mob_spe, "CellTypeFree") |> 
   sc_spatial(
     features = sort(names(mob_marker_genes)),
     mapping = aes(x = x, y = y),
     plot.pie = TRUE,
     color = NA,
     pie.radius.scale = 1.1
   ) +
   scale_fill_brewer(palette='Set2')
#> Scale for fill is already present.
#> Adding another scale for fill, which will replace the existing scale.

The pie plot of cell-type activity

We cal also display each cell-type activity with heat-map according to the spatial coordinate.

# calculated the of cell-type
assay(gsvaExp(mob_spe, 2)) |> as.matrix() |> 
  prop.table(2) |> 
  Matrix::Matrix(sparse=TRUE) -> 
  assay(gsvaExp(mob_spe,2), "prop")


mob_spe |> gsvaExp(2) |> 
   sc_spatial(
     features = sort(names(mob_marker_genes)),
     mapping = aes(x = x,  y = y),
     geom = geom_bgpoint,
     pointsize = 2,
     ncol = 5,
     slot = 2
   ) + 
   scale_color_viridis_c()
#> Scale for colour is already present.
#> Adding another scale for colour, which will replace the existing scale.

the heatmap of each cell-type activity

Quantification of cell-type using the reference single-cell data

This approach requires obtaining the cell-type marker genes from the reference single-cell data. First, the reference single cell data should be normalized. Here, the command of logNormCounts of scuttle is used. Next, the MCA dimensionality reduction is performed by runMCA of SVP. Finally, the marker gene sets of cell-type are extracted via runDetectMarker of SVP.

# load the reference single cell transcriptome
# it is a SingleCellExperiment
data(mob_sce) 

mob_sce <- logNormCounts(mob_sce)

# You can also first obtain the high variable genes, then
# perform the MCA reduction.
# set.seed(123)
# stats <- modelGeneVar(mob_sce)
# hvgs <- getTopHVGs(stats)

# perform the MCA reduction.
mob_sce <- mob_sce |> runMCA(assay.type = 'logcounts')
#> Computing Fuzzy Matrix
#> Computing SVD
#> Computing Coordinates

# obtain the marker gene sets from reference single-cell transcriptome based on MCA space
refmakergene <- runDetectMarker(mob_sce, group.by='cellType', 
  ntop=200, present.prop.in.group=.2, present.prop.in.sample=.5
  )
refmakergene |> lapply(length) |> unlist()
#>     GC    PGC   M/TC   OSNs EPL-IN 
#>     15     14      8     36     11

Then, the quantification of the cell-type gene signatures can also be performed via runSGSA with the marker gene sets.

# use the maker gene from the reference single-cell transcriptome
mob_spe <- runSGSA(
  mob_spe,
  gset.idx.list = refmakergene,
  gsvaExp.name = 'CellTypeRef'
)
#> Building the nearest neighbor graph with the distance between features and
#> cells ...
#> elapsed time is 9.047000 seconds
#> Building the seed matrix using the gene set and the nearest neighbor graph for
#> random walk with restart ...
#> elapsed time is 0.010000 seconds
#> Calculating the affinity score using random walk with restart ...
#> elapsed time is 0.133000 seconds
#> Tidying the result of random walk with restart ...
#> elapsed time is 0.009000 seconds
#> ORA analysis ...
#> elapsed time is 0.009000 seconds

mob_spe |> 
  gsvaExp('CellTypeRef') |> 
  sc_spatial(
    features = sort(names(refmakergene)), 
    mapping = aes(x = x, y = y),
    plot.pie = TRUE, 
    color = NA, 
    pie.radius.scale = 1.1
  ) +
  scale_fill_brewer(palette = 'Set2')
#> Scale for fill is already present.
#> Adding another scale for fill, which will replace the existing scale.

The pie plot of cell-type activity with the marker gene sets from reference single-cell transcriptome

We can found that the result based on marker gene sets from reference single-cell transcriptome is similar with the result based on marker gene sets from previous reports.

We can also obtain the major cell-type for each captured location using pred.cell.signature.

# the result will be added to the colData of gsvaExp(mob_spe, "CellTypeRef")
mob_spe <- mob_spe |> pred.cell.signature(gsvaexp='CellTypeFree', pred.col.name='pred')
#> The `gsvaexp` = CellTypeFree was specified, the specified gsva Experiment will
#> be used to predict the cell signature.

# Then we can use sc_spatial to visualize it.
mob_spe |>
  gsvaExp("CellTypeFree") |>
  sc_spatial(
    mapping = aes(x =  x, y = y, color = pred),
    geom = geom_bgpoint,
    pointsize = 4
  ) +
  scale_color_brewer(palette = 'Set2')

The major cell type for each spot

Spatial statistical analysis

SVP implements several spatial statistical algorithms to explore the spatial distribution of cell function or other features from three aspects: whether cell function states have spatial variability, where spatially variable cell functions cluster, and the co-distribution patterns of spatially variable cell functions with other features. The following sections provide a detailed description of each aspect.

Univariate spatial statistical analysis

This section addresses whether the feature is spatially variable and identifies its high-cluster areas. In detail, to identify the spatial variable features, SVP implements three spatial autocorrelation algorithms using Rcpp and RcppParallel: global Moran’s I, Geary’s C and Getis-Ord’s G. To identify the spatial high-cluster areas, SVP implements two local univariate spatial statistical algorithms using Rcpp and RcppParallel: local Getis-Ord’s G and local Moran’s I.

Identification of spatial variable features

Here, we identify the spatial variable cell-types using runDetectSVG, which implements the global Moran’s I, Geary’s C and Getis-Ord’s G algorithms.


# First approach:
# we can first obtain the target gsvaExp with gsvaExp names via 
# gsvaExp() function, for example, CellTypeFree
mob_spe |> gsvaExp("CellTypeFree") 
#> class: SpatialExperiment 
#> dim: 5 262 
#> metadata(0):
#> assays(2): affi.score prop
#> rownames(5): GC PGC M/TC OSNs EPL-IN
#> rowData names(4): exp.gene.num gset.gene.num gene.occurrence.rate
#>   geneSets
#> colnames(262): ACAACTATGGGTTGGCGG ACACAGATCCTGTTCTGA ...
#>   TTTCAACCCGAGGAAGTC TTTCTAACTCATAAGGAT
#> colData names(5): barcode_id layer sizeFactor sample_id pred
#> reducedDimNames(0):
#> mainExpName: NULL
#> altExpNames(0):
#> spatialCoords names(2) : x y
#> imgData names(0):

# Then we use runDetectSVG with method = 'moran' to identify the spatial variable 
# cell-types with global Moran's I.

celltype_free.I <- mob_spe |> gsvaExp("CellTypeFree") |>
  runDetectSVG(
    assay.type = "prop", # because default is 'logcounts', it should be adjused
    method = 'moransi',
    action = 'only'
  ) 
#> Identifying the spatially variable gene sets (pathway) based on moransi
#> elapsed time is 0.073000 seconds
  
celltype_free.I |> dplyr::arrange(rank) |> head()
#>              obs expect.moransi sd.moransi Z.moransi       pvalue         padj
#> GC     0.5590105   -0.003831418 0.03599358 15.637287 2.028236e-55 1.014118e-54
#> OSNs   0.3822118   -0.003831418 0.03591675 10.748281 3.018830e-27 7.547076e-27
#> M/TC   0.2944360   -0.003831418 0.03596283  8.293769 5.485780e-17 9.142967e-17
#> PGC    0.2809947   -0.003831418 0.03578380  7.959638 8.627161e-16 1.078395e-15
#> EPL-IN 0.0574522   -0.003831418 0.03554512  1.724108 4.234422e-02 4.234422e-02
#>        rank
#> GC        1
#> OSNs      2
#> M/TC      3
#> PGC       4
#> EPL-IN    5

# Second approach:
# the result was added to the original object by specific the 
# gsvaexp argument in the runDetectSVG
mob_spe <- mob_spe |> 
  runDetectSVG(
    gsvaexp = 'CellTypeFree',  
    method = 'moran',
    gsvaexp.assay.type = 'prop'
    )
#> The `gsvaexp` was specified, the specified `gsvaExp` will be used to detect
#> 'svg'.
#> Identifying the spatially variable gene sets (pathway) based on moransi
#> elapsed time is 0.028000 seconds
#> The result is added to the input object, which can be extracted using
#> `svDf()` with type='sv.moransi' for `SingleCellExperiment` or
#> `SpatialExperiment`.
#> If input object is `SVPExperiment`, and `gsvaexp` is specified
#> the result can be extracted by `gsvaExp()` (return a `SingleCellExperiment`
#> or `SpatialExperiment`),then also using `svDf()` to extract.
gsvaExp(mob_spe, 'CellTypeFree') |> svDf("sv.moransi") |> dplyr::arrange(rank) |> dplyr::filter(padj<=0.01)
#>            obs expect.moransi sd.moransi Z.moransi       pvalue         padj
#> GC   0.5590105   -0.003831418 0.03599358 15.637287 2.028236e-55 1.014118e-54
#> OSNs 0.3822118   -0.003831418 0.03591675 10.748281 3.018830e-27 7.547076e-27
#> M/TC 0.2944360   -0.003831418 0.03596283  8.293769 5.485780e-17 9.142967e-17
#> PGC  0.2809947   -0.003831418 0.03578380  7.959638 8.627161e-16 1.078395e-15
#>      rank
#> GC      1
#> OSNs    2
#> M/TC    3
#> PGC     4

The obs is the Moran’s I, expect.moransi is the expect Moran’s I (E(I)). Moran’s I quantifies the correlation between a single feature at a specific spatial point or region and its neighboring points or regions. Global Moran’s I usually ranges from -1 to 1, Global Moran’s I significantly above E(I) indicate a distinct spatial pattern or spatial clustering, which occurs when neighboring captured locations tend to have similar feature value. Global Moran’s I around E(I) suggest random spatial expression, that is, absence of spatial pattern. Global Moran’s I significantly below E(I) imply a chessboard-like pattern or dispersion.

# using Geary's C 
mob_spe <- mob_spe |>
  runDetectSVG(
    gsvaexp = 'CellTypeFree',
    gsvaexp.assay.type = 'prop',
    method = 'geary'
  )
#> The `gsvaexp` was specified, the specified `gsvaExp` will be used to detect
#> 'svg'.
#> Identifying the spatially variable gene sets (pathway) based on gearysc
#> elapsed time is 0.027000 seconds
#> The result is added to the input object, which can be extracted using
#> `svDf()` with type='sv.gearysc' for `SingleCellExperiment` or
#> `SpatialExperiment`.
#> If input object is `SVPExperiment`, and `gsvaexp` is specified
#> the result can be extracted by `gsvaExp()` (return a `SingleCellExperiment`
#> or `SpatialExperiment`),then also using `svDf()` to extract.

gsvaExp(mob_spe, "CellTypeFree") |> svDf('sv.gearysc') |> dplyr::arrange(rank) |> dplyr::filter(padj<=0.01)
#>            obs expect.gearysc sd.gearysc Z.gearysc       pvalue         padj
#> GC   0.4405675              1 0.03640685 15.366133 1.380832e-53 6.904162e-53
#> OSNs 0.6132465              1 0.03690356 10.480112 5.330845e-26 1.332711e-25
#> M/TC 0.7059801              1 0.03660655  8.031894 4.798975e-16 7.998292e-16
#> PGC  0.7112366              1 0.03774514  7.650346 1.002197e-14 1.252746e-14
#>      rank
#> GC      1
#> OSNs    2
#> M/TC    3
#> PGC     4

The obs is the Geary’s C. Global Geary’s C is another popular index of global spatial autocorrelation, but focuses more on differences between neighboring values. Global Geary’s C usually range from 0 to 2. Global Geary’s C significantly below E[C]=1 indicate a distinct spatial pattern or spatial clustering, the value significantly above E[C]=1 suggest a chessboard-like pattern or dispersion, the value around E[C]=1 imply absence of spatial pattern.

# using Getis-ord's G
mob_spe <- mob_spe |>
  runDetectSVG(
    gsvaexp = 'CellTypeFree',
    gsvaexp.assay.type = 'prop',
    method = 'getis'
  )
#> The `gsvaexp` was specified, the specified `gsvaExp` will be used to detect
#> 'svg'.
#> Identifying the spatially variable gene sets (pathway) based on getisord
#> elapsed time is 0.026000 seconds
#> The result is added to the input object, which can be extracted using
#> `svDf()` with type='sv.getisord' for `SingleCellExperiment` or
#> `SpatialExperiment`.
#> If input object is `SVPExperiment`, and `gsvaexp` is specified
#> the result can be extracted by `gsvaExp()` (return a `SingleCellExperiment`
#> or `SpatialExperiment`),then also using `svDf()` to extract.

gsvaExp(mob_spe, "CellTypeFree") |> svDf('sv.getisord') |> dplyr::arrange(rank) |> dplyr::filter(padj<=0.01) 
#>              obs    expect.G         sd.G       Z.G       pvalue         padj
#> GC   0.006508037 0.003831418 0.0001767082 15.147114 3.958388e-52 1.979194e-51
#> OSNs 0.004943604 0.003831418 0.0001123195  9.901992 2.040334e-23 5.100835e-23
#> M/TC 0.005215980 0.003831418 0.0001745559  7.931912 1.078986e-15 1.798310e-15
#> PGC  0.005363887 0.003831418 0.0002049821  7.476113 3.827644e-14 4.784556e-14
#>      rank
#> GC      1
#> OSNs    2
#> M/TC    3
#> PGC     4

The obs is the Getis-Ord’s G of each cell-type. Global Getis-Ord’s G measures global spatial autocorrelation by evaluating the clustering of high/low value, specifically identifying hot spots or cold spots. Global Getis-Ord’s G values have no fixed range, as they depend on the specific dataset and spatial weight matrix. Global Getis-Ord’s G significantly above E(G) indicate a distinct spatial pattern or spatial clustering, which occurs when neighboring captured locations tend to have similar feature value. Global Getis-Ord’s G around E(G) suggest random spatial expression, that is, absence of spatial pattern. Global Getis-Ord’s G significantly below E(G) imply a chessboard-like pattern or dispersion.

So the GC, OSNs, M/TC, and PGC were significantly concentrated in the spatial distribution compare to EPL-IN.

Identification of local spatial aggregation areas

Next, we can identify the spatial high-cluster areas for the spatial variable features via runLISA, which provides two local spatial statistics algorithms: Local Getis-Ord’s G and Local Moran’s I

mob_spe_cellfree <- mob_spe |> gsvaExp("CellTypeFree")

mob_spe_cellfree <- mob_spe_cellfree |> 
    runPCA(assay.type='prop') |> runTSNE(dimred='PCA')
#> Warning in check_numbers(k = k, nu = nu, nv = nv, limit = min(dim(x)) - : more
#> singular values/vectors requested than available
#> Warning in (function (A, nv = 5, nu = nv, maxit = 1000, work = nv + 7, reorth =
#> TRUE, : You're computing too large a percentage of total singular values, use a
#> standard svd instead.

# Here, we use the `TSNE` space to identify the location clustering regions
celltypefree_lisares <- mob_spe_cellfree |> 
   runLISA(
     features = c("GC", "OSNs", "M/TC", "PGC"), 
     assay.type = 'prop', 
     reduction.used = 'TSNE', 
     weight.method = 'knn', 
     k = 8
   ) 

celltypefree_lisares$GC |> head()
#>                              Gi        E.Gi       Var.Gi      Z.Gi
#> ACAACTATGGGTTGGCGG 0.0004568125 0.003831418 2.163394e-06 -2.294326
#> ACACAGATCCTGTTCTGA 0.0015130076 0.003831418 2.169044e-06 -1.574186
#> ACATCACCTGCGCGCTCT 0.0101004706 0.003831418 2.193616e-06  4.232741
#> ACATTTAAGGCGCATGAT 0.0103808033 0.003831418 2.193136e-06  4.422499
#> ACCACTGTAATCTCCCAT 0.0005428855 0.003831418 2.167719e-06 -2.233576
#> ACCAGAGCCGTTGAGCAA 0.0016442001 0.003831418 2.172145e-06 -1.484047
#>                    Pr (z != E(Gi)) cluster.no.test cluster.test
#> ACAACTATGGGTTGGCGG    2.177176e-02             Low          Low
#> ACACAGATCCTGTTCTGA    1.154444e-01             Low       NoSign
#> ACATCACCTGCGCGCTCT    2.308604e-05            High         High
#> ACATTTAAGGCGCATGAT    9.756574e-06            High         High
#> ACCACTGTAATCTCCCAT    2.551102e-02             Low          Low
#> ACCAGAGCCGTTGAGCAA    1.377966e-01             Low       NoSign

Local Getis-Ord’s G (G_i) measures the aggregation of a feature value in a specific spot and its neighbors, determining if locations are significantly clustered with high (hot spots) or low values (cold spots). G_i usually have no fixed range, a larger G_i indicates a stronger clustering of high expression for the feature in captured location and its neighboring spots, while a smaller G_i indicates a stronger clustering of low expression.

gsvaExp(mob_spe, 'CellTypeFree') |> 
  plot_lisa_feature(
    lisa.res = celltypefree_lisares, 
    assay.type = 2, 
    pointsize = 2,
    gap_line_width = .18,
    hlpointsize= 1.8
  )

The result of local Getis-Ord for cell-type

The highlighted area signifies that the feature is significantly clustered in this region.

Bivariate spatial statistical analysis

Next, to explore the bivariate spatial co-distribution, SVP implements global and local bivariate spatial statistics algorithms. Global bivariate spatial statistics algorithm can be used to identify whether the bivariate is co-cluster in the specific space. Local bivariate spatial algorithm is to find the areas of spatial co-clustering of bivariate.

Global bivariate spatial analysis

Here, we use runGLOBALBV of SVP to explore the spatial co-distribution between the cell-types.

gbv.res <- mob_spe |> 
  gsvaExp('CellTypeFree') |> 
  runGLOBALBV(
    features1 = c("GC", "OSNs", "M/TC", "PGC"),
    assay.type = 2, 
    permutation = NULL, # if permutation is NULL, mantel test will be used to calculated the pvalue
    add.pvalue = TRUE,
    alternative = 'greater' # one-side test 
  )
  
# gbv.res is a list contained Lee and pvalue matrix.
gbv.res |> as_tbl_df(diag=FALSE, flag.clust = TRUE) |> 
  dplyr::arrange(desc(Lee))
#> # A tibble: 6 × 4
#>   x     y         Lee   pvalue
#>   <fct> <fct>   <dbl>    <dbl>
#> 1 OSNs  PGC    0.177  1.40e-15
#> 2 M/TC  PGC    0.0304 1.57e- 3
#> 3 M/TC  OSNs  -0.107  9.76e- 1
#> 4 GC    M/TC  -0.202  1.00e+ 0
#> 5 GC    OSNs  -0.246  1   e+ 0
#> 6 GC    PGC   -0.306  1   e+ 0

The Lee value typically ranges from -1 to 1. When the index is closer to 1, it indicates a strong positive spatial association between the two variables, meaning the high-value areas of one variable tend to overlap with the high-value areas of the other, and low-value areas similarly overlap, showing strong spatial consistency. Conversely, if the index is near -1, it suggests a strong negative spatial association, where the high-value areas of one variable tend to overlap with the low-value areas of the other.

We developed a function plot_heatmap_globalbv to visualize the result of global bivariate spatial analysis

# We can also display the result of global univariate spatial analysis 
moran.res <- gsvaExp(mob_spe, 'CellTypeFree') |> svDf("sv.moransi")

# The result of local univariate spatial analysis also can be added
lisa.f1.res <- gsvaExp(mob_spe, "CellTypeFree", withColData = TRUE) |> 
    cal_lisa_f1(lisa.res = celltypefree_lisares, group.by = 'layer', rm.group.nm=NA)

plot_heatmap_globalbv(gbv.res, moran.t=moran.res, lisa.t=lisa.f1.res)
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_col()`).
#> Warning: Removed 1 row containing missing values or values outside the scale range
#> (`geom_text()`).

The results of clustering analysis of spatial distribution between cell-type

We can explore the spatial co-cluster between the different type of gsvaExp. For example, we explore the spatial co-distribution between the CellTypeFree and Biocarta of mob_spe object.

# If the number of features is excessive, we recommend selecting 
# those with spatial variability.
 celltypeid <- gsvaExp(mob_spe, "CellTypeFree") |> svDf("sv.moransi") |> 
   dplyr::filter(padj<=0.01) |> rownames()

biocartaid <- gsvaExp(mob_spe, "Biocarta") |> 
   runDetectSVG(assay.type = 1, verbose=FALSE) |> 
   svDf() |> 
   dplyr::filter(padj <= 0.01) |>
   rownames()
#> elapsed time is 0.039000 seconds

runGLOBALBV(
  mob_spe, 
  gsvaexp = c("CellTypeFree", "Biocarta"), 
  gsvaexp.features = c(celltypeid, biocartaid[seq(20)]), 
  gsvaexp.assay.type = c(2, 1),
  permutation = NULL, 
  add.pvalue = TRUE
) -> res.gbv.celltype.biocarta
#> The `gsvaexp` is specified, the specified `gsvaExp` will be
#> used to perform the analysis.

plot_heatmap_globalbv(res.gbv.celltype.biocarta)

The heatmap of spatial correlation between cell tyep and Biocarta states

Or we can also explore the co-location between the spatially variable genes and cell-types.

svgid <- mob_spe |> runDetectSVG(assay.type = 'logcounts') |> svDf() |>
         dplyr::filter(padj <= 0.01) |> dplyr::arrange(rank) |> rownames()
#> Identifying the spatially variable gene sets (pathway) based on moransi
#> elapsed time is 1.648000 seconds
#> The result is added to the input object, which can be extracted using
#> `svDf()` with type='sv.moransi' for `SingleCellExperiment` or
#> `SpatialExperiment`.
#> If input object is `SVPExperiment`, and `gsvaexp` is specified
#> the result can be extracted by `gsvaExp()` (return a `SingleCellExperiment`
#> or `SpatialExperiment`),then also using `svDf()` to extract.

res.gbv.genes.celltype <- runGLOBALBV(mob_spe,
  features1 = svgid[seq(40)],
  gsvaexp = 'CellTypeFree',
  gsvaexp.features = celltypeid,
  gsvaexp.assay.type = 2,
  permutation = NULL,
  add.pvalue = TRUE
)
#> The `gsvaexp` is specified, the specified `gsvaExp` will be
#> used to perform the analysis.
plot_heatmap_globalbv(res.gbv.genes.celltype)

The heatmap of spatial correlation between cell tyep and spatially variable genes

Local bivariate spatial analysis

After global bivariate spatial analysis, we found some genes (such as Psd, Nrgn, Kcnh3, Gpsm1, Pcp4) and GC show significant spatial co-aggregation. Next, we use runLOCALBV of SVP to identify the co-aggregation areas.

lbv.res <- mob_spe |> 
  runLOCALBV(features1=c("Psd", "Nrgn", "Kcnh3", "Gpsm1", "Pcp4"),
    assay.type = 'logcounts',
    gsvaexp = 'CellTypeFree',
    gsvaexp.features=c('GC'),
    gsvaexp.assay.type = 'prop',
    weight.method='knn',
    k = 8
  ) 
#> The `gsvaexp` is specified, the specified `gsvaExp` will be
#> used to perform the analysis with `features` displayed from `rownames(data)`.

# The result can be added to gsvaExp, then visualized by ggsc.
LISAsce(mob_spe, lisa.res=lbv.res, gsvaexp.name='LBV.res') |> 
  gsvaExp("LBV.res") %>% 
  plot_lisa_feature(lisa.res = lbv.res, assay.type = 1)

The heatmap of Local bivariate spatial analysis

The highlighted area denotes the region where the two variables significantly co-aggregate in space.

Session information

Here is the output of sessionInfo() on the system on which this document was compiled:

#> R version 4.5.0 beta (2025-04-02 r88102)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 24.04.2 LTS
#> 
#> Matrix products: default
#> BLAS:   /home/biocbuild/bbs-3.22-bioc/R/lib/libRblas.so 
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
#> 
#> locale:
#>  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
#>  [3] LC_TIME=en_GB              LC_COLLATE=C              
#>  [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
#>  [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
#>  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       
#> 
#> time zone: America/New_York
#> tzcode source: system (glibc)
#> 
#> attached base packages:
#> [1] stats4    stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#>  [1] scater_1.37.0               STexampleData_1.15.3       
#>  [3] ExperimentHub_2.17.0        AnnotationHub_3.17.0       
#>  [5] BiocFileCache_2.17.0        dbplyr_2.5.0               
#>  [7] scatterpie_0.2.4            magrittr_2.0.3             
#>  [9] ggplot2_3.5.2               SVP_1.1.0                  
#> [11] scran_1.37.0                scuttle_1.19.0             
#> [13] ggsc_1.7.0                  SpatialExperiment_1.19.0   
#> [15] SingleCellExperiment_1.31.0 SummarizedExperiment_1.39.0
#> [17] Biobase_2.69.0              GenomicRanges_1.61.0       
#> [19] GenomeInfoDb_1.45.0         IRanges_2.43.0             
#> [21] S4Vectors_0.47.0            BiocGenerics_0.55.0        
#> [23] generics_0.1.3              MatrixGenerics_1.21.0      
#> [25] matrixStats_1.5.0          
#> 
#> loaded via a namespace (and not attached):
#>   [1] fs_1.6.6                  spatstat.sparse_3.1-0    
#>   [3] httr_1.4.7                RColorBrewer_1.1-3       
#>   [5] tools_4.5.0               sctransform_0.4.1        
#>   [7] utf8_1.2.4                R6_2.6.1                 
#>   [9] lazyeval_0.2.2            uwot_0.2.3               
#>  [11] withr_3.0.2               sp_2.2-0                 
#>  [13] gridExtra_2.3             progressr_0.15.1         
#>  [15] cli_3.6.4                 spatstat.explore_3.4-2   
#>  [17] fastDummies_1.7.5         labeling_0.4.3           
#>  [19] sass_0.4.10               Seurat_5.2.1             
#>  [21] spatstat.data_3.1-6       ggridges_0.5.6           
#>  [23] pbapply_1.7-2             yulab.utils_0.2.0        
#>  [25] parallelly_1.43.0         limma_3.65.0             
#>  [27] RSQLite_2.3.9             gridGraphics_0.5-1       
#>  [29] ica_1.0-3                 spatstat.random_3.3-3    
#>  [31] dplyr_1.1.4               Matrix_1.7-3             
#>  [33] ggbeeswarm_0.7.2          abind_1.4-8              
#>  [35] lifecycle_1.0.4           yaml_2.3.10              
#>  [37] edgeR_4.7.0               SparseArray_1.9.0        
#>  [39] Rtsne_0.17                grid_4.5.0               
#>  [41] blob_1.2.4                promises_1.3.2           
#>  [43] dqrng_0.4.1               crayon_1.5.3             
#>  [45] miniUI_0.1.1.1            lattice_0.22-7           
#>  [47] beachmat_2.25.0           cowplot_1.1.3            
#>  [49] KEGGREST_1.49.0           magick_2.8.6             
#>  [51] pillar_1.10.2             knitr_1.50               
#>  [53] metapod_1.17.0            rjson_0.2.23             
#>  [55] future.apply_1.11.3       codetools_0.2-20         
#>  [57] fastmatch_1.1-6           glue_1.8.0               
#>  [59] ggfun_0.1.8               spatstat.univar_3.1-2    
#>  [61] data.table_1.17.0         vctrs_0.6.5              
#>  [63] png_0.1-8                 treeio_1.33.0            
#>  [65] spam_2.11-1               gtable_0.3.6             
#>  [67] cachem_1.1.0              xfun_0.52                
#>  [69] S4Arrays_1.9.0            mime_0.13                
#>  [71] pracma_2.4.4              survival_3.8-3           
#>  [73] statmod_1.5.0             bluster_1.19.0           
#>  [75] fitdistrplus_1.2-2        ROCR_1.0-11              
#>  [77] nlme_3.1-168              ggtree_3.17.0            
#>  [79] bit64_4.6.0-1             filelock_1.0.3           
#>  [81] RcppAnnoy_0.0.22          bslib_0.9.0              
#>  [83] irlba_2.3.5.1             vipor_0.4.7              
#>  [85] KernSmooth_2.23-26        colorspace_2.1-1         
#>  [87] DBI_1.2.3                 tidyselect_1.2.1         
#>  [89] bit_4.6.0                 compiler_4.5.0           
#>  [91] curl_6.2.2                BiocNeighbors_2.3.0      
#>  [93] DelayedArray_0.35.0       plotly_4.10.4            
#>  [95] scales_1.3.0              lmtest_0.9-40            
#>  [97] rappdirs_0.3.3            stringr_1.5.1            
#>  [99] digest_0.6.37             goftest_1.2-3            
#> [101] spatstat.utils_3.1-3      rmarkdown_2.29           
#> [103] XVector_0.49.0            htmltools_0.5.8.1        
#> [105] pkgconfig_2.0.3           sparseMatrixStats_1.21.0 
#> [107] fastmap_1.2.0             rlang_1.1.6              
#> [109] htmlwidgets_1.6.4         UCSC.utils_1.5.0         
#> [111] shiny_1.10.0              prettydoc_0.4.1          
#> [113] DelayedMatrixStats_1.31.0 farver_2.1.2             
#> [115] jquerylib_0.1.4           zoo_1.8-14               
#> [117] jsonlite_2.0.0            BiocParallel_1.43.0      
#> [119] BiocSingular_1.25.0       GenomeInfoDbData_1.2.14  
#> [121] ggplotify_0.1.2           dotCall64_1.2            
#> [123] patchwork_1.3.0           munsell_0.5.1            
#> [125] Rcpp_1.0.14               viridis_0.6.5            
#> [127] ape_5.8-1                 reticulate_1.42.0        
#> [129] stringi_1.8.7             ggstar_1.0.4             
#> [131] MASS_7.3-65               plyr_1.8.9               
#> [133] parallel_4.5.0            listenv_0.9.1            
#> [135] ggrepel_0.9.6             deldir_2.0-4             
#> [137] Biostrings_2.77.0         splines_4.5.0            
#> [139] tensor_1.5                locfit_1.5-9.12          
#> [141] igraph_2.1.4              spatstat.geom_3.3-6      
#> [143] RcppHNSW_0.6.0            reshape2_1.4.4           
#> [145] ScaledMatrix_1.17.0       BiocVersion_3.22.0       
#> [147] evaluate_1.0.3            SeuratObject_5.0.2       
#> [149] RcppParallel_5.1.10       BiocManager_1.30.25      
#> [151] tweenr_2.0.3              httpuv_1.6.15            
#> [153] RANN_2.6.2                tidyr_1.3.1              
#> [155] purrr_1.0.4               polyclip_1.10-7          
#> [157] future_1.40.0             scattermore_1.2          
#> [159] ggforce_0.4.2             rsvd_1.0.5               
#> [161] xtable_1.8-4              RSpectra_0.16-2          
#> [163] tidytree_0.4.6            tidydr_0.0.5             
#> [165] later_1.4.2               viridisLite_0.4.2        
#> [167] tibble_3.2.1              aplot_0.2.5              
#> [169] beeswarm_0.4.0            memoise_2.0.1            
#> [171] AnnotationDbi_1.71.0      cluster_2.1.8.1          
#> [173] globals_0.16.3

References

Chang, Yuzhou, Jixin Liu, Yi Jiang, Anjun Ma, Yao Yu Yeo, Qi Guo, Megan McNutt, et al. 2024. “Graph Fourier Transform for Spatial Omics Representation and Analyses of Complex Organs.” Nature Communications 15 (1): 7467.

Cortal, Akira, Loredana Martignetti, Emmanuelle Six, and Antonio Rausell. 2021. “Gene Signature Extraction and Cell Identity Recognition at the Single-Cell Level with Cell-Id.” Nature Biotechnology 39 (9): 1095–1102.

Hao, Yuhan, Tim Stuart, Madeline H Kowalski, Saket Choudhary, Paul Hoffman, Austin Hartman, Avi Srivastava, et al. 2023. “Dictionary Learning for Integrative, Multimodal and Scalable Single-Cell Analysis.” Nature Biotechnology. https://doi.org/10.1038/s41587-023-01767-y.

Miller, Brendan F, Dhananjay Bambah-Mukku, Catherine Dulac, Xiaowei Zhuang, and Jean Fan. 2021. “Characterizing Spatial Gene Expression Heterogeneity in Spatially Resolved Single-Cell Transcriptomic Data with Nonuniform Cellular Densities.” Genome Research 31 (10): 1843–55.

Palla, Giovanni, Hannah Spitzer, Michal Klein, David Fischer, Anna Christina Schaar, Louis Benedikt Kuemmerle, Sergei Rybakov, et al. 2022. “Squidpy: A Scalable Framework for Spatial Omics Analysis.” Nature Methods 19 (2): 171–78. https://doi.org/10.1038/s41592-021-01358-2.

Ståhl, Patrik L., Fredrik Salmén, Sanja Vickovic, Anna Lundmark, José Fernández Navarro, Jens Magnusson, Stefania Giacomello, et al. 2016. “Visualization and Analysis of Gene Expression in Tissue Sections by Spatial Transcriptomics.” Science 353 (6294): 78–82. https://doi.org/10.1126/science.aaf2403.

Sun, Shiquan, Jiaqiang Zhu, and Xiang Zhou. 2020. “Statistical Analysis of Spatial Expression Patterns for Spatially Resolved Transcriptomic Studies.” Nature Methods 17 (2): 193–200.

Svensson, Valentine, Sarah A Teichmann, and Oliver Stegle. 2018. “SpatialDE: Identification of Spatially Variable Genes.” Nature Methods 15 (5): 343–46.

Tepe, Burak, Matthew C. Hill, Brandon T. Pekarek, Patrick J. Hunt, Thomas J. Martin, James F. Martin, and Benjamin R. Arenkiel. 2018. “Single-Cell Rna-Seq of Mouse Olfactory Bulb Reveals Cellular Heterogeneity and Activity-Dependent Molecular Census of Adult-Born Neurons.” Cell Reports 25 (10): 2689–2703.e3. https://doi.org/10.1016/j.celrep.2018.11.034.

Weber, Lukas M., Arkajyoti Saha, Abhirup Datta, Kasper D. Hansen, and Stephanie C. Hicks. 2023. “NnSVG for the Scalable Identification of Spatially Variable Genes Using Nearest-Neighbor Gaussian Processes.” Nature Communications 14 (4059). https://doi.org/https://doi.org/10.1038/s41467-023-39748-z.

Zhu, Jiaqiang, Shiquan Sun, and Xiang Zhou. 2021. “SPARK-X: Non-Parametric Modeling Enables Scalable and Robust Detection of Spatial Expression Patterns for Large Spatial Transcriptomic Studies.” Genome Biology 22 (1): 184. https://doi.org/10.1186/s13059-021-02404-0.

SVP Vignette

Shuangbin Xu and GuangChuang Yu
School of Basic Medical Sciences, Southern Medical University

2025-04-15

Introduction

Overview of SVP

Install

Quantification of cell states using SVP

Quantification of BIOCARTA pathway or other function

Quantification of cell-type

Quantification of cell-type using the pre-established marker gene sets

Quantification of cell-type using the reference single-cell data

Spatial statistical analysis

Univariate spatial statistical analysis

Identification of spatial variable features

Identification of local spatial aggregation areas

Bivariate spatial statistical analysis

Global bivariate spatial analysis

Local bivariate spatial analysis

Session information

References

SVP Vignette

Shuangbin Xu and GuangChuang Yu School of Basic Medical Sciences, Southern Medical University

2025-04-15

Introduction

Overview of SVP

Install

Quantification of cell states using SVP

Quantification of BIOCARTA pathway or other function

Quantification of cell-type

Quantification of cell-type using the pre-established marker gene sets

Quantification of cell-type using the reference single-cell data

Spatial statistical analysis

Univariate spatial statistical analysis

Identification of spatial variable features

Identification of local spatial aggregation areas

Bivariate spatial statistical analysis

Global bivariate spatial analysis

Local bivariate spatial analysis

Session information

References

Shuangbin Xu and GuangChuang Yu
School of Basic Medical Sciences, Southern Medical University