library(stJoincount)
library(pheatmap)
library(ggplot2)
v1.1.1
stJoincount: Quantification tool for spatial correlation between clusters in spatial transcriptomics preprocessed data using join count statistic approach.
Spatial dependency is the relationship between location and attribute similarity. The measure reflects whether an attribute of a variable observed at one location is independent of values observed at neighboring locations. Positive spatial dependency exists when neighboring attributes are more similar than what could be explained by chance. Likewise, a negative spatial dependency is reflected by a dissimilarity of neighboring attributes. Join count analysis allows for quantification of the spatial dependencies of nominal data in an arrangement of spatially adjacent polygons.
This tool requires data produced with the 10X Genomics Visium Spatial Gene Expression platform with customized clusters. The purpose of this R package is to perform join count analysis for spatial correlation between clusters.
Users can install stJoincount
with:
if (!requireNamespace("BiocManager", quietly = TRUE)) {
install.packages("BiocManager")
}
BiocManager::install("stJoincount")
Examples of how to run this tool are below:
In this vignette, we are going to use an human breast cancer spatial transcriptomics sample.
fpath <- system.file("extdata", "dataframe.rda", package="stJoincount")
load(fpath)
head(humanBC)
#> imagecol imagerow Cluster
#> AATTGCAGCAATCGAC-1 431.2129 476.8069 4
#> ACCAGGAGTGTGATCT-1 273.0446 117.8218 9
#> ACCTCCGCCCTCGCTG-1 448.2178 423.9109 7
#> AGGTGTATCGCCATGA-1 144.2822 317.5000 1
#> ATAGTTCCACCCACTC-1 431.5099 323.9109 7
#> CCGTATTAGCGCAGTT-1 462.1535 200.4950 2
Within the ‘extdata’ user can find a dataframe “humanBC.rda”. This example data is a data.frame
that comes from a Seurat
object of a human breast cancer sample. It contains the following information that is essential to this algorithm - barcode (index), cluster (they could either be categorical or numerical labels), central pixel location (imagerow and imagecol). This dataframe is simplified after combining metadata with spatial coordinates.
The index contains barcodes, and at least three other columns that have these information are required and the column names should be the same as following:
imagerow
: The row pixel coordinate of the center of the spot
imagecol
: The column pixel coordinate of the center of the spot
Cluster
: The label that corresponding to this barcode
The following codes demonstrate how to generate the described data.frame
from Seurat
/spatialExperiment
Objects.
An example data preparation from Seurat:
fpath <- system.file("extdata", "SeuratBC.rda", package="stJoincount")
load(fpath)
df <- dataPrepFromSeurat(seuratBC, "label")
An example data preparation from SpatialExperiment object:
fpath <- system.file("extdata", "SpeBC.rda", package="stJoincount")
load(fpath)
df2 <- dataPrepFromSpE(SpeObjBC, "label")
This tool first converts a labeled spatial tissue map into a raster object, in which each spatial feature is represented by a pixel coded by label assignment. This process includes automatic calculation of optimal raster resolution and extent for the sample.
resolutionList <- resolutionCalc(humanBC)
resolutionList
#> [1] 152.89604 64.20792
mosaicIntegration <- rasterizeEachCluster(humanBC)
#> No optimal number found, using n = 110 instead.
#> In this case, there may be minor deviations in the subsequent calculation process.
#> The results are for reference only
After the labeled spatial sample being converted, the raster map can be visualized by:
mosaicIntPlot(humanBC, mosaicIntegration)
A neighbors list is then created from the rasterized sample, in which adjacent and diagonal neighbors for each pixel are identified. After adding binary spatial weights to the neighbors list, a multi-categorical join count analysis is performed to tabulate “joins” between all possible combinations of label pairs. The function returns the observed join counts, the expected count under conditions of spatial randomness, and the variance calculated under non-free sampling.
joincount.result <- joincountAnalysis(mosaicIntegration)
#> Warning in subset.nb(nbList, !(seq_len(length(nbList)) %in% emptyPos)):
#> subsetting caused increase in subgraph count
The z-score is then calculated as the difference between observed and expected counts, divided by the square root of the variance. A heatmap of z-scores represents the result from the join count analysis for all possible label pairs.
matrix <- zscoreMatrix(humanBC, joincount.result)
zscorePlot(matrix)
sessionInfo()
#> R version 4.4.0 RC (2024-04-16 r86468)
#> Platform: x86_64-pc-linux-gnu
#> Running under: Ubuntu 22.04.4 LTS
#>
#> Matrix products: default
#> BLAS: /home/biocbuild/bbs-3.20-bioc/R/lib/libRblas.so
#> LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.10.0
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_GB LC_COLLATE=C
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> time zone: America/New_York
#> tzcode source: system (glibc)
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ggplot2_3.5.1 pheatmap_1.0.12 stJoincount_1.7.0 BiocStyle_2.33.0
#>
#> loaded via a namespace (and not attached):
#> [1] RcppAnnoy_0.0.22 splines_4.4.0
#> [3] later_1.3.2 tibble_3.2.1
#> [5] polyclip_1.10-6 fastDummies_1.7.3
#> [7] lifecycle_1.0.4 sf_1.0-16
#> [9] globals_0.16.3 lattice_0.22-6
#> [11] MASS_7.3-60.2 magrittr_2.0.3
#> [13] plotly_4.10.4 sass_0.4.9
#> [15] rmarkdown_2.26 jquerylib_0.1.4
#> [17] yaml_2.3.8 httpuv_1.6.15
#> [19] Seurat_5.0.3 sctransform_0.4.1
#> [21] spam_2.10-0 sp_2.1-4
#> [23] spatstat.sparse_3.0-3 reticulate_1.36.1
#> [25] cowplot_1.1.3 pbapply_1.7-2
#> [27] DBI_1.2.2 RColorBrewer_1.1-3
#> [29] abind_1.4-5 zlibbioc_1.51.0
#> [31] Rtsne_0.17 GenomicRanges_1.57.0
#> [33] purrr_1.0.2 BiocGenerics_0.51.0
#> [35] GenomeInfoDbData_1.2.12 IRanges_2.39.0
#> [37] S4Vectors_0.43.0 ggrepel_0.9.5
#> [39] irlba_2.3.5.1 listenv_0.9.1
#> [41] spatstat.utils_3.0-4 terra_1.7-71
#> [43] units_0.8-5 goftest_1.2-3
#> [45] RSpectra_0.16-1 spatstat.random_3.2-3
#> [47] fitdistrplus_1.1-11 parallelly_1.37.1
#> [49] leiden_0.4.3.1 codetools_0.2-20
#> [51] DelayedArray_0.31.0 tidyselect_1.2.1
#> [53] raster_3.6-26 farver_2.1.1
#> [55] UCSC.utils_1.1.0 matrixStats_1.3.0
#> [57] stats4_4.4.0 spatstat.explore_3.2-7
#> [59] jsonlite_1.8.8 e1071_1.7-14
#> [61] progressr_0.14.0 ggridges_0.5.6
#> [63] survival_3.6-4 tools_4.4.0
#> [65] ica_1.0-3 Rcpp_1.0.12
#> [67] glue_1.7.0 gridExtra_2.3
#> [69] SparseArray_1.5.0 xfun_0.43
#> [71] MatrixGenerics_1.17.0 GenomeInfoDb_1.41.0
#> [73] dplyr_1.1.4 withr_3.0.0
#> [75] BiocManager_1.30.22 fastmap_1.1.1
#> [77] boot_1.3-30 fansi_1.0.6
#> [79] spData_2.3.0 digest_0.6.35
#> [81] R6_2.5.1 mime_0.12
#> [83] wk_0.9.1 colorspace_2.1-0
#> [85] scattermore_1.2 tensor_1.5
#> [87] spatstat.data_3.0-4 utf8_1.2.4
#> [89] tidyr_1.3.1 generics_0.1.3
#> [91] data.table_1.15.4 class_7.3-22
#> [93] httr_1.4.7 htmlwidgets_1.6.4
#> [95] S4Arrays_1.5.0 spdep_1.3-3
#> [97] uwot_0.2.2 pkgconfig_2.0.3
#> [99] gtable_0.3.5 lmtest_0.9-40
#> [101] SingleCellExperiment_1.27.0 XVector_0.45.0
#> [103] htmltools_0.5.8.1 dotCall64_1.1-1
#> [105] bookdown_0.39 SeuratObject_5.0.1
#> [107] scales_1.3.0 Biobase_2.65.0
#> [109] png_0.1-8 SpatialExperiment_1.15.0
#> [111] knitr_1.46 reshape2_1.4.4
#> [113] rjson_0.2.21 nlme_3.1-164
#> [115] proxy_0.4-27 cachem_1.0.8
#> [117] zoo_1.8-12 stringr_1.5.1
#> [119] KernSmooth_2.23-22 parallel_4.4.0
#> [121] miniUI_0.1.1.1 s2_1.1.6
#> [123] pillar_1.9.0 grid_4.4.0
#> [125] vctrs_0.6.5 RANN_2.6.1
#> [127] promises_1.3.0 xtable_1.8-4
#> [129] cluster_2.1.6 evaluate_0.23
#> [131] tinytex_0.50 magick_2.8.3
#> [133] cli_3.6.2 compiler_4.4.0
#> [135] rlang_1.1.3 crayon_1.5.2
#> [137] future.apply_1.11.2 labeling_0.4.3
#> [139] classInt_0.4-10 plyr_1.8.9
#> [141] stringi_1.8.3 viridisLite_0.4.2
#> [143] deldir_2.0-4 munsell_0.5.1
#> [145] lazyeval_0.2.2 spatstat.geom_3.2-9
#> [147] Matrix_1.7-0 RcppHNSW_0.6.0
#> [149] patchwork_1.2.0 future_1.33.2
#> [151] shiny_1.8.1.1 highr_0.10
#> [153] SummarizedExperiment_1.35.0 ROCR_1.0-11
#> [155] igraph_2.0.3 bslib_0.7.0