dittoSeq 1.16.0
dittoSeq is a tool built to enable analysis and visualization of single-cell and bulk RNA-sequencing data by novice, experienced, and color-blind coders. Thus, it provides many useful visualizations, which all utilize red-green color-blindness optimized colors by default, and which allow sufficient customization, via discrete inputs, for out-of-the-box creation of publication-ready figures.
For single-cell data, dittoSeq works directly with data pre-processed in other popular packages (Seurat, scater, scran, …). For bulk RNAseq data, dittoSeq’s import functions will convert bulk RNAseq data of various different structures into a set structure that dittoSeq helper and visualization functions can work with. So ultimately, dittoSeq includes universal plotting and helper functions for working with (sc)RNAseq data processed and stored in these formats:
Single-Cell:
Bulk:
For bulk data, or if your data is currently not analyzed, or simply not in one
of these structures, you can still pull it in to the SingleCellExperiment
structure that dittoSeq works with using the importDittoBulk
function.
The default colors of this package are red-green color-blindness friendly. To
make it so, I used the suggested colors from (Wong 2011) and adapted
them slightly by appending darker and lighter versions to create a 24 color
vector. All plotting functions use these colors, stored in dittoColors()
, by
default.
Additionally:
Simulate
function allows a cone-typical individual to see what their
dittoSeq plots might look like to a colorblind individual.Code used here for dataset processing and normalization should not be seen as a suggestion of the proper methods for performing such steps. dittoSeq is a visualization tool, and my focus while developing this vignette has been simply creating values required for providing “pretty-enough” visualization examples.
dittoSeq is available through Bioconductor.
# Install BiocManager if needed
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
# Install dittoSeq
BiocManager::install("dittoSeq")
As of May 25th, 2021, Seurat-v4.0.2 & dittoSeq v1.4.1
Because often users will be familiar with Seurat already, so this may be 90% of what you may need!
Seurat Viz Function(s) | dittoSeq Equivalent(s) |
---|---|
DimPlot/ (I)FeaturePlot / UMAPPlot / etc. | dittoDimPlot / multi_dittoDimPlot |
VlnPlot / RidgePlot | dittoPlot / multi_dittoPlot |
DotPlot | dittoDotPlot |
FeatureScatter / GenePlot | dittoScatterPlot |
DoHeatmap | dittoHeatmap* |
[No Seurat Equivalent] | dittoBarPlot / dittoFreqPlot |
[No Seurat Equivalent] | dittoDimHex / dittoScatterHex |
[No Seurat Equivalent] | dittoPlotVarsAcrossGroups |
SpatialDimPlot, SpatialFeaturePlot, etc. | dittoSpatial (coming soon!) |
*Not all dittoSeq features exist in Seurat counterparts, and occasionally the same is true in the reverse.
See reference below for the equivalent names of major inputs
Seurat has had inconsistency in input names from version to version. dittoSeq drew some of its parameter names from previous Seurat-equivalents to ease cross-conversion, but continuing to blindly copy their parameter standards will break people’s already existing code. Instead, dittoSeq input names are guaranteed to remain consistent across versions, unless a change is required for useful feature additions.
Seurat Viz Input(s) | dittoSeq Equivalent(s) |
---|---|
object |
SAME |
features |
var / vars (generally the 2nd input, so name not needed!) OR genes & metas for dittoHeatmap() |
cells (cell subsetting is not always available) |
cells.use (consistently available) |
reduction & dims |
reduction.use & dim.1 , dim.2 |
pt.size |
size (or jitter.size ) |
group.by |
SAME |
split.by |
SAME |
shape.by |
SAME and also available in dittoPlot() |
fill.by |
color.by (can be used to subset group.by further!) |
assay / slot |
SAME |
order = logical |
order but = “unordered” (default), “increasing”, or “decreasing” |
cols |
color.panel for discrete OR min.color , max.color for continuous |
label & label.size & repel |
do.label & labels.size & labels.repel |
interactive |
do.hover = via plotly conversion |
[Not in Seurat] | data.out , do.raster , do.letter , do.ellipse , add.trajectory.lineages and others! |
For examples, we will use a pancreatic Baron et al. (2016) is not normalized nor dimensionality reduced upon
## Download Data
library(scRNAseq)
sce <- BaronPancreasData()
# Trim to only 5 of the cell types for simplicity of vignette
sce <- sce[,sce$label %in% c(
"acinar", "beta", "gamma", "delta", "ductal")]
Now that we have a single-cell dataset loaded, we are ready to go. All functions work for either Seurat or SCE encapsulated single-cell data.
But to make full use of dittoSeq, we should really have this data log-normalized, and we should run dimensionality reduction and clustering.
## Some Quick Pre-processing
# Normalization.
library(scater)
sce <- logNormCounts(sce)
# Feature selection.
library(scran)
dec <- modelGeneVar(sce)
hvg <- getTopHVGs(dec, prop=0.1)
# PCA & UMAP
library(scater)
set.seed(1234)
sce <- runPCA(sce, ncomponents=25, subset_row=hvg)
sce <- runUMAP(sce, pca = 10)
# Clustering.
library(bluster)
sce$cluster <- clusterCells(sce, use.dimred='PCA',
BLUSPARAM=NNGraphParam(cluster.fun="louvain"))
# Add some metadata common to Seurat objects
sce$nCount_RNA <- colSums(counts(sce))
sce$nFeature_RNA <- colSums(counts(sce)>0)
sce$percent.mito <- colSums(counts(sce)[grep("^MT-", rownames(sce)),])/sce$nCount_RNA
sce
## class: SingleCellExperiment
## dim: 20125 5416
## metadata(0):
## assays(2): counts logcounts
## rownames(20125): A1BG A1CF ... ZZZ3 pk
## rowData names(0):
## colnames(5416): human1_lib1.final_cell_0001 human1_lib1.final_cell_0002
## ... human4_lib3.final_cell_0700 human4_lib3.final_cell_0701
## colData names(7): donor label ... nFeature_RNA percent.mito
## reducedDimNames(2): PCA UMAP
## mainExpName: NULL
## altExpNames(0):
Now we have a single-cell dataset loaded and analyzed as an SCE, but note: All functions will work the same for single-cell data stored as either Seurat or SCE.
dittoSeq works natively with Seurat and SingleCellExperiment objects. Nothing special is needed. Just load in your data if it isn’t already loaded, then go!
library(dittoSeq)
dittoDimPlot(sce, "donor")
dittoPlot(sce, "ENO1", group.by = "label")
dittoBarPlot(sce, "label", group.by = "donor")
# First, we'll just make some mock expression and conditions data
exp <- matrix(rpois(20000, 5), ncol=20)
colnames(exp) <- paste0("donor", seq_len(ncol(exp)))
rownames(exp) <- paste0("gene", seq_len(nrow(exp)))
logexp <- logexp <- log2(exp + 1)
pca <- matrix(rnorm(20000), nrow=20)
conditions <- factor(rep(1:4, 5))
sex <- c(rep("M", 9), rep("F", 11))
dittoSeq works natively with bulk RNAseq data stored as a SummarizedExperiment object, and this includes data analyzed with DESeq2.
library(SummarizedExperiment)
bulkSE <- SummarizedExperiment(
assays = list(counts = exp,
logcounts = logexp),
colData = data.frame(conditions = conditions,
sex = sex)
)
Alternatively, or for bulk data stored in other forms, such as a DGEList or as
raw matrices, one can use the importDittoBulk()
function to convert it into
the SingleCellExperiment structure.
Some brief details on this structure: The SingleCellEExperiment class is very similar to the base SummarizedExperiment class, but with room added for storing pre-calculated dimensionality reductions.
# dittoSeq import which allows
bulkSCE <- importDittoBulk(
# x can be a DGEList, a DESeqDataSet, a SummarizedExperiment,
# or a list of data matrices
x = list(counts = exp,
logcounts = logexp),
# Optional inputs:
# For adding metadata
metadata = data.frame(conditions = conditions,
sex = sex),
# For adding dimensionality reductions
reductions = list(pca = pca)
)
Metadata and dimensionality reductions can be added either directly within the
importDittoBulk()
function via the metadata
and reductions
inputs,
as above, or separately afterwards:
# Add metadata (metadata can alternatively be added in this way)
bulkSCE$conditions <- conditions
bulkSCE$sex <- sex
# Add dimensionality reductions (can alternatively be added this way)
bulkSCE <- addDimReduction(
object = bulkSCE,
# (We aren't actually calculating PCA here.)
embeddings = pca,
name = "pca",
key = "PC")
Making plots for bulk data then operates similarly as for single-cell except for one slight caveat for SummarizedExperiment objects
library(dittoSeq)
dittoDimPlot(bulkSCE, "sex", size = 3, do.ellipse = TRUE)