lute 1.3.0
This guide describes lute
’s generics, methods, and classes for algorithms,
including deconvolution and marker selection algorithms. This software and the
method to rescale on cell type-specific sizes is detailed in the manuscript
Maden et al. (2024). This may be useful to algorithm developers and
researchers interested in conducting systematic algorithm benchmarks.
The class structure used by lute
is based on the bluster
R/Bioconductor
package. It expands on that class structure by defining a hierarchy.
Many algorithms are maintained and versioned in GitHub or Zenodo rather than a routinely versioned repository such as Bioconductor or CRAN. This can prove an obstacle when tracing package development and attempting comprehensive benchmarks, as software that is not actively maintained can become deprecated over time, and not all software will use compatible dependency versions (Maden et al. (2023)).
lute
classes can help to (1.) encourage use of common Bioconductor
object classes (e.g. SummarizedExperiment
, SingleCellExperiment
,
DelayedArray
, etc.) and (2.) to use more standard inputs and outputs to
encourage code reuse, discourage duplicated efforts, and enable more rapid and
exhaustive benchmarks.
In a general sense, the class hierarchy is a wrapper allowing access to many algorithms using a single function and shared methods. However, it is possible to share data reformatting and preprocessing tasks, making the hierarchy more effectively similar to a workflow.
Topmost parameter class for cell type gene markers. This is used to manage the marker IDs.
This is the parent class for all deconvolution algorithm param objects. The
deconvolutionParam
class is minimal, and simply defines slots for
bulkExpression
, or a matrix of bulk expression data, and returnInfo
,
a logical value indicating whether the default algorithm output will be stored
and returned with standard output from running the deconvolution()
method on
a valid algorithm param object.
As shown in the class hierarchy diagram (above), referencebasedParam
is a
parent subclass inheriting attributes from deconvolutionParam
. It is meant to
contain and manage all tasks shared by reference-based deconvolution algorithms,
or algorithms that utilize a cell type summary dataset. This is to be
distinguished from reference-free algorithms.
This param class adds slots for referenceExpression
, the cell type reference
data, and cellScaleFactors
, an optional vector of cell type size factors used
to transform the reference.
This class is a subset of referencebasedParam
algorithms specifying explicit
samples used separately, such as for discrete training and test stages.
This param class adds a slot called bulkExpressionIndependent
, which is for a
dataset of bulk samples independent from samples specified in the
bulkExpression
slot.
lute
provides a number of helper functions used to make the algorithm classes
work. These include the parent classes and subclasses, and several functions to
convert between object classes. These helper functions may be useful to
developers. The following table indicates the functions and a short summary of
what they do.
function_name | description |
---|---|
referenceFromSingleCellExperiment() |
Makes the Z cell atlas reference from a SingleCellExperiment. |
eset_to_sce() |
Convert ExpressionSet to SingleCellExperiment. |
sce_to_eset() |
Convert SingleCellExperiment to ExpressionSet |
se_to_eset() |
Convert SummarizedExperiment to ExpressionSet. |
get_eset_from_matrix() |
Makes an ExpressionSet from a matrix. |
parseDeconvolutionPredictionsResults() |
Gets formatted predicted cell type proportions table from deconvolution results list. |
show() |
Method to inspect and summarize param object contents. |
deconvolution() |
Method to perform deconvolution with a param object. |
typemarkers() |
Method to get cell type markers with a param object. |
deconvolutionParam() |
Defines the principal parent class for all deconvolution method parameters. |
referencebasedParam() |
Class and methods for managing reference-based deconvolution methods. |
independentbulkParam() |
Class and methods for managing methods requiring independent bulk samples. |
typemarkersParam() |
Main constructor for class to manage mappings to the typemarkers() generic. |
The param class findmarkersParam
is defined for the function findMarkers()
from scran
(see ?findmarkersParam
). This is a function to identify cell type
marker genes from a single-cell or single-nucleus expression dataset.
The findmarkersParam
class is organized under its parent classes as
typemarkersParam->findMarkersParam
. It includes the typemarkers()
method for
the identification of marker genes, and show()
for inspecting the param
contents.
The following images annotate the constructor function and the typemarkers()
generic defined for the findmarkersParam
class.
The param class nnlsParam
is defined for the function nnls
from the nnls
R/CRAN package (see ?nnlsParam
). Non-negative least squares (NNLS) is commonly
used for deconvolution.
The nnlsParam
class is organized under its parent classes as
deconvolutionParam->referencebasedParam->nnlsParam
. It includes the
deconvolution()
generic for cell type deconvolution, and the show()
method
for inspecting the param contents.
The following images annotate the constructor function and the deconvolution()
generic defined for the nnlsParam
class.