The common data files (FCS files for the practical workshop) are located in /data/OpenCyto. The following will move the gating template csv file to your home directory under ~/data.

Please do not touch or modify the shared gating template. Work rather on your own copy.

# Common files
FCSPATH="/data/OpenCyto/data/FCS/"
WORKSPACE="/data/OpenCyto/data/workspace/080 batch 0882.xml"
OUTPATH="~"
MANUAL<-file.path(OUTPATH,"data","manual")
AUTO<-file.path(OUTPATH,"data","auto")
TEMPLATE="/data/OpenCyto/data/template/gt_080.csv"
ANNOTATIONS="/data/OpenCyto/data/workspace/pd_submit.csv"

# Move the template into your home directory
if(!file.exists("~/data")){
  dir.create("~/data")
}
file.copy(TEMPLATE,file.path("~/data",basename(TEMPLATE)),overwrite = TRUE)
## [1] TRUE
file.copy(ANNOTATIONS,file.path("~/data",basename(ANNOTATIONS)),overwrite = TRUE)
## [1] TRUE
TEMPLATE<-"~/data/gt_080.csv"
ANNOTATIONS<-"~/data/pd_submit.csv"

Load libraries

require(openCyto)
## Loading required package: openCyto
## Loading required package: flowWorkspace
## Loading required package: flowCore
## Loading required package: flowViz
## Loading required package: lattice
## KernSmooth 2.23 loaded
## Copyright M. P. Wand 1997-2009
## Loading required package: ncdfFlow
## Loading required package: gridExtra
## Loading required package: grid
require(ggplot2)
## Loading required package: ggplot2
require(plyr)
## Loading required package: plyr
require(reshape2)
## Loading required package: reshape2
require(data.table)
## Loading required package: data.table

Open the workspace

The first thing to do is open the workspace file.

ws<-openWorkspace(WORKSPACE)
show(ws)
## FlowJo Workspace Version  2.0 
## File location:  /data/OpenCyto/data/workspace 
## File name:  080 batch 0882.xml 
## Workspace is open. 
## 
## Groups in Workspace
##           Name Num.Samples
## 1  All Samples         158
## 2   0882-L-080         157
## 3        Comps          13
## 4 0882 Samples          76

We see different sample groups in the workspace. The data we’re concerned with is the 0882 Samples group. The data are already compensated and transformed.

Parse the workspace data

The following code snippet parses the contents of the workspace, loads the FCS data, compensates and transforms it according to the instructions in the FlowJo workspace file, and stores the results in a gating_set.
This can take some time depending on the bandwidth of the attached storage device (local vs network disk).
Some arguments to parseWorkspace
The name argument specifies which group to import.
The path argument tells flowWorkspace where to look for the FCS files.
The subset argument allows you to parse a subset of samples. It is a numeric index. The isNcdf argument allows you specify whether or not to use HDF5 / NetCDF for disk-based data access (TRUE), or whether to store the data in RAM (FALSE). Large data sets can’t practically be analyzed in RAM, set this to TRUE most of the time.

# Parse the worspace only if we haven't already done so, and save it.
if(!file.exists(MANUAL)){
  G<-parseWorkspace(ws,isNcdf=TRUE,name="0882 Samples",path=FCSPATH)
  save_gs(G,MANUAL)
}else{
  G<-load_gs(MANUAL)
}

G is now a GatingSet object. This is a set of GatingHierarchy objects, each of which represents an FCS file in the workspace / group that you imported.

Explore the GatingSet object

show(G)
## A GatingSet with 76 samples

We see the object G has 76 samples. The usual R operators can be used to subset the data. - Subset by index (not preferred) - Extract sample names - Subset by sample names (preferred)

G[1:10] # Subset by numeric index.. not the preferred way of doing things.
## A GatingSet with 10 samples
sampleNames(G)  # Get the sample names
##  [1] "769097.fcs" "769098.fcs" "769099.fcs" "769101.fcs" "769103.fcs"
##  [6] "769105.fcs" "769107.fcs" "769109.fcs" "769111.fcs" "769119.fcs"
## [11] "769121.fcs" "769122.fcs" "769124.fcs" "769125.fcs" "769127.fcs"
## [16] "769128.fcs" "769130.fcs" "769131.fcs" "769133.fcs" "769134.fcs"
## [21] "769136.fcs" "769137.fcs" "769145.fcs" "769147.fcs" "769149.fcs"
## [26] "769151.fcs" "769153.fcs" "769155.fcs" "769161.fcs" "769163.fcs"
## [31] "769165.fcs" "769167.fcs" "769169.fcs" "769171.fcs" "769177.fcs"
## [36] "769179.fcs" "769181.fcs" "769183.fcs" "769185.fcs" "769187.fcs"
## [41] "769193.fcs" "769195.fcs" "769197.fcs" "769199.fcs" "769201.fcs"
## [46] "769203.fcs" "769209.fcs" "769211.fcs" "769213.fcs" "769215.fcs"
## [51] "769217.fcs" "769219.fcs" "769225.fcs" "769227.fcs" "769229.fcs"
## [56] "769231.fcs" "769233.fcs" "769235.fcs" "769241.fcs" "769243.fcs"
## [61] "769245.fcs" "769247.fcs" "769249.fcs" "769251.fcs" "769257.fcs"
## [66] "769259.fcs" "769261.fcs" "769263.fcs" "769265.fcs" "769267.fcs"
## [71] "769273.fcs" "769275.fcs" "769277.fcs" "769279.fcs" "769281.fcs"
## [76] "769283.fcs"
G[sampleNames(G)[1:10]] # Subset by sample name.. preferred.
## A GatingSet with 10 samples

Visualize transformations

There is one data transformation per transformed channel. In this case there are no differences between the transformations on the different channels. The FlowJo transformation is a biexponential but it maps input data into “channel space” from 0 to 4095. The output data are effectively binned. flowWorkspace interpolate the transformation function to give continuous values between 0 and 4095.

raw<-seq(1,2.5e5,by=1000) #Input data range on the raw scale
transformed<-lapply(getTransformations(G[[1]]),function(x)matrix(x(raw))) # transform with each transformation
transformed<-do.call(cbind,transformed) # group output
colnames(transformed)<-gsub("^ ","",names(getTransformations(G[[1]])))
transformed<-melt(transformed)
transformed<-(cbind(transformed,raw))
setnames(transformed,c("index","parameter","transformed","raw"))
ggplot(data.frame(raw,transformed))+geom_line(aes(x=raw,y=transformed))+facet_wrap(~parameter)+theme_minimal()

plot of chunk unnamed-chunk-7

Compensation Matrices

We can extract the compensation matrices from the GatingSet as well, and visualize them.

getCompensationMatrices(G[[1]]) #Compensation for the first GatingHierarchy
## Compensation object 'defaultCompensation':
##                       FITC-A PerCP Cy55 Blue-A Pacific Blue-A  Am Cyan-A
## FITC-A             1.000e+00          0.117360      1.375e-03  9.008e-03
## PerCP Cy55 Blue-A  5.425e-06          1.000000      1.140e-04  3.689e-05
## Pacific Blue-A     5.084e-04          0.004963      1.000e+00  2.455e-02
## Am Cyan-A          4.314e-02          0.064777      4.833e-01  1.000e+00
## APC-A              2.848e-04          0.035541      2.395e-04  1.022e-04
## Alexa 680-A        3.616e-04          0.049498      2.752e-04  1.155e-04
## APC Cy7-A         -7.741e-04         -0.002257     -5.087e-04 -2.697e-04
## PE Green laser-A   2.025e-04          0.144410      8.921e-06  9.473e-06
## PE Tx RD-A         5.456e-04          0.915790      1.920e-04  4.865e-05
## PE Cy7-A           4.590e-04          0.048309      5.165e-05  2.110e-05
##                       APC-A Alexa 680-A APC Cy7-A PE Green laser-A
## FITC-A            0.0012730    0.001554 4.906e-04        0.0289250
## PerCP Cy55 Blue-A 0.0092128    0.390170 2.259e-02        0.0004728
## Pacific Blue-A    0.0012361    0.006692 1.086e-03        0.0014149
## Am Cyan-A         0.0197610    0.032842 2.797e-03        0.0350400
## APC-A             1.0000000    3.122700 1.409e-01        0.0010204
## Alexa 680-A       0.0020035    1.000000 7.197e-02        0.0007192
## APC Cy7-A         0.0150370    0.047991 1.000e+00       -0.0015137
## PE Green laser-A  0.0006007    0.001064 4.363e-05        1.0000000
## PE Tx RD-A        0.0039865    0.008877 4.263e-04        0.2067800
## PE Cy7-A          0.0003989    0.003547 3.853e-02        0.0505200
##                   PE Tx RD-A  PE Cy7-A
## FITC-A             0.0089330 0.0005243
## PerCP Cy55 Blue-A  0.0003445 0.0510200
## Pacific Blue-A     0.0011251 0.0011551
## Am Cyan-A          0.0274790 0.0058747
## APC-A              0.0008323 0.0153530
## Alexa 680-A        0.0008401 0.0152670
## APC Cy7-A         -0.0017165 0.1681300
## PE Green laser-A   0.2480800 0.0059821
## PE Tx RD-A         1.0000000 0.0522140
## PE Cy7-A           0.0141130 1.0000000
ggplot(melt(getCompensationMatrices(G[[1]])@spillover,value.name = "Coefficient"))+geom_tile(aes(x=Var1,y=Var2,fill=Coefficient))+scale_fill_continuous(guide="colourbar")+theme(axis.text.x=element_text(angle=45,hjust=1))

plot of chunk unnamed-chunk-8

Get the names of all manually derived pouplations

The getNodes function will extract the node names (population names) of all populations defined in a GatingSet or a GatingHierarchy.

getNodes(G)
##   [1] "root"                               
##   [2] "/S"                                 
##   [3] "/S/Lv"                              
##   [4] "/S/Lv/L"                            
##   [5] "/S/Lv/L/3+"                         
##   [6] "/S/Lv/L/3+/4+"                      
##   [7] "/S/Lv/L/3+/4+/57+"                  
##   [8] "/S/Lv/L/3+/4+/57-"                  
##   [9] "/S/Lv/L/3+/4+/Perforin+"            
##  [10] "/S/Lv/L/3+/4+/Granzyme B+"          
##  [11] "/S/Lv/L/3+/4+/Granzyme B-"          
##  [12] "/S/Lv/L/3+/4+/GZB+IFN+IL2+TNF+"     
##  [13] "/S/Lv/L/3+/4+/GZB+IFN+IL2+TNF-"     
##  [14] "/S/Lv/L/3+/4+/GZB+IFN+IL2-TNF+"     
##  [15] "/S/Lv/L/3+/4+/GZB+IFN+IL2-TNF-"     
##  [16] "/S/Lv/L/3+/4+/GZB+IFN-IL2+TNF+"     
##  [17] "/S/Lv/L/3+/4+/GZB+IFN-IL2+TNF-"     
##  [18] "/S/Lv/L/3+/4+/GZB+IFN-IL2-TNF+"     
##  [19] "/S/Lv/L/3+/4+/GZB+IFN-IL2-TNF-"     
##  [20] "/S/Lv/L/3+/4+/GZB-IFN+IL2+TNF+"     
##  [21] "/S/Lv/L/3+/4+/GZB-IFN+IL2+TNF-"     
##  [22] "/S/Lv/L/3+/4+/GZB-IFN+IL2-TNF+"     
##  [23] "/S/Lv/L/3+/4+/GZB-IFN+IL2-TNF-"     
##  [24] "/S/Lv/L/3+/4+/GZB-IFN-IL2+TNF+"     
##  [25] "/S/Lv/L/3+/4+/GZB-IFN-IL2+TNF-"     
##  [26] "/S/Lv/L/3+/4+/GZB-IFN-IL2-TNF+"     
##  [27] "/S/Lv/L/3+/4+/GZB-IFN-IL2-TNF-"     
##  [28] "/S/Lv/L/3+/4+/IFN+IL2+"             
##  [29] "/S/Lv/L/3+/4+/IFN+IL2+TNF+"         
##  [30] "/S/Lv/L/3+/4+/IFN+IL2+TNF-"         
##  [31] "/S/Lv/L/3+/4+/IFN+IL2-"             
##  [32] "/S/Lv/L/3+/4+/IFN+IL2-TNF+"         
##  [33] "/S/Lv/L/3+/4+/IFN+IL2-TNF-"         
##  [34] "/S/Lv/L/3+/4+/IFN-IL2+"             
##  [35] "/S/Lv/L/3+/4+/IFN-IL2+TNF+"         
##  [36] "/S/Lv/L/3+/4+/IFN-IL2+TNF-"         
##  [37] "/S/Lv/L/3+/4+/IFN-IL2-"             
##  [38] "/S/Lv/L/3+/4+/IFN-IL2-TNF+"         
##  [39] "/S/Lv/L/3+/4+/IFN-IL2-TNF-"         
##  [40] "/S/Lv/L/3+/4+/IFNg+"                
##  [41] "/S/Lv/L/3+/4+/IFNg+IL2+TNFa+GzB+57+"
##  [42] "/S/Lv/L/3+/4+/IFNg+IL2+TNFa+GzB+57-"
##  [43] "/S/Lv/L/3+/4+/IFNg+IL2+TNFa+GzB-57+"
##  [44] "/S/Lv/L/3+/4+/IFNg+IL2+TNFa+GzB-57-"
##  [45] "/S/Lv/L/3+/4+/IFNg+IL2+TNFa-GzB+57+"
##  [46] "/S/Lv/L/3+/4+/IFNg+IL2+TNFa-GzB+57-"
##  [47] "/S/Lv/L/3+/4+/IFNg+IL2+TNFa-GzB-57+"
##  [48] "/S/Lv/L/3+/4+/IFNg+IL2+TNFa-GzB-57-"
##  [49] "/S/Lv/L/3+/4+/IFNg+IL2-TNFa+GzB+57+"
##  [50] "/S/Lv/L/3+/4+/IFNg+IL2-TNFa+GzB+57-"
##  [51] "/S/Lv/L/3+/4+/IFNg+IL2-TNFa+GzB-57+"
##  [52] "/S/Lv/L/3+/4+/IFNg+IL2-TNFa+GzB-57-"
##  [53] "/S/Lv/L/3+/4+/IFNg+IL2-TNFa-GzB+57+"
##  [54] "/S/Lv/L/3+/4+/IFNg+IL2-TNFa-GzB+57-"
##  [55] "/S/Lv/L/3+/4+/IFNg+IL2-TNFa-GzB-57+"
##  [56] "/S/Lv/L/3+/4+/IFNg+IL2-TNFa-GzB-57-"
##  [57] "/S/Lv/L/3+/4+/IFNg-IL2+TNFa+GzB+57+"
##  [58] "/S/Lv/L/3+/4+/IFNg-IL2+TNFa+GzB+57-"
##  [59] "/S/Lv/L/3+/4+/IFNg-IL2+TNFa+GzB-57+"
##  [60] "/S/Lv/L/3+/4+/IFNg-IL2+TNFa+GzB-57-"
##  [61] "/S/Lv/L/3+/4+/IFNg-IL2+TNFa-GzB+57+"
##  [62] "/S/Lv/L/3+/4+/IFNg-IL2+TNFa-GzB+57-"
##  [63] "/S/Lv/L/3+/4+/IFNg-IL2+TNFa-GzB-57+"
##  [64] "/S/Lv/L/3+/4+/IFNg-IL2+TNFa-GzB-57-"
##  [65] "/S/Lv/L/3+/4+/IFNg-IL2-TNFa+GzB+57+"
##  [66] "/S/Lv/L/3+/4+/IFNg-IL2-TNFa+GzB+57-"
##  [67] "/S/Lv/L/3+/4+/IFNg-IL2-TNFa+GzB-57+"
##  [68] "/S/Lv/L/3+/4+/IFNg-IL2-TNFa+GzB-57-"
##  [69] "/S/Lv/L/3+/4+/IFNg-IL2-TNFa-GzB+57+"
##  [70] "/S/Lv/L/3+/4+/IFNg-IL2-TNFa-GzB+57-"
##  [71] "/S/Lv/L/3+/4+/IFNg-IL2-TNFa-GzB-57+"
##  [72] "/S/Lv/L/3+/4+/IFNg-IL2-TNFa-GzB-57-"
##  [73] "/S/Lv/L/3+/4+/IFN\\IL2"             
##  [74] "/S/Lv/L/3+/4+/IL2+"                 
##  [75] "/S/Lv/L/3+/4+/TNFa+"                
##  [76] "/S/Lv/L/3+/8+"                      
##  [77] "/S/Lv/L/3+/8+/57+"                  
##  [78] "/S/Lv/L/3+/8+/57-"                  
##  [79] "/S/Lv/L/3+/8+/Granzyme B+"          
##  [80] "/S/Lv/L/3+/8+/Granzyme B-"          
##  [81] "/S/Lv/L/3+/8+/GZB+IFN+IL2+TNF+"     
##  [82] "/S/Lv/L/3+/8+/GZB+IFN+IL2+TNF-"     
##  [83] "/S/Lv/L/3+/8+/GZB+IFN+IL2-TNF+"     
##  [84] "/S/Lv/L/3+/8+/GZB+IFN+IL2-TNF-"     
##  [85] "/S/Lv/L/3+/8+/GZB+IFN-IL2+TNF+"     
##  [86] "/S/Lv/L/3+/8+/GZB+IFN-IL2+TNF-"     
##  [87] "/S/Lv/L/3+/8+/GZB+IFN-IL2-TNF+"     
##  [88] "/S/Lv/L/3+/8+/GZB+IFN-IL2-TNF-"     
##  [89] "/S/Lv/L/3+/8+/GZB-IFN+IL2+TNF+"     
##  [90] "/S/Lv/L/3+/8+/GZB-IFN+IL2+TNF-"     
##  [91] "/S/Lv/L/3+/8+/GZB-IFN+IL2-TNF+"     
##  [92] "/S/Lv/L/3+/8+/GZB-IFN+IL2-TNF-"     
##  [93] "/S/Lv/L/3+/8+/GZB-IFN-IL2+TNF+"     
##  [94] "/S/Lv/L/3+/8+/GZB-IFN-IL2+TNF-"     
##  [95] "/S/Lv/L/3+/8+/GZB-IFN-IL2-TNF+"     
##  [96] "/S/Lv/L/3+/8+/GZB-IFN-IL2-TNF-"     
##  [97] "/S/Lv/L/3+/8+/IFN+IL2+"             
##  [98] "/S/Lv/L/3+/8+/IFN+IL2+TNF+"         
##  [99] "/S/Lv/L/3+/8+/IFN+IL2+TNF-"         
## [100] "/S/Lv/L/3+/8+/IFN+IL2-"             
## [101] "/S/Lv/L/3+/8+/IFN+IL2-TNF+"         
## [102] "/S/Lv/L/3+/8+/IFN+IL2-TNF-"         
## [103] "/S/Lv/L/3+/8+/IFN-IL2+"             
## [104] "/S/Lv/L/3+/8+/IFN-IL2+TNF+"         
## [105] "/S/Lv/L/3+/8+/IFN-IL2+TNF-"         
## [106] "/S/Lv/L/3+/8+/IFN-IL2-"             
## [107] "/S/Lv/L/3+/8+/IFN-IL2-TNF+"         
## [108] "/S/Lv/L/3+/8+/IFN-IL2-TNF-"         
## [109] "/S/Lv/L/3+/8+/IFNg+"                
## [110] "/S/Lv/L/3+/8+/IFNg+IL2+TNFa+GzB+57+"
## [111] "/S/Lv/L/3+/8+/IFNg+IL2+TNFa+GzB+57-"
## [112] "/S/Lv/L/3+/8+/IFNg+IL2+TNFa+GzB-57+"
## [113] "/S/Lv/L/3+/8+/IFNg+IL2+TNFa+GzB-57-"
## [114] "/S/Lv/L/3+/8+/IFNg+IL2+TNFa-GzB+57+"
## [115] "/S/Lv/L/3+/8+/IFNg+IL2+TNFa-GzB+57-"
## [116] "/S/Lv/L/3+/8+/IFNg+IL2+TNFa-GzB-57+"
## [117] "/S/Lv/L/3+/8+/IFNg+IL2+TNFa-GzB-57-"
## [118] "/S/Lv/L/3+/8+/IFNg+IL2-TNFa+GzB+57+"
## [119] "/S/Lv/L/3+/8+/IFNg+IL2-TNFa+GzB+57-"
## [120] "/S/Lv/L/3+/8+/IFNg+IL2-TNFa+GzB-57+"
## [121] "/S/Lv/L/3+/8+/IFNg+IL2-TNFa+GzB-57-"
## [122] "/S/Lv/L/3+/8+/IFNg+IL2-TNFa-GzB+57+"
## [123] "/S/Lv/L/3+/8+/IFNg+IL2-TNFa-GzB+57-"
## [124] "/S/Lv/L/3+/8+/IFNg+IL2-TNFa-GzB-57+"
## [125] "/S/Lv/L/3+/8+/IFNg+IL2-TNFa-GzB-57-"
## [126] "/S/Lv/L/3+/8+/IFNg-IL2+TNFa+GzB+57+"
## [127] "/S/Lv/L/3+/8+/IFNg-IL2+TNFa+GzB+57-"
## [128] "/S/Lv/L/3+/8+/IFNg-IL2+TNFa+GzB-57+"
## [129] "/S/Lv/L/3+/8+/IFNg-IL2+TNFa+GzB-57-"
## [130] "/S/Lv/L/3+/8+/IFNg-IL2+TNFa-GzB+57+"
## [131] "/S/Lv/L/3+/8+/IFNg-IL2+TNFa-GzB+57-"
## [132] "/S/Lv/L/3+/8+/IFNg-IL2+TNFa-GzB-57+"
## [133] "/S/Lv/L/3+/8+/IFNg-IL2+TNFa-GzB-57-"
## [134] "/S/Lv/L/3+/8+/IFNg-IL2-TNFa+GzB+57+"
## [135] "/S/Lv/L/3+/8+/IFNg-IL2-TNFa+GzB+57-"
## [136] "/S/Lv/L/3+/8+/IFNg-IL2-TNFa+GzB-57+"
## [137] "/S/Lv/L/3+/8+/IFNg-IL2-TNFa+GzB-57-"
## [138] "/S/Lv/L/3+/8+/IFNg-IL2-TNFa-GzB+57+"
## [139] "/S/Lv/L/3+/8+/IFNg-IL2-TNFa-GzB+57-"
## [140] "/S/Lv/L/3+/8+/IFNg-IL2-TNFa-GzB-57+"
## [141] "/S/Lv/L/3+/8+/IFNg-IL2-TNFa-GzB-57-"
## [142] "/S/Lv/L/3+/8+/IFN\\IL2"             
## [143] "/S/Lv/L/3+/8+/IL2+"                 
## [144] "/S/Lv/L/3+/8+/Perforin+"            
## [145] "/S/Lv/L/3+/8+/TNFa+"                
## [146] "/S/Lv/L/Not 4+"                     
## [147] "/S/Lv/L/Not 4+/IFNg+"

This manually gated data has 147 distinct populations.

Additional arguments to getNodes and getPopStats

The population names can be unwieldy and long. One particularly useful argument is the path= argument. You can set path="auto", which will extract the node name as the shortest unique path name for each cell population.

getNodes(G,path="auto")
##   [1] "root"                     "S"                       
##   [3] "Lv"                       "L"                       
##   [5] "3+"                       "4+"                      
##   [7] "4+/57+"                   "4+/57-"                  
##   [9] "4+/Perforin+"             "4+/Granzyme B+"          
##  [11] "4+/Granzyme B-"           "4+/GZB+IFN+IL2+TNF+"     
##  [13] "4+/GZB+IFN+IL2+TNF-"      "4+/GZB+IFN+IL2-TNF+"     
##  [15] "4+/GZB+IFN+IL2-TNF-"      "4+/GZB+IFN-IL2+TNF+"     
##  [17] "4+/GZB+IFN-IL2+TNF-"      "4+/GZB+IFN-IL2-TNF+"     
##  [19] "4+/GZB+IFN-IL2-TNF-"      "4+/GZB-IFN+IL2+TNF+"     
##  [21] "4+/GZB-IFN+IL2+TNF-"      "4+/GZB-IFN+IL2-TNF+"     
##  [23] "4+/GZB-IFN+IL2-TNF-"      "4+/GZB-IFN-IL2+TNF+"     
##  [25] "4+/GZB-IFN-IL2+TNF-"      "4+/GZB-IFN-IL2-TNF+"     
##  [27] "4+/GZB-IFN-IL2-TNF-"      "4+/IFN+IL2+"             
##  [29] "4+/IFN+IL2+TNF+"          "4+/IFN+IL2+TNF-"         
##  [31] "4+/IFN+IL2-"              "4+/IFN+IL2-TNF+"         
##  [33] "4+/IFN+IL2-TNF-"          "4+/IFN-IL2+"             
##  [35] "4+/IFN-IL2+TNF+"          "4+/IFN-IL2+TNF-"         
##  [37] "4+/IFN-IL2-"              "4+/IFN-IL2-TNF+"         
##  [39] "4+/IFN-IL2-TNF-"          "4+/IFNg+"                
##  [41] "4+/IFNg+IL2+TNFa+GzB+57+" "4+/IFNg+IL2+TNFa+GzB+57-"
##  [43] "4+/IFNg+IL2+TNFa+GzB-57+" "4+/IFNg+IL2+TNFa+GzB-57-"
##  [45] "4+/IFNg+IL2+TNFa-GzB+57+" "4+/IFNg+IL2+TNFa-GzB+57-"
##  [47] "4+/IFNg+IL2+TNFa-GzB-57+" "4+/IFNg+IL2+TNFa-GzB-57-"
##  [49] "4+/IFNg+IL2-TNFa+GzB+57+" "4+/IFNg+IL2-TNFa+GzB+57-"
##  [51] "4+/IFNg+IL2-TNFa+GzB-57+" "4+/IFNg+IL2-TNFa+GzB-57-"
##  [53] "4+/IFNg+IL2-TNFa-GzB+57+" "4+/IFNg+IL2-TNFa-GzB+57-"
##  [55] "4+/IFNg+IL2-TNFa-GzB-57+" "4+/IFNg+IL2-TNFa-GzB-57-"
##  [57] "4+/IFNg-IL2+TNFa+GzB+57+" "4+/IFNg-IL2+TNFa+GzB+57-"
##  [59] "4+/IFNg-IL2+TNFa+GzB-57+" "4+/IFNg-IL2+TNFa+GzB-57-"
##  [61] "4+/IFNg-IL2+TNFa-GzB+57+" "4+/IFNg-IL2+TNFa-GzB+57-"
##  [63] "4+/IFNg-IL2+TNFa-GzB-57+" "4+/IFNg-IL2+TNFa-GzB-57-"
##  [65] "4+/IFNg-IL2-TNFa+GzB+57+" "4+/IFNg-IL2-TNFa+GzB+57-"
##  [67] "4+/IFNg-IL2-TNFa+GzB-57+" "4+/IFNg-IL2-TNFa+GzB-57-"
##  [69] "4+/IFNg-IL2-TNFa-GzB+57+" "4+/IFNg-IL2-TNFa-GzB+57-"
##  [71] "4+/IFNg-IL2-TNFa-GzB-57+" "4+/IFNg-IL2-TNFa-GzB-57-"
##  [73] "4+/IFN\\IL2"              "4+/IL2+"                 
##  [75] "4+/TNFa+"                 "8+"                      
##  [77] "8+/57+"                   "8+/57-"                  
##  [79] "8+/Granzyme B+"           "8+/Granzyme B-"          
##  [81] "8+/GZB+IFN+IL2+TNF+"      "8+/GZB+IFN+IL2+TNF-"     
##  [83] "8+/GZB+IFN+IL2-TNF+"      "8+/GZB+IFN+IL2-TNF-"     
##  [85] "8+/GZB+IFN-IL2+TNF+"      "8+/GZB+IFN-IL2+TNF-"     
##  [87] "8+/GZB+IFN-IL2-TNF+"      "8+/GZB+IFN-IL2-TNF-"     
##  [89] "8+/GZB-IFN+IL2+TNF+"      "8+/GZB-IFN+IL2+TNF-"     
##  [91] "8+/GZB-IFN+IL2-TNF+"      "8+/GZB-IFN+IL2-TNF-"     
##  [93] "8+/GZB-IFN-IL2+TNF+"      "8+/GZB-IFN-IL2+TNF-"     
##  [95] "8+/GZB-IFN-IL2-TNF+"      "8+/GZB-IFN-IL2-TNF-"     
##  [97] "8+/IFN+IL2+"              "8+/IFN+IL2+TNF+"         
##  [99] "8+/IFN+IL2+TNF-"          "8+/IFN+IL2-"             
## [101] "8+/IFN+IL2-TNF+"          "8+/IFN+IL2-TNF-"         
## [103] "8+/IFN-IL2+"              "8+/IFN-IL2+TNF+"         
## [105] "8+/IFN-IL2+TNF-"          "8+/IFN-IL2-"             
## [107] "8+/IFN-IL2-TNF+"          "8+/IFN-IL2-TNF-"         
## [109] "8+/IFNg+"                 "8+/IFNg+IL2+TNFa+GzB+57+"
## [111] "8+/IFNg+IL2+TNFa+GzB+57-" "8+/IFNg+IL2+TNFa+GzB-57+"
## [113] "8+/IFNg+IL2+TNFa+GzB-57-" "8+/IFNg+IL2+TNFa-GzB+57+"
## [115] "8+/IFNg+IL2+TNFa-GzB+57-" "8+/IFNg+IL2+TNFa-GzB-57+"
## [117] "8+/IFNg+IL2+TNFa-GzB-57-" "8+/IFNg+IL2-TNFa+GzB+57+"
## [119] "8+/IFNg+IL2-TNFa+GzB+57-" "8+/IFNg+IL2-TNFa+GzB-57+"
## [121] "8+/IFNg+IL2-TNFa+GzB-57-" "8+/IFNg+IL2-TNFa-GzB+57+"
## [123] "8+/IFNg+IL2-TNFa-GzB+57-" "8+/IFNg+IL2-TNFa-GzB-57+"
## [125] "8+/IFNg+IL2-TNFa-GzB-57-" "8+/IFNg-IL2+TNFa+GzB+57+"
## [127] "8+/IFNg-IL2+TNFa+GzB+57-" "8+/IFNg-IL2+TNFa+GzB-57+"
## [129] "8+/IFNg-IL2+TNFa+GzB-57-" "8+/IFNg-IL2+TNFa-GzB+57+"
## [131] "8+/IFNg-IL2+TNFa-GzB+57-" "8+/IFNg-IL2+TNFa-GzB-57+"
## [133] "8+/IFNg-IL2+TNFa-GzB-57-" "8+/IFNg-IL2-TNFa+GzB+57+"
## [135] "8+/IFNg-IL2-TNFa+GzB+57-" "8+/IFNg-IL2-TNFa+GzB-57+"
## [137] "8+/IFNg-IL2-TNFa+GzB-57-" "8+/IFNg-IL2-TNFa-GzB+57+"
## [139] "8+/IFNg-IL2-TNFa-GzB+57-" "8+/IFNg-IL2-TNFa-GzB-57+"
## [141] "8+/IFNg-IL2-TNFa-GzB-57-" "8+/IFN\\IL2"             
## [143] "8+/IL2+"                  "8+/Perforin+"            
## [145] "8+/TNFa+"                 "Not 4+"                  
## [147] "Not 4+/IFNg+"

Operations that utilize node names can match a unique path to a node, so this gives an easy way to extract the shortest unique name for a cell population.

Extracting cell population statistics

getPopStats will extract cell population statistics, for a GatingSet or a GatingHierarchy. You can specify whether you want the cell counts or the cell frequencies.

freq_gs<-getPopStats(G[1:5],statistic="freq")
head(freq_gs)
##                     769097.fcs 769098.fcs 769099.fcs 769101.fcs 769103.fcs
## 3+                     0.79054    0.79106  0.8159666  7.163e-01  7.793e-01
## 4+                     0.63934    0.63238  0.6581814  6.150e-01  6.393e-01
## 4+/57+                 0.05689    0.05853  0.0489381  2.319e-02  1.078e-02
## 4+/57-                 0.94311    0.94147  0.9510619  9.768e-01  9.892e-01
## 4+/GZB+IFN+IL2+TNF+    0.00000    0.00000  0.0004208  1.506e-03  3.705e-04
## 4+/GZB+IFN+IL2+TNF-    0.00000    0.00000  0.0000000  8.425e-05  4.005e-05
counts_gh<-getPopStats(G[1:5],statistic="count")
head(counts_gh)
##                     769097.fcs 769098.fcs 769099.fcs 769101.fcs 769103.fcs
## 3+                      116492     111962     129990     154397     156216
## 4+                       74478      70802      85557      94950      99869
## 4+/57+                    4237       4144       4187       2202       1077
## 4+/57-                   70241      66658      81370      92748      98792
## 4+/GZB+IFN+IL2+TNF+          0          0         36        143         37
## 4+/GZB+IFN+IL2+TNF-          0          0          0          8          4

The output of getPopStats is a matrix wtih each entry representing the cell counts or cell frequencies (relative to the parent population) for a cell population and a sample.

Plotting

We can plot the gating tree for a sample or a set of samples using plot. To plot a gate use plotGate.

# The gating tree for this gating set
plot(G[[1]],boolean=FALSE)

plot of chunk plot

There is a lone IFNg+ gate floating, unattached to other nodes. If we look more closely at the node names, we see that its parent is “Not 4+”, which is hidden. We can set attributes of a node using setNode.

plot of chunk hiding_nodes

Plotting gates is done using plotGate. The infrastructre will extract the event-level data and plot it directly.

#plot one specific gate from one sample
p1<-plotGate(G[[2]],"4+/TNFa+",arrange=FALSE)
#plot one gate from two samples - automatic faceting
p2<-plotGate(G[2:3],"4+/TNFa+",arrange=FALSE)
#oops! population names must be unique
try(plotGate(G[1:2],"TNFa+"))

#binning using xbin (or not)
p4<-plotGate(G[[2]], "4+/TNFa+",xbin=128,arrange=FALSE)  #smaller bins (default is 32), more detail
# use gridExtra for layout
require(gridExtra)
grid.arrange(p1[[1]],p2,p4[[1]]) # Single plot is a list with element 1 being the plot object. Multiple plots are trellis objects directly.

plot of chunk plotGate

#plot an entire layout (boolean gates are skipped automatically)
p3<-plotGate(G[[2]]) # a little slower since we read all the event-level data. Arrange is TRUE by default
## skipping boolean gates!

plot of chunk plotGate

The plotting functions use package::lattice, see the help on ?plotGate for further options.

Extracting Event-level data: the getData API

Another way to extract event-level data from GatingSets and GatingHierarchy objects is via the getData API.
This API extracts the selected events (e.g. events belonging to a certain population) into an ncdfFlowSet, ncdfFlowFrame, flowSet, or flowFrame. These are the standard flowCore objects for representing collections of events from samples, with each sample representing one or a subset of FCS files.

cd3 <- getData(G[[1]],"3+")
cd3
## flowFrame object '9e2f2b56-b28e-4f1a-8ff2-cbd45407257e'
## with 116492 cells and 16 observables:
##                     name       desc  range minRange maxRange
## $P1                 Time       <NA>   2987    0.000     2987
## $P2                FSC-A       <NA> 262143    0.000   262143
## $P3                FSC-H       <NA> 262143    0.000   262143
## $P4                SSC-A       <NA> 262143    0.000   262143
## $P5             <FITC-A>       TNFa   4098   -1.437     4097
## $P6  <PerCP Cy55 Blue-A>        CD8   4098   -1.437     4097
## $P7     <Pacific Blue-A>       CD57   4098   -1.437     4097
## $P8          <Am Cyan-A>       AViD   4098   -1.437     4097
## $P9              <APC-A>   Perforin   4098   -1.437     4097
## $P10       <Alexa 680-A> Granzyme B   4098   -1.437     4097
## $P11         <APC Cy7-A>        CD4   4098   -1.437     4097
## $P12  <PE Green laser-A>        IL2   4098   -1.437     4097
## $P13        <PE Tx RD-A>        CD3   4098   -1.437     4097
## $P14            PE Cy5-A       <NA>   4098   -1.437     4097
## $P15           PE Cy55-A       <NA>   4098   -1.437     4097
## $P16          <PE Cy7-A>       IFNg   4098   -1.437     4097
## 241 keywords are stored in the 'description' slot

The above extracts the events that are CD3+ from the first sample of the gating set. The returned object is a flowFrame. See ?flowFrame for more information.

To grab multiple samples, pass a GatingSet

cd3_set<-getData(G[1:10],"3+")
cd3_set
## An ncdfFlowSet with 10 samples.
## flowSetId :  
## NCDF file : /home/greg/data/manual/file7e0047d2a0b0.nc 
## An object of class 'AnnotatedDataFrame'
##   rowNames: 769097.fcs 769098.fcs ... 769119.fcs (10 total)
##   varLabels: name
##   varMetadata: labelDescription
## 
##   column names:
##     Time, FSC-A, FSC-H, SSC-A, <FITC-A>, <PerCP Cy55 Blue-A>, <Pacific Blue-A>, <Am Cyan-A>, <APC-A>, <Alexa 680-A>, <APC Cy7-A>, <PE Green laser-A>, <PE Tx RD-A>, PE Cy5-A, PE Cy55-A, <PE Cy7-A>

You’ll note the above is a different object, an ncdfFlowSet in this case, which is a set of flowFrames backed by a NetCDF file.

The samples contained in a netCDF flowSet are tracked by a set of pointers or sample indices, so the same NetCDF file can be associated with multiple R objects, and each can contain subsets of the whole data set.

For this reason, you should be careful about modifying the data on disk. It can have unintended consequences.

cd3_set@file #where the NetCDF file lives
## [1] "/home/greg/data/manual/file7e0047d2a0b0.nc"
flowData(G) #Return the entire ncdfFlowSet associated with a GatingSet
## An ncdfFlowSet with 76 samples.
## flowSetId :  
## NCDF file : /home/greg/data/manual/file7e0047d2a0b0.nc 
## An object of class 'AnnotatedDataFrame'
##   rowNames: 769097.fcs 769098.fcs ... 769283.fcs (76 total)
##   varLabels: name
##   varMetadata: labelDescription
## 
##   column names:
##     Time, FSC-A, FSC-H, SSC-A, <FITC-A>, <PerCP Cy55 Blue-A>, <Pacific Blue-A>, <Am Cyan-A>, <APC-A>, <Alexa 680-A>, <APC Cy7-A>, <PE Green laser-A>, <PE Tx RD-A>, PE Cy5-A, PE Cy55-A, <PE Cy7-A>
flowData(G[1:5]) #Subsettig works, and still points to the same file. 
## An ncdfFlowSet with 5 samples.
## flowSetId :  
## NCDF file : /home/greg/data/manual/file7e0047d2a0b0.nc 
## An object of class 'AnnotatedDataFrame'
##   rowNames: 769097.fcs 769098.fcs ... 769103.fcs (5 total)
##   varLabels: name
##   varMetadata: labelDescription
## 
##   column names:
##     Time, FSC-A, FSC-H, SSC-A, <FITC-A>, <PerCP Cy55 Blue-A>, <Pacific Blue-A>, <Am Cyan-A>, <APC-A>, <Alexa 680-A>, <APC Cy7-A>, <PE Green laser-A>, <PE Tx RD-A>, PE Cy5-A, PE Cy55-A, <PE Cy7-A>

getSingleCellExpression API

We may want to grab the single-cell data for CD8+ T-cells that express IL2 or IFNg.
This is relatively common, so there’s an API for that:

# Get cells that express IL2 or IFNg and are CD8+. 
# map tells us how the node names map to markers on channels.
sce<-getSingleCellExpression(G[1:5],c("8+/IL2+","8+/IFNg+"),map=list("8+/IL2+"="IL2","8+/IFNg+"="IFNg"))
## 769097.fcs
## 769098.fcs
## 769099.fcs
## 769101.fcs
## 769103.fcs
str(sce) # list of 5  n x 2 matrices.
## List of 5
##  $ 769097.fcs: num[0 , 1:2] 
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:2] "IL2" "IFNg"
##  $ 769098.fcs: num [1:2, 1:2] 1899 1878 0 0
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:2] "IL2" "IFNg"
##  $ 769099.fcs: num [1:246, 1:2] 0 0 0 2788 0 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:2] "IL2" "IFNg"
##  $ 769101.fcs: num [1:6605, 1:2] 0 0 0 1883 2368 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:2] "IL2" "IFNg"
##  $ 769103.fcs: num [1:7051, 1:2] 2981 3097 1924 2547 2833 ...
##   ..- attr(*, "dimnames")=List of 2
##   .. ..$ : NULL
##   .. ..$ : chr [1:2] "IL2" "IFNg"

Annotating the samples

Much of the annotation information can be found in the keywords of the FCS files and the FlowJo workspace. We can extract these and annotate a GatingSet. It contains a phenoData slot that can be accessed by pData much like other BioConductor core objects.

Importantly, these annotation can be used to group files for gating, for example by subject, or stimulation and control, batch and so forth.

Other annotations may come from sites like flowrepository.org, and can also be imported from text files.

require(flowIncubator) # the `getKeywords` function is in the `flowIncubator` package at http://www.github.com/RGLab/flowIncubator
## Loading required package: flowIncubator
## Loading required package: QUALIFIER
## Loading required package: reshape
## 
## Attaching package: 'reshape'
## 
## The following objects are masked from 'package:reshape2':
## 
##     colsplit, melt, recast
## 
## The following objects are masked from 'package:plyr':
## 
##     rename, round_any
## 
## 
## Attaching package: 'flowIncubator'
## 
## The following object is masked from 'package:flowWorkspace':
## 
##     getKeywords
names(keyword(G[[1]])) # valid keywords
##   [1] "FCSversion"                   "$BEGINANALYSIS"              
##   [3] "$ENDANALYSIS"                 "$BEGINSTEXT"                 
##   [5] "$ENDSTEXT"                    "$BEGINDATA"                  
##   [7] "$ENDDATA"                     "$FIL"                        
##   [9] "$SYS"                         "$TOT"                        
##  [11] "$PAR"                         "$MODE"                       
##  [13] "$BYTEORD"                     "$DATATYPE"                   
##  [15] "$NEXTDATA"                    "CREATOR"                     
##  [17] "TUBE NAME"                    "$SRC"                        
##  [19] "EXPERIMENT NAME"              "GUID"                        
##  [21] "$DATE"                        "$BTIM"                       
##  [23] "$ETIM"                        "WINDOW EXTENSION"            
##  [25] "EXPORT USER NAME"             "EXPORT TIME"                 
##  [27] "FSC ASF"                      "AUTOBS"                      
##  [29] "$INST"                        "PLATE NAME"                  
##  [31] "WELL ID"                      "PLATE ID"                    
##  [33] "$TIMESTEP"                    "SPILL"                       
##  [35] "APPLY COMPENSATION"           "THRESHOLD"                   
##  [37] "$P1N"                         "$P1R"                        
##  [39] "$P1B"                         "$P1E"                        
##  [41] "$P1G"                         "P1BS"                        
##  [43] "P1MS"                         "$P2N"                        
##  [45] "$P2R"                         "$P2B"                        
##  [47] "$P2E"                         "$P2V"                        
##  [49] "$P2G"                         "P2DISPLAY"                   
##  [51] "P2BS"                         "P2MS"                        
##  [53] "$P3N"                         "$P3R"                        
##  [55] "$P3B"                         "$P3E"                        
##  [57] "$P3V"                         "$P3G"                        
##  [59] "P3DISPLAY"                    "P3BS"                        
##  [61] "P3MS"                         "$P4N"                        
##  [63] "$P4R"                         "$P4B"                        
##  [65] "$P4E"                         "$P4V"                        
##  [67] "$P4G"                         "P4DISPLAY"                   
##  [69] "P4BS"                         "P4MS"                        
##  [71] "$P5N"                         "$P5S"                        
##  [73] "$P5R"                         "$P5B"                        
##  [75] "$P5E"                         "$P5V"                        
##  [77] "$P5G"                         "P5DISPLAY"                   
##  [79] "P5BS"                         "P5MS"                        
##  [81] "$P6N"                         "$P6S"                        
##  [83] "$P6R"                         "$P6B"                        
##  [85] "$P6E"                         "$P6V"                        
##  [87] "$P6G"                         "P6DISPLAY"                   
##  [89] "P6BS"                         "P6MS"                        
##  [91] "$P7N"                         "$P7S"                        
##  [93] "$P7R"                         "$P7B"                        
##  [95] "$P7E"                         "$P7V"                        
##  [97] "$P7G"                         "P7DISPLAY"                   
##  [99] "P7BS"                         "P7MS"                        
## [101] "$P8N"                         "$P8S"                        
## [103] "$P8R"                         "$P8B"                        
## [105] "$P8E"                         "$P8V"                        
## [107] "$P8G"                         "P8DISPLAY"                   
## [109] "P8BS"                         "P8MS"                        
## [111] "$P9N"                         "$P9S"                        
## [113] "$P9R"                         "$P9B"                        
## [115] "$P9E"                         "$P9V"                        
## [117] "$P9G"                         "P9DISPLAY"                   
## [119] "P9BS"                         "P9MS"                        
## [121] "$P10N"                        "$P10S"                       
## [123] "$P10R"                        "$P10B"                       
## [125] "$P10E"                        "$P10V"                       
## [127] "$P10G"                        "P10DISPLAY"                  
## [129] "P10BS"                        "P10MS"                       
## [131] "$P11N"                        "$P11S"                       
## [133] "$P11R"                        "$P11B"                       
## [135] "$P11E"                        "$P11V"                       
## [137] "$P11G"                        "P11DISPLAY"                  
## [139] "P11BS"                        "P11MS"                       
## [141] "$P12N"                        "$P12S"                       
## [143] "$P12R"                        "$P12B"                       
## [145] "$P12E"                        "$P12V"                       
## [147] "$P12G"                        "P12DISPLAY"                  
## [149] "P12BS"                        "P12MS"                       
## [151] "$P13N"                        "$P13S"                       
## [153] "$P13R"                        "$P13B"                       
## [155] "$P13E"                        "$P13V"                       
## [157] "$P13G"                        "P13DISPLAY"                  
## [159] "P13BS"                        "P13MS"                       
## [161] "$P14N"                        "$P14R"                       
## [163] "$P14B"                        "$P14E"                       
## [165] "$P14V"                        "$P14G"                       
## [167] "P14DISPLAY"                   "P14BS"                       
## [169] "P14MS"                        "$P15N"                       
## [171] "$P15R"                        "$P15B"                       
## [173] "$P15E"                        "$P15V"                       
## [175] "$P15G"                        "P15DISPLAY"                  
## [177] "P15BS"                        "P15MS"                       
## [179] "$P16N"                        "$P16S"                       
## [181] "$P16R"                        "$P16B"                       
## [183] "$P16E"                        "$P16V"                       
## [185] "$P16G"                        "P16DISPLAY"                  
## [187] "P16BS"                        "P16MS"                       
## [189] "CST SETUP STATUS"             "CST BEADS LOT ID"            
## [191] "CYTOMETER CONFIG NAME"        "CYTOMETER CONFIG CREATE DATE"
## [193] "CST SETUP DATE"               "CST BASELINE DATE"           
## [195] "Plate"                        "Sample"                      
## [197] "Stim"                         "Replicate"                   
## [199] "Sample Order"                 "Type"                        
## [201] "Comp"                         "ANALYSIS_PLAN_ID"            
## [203] "EXP_ASSAY_ID"                 "STDY_DESC"                   
## [205] "WELLROLE"                     "FH control"                  
## [207] "FILENAME"                     "transformation"              
## [209] "flowCore_$P1Rmax"             "flowCore_$P1Rmin"            
## [211] "flowCore_$P2Rmax"             "flowCore_$P2Rmin"            
## [213] "flowCore_$P3Rmax"             "flowCore_$P3Rmin"            
## [215] "flowCore_$P4Rmax"             "flowCore_$P4Rmin"            
## [217] "flowCore_$P5Rmax"             "flowCore_$P5Rmin"            
## [219] "flowCore_$P6Rmax"             "flowCore_$P6Rmin"            
## [221] "flowCore_$P7Rmax"             "flowCore_$P7Rmin"            
## [223] "flowCore_$P8Rmax"             "flowCore_$P8Rmin"            
## [225] "flowCore_$P9Rmax"             "flowCore_$P9Rmin"            
## [227] "flowCore_$P10Rmax"            "flowCore_$P10Rmin"           
## [229] "flowCore_$P11Rmax"            "flowCore_$P11Rmin"           
## [231] "flowCore_$P12Rmax"            "flowCore_$P12Rmin"           
## [233] "flowCore_$P13Rmax"            "flowCore_$P13Rmin"           
## [235] "flowCore_$P14Rmax"            "flowCore_$P14Rmin"           
## [237] "flowCore_$P15Rmax"            "flowCore_$P15Rmin"           
## [239] "flowCore_$P16Rmax"            "flowCore_$P16Rmin"           
## [241] "ORIGINALGUID"
keyword_vars<-c("$FIL","Stim","Sample Order","EXPERIMENT NAME") #relevant keywords
pd<-data.table(getKeywords(G,keyword_vars)) #extract relevant keywords to a data table

annotations<-data.table:::fread(ANNOTATIONS) # read the annotations from flowrepository
setnames(annotations,"File Name","name")
setkey(annotations,"name")
setkey(pd,"name")

head(pd)
##          name    Stim Sample Order EXPERIMENT NAME
## 1: 769097.fcs negctrl            9      0882-L-080
## 2: 769098.fcs negctrl            9      0882-L-080
## 3: 769099.fcs     CMV            9      0882-L-080
## 4: 769101.fcs sebctrl            1      0882-L-080
## 5: 769103.fcs sebctrl            2      0882-L-080
## 6: 769105.fcs sebctrl            3      0882-L-080
head(annotations)
##          name Condition Timepoint Individual          Sample Description
## 1: 431345.fcs   negctrl         2      080-1 PBMCs from healthy subjects
## 2: 431346.fcs   negctrl         2      080-1 PBMCs from healthy subjects
## 3: 431348.fcs   negctrl         5      080-1 PBMCs from healthy subjects
## 4: 431349.fcs   negctrl         5      080-1 PBMCs from healthy subjects
## 5: 431351.fcs   negctrl         2      080-2 PBMCs from healthy subjects
## 6: 431352.fcs   negctrl         2      080-2 PBMCs from healthy subjects
##              Sample Source Description Sample Source
## 1: [biological] Human peripheral blood  [biological]
## 2: [biological] Human peripheral blood  [biological]
## 3: [biological] Human peripheral blood  [biological]
## 4: [biological] Human peripheral blood  [biological]
## 5: [biological] Human peripheral blood  [biological]
## 6: [biological] Human peripheral blood  [biological]
# We match on the name column, which matches the sampleNames(G). 
pd<-data.frame(annotations[pd]) #data.table style merge
setnames(pd,c("Timepoint","Individual"),c("VISITNO","PTID")) #Rename Timepoint and Indivudual to VISITNO and PTID which are used in the template files
pData(G)<-pd #annotate

OpenCyto Gating

Now that you are more familiar with the basic objects in OpenCyto, we’ll proceed to do some automated gating of the data, and we’ll explore the consequences of modifying different arugments in the gating template.

Cloning an existing GatingSet

We need to clone the existing gating set containing the manually gated data if we want to perform OpenCyto automated gating and save our results.

automated<-clone(G)

Next we need to remove the existing gates. root is the first node.. don’t remove that. Remove the first population below it, which are singlets in this case “S”.

Rm("S",automated) #Rm is the API to remove a node and all nodes beneath it.

automated is now a GatingSet containing only compensated and transformed data, with no gates.

Next we will load the gating template.

gating_template<-gatingTemplate(TEMPLATE)
## Warning: Bumped column 10 to type character on data row 7, field contains
## 'K=2'. Coercing previously read values in this column from integer or
## numeric back to character which may not be lossless; e.g., if '00' and
## '000' occurred before they will now be just '0', and there may be
## inconsistencies with treatment of ',,' and ',NA,' too (if they occurred in
## this column before the bump). If this matters please rerun and set
## 'colClasses' to 'character' for this column. Please note that column type
## detection uses the first 5 rows, the middle 5 rows and the last 5 rows, so
## hopefully this message should be very rare. If reporting to
## datatable-help, please rerun and include the output from verbose=TRUE.
## expanding pop: cd8+/-
## 
## expanding pop: cd57+/-
## 
## expanding pop: cd57+/-
## 
## Adding population:boundary
## Adding population:singlet
## Adding population:viable
## Adding population:nonNeutro
## Adding population:DebrisGate
## Adding population:nonDebris
## Adding population:lymph
## Adding population:cd3
## Adding population:cd8+
## Adding population:cd8-
## Adding population:cd4pos
## Adding population:cd4neg
## Adding population:cd4
## Adding population:cd8
## Adding population:TNFa
## Adding population:TNFa
## Adding population:IFNg
## Adding population:IFNg
## Adding population:IL2
## Adding population:IL2
## Adding population:cd57+
## Adding population:cd57-
## Adding population:cd57+
## Adding population:cd57-
## Adding population:prf_gate
## Adding population:prf_gate
## Adding population:Prf
## Adding population:Prf
## Adding population:gzb_gate
## Adding population:gzb_gate
## Adding population:GzB
## Adding population:GzB

Don’t worry about the warning at the end. Reading the template processes it and performs some expansion of populations like cd8+/- into cd8+ and cd8-.

We can view the gating template with plot

plot(gating_template)
## 
## Attaching package: 'graph'
## 
## The following object is masked from 'package:plyr':
## 
##     join

plot of chunk plot_template

Next we’ll look at the first row of the template, the entry for the boundary filter.

options("warn"=-1)
head(fread(TEMPLATE)[1,])
##       alias      pop parent        dims gating_method        gating_args
## 1: boundary boundary   root FSC-A,SSC-A      boundary max=c(2.5e5,2.5e5)
##    collapseDataForGating groupBy preprocessing_method preprocessing_args
## 1:

The gating_method is listed as boundary, if we look at the list of available gating methods we see it is one of the methods listed:

listgtMethods()
## Gating Functions:
## ===  quadrantGate 
## ===  quantileGate 
## ===  rangeGate 
## ===  flowClust.2d 
## ===  mindensity 
## ===  cytokine 
## ===  flowClust.1d 
## ===  boundary 
## ===  singletGate 
## ===  tailgate 
## ===  quadGate.tmix 
## ===  quadGate.seq 
## Preprocessing Functions:
## ===  prior_flowClust 
## ===  warpSet 
## ===  standardize_flowset

Other parameters are an alias, which is how the population is named in the gating tree, and pop, which tells openCyto whether to keep the events within the gate or outside the gate. In this case, it keeps the events within the gate by default.

  • The parent column specifies the name of the parent population, which is “root”, since this is the first population we are defining.
  • The dims column defines the channels on which this gate is applied. Here, it is “FSC-A,SSC-A”, which are the forward and side scatter dimensions. We can verify this by looking at the flowFrame from a sample.
getData(automated[[1]],use.exprs=FALSE) #use.exprs=FALSE don't retrieve events.. faster.
## flowFrame object '9e2f2b56-b28e-4f1a-8ff2-cbd45407257e'
## with 0 cells and 0 observables:
##                     name       desc  range minRange maxRange
## $P1                 Time       <NA>   2987    0.000     2987
## $P2                FSC-A       <NA> 262143    0.000   262143
## $P3                FSC-H       <NA> 262143    0.000   262143
## $P4                SSC-A       <NA> 262143    0.000   262143
## $P5             <FITC-A>       TNFa   4098   -1.437     4097
## $P6  <PerCP Cy55 Blue-A>        CD8   4098   -1.437     4097
## $P7     <Pacific Blue-A>       CD57   4098   -1.437     4097
## $P8          <Am Cyan-A>       AViD   4098   -1.437     4097
## $P9              <APC-A>   Perforin   4098   -1.437     4097
## $P10       <Alexa 680-A> Granzyme B   4098   -1.437     4097
## $P11         <APC Cy7-A>        CD4   4098   -1.437     4097
## $P12  <PE Green laser-A>        IL2   4098   -1.437     4097
## $P13        <PE Tx RD-A>        CD3   4098   -1.437     4097
## $P14            PE Cy5-A       <NA>   4098   -1.437     4097
## $P15           PE Cy55-A       <NA>   4098   -1.437     4097
## $P16          <PE Cy7-A>       IFNg   4098   -1.437     4097
## 241 keywords are stored in the 'description' slot
  • The gating_args column are additional arguments to the gating_method (boundary). Here we tell the gating method that we want to include events up to a maximum of 2.5e5. Anything outside that range probably corresponds to cellular aggregates.
Why a boundary filter?

We need to clean up the data because subsequent gating steps, which do some modeling, perform better if the data distribution is well behaved, i.e. smooth and doesn’t have large spikes.

How does the entry in the template look?

Let’s take a look at the gating template object

gating_template@nodeData@data[["/boundary"]] # node named "/boundary"
## $pop
## An object of class "gtPopulation"
## Slot "id":
## [1] "/boundary"
## 
## Slot "name":
## [1] "boundary+"
## 
## Slot "alias":
## [1] "boundary"
gating_template@edgeData@data[["root|/boundary"]] #edge from root to the boundary node
## $gtMethod
## boundary(xChannel=FSC-A,yChannel=SSC-A) 
## 
## $isReference
## [1] FALSE
str(gating_template@edgeData@data[["root|/boundary"]]$gtMethod)
## Formal class 'gtMethod' [package "openCyto"] with 5 slots
##   ..@ name    : chr "boundary"
##   ..@ dims    : chr "FSC-A,SSC-A"
##   ..@ args    :List of 1
##   .. ..$ max: language c(250000, 250000)
##   ..@ groupBy : chr ""
##   ..@ collapse: logi FALSE

We see the node has an “id”, a “name”, and an “alias”, these are taken from the csv template. The edge information has an entry “gtMethod”, which contains a gtMethod object that has all the info to run the gating method on a sample. It contains the dimensions to gate, the name of the population, the arguments to the method, and groupBy describing whether the data should be grouped by some phenoData variable, and collapse defining whether the data should be collapsed across multiple samples for gating.

The gating template plot shows us that we have a population named “boundary” gated afte the “root” node using the “boundary” gating method (green arrow).

Reference gates

Reference gates are very useful and important to understand. They provide a way to define a gate and re-use that gate threshold later in the gating process. We’ll show how they’re used here to gate out debris.

Let’s look at the path from “viable” to “nonDebris”. It goes through two additional populations, “nonNeutro”, and “DebrisGate”.

head(fread(TEMPLATE)[3:6,]) # Thre four entries in the csv template.
##         alias         pop    parent  dims gating_method
## 1:     viable     viable-   singlet  AViD    mindensity
## 2:  nonNeutro  nonNeutro-    viable SSC-A    mindensity
## 3: DebrisGate DebrisGate+ nonNeutro FSC-A    mindensity
## 4:  nonDebris  nonDebris+    viable FSC-A       refGate
##                gating_args collapseDataForGating groupBy
## 1:  gate_range=c(500,1000)                              
## 2: gate_range=c(5e4,1.5e5)                              
## 3:   gate_range=c(0,1e+05)                              
## 4:              DebrisGate                              
##    preprocessing_method preprocessing_args
## 1:                                        
## 2:                                        
## 3:                                        
## 4:

These consist of a one-dimensional mindensity gate to exclude dead cells (the gate on the “AViD” parameter which looks for a minimum density cutpoint in the data distribution between 500 and 1000). Then, two minimum density cutpoints on side-scatter and forward scatter, which remove high side-scatter events, and debris, respectively. The gate_range argument to mindensity allows you to specify a range of the data where you expect the separation between the populations to occur. This is why it is important to have well-standardized experiments. If you have lots of instrument variation, the data will have very different distributions from sample to sample, and automated gating approaches will have a very difficult time with such data.

The final gate is a refGate, or “referenceGate”. It is applied to the viable population on the forward scatter channel, and uses the threshold derived from the DebrisGate population. Why not just apply a mindensity gate on forward scatter directly to the live cells? The reason is that high side-scatter events sometimes get in the way of identifying a good cutpoint for debris in the forward scatter channel, so these are removed first, then the resulting cutpoint is used directly on the live cells.

Two dimensional gates for lymphocytes

OpenCyto can use multidimensional mixture modeling tools such as flowClust which was developed for flow cytometry automated gating. We use this most frequently to gate on lymphocytes. The definition of the lymphocyte population, named “lymph” is shown below in the csv.

head(fread(TEMPLATE)[7,]) # The lymph population in the csv template.
##    alias   pop    parent        dims gating_method       gating_args
## 1: lymph lymph nonDebris FSC-A,SSC-A     flowClust K=2,quantile=0.99
##    collapseDataForGating groupBy preprocessing_method preprocessing_args
## 1:                                    prior_flowClust                K=2
  • The dims column lists two dimensions for gating.
  • The pop column lists one population with no + or - sign. This indicates to OpenCyto that this is a multidimensional gate.
  • The gating_method is “flowClust”. A 1d and 2d version of this can be found in the listgtMethods() output, but openCyto chooses the correct one based on context.
  • gating_args include standard arguments to flowClust. In this case we specify K=2 that we want to identify two populations, and quantile=0.99 that we want to include 99% of cells in the gate.
  • preprocessing_method uses prior_flowClust which looks at the data before gating it and estimates a data-driven prior for the two populations. This used to speed up convergence when fitting the model to individual samples.

The output is a single population named “lymph”. By default, openCyto will select the population with highest proportion. However, if that’s not desired, then you can pass a target=c(x,y) argument, where “x” and “y” are the approximate x and y location of the cell population of interest (usually it is fairly consistent across samples in an experiment). OpenCyto will look at the distance to this target of each cell population identified in each sample and return the one that is closest.

CD8 and “shorthand” notation

head(fread(TEMPLATE)[9,]) # The four entries in the csv template.
##    alias    pop parent dims gating_method                   gating_args
## 1:     * cd8+/-    cd3  cd8    mindensity gate_range=c(1500,2800),min=0
##    collapseDataForGating groupBy preprocessing_method preprocessing_args
## 1:

Above, you see the definition of the CD8+/- cell subset. Some things you should note that are different from what we’ve seen so far:

  • There is no alias, but a “*" asterisk character.
  • An “min” argument which prefilters the data before computing the density.
  • A “gate_range” argument that specifies where to look for a cutpoint after computing the density.
  • Most fluorescence parameters from FlowJo are transformed into the range [0,4095], so the gate_range here covers about 31.9489 percent of the data range.
  • OpenCyto performs an “expansion” of the shorthand pop argument “cd8+/-” to define two populations: “cd8+” and “cd8-”, named automatically, thus the empty alias field.

We can see the expanded populations in the gating template graph.

# Two nodes for the single input row "cd8+/-"
gating_template@nodeData@data[["/boundary/singlet/viable/nonDebris/lymph/cd3/cd8+"]]
## $pop
## An object of class "gtPopulation"
## Slot "id":
## [1] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8+"
## 
## Slot "name":
## [1] "cd8+"
## 
## Slot "alias":
## [1] "cd8+"
gating_template@nodeData@data[["/boundary/singlet/viable/nonDebris/lymph/cd3/cd8-"]]
## $pop
## An object of class "gtPopulation"
## Slot "id":
## [1] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8-"
## 
## Slot "name":
## [1] "cd8-"
## 
## Slot "alias":
## [1] "cd8-"

Putting it together to gate CD4 conditional on CD8

Things appear to get complex as we get to the CD4 and CD8 populations. But they’re actually not so bad.

Let’s start by actually running the gating on the first few samples.

#gate the first 5 samples
nodes(gating_template)
##  [1] "root"                                                           
##  [2] "/boundary"                                                      
##  [3] "/boundary/singlet"                                              
##  [4] "/boundary/singlet/viable"                                       
##  [5] "/boundary/singlet/viable/nonNeutro"                             
##  [6] "/boundary/singlet/viable/nonNeutro/DebrisGate"                  
##  [7] "/boundary/singlet/viable/nonDebris"                             
##  [8] "/boundary/singlet/viable/nonDebris/lymph"                       
##  [9] "/boundary/singlet/viable/nonDebris/lymph/cd3"                   
## [10] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8+"              
## [11] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8-"              
## [12] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8-/cd4pos"       
## [13] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8+/cd4neg"       
## [14] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4"               
## [15] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8"               
## [16] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/TNFa"          
## [17] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/TNFa"          
## [18] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/IFNg"          
## [19] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/IFNg"          
## [20] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/IL2"           
## [21] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/IL2"           
## [22] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/cd57+"         
## [23] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/cd57-"         
## [24] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/cd57+"         
## [25] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/cd57-"         
## [26] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/cd57-/prf_gate"
## [27] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/cd57-/prf_gate"
## [28] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/Prf"           
## [29] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/Prf"           
## [30] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/cd57-/gzb_gate"
## [31] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/cd57-/gzb_gate"
## [32] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd4/GzB"           
## [33] "/boundary/singlet/viable/nonDebris/lymph/cd3/cd8/GzB"
gating(gating_template,automated[1:10])
## Loading required package: parallel
## Gating for 'boundary'
## done.
## Gating for 'singlet'
## done.
## Gating for 'viable'
## done.
## Gating for 'nonNeutro'
## done.
## Gating for 'DebrisGate'
## done.
## Population 'nonDebris'
## done.
## Preprocessing for 'lymph'
## Gating for 'lymph'
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## Using the parallel (multicore) version of flowClust with 1 cores
## done.
## Gating for 'cd3'
## done.
## Gating for 'cd8+'
## done.
## Gating for 'cd4neg'
## done.
## Population 'cd8'
## done.
## Gating for 'cd57+'
## done.
## Population 'cd57-'
## done.
## Preprocessing for 'gzb_gate'
## Gating for 'gzb_gate'
## done.
## Population 'GzB'
## done.
## Preprocessing for 'prf_gate'
## Gating for 'prf_gate'
## done.
## Population 'Prf'
## done.
## Preprocessing for 'IL2'
## Gating for 'IL2'
## done.
## Preprocessing for 'IFNg'
## Gating for 'IFNg'
## done.
## Preprocessing for 'TNFa'
## Gating for 'TNFa'
## done.
## Population 'cd8-'
## done.
## Gating for 'cd4pos'
## done.
## Population 'cd4'
## done.
## Gating for 'cd57+'
## done.
## Population 'cd57-'
## done.
## Preprocessing for 'gzb_gate'
## Gating for 'gzb_gate'
## done.
## Population 'GzB'
## done.
## Preprocessing for 'prf_gate'
## Gating for 'prf_gate'
## done.
## Population 'Prf'
## done.
## Preprocessing for 'IL2'
## Gating for 'IL2'
## done.
## Preprocessing for 'IFNg'
## Gating for 'IFNg'
## done.
## Preprocessing for 'TNFa'
## Gating for 'TNFa'
## done.
## finished.

Now let’s plot the cd8-, cd4pos, and cd4 gates.

plotGate(automated[[1]],c("cd8-","cd4pos","cd4"),gpar=list(nrow=1)) #gpar adjusts graphical parameters so we can layout by row.

plot of chunk plot_cd4

Now we see more clearly what is happening. The “cd8-” gate defines the set of CD8- cells (71% in the samples, first panel), which are passed into the “cd4pos” gate, so CD4 is gated conditional on CD8 (second panel). The thresholds from these two gates are composed via the “reference gate” to define CD4+/CD8- T-cells, whose parent is CD3 (third panel).

plot of chunk unnamed-chunk-12

Similar approaches using “refGate” are used to gate Perforin, and Grazyme B in this data set.

Using annotations to help in gating.

We can use the annotations in pData to help OpenCyto group samples for gating. In this data set, which consists of antigen-stimulated T-cells and subject-matched non-stimulated controls, it makes sense to gate data by PTID (subject), and even by VISIT (subject visit).

groupBy

To group the data for gating by certain variables in the pData, add those variables to the groupBy column, using standard R notation for combining levels of factors. In this case, for the “TNFa” gate, we see that the groupBy column has the “PTID:VISIT” variables listed. So samples with unique levels of PTID and VISIT will be combined and passed to the gating routine.

fread(TEMPLATE)[14,]
##    alias   pop parent dims gating_method       gating_args
## 1:  TNFa TNFa+    cd4 TNFa      cytokine adjust=2,tol=2e-3
##    collapseDataForGating      groupBy preprocessing_method
## 1:                       PTID:VISITNO  standardize_flowset
##    preprocessing_args
## 1:

cytokine gate (tailgate)

The cytokines in this data set are gated using a routine called cyokine aka tailgate. These are interchangeable and alias the same routine.
There are two parameters of particular importance:

  • adjust specifies the amount of smoothing when estimating the density.
  • tol is the tolerance within which the first derivative of the density is allowed to appraoch zero (i.e. how flat) before a threshold is drawn.

The cytokine / tailgate is meant for use in a situation where there is one major cell population (perhaps a major negative peak), and very few positive cells, such that they may not register as a peak on a density estimate. The routine calculates the first derivative of the density and looks at the value on the (left or right) side of the density (side="left" or side="right" parameter). When the value is within tol of zero, it places a gate.

In general the routine works better than a fixed quantile for discriminating rare cell populations, and allows for variable levels of background.

Preprocessing via standardize_flowset

The cytokine or tailgate is pared with a preprocessing routine named standardize_flowset. This routine will take all the samples within a group, estimate a set of data transformations that will standarize the samples to each other, collapse the transformed data and pass it along to the gating routine, along with the transformation functions, and the original data.

The gating routine, if it receives the preprocessing results from standardize_flowset, will gate the collapsed data, then back-transform that gate location with the appropriate sample-specific transformation, and use that to gate the original sample-specific data.

The above enables the grouped data to have a common gate threshold on the transformed scale, and on that transformed scale, the data are standardized, thus comparable across samples.

Tailgate without preprocessing via standardize_flowset

If the preprocessing argument is omitted, then gating proceeds immediately using tailgate or cytokine gate. In order to pass grouped and collapsed samples on to the gating routine, you need to specify both the groupBy and collapseDataForGating columns in the template. Then a single collapsed flowSet is passed on to tailgate (without standardization), and each sample will share a common gate threshold on the untransformed scale.

Grouping and collapsing

In general groupBy specifies how the data are grouped before being passed to preprocessing if preprocessing is performed. For example, for flowClust, the groups could specify sets of samples that should be used to estimate a common prior.

The collapseDataForGating columns specifies whether the grouped data should be collapsed into a single flowFrame before being passed on to the gating routine. If this is the case, then a common threshold is estimated for the set of samples.

An example of this is the CD57+/- set of populations.

fread(TEMPLATE)[20,]
##    alias     pop parent dims gating_method                   gating_args
## 1:     * cd57+/-    cd4 cd57    mindensity gate_range=c(1000,3000),min=0
##    collapseDataForGating      groupBy preprocessing_method
## 1:                  TRUE PTID:VISITNO                     
##    preprocessing_args
## 1:

The gate above uses mindensity to find a threshold for the CD57+ vs CD57- populations. It finds a common threshold for unique levels of PTID and VISIT by setting the groupBy column with collapseDataForGating=TRUE. It does no preprocessing.

standardize_flowset is the only exception to the above. In order to get a common threshold for a group of samples on the standarized scale, you must specify a value for groupBy but set collapseDataForGating to FALSE, otherwise an error will be thrown.

Some rules about specifying population names in the pop column

Plus (+) and minus (-) signs are reserved characters. Your channel and marker names must not include + or - signs as these are interpreted as population modifiers that tell OpenCyto whether to keep the positive or negative part of a population (events in our out of the gate).

If your channel names have special characters like these, use marker names in the pop column instead.

Plug-in gating methods

You can plug in your own routines for gating_method and preprocessing_method by using the registerPlugins API.

This looks like the following:

args(registerPlugins)
## function (fun = NA, methodName, dep = NA, ...) 
## NULL
?registerPlugins

The registration routine requires a function fun, a name methodName, and a dep argument that specifies any library dependencies (for example if the gating routine requires functionality from a different package). Finally there are additional arguments that specify whether the routine is a “gating” or “preprocessing” routine.

The fun argument is a function that wraps around a gating routine and gives it a common interface. A gating method should have the following formal arguments:

A preprocessing method should have the following arguments:

Within the wrapper, you can extract the information you need to pass on to your gating or preprocessing routine. As defined by the formal arguments, you have access to the data in fr, the output of any preprocessing in pp_res, the channels via xChannel and yChannel (yChannel is always specified, and xChannel may be NULL), and the filter_id which you can use to name the gate that you return to OpenCyto.
Within the preprocessing you have access to the flowSet of data, the GatingSet which allows you to access other annotations, the gating method object via gm, and the channels to use for gating.

This allows the plugin-writer a considerable amount of flexibility and creativity to process the data as they please. For example, we can envision a plugin that would use FMO controls to set the gate thresholds. The preprocessing could match the xChannel and yChannel with FMO controls from the gating set in gs, and gate those controls and pass the thresholds as part of pp_res on to the gating method.

The gating method would construct the actual gates, or use those thresholds to refine some more complicated gating scheme.

The gating function must return a fiter object, defined in flowCore, or may instantiate its own type of filter. Most commonly, we return a polygonGate.

Finally, wrapper functions must begin with a dot (.) - So if you have a gating method you call “myMethod” in the methodName argument, the function passed to the fun argument should be named .myMethod

If registration succeeds, registerPlugins returns TRUE.

Plugin example - gating a stimulated sample based on a negative control

The following plugin will collapse negative controls and stimulated samples per subject based on variables in the groupBy template column.
It will then use a control column in the phenoData of the flowSet to identify which events belong to the control sample and which belong to the stimualted sample.
The gating routine will subset the control events, define a gate threshold and apply it to all samples in the group, including the stimulated samples.

# The gate method will subset the flowFrame on the "control" events and  gate those, returning a "negative control" gate. 
# This could work equally well with FMOs.
.negGate <- function(fr, pp_res, xChannel=NA, yChannel=NA, filterId="ppgate", ...){ 
   my_gate <- tailgate(fr[pp_res,],channel=yChannel, filter_id=filterId, ...)

   return(my_gate)
}
registerPlugins(fun=.negGate,methodName='negGate',dep=NA)
## Registered negGate
## [1] TRUE
# Identifies which events belong to the negative control sample vs the treatment sample.
# groupBy includes PTID
# Treatment information comes from the pData of the flowSet
# passes a vector of 0/1 for treamtent / control events
.ppnegGate <- function(fs, gs, gm, xChannel, yChannel, groupBy, isCollapse, ...) {
   d <- c()
   for(i in c(1:length(fs))) {
       d = c(d,rep.int(pData(fs[i])$control,nrow(exprs(fs[[i]]))))
   }
   return(as.logical(d))
}
registerPlugins(fun=.ppnegGate, methodName='ppnegGate', dep=NA, "preprocessing")
## Registered ppnegGate
## [1] TRUE