Contents

This document shows typical Topconfects usage with limma, edgeR, or DESeq2.

The first step is to load a dataset. Here, we’re looking at RNA-seq data that investigates the response of Arabodopsis thaliana to a bacterial pathogen. Besides the experimental and control conditions, there is also a batch effect. This dataset is also examined in section 4.2 of the edgeR user manual, and I’ve followed the initial filtering steps in the edgeR manual.

library(topconfects)

library(NBPSeq)
library(edgeR)
library(limma)

library(dplyr)
library(ggplot2)

data(arab)

# Retrieve symbol for each gene
info <- 
    AnnotationDbi::select(
        org.At.tair.db::org.At.tair.db, 
        rownames(arab), "SYMBOL") %>%
    group_by(TAIR) %>% 
    summarize(
        SYMBOL=paste(unique(na.omit(SYMBOL)),collapse="/"))
arab_info <- 
    info[match(rownames(arab),info$TAIR),] %>% 
    select(-TAIR) %>%
    as.data.frame
rownames(arab_info) <- rownames(arab)

# Extract experimental design from sample names
Treat <- factor(substring(colnames(arab),1,4)) %>% relevel(ref="mock")
Time <- factor(substring(colnames(arab),5,5))

y <- DGEList(arab, genes=as.data.frame(arab_info))

# Keep genes with at least 3 samples having an RPM of more than 2
keep <- rowSums(cpm(y)>2) >= 3
y <- y[keep,,keep.lib.sizes=FALSE]
y <- calcNormFactors(y)

1 limma analysis

1.1 Standard limma analysis steps

We use the voom-limma method.

design <- model.matrix(~Time+Treat)
design[,]
##   (Intercept) Time2 Time3 Treathrcc
## 1           1     0     0         0
## 2           1     1     0         0
## 3           1     0     1         0
## 4           1     0     0         1
## 5           1     1     0         1
## 6           1     0     1         1
fit <-
    voom(y, design) %>%
    lmFit(design)

1.2 Apply topconfects

Find largest fold changes that we can be confident of at FDR 0.05.

confects <- limma_confects(fit, coef="Treathrcc", fdr=0.05)

confects
## $table
##    confect effect AveExpr name      SYMBOL       
## 1  3.090   4.849  6.567   AT3G46280              
## 2  2.895   4.331  9.155   AT4G12500              
## 3  2.895   4.493  6.053   AT2G19190 FRK1/SIRK    
## 4  2.608   3.839  9.166   AT4G12490 AZI3         
## 5  2.608   3.952  6.615   AT1G51800 IOS1         
## 6  2.608   4.299  5.440   AT2G39530 CASPL4D1     
## 7  2.606   3.908  7.899   AT2G39200 ATMLO12/MLO12
## 8  2.606   3.710  8.729   AT5G64120 AtPRX71/PRX71
## 9  2.595   5.346  1.807   AT5G31702              
## 10 2.563   4.888  2.383   AT2G08986              
## ...
## 2184 of 16526 non-zero log2 fold change at FDR 0.05
## Prior df 18.2

Note: If not using voom or a similar precision weighting method, it may be important to use limma_confects(..., trend=TRUE) to account for a mean-variance trend, similar to how you would call limma::treat(..., trend=TRUE). You can check the need for this with limma::plotSA(fit).

1.3 Looking at the result

Here the usual logFC values estimated by limma are shown as dots, with lines to the confect value.

confects_plot(confects)

confects_plot_me2 produces a Mean-Difference Plot (as might be produced by limma::plotMD), and colors points based on thresholds for the confect values. Effect sizes confidently “> 0” correspond to traditional differential expression testing, and effect sizes confidently greater than larger values correspond to the TREAT test at that threshold. I have specified the breaks here explicitly, to aid comparison with other methods in similar plots below.

confects_plot_me2(confects, breaks=c(0,1,2))