The method in this package is an extension of the classic Nested Effects Models provided in the package . Nested Effects Models is a pathway reconstruction method, which takes into account effects of downstream genes. Those effects are observed for every knock-out of an upstream pathway gene, and the nested structure of observed effects can then be used to reconstruct the pathway structure. However, classic Nested Effects Models do not account for double knock-outs. In this package , one additional layer of complexity is added. For every two genes, acting on one gene together, the relationship is evaluated and added to the model as a logical gate. Genetic relationships are represented by the logics OR (no relationship), AND (functional overlap), NOT (masking or inhibiting) and XOR (mutual prevention from acting on gene C). Please see the references for a more detailed description of NEMs and epiNEMs.
Simulation and application results are imported from pre-calculated data sets to shorten the runtime of this vignette.
library(epiNEM)
The data should be in the form of binary effects stemming from knock-out data including single and double perturbations with effect reporters, e.g. genes, as rows and perturbations as columns. A one in row i and column j denotes an effect of reporter i for perturbation j. These effects are usually derived from comparing control/wild-type experiments with respective perturbation experiments (e.g. differential expression with edgeR, Robinson et al., 2010). The binary data is the main input for the epiNEM function, which contains the inference algorithm to estimate the underlying network structure of the data. For up to four signalling genes (=different perturbation targets) exhaustive search is available. For five or more a greedy search is implemented as an alternative.
data <- matrix(sample(c(0,1), 100*4, replace = TRUE), 100, 4)
colnames(data) <- c("A", "A.B", "B", "C")
rownames(data) <- paste("E", 1:100, sep = "_")
print(head(data))
## A A.B B C
## E_1 1 1 1 0
## E_2 1 1 1 1
## E_3 1 1 1 0
## E_4 1 1 1 1
## E_5 0 1 0 1
## E_6 0 0 1 0
res <- epiNEM(data, method = "exhaustive")
plot(res)
The plot shows the inferred network with the signaling genes (bright red), the inferred logic (green) and attachment of the number of effect reporters (grey).
Alternatively the input can be a larger matrix with single and double knock-outs. EpiNEM can perform a systematic analysis to identify most likely modulators for a signaling gene pair of a double knock-out.
data <- matrix(sample(c(0,1), 100*9, replace = TRUE), 100, 9)
colnames(data) <- c("A.B", "A.C", "B.C", "A", "B", "C", "D", "E", "G")
rownames(data) <- paste("E", 1:100, sep = "_")
res <- epiScreen(data)
## [1] "A.B"
## [1] "A"
## [1] "B"
## [1] "C"
## [1] "D"
## [1] "E"
## [1] "G"
## [1] "A.C"
## [1] "A"
## [1] "B"
## [1] "C"
## [1] "D"
## [1] "E"
## [1] "G"
## [1] "B.C"
## [1] "A"
## [1] "B"
## [1] "C"
## [1] "D"
## [1] "E"
## [1] "G"
plot(res)
plot(res, global = FALSE, ind = 1)
If the “global” parameter is set to FALSE, detailed results are plotted for each or a specific (“ind” parameter) pair of a double knock-out.
The results (logics) of the knock-out screens have been annotated according to the following legend of effects (black), where the x-axis denotes the gene and the y-axis denotes the type of knock-out. E.g. if C is regulated via the OR gate, it is effected by by both single and the double knock-out.
epiAnno()
We compare epiNEM to original NEMs (Markowetz et al., 2005), Boolean Nested Effects Models (B-NEM, link, Pirkl et al., 2016), Aracne (Margolin et al., 2006) and the PC algorithm (Kalisch et al., 2007) for 100 four node networks and respective data experiencing different levels of noise. The evaluation is done by accounting for the accuracy of discovered edges and in the case of B-NEM and epiNEM also the truth tables and logical gates.
We simulate code based on a ground truth and infer an optimal network with all aforementioned methods solely based on the data. The B-NEM package is necessary to run the full simulation: link. If the comparison to B-NEM is omitted, the B-NEM package is not necessary for epiNEM to function.
B-NEM is a method for general Boolean network inference based on multivariate read-out from combinatorial perturbation experiments. EpiNEM is specifically designed for knock-out screens including double and single knock-outs. Thus epiNEM is faster especially for large scale screens and more accurate, because it is designed for triples.
We boxplot running time, accuracy for the inferred edges, logical gates and expected data (truth table) for all five respectively two methods.
data(sim)
plot(sim)