Author: Jeff Johnston
Date: 2013-07-19 23:30:06
The transcription factor Zelda was found to bind to “TAGteam” sites and play a role in the zygotic genome activation in Drosophila (Liang et al. 2008 ).
In this analysis, we will use Bioconductor (Gentleman et al. 2004 ) tools to identify genes with multiple TAGteam motifs in their promoters and perform a GO analysis on these genes.
As we have 23017 transcripts and 14869 genes, we will select a single transcript per gene using the highest Zelda motif count.
The distribution of genes by Zelda motif count is plotted below:
Zelda_Motif_Count | Number_of_Genes |
---|---|
0 | 7024 |
1 | 5075 |
2 | 1992 |
3 | 547 |
4 | 185 |
5 | 31 |
6 | 12 |
7 | 3 |
Based on the above histogram, we will select those genes with at least 4 Zelda motifs and perform GO analysis to identify enriched GO categories using GOstats (Falcon & Gentleman, 2006 ).
Below are the over-represented GO categories among these 231 genes (p < 0.001):
GOBPID | Pvalue | OddsRatio | ExpCount | Count | Size | Term |
---|---|---|---|---|---|---|
GO:0006334 | 0.00000 | 12.40 | 2.4 | 22 | 130 | nucleosome assembly |
GO:0031497 | 0.00000 | 11.43 | 2.5 | 22 | 139 | chromatin assembly |
GO:0034728 | 0.00000 | 11.33 | 2.6 | 22 | 140 | nucleosome organization |
GO:0006333 | 0.00000 | 10.96 | 2.6 | 22 | 144 | chromatin assembly or disassembly |
GO:0065004 | 0.00000 | 10.36 | 2.8 | 22 | 151 | protein-DNA complex assembly |
GO:0071824 | 0.00000 | 9.26 | 3.0 | 22 | 166 | protein-DNA complex subunit organization |
GO:0006323 | 0.00000 | 8.54 | 3.2 | 22 | 178 | DNA packaging |
GO:0071103 | 0.00000 | 8.06 | 3.4 | 22 | 187 | DNA conformation change |
GO:0034622 | 0.00000 | 5.83 | 5.0 | 24 | 274 | cellular macromolecular complex assembly |
GO:0065003 | 0.00000 | 5.25 | 5.5 | 24 | 301 | macromolecular complex assembly |
GO:0006325 | 0.00000 | 5.35 | 5.1 | 23 | 282 | chromatin organization |
GO:0044260 | 0.00000 | 2.37 | 56.3 | 92 | 3088 | cellular macromolecule metabolic process |
GO:0043933 | 0.00000 | 3.84 | 8.3 | 27 | 454 | macromolecular complex subunit organization |
GO:0051276 | 0.00000 | 3.83 | 7.6 | 25 | 418 | chromosome organization |
GO:0006259 | 0.00000 | 3.80 | 7.0 | 23 | 384 | DNA metabolic process |
GO:0044237 | 0.00000 | 2.07 | 72.9 | 104 | 4002 | cellular metabolic process |
GO:0090304 | 0.00000 | 2.30 | 28.2 | 53 | 1547 | nucleic acid metabolic process |
GO:0043170 | 0.00000 | 2.02 | 69.0 | 99 | 3787 | macromolecule metabolic process |
GO:1901360 | 0.00000 | 2.15 | 35.3 | 61 | 1938 | organic cyclic compound metabolic process |
GO:0046483 | 0.00000 | 2.15 | 34.5 | 60 | 1895 | heterocycle metabolic process |
GO:0006139 | 0.00000 | 2.17 | 32.9 | 58 | 1806 | nucleobase-containing compound metabolic process |
GO:0006725 | 0.00001 | 2.12 | 34.2 | 59 | 1878 | cellular aromatic compound metabolic process |
GO:0044238 | 0.00002 | 1.91 | 80.3 | 108 | 4406 | primary metabolic process |
GO:0034641 | 0.00002 | 2.01 | 35.7 | 59 | 1957 | cellular nitrogen compound metabolic process |
GO:0022607 | 0.00002 | 2.68 | 11.4 | 27 | 627 | cellular component assembly |
GO:0044085 | 0.00004 | 2.54 | 12.5 | 28 | 684 | cellular component biogenesis |
GO:0007540 | 0.00006 | 82.18 | 0.1 | 3 | 5 | sex determination, establishment of X:A ratio |
GO:0006807 | 0.00007 | 1.90 | 40.4 | 63 | 2215 | nitrogen compound metabolic process |
GO:0071704 | 0.00007 | 1.82 | 85.7 | 111 | 4701 | organic substance metabolic process |
GO:0048598 | 0.00010 | 3.82 | 3.8 | 13 | 207 | embryonic morphogenesis |
GO:0007419 | 0.00026 | 10.24 | 0.6 | 5 | 32 | ventral cord development |
GO:0007538 | 0.00027 | 15.72 | 0.3 | 4 | 18 | primary sex determination |
GO:0010565 | 0.00033 | Inf | 0.0 | 2 | 2 | regulation of cellular ketone metabolic process |
GO:0016331 | 0.00060 | 4.65 | 1.9 | 8 | 104 | morphogenesis of embryonic epithelium |
GO:0045893 | 0.00060 | 3.31 | 4.0 | 12 | 217 | positive regulation of transcription, DNA-dependent |
GO:0045944 | 0.00064 | 3.77 | 2.9 | 10 | 159 | positive regulation of transcription from RNA polymerase II promoter |
GO:0016348 | 0.00065 | 23.47 | 0.2 | 3 | 10 | imaginal disc-derived leg joint morphogenesis |
GO:0036022 | 0.00065 | 23.47 | 0.2 | 3 | 10 | limb joint morphogenesis |
GO:0031325 | 0.00072 | 2.81 | 5.8 | 15 | 319 | positive regulation of cellular metabolic process |
GO:0010628 | 0.00077 | 3.21 | 4.1 | 12 | 223 | positive regulation of gene expression |
GO:0010604 | 0.00083 | 2.88 | 5.3 | 14 | 290 | positive regulation of macromolecule metabolic process |
GO:0040034 | 0.00088 | 20.53 | 0.2 | 3 | 11 | regulation of development, heterochronic |
GO:0009893 | 0.00089 | 2.75 | 5.9 | 15 | 326 | positive regulation of metabolic process |
GO:0048522 | 0.00094 | 2.26 | 10.7 | 22 | 585 | positive regulation of cellular process |
GO:0051254 | 0.00097 | 3.12 | 4.2 | 12 | 229 | positive regulation of RNA metabolic process |
## R version 3.0.1 Patched (2013-07-10 r63263)
## Platform: x86_64-unknown-linux-gnu (64-bit)
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=C
## [4] LC_COLLATE=C LC_MONETARY=C LC_MESSAGES=C
## [7] LC_PAPER=C LC_NAME=C LC_ADDRESS=C
## [10] LC_TELEPHONE=C LC_MEASUREMENT=C LC_IDENTIFICATION=C
##
## attached base packages:
## [1] parallel stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] knitcitations_0.4-7
## [2] bibtex_0.3-5
## [3] xtable_1.7-1
## [4] org.Dm.eg.db_2.9.0
## [5] GOstats_2.27.0
## [6] graph_1.39.3
## [7] Category_2.27.2
## [8] GO.db_2.9.0
## [9] RSQLite_0.11.4
## [10] DBI_0.2-7
## [11] Matrix_1.0-12
## [12] lattice_0.20-15
## [13] ggplot2_0.9.3.1
## [14] plyr_1.8
## [15] BSgenome.Dmelanogaster.UCSC.dm3_1.3.19
## [16] BSgenome_1.29.0
## [17] Biostrings_2.29.13
## [18] TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.9.0
## [19] GenomicFeatures_1.13.19
## [20] AnnotationDbi_1.23.17
## [21] Biobase_2.21.6
## [22] GenomicRanges_1.13.34
## [23] XVector_0.1.0
## [24] IRanges_1.19.19
## [25] BiocGenerics_0.7.3
## [26] knitr_1.3
## [27] BiocInstaller_1.11.3
##
## loaded via a namespace (and not attached):
## [1] AnnotationForge_1.3.10 GSEABase_1.23.0 MASS_7.3-27
## [4] RBGL_1.37.2 RColorBrewer_1.0-5 RCurl_1.95-4.1
## [7] Rsamtools_1.13.24 XML_3.98-1.1 annotate_1.39.0
## [10] biomaRt_2.17.2 bitops_1.0-5 colorspace_1.2-2
## [13] dichromat_2.0-0 digest_0.6.3 evaluate_0.4.4
## [16] formatR_0.8 genefilter_1.43.0 grid_3.0.1
## [19] gtable_0.1.2 httr_0.2 labeling_0.2
## [22] munsell_0.4.2 proto_0.3-10 reshape2_1.2.2
## [25] rtracklayer_1.21.8 scales_0.2.3 splines_3.0.1
## [28] stats4_3.0.1 stringr_0.6.2 survival_2.37-4
## [31] tools_3.0.1 zlibbioc_1.7.0