Genes with Zelda motif

Author: Jeff Johnston

Date: 2013-07-19 23:30:06

Background

The transcription factor Zelda was found to bind to “TAGteam” sites and play a role in the zygotic genome activation in Drosophila (Liang et al. 2008 ).

Overview

In this analysis, we will use Bioconductor (Gentleman et al. 2004 ) tools to identify genes with multiple TAGteam motifs in their promoters and perform a GO analysis on these genes.

Distribution of Zelda motifs in promoters

As we have 23017 transcripts and 14869 genes, we will select a single transcript per gene using the highest Zelda motif count.

The distribution of genes by Zelda motif count is plotted below:

Figure 1.

Zelda_Motif_Count Number_of_Genes
0 7024
1 5075
2 1992
3 547
4 185
5 31
6 12
7 3

GO analysis of genes with multiple Zelda motifs

Based on the above histogram, we will select those genes with at least 4 Zelda motifs and perform GO analysis to identify enriched GO categories using GOstats (Falcon & Gentleman, 2006 ).

Below are the over-represented GO categories among these 231 genes (p < 0.001):

GOBPID Pvalue OddsRatio ExpCount Count Size Term
GO:0006334 0.00000 12.40 2.4 22 130 nucleosome assembly
GO:0031497 0.00000 11.43 2.5 22 139 chromatin assembly
GO:0034728 0.00000 11.33 2.6 22 140 nucleosome organization
GO:0006333 0.00000 10.96 2.6 22 144 chromatin assembly or disassembly
GO:0065004 0.00000 10.36 2.8 22 151 protein-DNA complex assembly
GO:0071824 0.00000 9.26 3.0 22 166 protein-DNA complex subunit organization
GO:0006323 0.00000 8.54 3.2 22 178 DNA packaging
GO:0071103 0.00000 8.06 3.4 22 187 DNA conformation change
GO:0034622 0.00000 5.83 5.0 24 274 cellular macromolecular complex assembly
GO:0065003 0.00000 5.25 5.5 24 301 macromolecular complex assembly
GO:0006325 0.00000 5.35 5.1 23 282 chromatin organization
GO:0044260 0.00000 2.37 56.3 92 3088 cellular macromolecule metabolic process
GO:0043933 0.00000 3.84 8.3 27 454 macromolecular complex subunit organization
GO:0051276 0.00000 3.83 7.6 25 418 chromosome organization
GO:0006259 0.00000 3.80 7.0 23 384 DNA metabolic process
GO:0044237 0.00000 2.07 72.9 104 4002 cellular metabolic process
GO:0090304 0.00000 2.30 28.2 53 1547 nucleic acid metabolic process
GO:0043170 0.00000 2.02 69.0 99 3787 macromolecule metabolic process
GO:1901360 0.00000 2.15 35.3 61 1938 organic cyclic compound metabolic process
GO:0046483 0.00000 2.15 34.5 60 1895 heterocycle metabolic process
GO:0006139 0.00000 2.17 32.9 58 1806 nucleobase-containing compound metabolic process
GO:0006725 0.00001 2.12 34.2 59 1878 cellular aromatic compound metabolic process
GO:0044238 0.00002 1.91 80.3 108 4406 primary metabolic process
GO:0034641 0.00002 2.01 35.7 59 1957 cellular nitrogen compound metabolic process
GO:0022607 0.00002 2.68 11.4 27 627 cellular component assembly
GO:0044085 0.00004 2.54 12.5 28 684 cellular component biogenesis
GO:0007540 0.00006 82.18 0.1 3 5 sex determination, establishment of X:A ratio
GO:0006807 0.00007 1.90 40.4 63 2215 nitrogen compound metabolic process
GO:0071704 0.00007 1.82 85.7 111 4701 organic substance metabolic process
GO:0048598 0.00010 3.82 3.8 13 207 embryonic morphogenesis
GO:0007419 0.00026 10.24 0.6 5 32 ventral cord development
GO:0007538 0.00027 15.72 0.3 4 18 primary sex determination
GO:0010565 0.00033 Inf 0.0 2 2 regulation of cellular ketone metabolic process
GO:0016331 0.00060 4.65 1.9 8 104 morphogenesis of embryonic epithelium
GO:0045893 0.00060 3.31 4.0 12 217 positive regulation of transcription, DNA-dependent
GO:0045944 0.00064 3.77 2.9 10 159 positive regulation of transcription from RNA polymerase II promoter
GO:0016348 0.00065 23.47 0.2 3 10 imaginal disc-derived leg joint morphogenesis
GO:0036022 0.00065 23.47 0.2 3 10 limb joint morphogenesis
GO:0031325 0.00072 2.81 5.8 15 319 positive regulation of cellular metabolic process
GO:0010628 0.00077 3.21 4.1 12 223 positive regulation of gene expression
GO:0010604 0.00083 2.88 5.3 14 290 positive regulation of macromolecule metabolic process
GO:0040034 0.00088 20.53 0.2 3 11 regulation of development, heterochronic
GO:0009893 0.00089 2.75 5.9 15 326 positive regulation of metabolic process
GO:0048522 0.00094 2.26 10.7 22 585 positive regulation of cellular process
GO:0051254 0.00097 3.12 4.2 12 229 positive regulation of RNA metabolic process

References

Session info

## R version 3.0.1 Patched (2013-07-10 r63263)
## Platform: x86_64-unknown-linux-gnu (64-bit)
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C         LC_TIME=C           
##  [4] LC_COLLATE=C         LC_MONETARY=C        LC_MESSAGES=C       
##  [7] LC_PAPER=C           LC_NAME=C            LC_ADDRESS=C        
## [10] LC_TELEPHONE=C       LC_MEASUREMENT=C     LC_IDENTIFICATION=C 
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] knitcitations_0.4-7                      
##  [2] bibtex_0.3-5                             
##  [3] xtable_1.7-1                             
##  [4] org.Dm.eg.db_2.9.0                       
##  [5] GOstats_2.27.0                           
##  [6] graph_1.39.3                             
##  [7] Category_2.27.2                          
##  [8] GO.db_2.9.0                              
##  [9] RSQLite_0.11.4                           
## [10] DBI_0.2-7                                
## [11] Matrix_1.0-12                            
## [12] lattice_0.20-15                          
## [13] ggplot2_0.9.3.1                          
## [14] plyr_1.8                                 
## [15] BSgenome.Dmelanogaster.UCSC.dm3_1.3.19   
## [16] BSgenome_1.29.0                          
## [17] Biostrings_2.29.13                       
## [18] TxDb.Dmelanogaster.UCSC.dm3.ensGene_2.9.0
## [19] GenomicFeatures_1.13.19                  
## [20] AnnotationDbi_1.23.17                    
## [21] Biobase_2.21.6                           
## [22] GenomicRanges_1.13.34                    
## [23] XVector_0.1.0                            
## [24] IRanges_1.19.19                          
## [25] BiocGenerics_0.7.3                       
## [26] knitr_1.3                                
## [27] BiocInstaller_1.11.3                     
## 
## loaded via a namespace (and not attached):
##  [1] AnnotationForge_1.3.10 GSEABase_1.23.0        MASS_7.3-27           
##  [4] RBGL_1.37.2            RColorBrewer_1.0-5     RCurl_1.95-4.1        
##  [7] Rsamtools_1.13.24      XML_3.98-1.1           annotate_1.39.0       
## [10] biomaRt_2.17.2         bitops_1.0-5           colorspace_1.2-2      
## [13] dichromat_2.0-0        digest_0.6.3           evaluate_0.4.4        
## [16] formatR_0.8            genefilter_1.43.0      grid_3.0.1            
## [19] gtable_0.1.2           httr_0.2               labeling_0.2          
## [22] munsell_0.4.2          proto_0.3-10           reshape2_1.2.2        
## [25] rtracklayer_1.21.8     scales_0.2.3           splines_3.0.1         
## [28] stats4_3.0.1           stringr_0.6.2          survival_2.37-4       
## [31] tools_3.0.1            zlibbioc_1.7.0