Compared to the amount of information that has been accumulated on gene expression, our understanding of gene regulation in metazoans is still limited. In this study, we report the first genome-wide mapping of DNase I hypersensitive sites in the multicellular model organism Caenorhabditis elegans
by a high-resolution tiling microarray. Similar to the DNase-chip method developed by Crawford et al
], DNA fragments flanking DNase I-cleavage sites were captured by ligation to biotinylated adapters and amplification by PCR. Since replicating DNA forks are susceptible to DNase I digestion, Crawford et al.
] used the non-replicating CD4+
T cells to reduce background. Here, we treated synchronized young adult hermaphrodite worms with floxuridine (FUdR) to block cell division, thereby further reducing the levels of DNA replication background [6
]. In the study, we actually only identify DHSs that are common in the mixture of all cell types at the young adult stage. However, different cell types within worms could have drastically different gene expression and chromatin profiles. Subsequent studies of DHSs profiles from primary tissues and at various development stages should therefore further increase our understanding of the dynamic expressional regulation at the chromatin structure level in the nematode.
Consistent with previous regulatory element studies in human genome [20
], DHSs were found throughout the C. elegans
genome. We found that about one half of the DHSs map to intergenic regions, and that two thirds of the intergenic DHSs were located within upstream or downstream proximal regions of coding genes. In a recent study on human transcriptional promoters and enhancers, approximately 70% of putative distant regulatory elements detected by ChIP-on-chip assays in HeLa cells overlapped with DHSs [22
]. We found DHSs located within eight of C. elegans
known coding gene promoter regions identified by high-throughput yeast one-hybrid (Y1H) assays [23
]. For example, Figure showed a DHS located within the promoter region of a gene expressed at the adult stage (T10B11.3), which is a member of the Zinc finger Transcription Factor family [8
]. We also found that one-third of the intergenic DHSs map to regions more than 2 kb away from coding genes, suggesting that these may represent long-distance regulatory elements candidates; however, as a considerable fraction of the intergenic DHSs are located nearby putative ncRNA loci, there is also the possibility that these DHSs may be regulatory elements targeting not yet identified non-coding genes.
An example of a DHS located within the known promoter region of a coding gene expressed at the adult stage.
We found that the frequency of intronic DHSs is significantly less than would be expected based on the amount of genomic sequence occupied by introns, but about one-fourth of all genic DHSs are nonetheless located in introns. A reasonable expectation would be that these elements contain regulatory activity targeting the host gene [24
]. On the other hand, it has also been demonstrated that long-range regulatory element may be located in introns of very distant genes; for example, the enhancer of the SHH gene was found within an intron of a gene located one Mb away in the human genome [26
]. In addition, regulatory elements of non-coding RNAs have been reported in introns [14
], and analysis of the genomic distribution of DHSs with respect to non-coding RNA loci showed that one third of the intronic DHSs surround known or putative small ncRNA loci [14
Consistent with previous studies reported in the human genome, DHSs in the C. elegans
genome were enriched in the first exons that were considered as parts of the core promoters [4
]. In contrast, a considerable and significantly enriched proportion of the DHSs is also found in internal exons in the C. elegans
genome. Such DHSs have been suggested to play a role in alternative splicing of the host gene [25
], but could also be transcription factor binding sites that regulate the host gene [24
]. Compared to intergenic and intronic DHSs, only a small fraction (10%) of the exonic DHSs is located nearby non-coding RNAs, including 27 internal exonic DHSs nearby known or putative small ncRNA loci. For example, a DHS located in the second exon of a gene (C27H5.1) resides less than 50 bp downstream of the snoRNA (DQ789560.1) locus and less than 240 bp upstream of another snoRNA (CeN63) locus [14
] (Figure ).
An example of a DHS located in an exon between two intronic snoRNAs.
DHSs were also located within or close to pseudogenes. These DHSs could be regulatory elements of nearby coding genes, but do also raise the possibility that some assumed pseudogenes are active as non-coding genes. Nucleosomes have been observed to be depleted on active regulatory elements throughout the yeast genome [31
]. In C. elegans
genome, we also found that approximately 70% of DHSs were found in nucleosome-free regions of mixed-stage worms. It has been reported that nematode highly conserved non-coding elements (CNEs) were associated with cis-regulatory elements [16
], and DHSs, particularly distal intergenic DHSs, were also observed to significantly tend to fall in regions that are conserved between the two nematode genomes. Future studies aimed at conserved DHSs will help to determine what type of functional elements these regions may represent.
When exploring the relationship between DHSs and the expression of nearby coding transcripts we found that the chromosomal distributions of DHSs were more strongly correlated to the distribution of genes expressed at the young adult stage than to the general distribution of annotated coding genes (Figure ). This was most pronounced for chromosome V, despite that the ratio of genes expressed at the YA stage is lower on chromosome V than on other chromosomes (Supplemental Table S3 in Additional file 1
). Genes nearby DHSs were more likely to have elevated gene expression; nonetheless, some highly expressed genes did not have any nearby DHSs. This could owe to a variety of reason, one of which might be that DHSs are associated not only with various functional regulatory elements, but could also be linked to other epigenetic signals and non-regulatory structural elements that contribute to chromatin organization [2
]. This implies that the relationship between DHSs appearance and the expression of their neighboring coding genes may be not straightforward. We also found that not all DHSs detected after treatment within the lower concentrations of DNase I were observed after treatment with higher concentrations of DNase I, and vice versa. The reasons for this are not clear, whereas the most likely reason for this is stochastic variation in the material or the amplification process, we cannot exclude the possibility that sites may differ in their sensitivity to different DNase I concentrations. There is also the possibility that variation in the completeness of digestion caused by variation in DNase I concentration could lead to sequence-based bias of DNase I digestion [4
] and sequence-based differences in amplification or hybridization to the tiling microarray. The latter might be particular true with respect to DHSs located within or adjacent genomic repeat regions, as such sequences are generally excluded from the tiling microarray design. Thus, high-throughput sequencing methods would be a valuable complementary strategy for further identification of DHSs in the C. elegans