Eukaryotic genome has been intensively studied in efforts to unravel the relationship between its structure and function. One attractive hypothesis is that chromosomes are organized in independent chromatin domains that contain individual genes or gene clusters, and these domains play the regulatory role, being either permissive or repressive towards gene expression (
1,
2). The ‘domain’ hypothesis is supported mainly by the data from thoroughly investigated models such as the β-globin (
10–
17), growth hormone (
18,
19), interleukin (
20) and some other gene clusters. These clusters are composed of paralogous genes, which are probably the products of local gene duplications. Because of the implied mechanism of formation of these clusters, occurrence of the related genes in a common chromatin domain may merely be an automatic consequence of gene duplications within the pre-existing domain. Whether or not the resulting structure is employed in co-regulation of the paralogous genes in the cluster, it is not clear that similar mechanism(s) operates to regulate gene expression elsewhere in genome.
Recently, we (
43) and others [reviewed in (
25)] have revealed the abundant presence of clusters of co-expressed genes in the genomes of multicellular eukaryotes, leading to a tempting suggestion that these clusters outline chromatin domains. For example, according to our data (
43), about one-third of the tissue-specific genes form clusters in
Drosophila genome. Spellman and Rubin (
26) have shown that more than 20% of
Drosophila genes (both constitutively and tissue-specifically expressed) are clustered according to their expression patterns. Here, we present the evidence for the regulated chromatin domain that encompasses cluster of non-homologous co-expressed genes. Considering that majority of the clusters of co-expressed genes consist of non-homologous genes (
26,
43–
45), this finding implies that the ‘domain’ hypothesis may be applicable to the substantial portion of eukaryotic genome.
We have analyzed in detail the 15 kb genomic region of Drosophila genome where five non-homologous testes-specific genes are localized next to each other, thus forming a non-interrupted cluster of co-expressed genes (the 60D cluster). All genes in the 60D cluster are activated simultaneously in spermatocytes, as shown by RNA in situ hybridizations and by northern analysis of bam86 mutants that are arrested at the spermatogonial stage. On each side, the cluster is flanked by genes with broader tissue specificity of expression.
Our analysis of chromatin structure indicates that four testes-specific genes in the cluster share a regulated chromatin domain. While sensitivity to DNase I in larval testes was high across the entire region that included the
60D cluster and adjacent sequences, a segment showing high resistance to DNase I was observed in larval brains and in embryos. Sensitivity to DNase I is a conventional gauge for the compactness of chromatin. The actively transcribed genes are associated with the DNase I-sensitive, unfolded chromatin, while the ‘closed’ chromatin structure, typical for the repressed genes, is more resistant to DNase I (
6–
8,
11,
12,
14,
16,
21–
23). Therefore, our interpretation of the observed patterns of sensitivity to DNase I is that the chromatin of the
60D cluster and surrounding genes is ‘open’ in testes (where all these genes are expressed), but a segment encompassing most of the
60D cluster is ‘closed’ in somatic tissues represented by embryos and brain. This segment apparently represents a regulated domain of repressed chromatin, and it includes the genes
CG13589,
Crtp,
Yu,
CK2βtes and
Pros28.1B. All these genes, except for
CG13589, are transcribed in the male germ line only. Chromatin structure of the domain becomes ‘open’, i.e. ‘potentiated’ (
35), before or at the primary spermatocyte stage, at which transcription of the cluster genes is initiated.
It is quite intriguing that while the regulated chromatin domain does contain most of the testes-specific genes from the 60D cluster, the borders of the domain and of the cluster are not identical. On one side, the gene CG13589, which is included in the regulated domain, is not a member of the 60D cluster of testes-specific genes. Our RT–PCR data show that CG13589 is transcribed at low level in testes, ovaries and heads, but not in embryos and larval brains. Thus, some aspects of tissue specificity of CG13589 appear to be similar to that of the genes in the 60D cluster (namely, expression in testes and absence of expression in embryos and in larval brains). The shared chromatin domain may, at least partially, underlie this similarity. This indicates that genes included in common chromatin domain ought to show significantly overlapping, but not necessarily identical transcription patterns.
On the other side of the 60D cluster we observed that a continuous stretch of co-expressed genes on the chromosome may occupy more than one chromatin domain. The border of regulated domain is clearly positioned to the left of CG13581, thus leaving CG13581 outside of the domain shared by other testes-specific genes of the cluster. ‘Closing’ of chromatin over CG13581 was observed in somatic tissues, indicating the possible presence of a second domain of repressed chromatin in the region, however, the effect was not very significant. Whether or not the chromatin domain containing CG13581 is a regulated one, it is distinct from the domain that contains other genes of the 60D cluster, being separated by a constitutively DNase I-sensitive region that contains amplicon A27 (). It is important to mention here that the constitutive DNase I sensitivity of the amplicon A27 is not due to the presence of DNase I-hypersensitive site (HSS). We have studied this by Southern analysis of the 5 kb fragment between the genes Pros28.1B and CG13581 (encompassing amplicon A27) in embryos, using indirect end-labeling. We found nine ‘weak’ sites with increased sensitivity to DNase I, however, none of these mapped within A27 (Supplementary Figure S1).
This ‘border’ region may represent a fixed domain boundary established, e.g. by insulator sequence [reviewed in (
46–
48)], although other types of boundary elements were described (
49,
50). Taken together, our studies indicate that while clusters of co-expressed genes do outline chromatin domains, the exact domain pattern can not be deduced solely from the expression data.
Shared chromatin domains provide the tool for co-regulation of genes with similar expression patterns. For clusters of paralogous genes, possible mechanism of creation of multigenic domains is the internal ‘stretching’ due to local gene duplications within domains. In case of clusters of non-homologous co-expressed genes, shared domains could be created by integration of genes into pre-existing domains. This could partially canalize the evolution of newly integrated genes by pre-setting general aspects of their transcription patterns.
Finding of the regulated chromatin domain implies that there are factors that control its repressive state in somatic tissues, and potentiation in male germ line as it was shown, e.g. for the cluster of neuron-specific genes (
51). One potential candidate for such role would be the gene
aly, which is required for expression of wide spectrum of genes during male gametogenesis, and has been implicated in regulation of chromatin structure (
42). We analyzed the effect of
aly deficiency on expression of testes-specific genes in the
60D cluster, and found that one gene (
CK2βtes) does not require the
aly function for its transcription.
CK2βtes is located in the middle of regulated chromatin domain, being surrounded by other testes-specific genes that depend on
aly heavily. Failure to ‘open’ the regulated domain due to the
aly mutation would have similar effect on all genes included in the domain. Therefore, the Aly protein is probably involved in more downstream steps of transcriptional regulation in testes.
The results presented here suggest that extensive clustering of co-expressed genes observed recently in eukaryotic genomes does in general reflect the domain organization of chromatin, although exact domain borders may not correspond to the margins of gene clusters. ‘Opening’ or ‘closing’ of chromatin domains would outline tissue specificity of expression of genes included in these domains. The chromatin organization then provides a tool for coordinated regulation of multiple genes. Pattern of ‘open’ and ‘closed’ chromatin domains along the chromosomes, if maintained through cell divisions, can shape the transcriptome of the cell lineage. Alternatively, modifications of this pattern during development may lead to the coordinated changes in the expression of gene batteries, thus establishing new cell lineages.