The object of this chapter is to point out that large parts of the genome relate to other mechanisms than gene expression per se. Some relate to replication, genetic transmission and meiotic recombination, and others to the static and dynamic organisation of chromatin; the latter may bear, eventually, on gene expression. Indeed, one of the most striking conceptual developments in recent years was the gradual introduction of the notion of space in genome organisation and gene expression, in addition to the classical concepts of regulation in time and according to physiological change.
The nucleic acids carrying the genome and the gene expression machinery must assume at least two basic functions: (1) contain the information relating to the genes and, (2) serve as the physical support for this information. Function (1) is all evident within the definition of gene and genon developed above, whereas the implications of function (2) are less clear.
First, the support of genetic information has to obey the necessities of various quite distinct functions as (1) long-term storage of genetic information, (2) its transmission from generation to generation, and (3) the intra-cellular mechanisms of gene expression including selective transport of the transcripts to the sites of translation (see Fig. ), and post-transcriptional regulation in adaptation to physiological conditions. The chemically quite inert DNA seems well suited for safeguard and transmission of information, whereas the more reactive and flexible RNAs are, seen their chemical and physical properties, better suited to adapt to the necessities of gene expression.
It is often forgotten that both, DNA and RNA, act a priori as the mechanical support of genetic information and have to adapt to stringent rules deriving from their own physico-chemical properties. Concerning information storage and regulation at DNA level, an important factor coming into play is, for instance, the quite high physical rigidity of the DNA double strand which does not allow free and random movements, in particular in the conditions of high viscosity in the cellular nuclei. There are limits to folding up of hetero- and euchromatin and, e.g., to rapid “flip-flop” movements of DNA loops assumed to operate according to some popular models (de Laat and Grosveld
2003). Furthermore, relating to what may be called “chromosome mechanics”, coming into play in mitotic replication and meiotic recombination, as well as sister-chromatid exchange, these rules sometimes may supersede the information content relating to the genes
per se.
As suggested above, the genomic DNA may have an architectural function organising both, overall nuclear as well as local chromatin organisation. Cavalier-Smith (
1978) already pointed out that there is a correlation between DNA organisation and chromosome architecture influencing both, nuclear size and linear chromosome organisation; speaking of “nucleoskelettal DNA” (S-DNA), as opposed to “genic DNA” (G-DNA), implicitly he proposed a relation of DNA and nuclear matrix. Recently, Képès proposed in his solenoid model that there is a correlation between transcription factor and promotor attachment sites and the higher order chromatin organisation; moreover, this organisation is suggested to be transcription-pattern dependent (Kepes and Vaillant
2003; Képès
2003). This points to interdependence of 3D genome and transcription organisation, i.e., the static and dynamic nuclear architecture, as discussed below within the Unified Matrix Hypothesis (cf. Scherrer
1989 and “The 3D DNA organisation according to the unified matrix hypothesis”). Actually, it can be assumed that many so-called transcription (initiation ?) factors are proteins of the nuclear matrix, and that promoters (Auboeuf et al.
2007,
2005) may carry out some of their functions at the level of the RNA-dependant nuclear matrix (Ioudinkova et al.
2005; Razin et al.
2004).
Surprisingly neglected by actual Molecular Biology is the fact that DNA and RNA have to operate in a 3D space; and passive “crystallisation” or interaction of macromolecules cannot possibly explain all of genomic and cellular 3D organisation. (DNA “knows” that there is iron and light in the world, but seems to have “forgotten” that its environment is a 3D space !). When genes are being expressed, their reconstitution from RNA fragments in course of splicing, as well as the physical transport of mRNA from sites of transcription to those of expression, have to be organised in the 3D space and necessitates a precise dynamic architecture in space and time. Within these mechanisms, relating to semi-static and dynamic nuclear architecture, the positions of exons and the sites of RNA-protein interactions within the transcripts obey certain rules, which must be compatible with the selectivity of RNA processing and its implementation in the 3D space. Furthermore, since it became obvious that the genome is distributed in specific, experimentally identifiable sectors of the nuclear space, assigning specific positions to chromosomes and genomic domains, the organisation of the DNA itself in 3D must be taken into consideration. Figure outlines the conceptual consistency of organisation in space, common to DNA, RNA and proteins; the basis is the “architectural” necessity to place sites of action and interaction in precise 3D positions relative to each other. The mutual interdependence of the exonic fragments of genetic information and the biophysical properties of its physical support lead, inevitably, to the notion of additional genomic information necessary to rule these processes.
That the nuclear DNA might carry information other than that related to the genetic code could be inferred for a long time on the basis of data pointing to its possible role in cellular structure. The C-value paradox (Cavalier-Smith
1978; Commoner
1964) showed a correlation of cellular and nuclear size (the prime architectural feature!) with DNA content. Later, comparing amphibian erythrocytes in species with a DNA content varying up to 100 times, it was found that these differences bear on repetitive DNA; interestingly, in these species the complexity of the
transcribed genome remains comparable (Rosbash et al.
1974). Furthermore, most of such repetitive DNA was found to be AT-rich, with little or no coding sequences.
That DNA may have a structural role independent of its gene content is also demonstrated by the phenomenon of the “petit” mutants in yeast (Bernardi
2005). Petit mutants have non-functional vestiges of mitochondria, which contain, however, normal-sized mitochondrial DNA. It was found that in such mutants the mitochondrial genes were progressively lost and, surprisingly, replaced by stretches of almost pure A + T (Bernardi
2005). There seems to exist, thus, a mechanism subject to selective pressure, which maintains the length of mitochondrial DNA constant independent of the gene content. A similar case may exist in the kinetoplast of trypanosomes, where the DNA of the organelle is largely composed of gene-less A + T-rich stretches (Shapiro and Englund
1995).
In chromosomes also, there are DNA segments which relate to structure rather than gene content. The genome is subdivided into genomic domains. The definition of genomic domains may be based either on the organisation of DNA, chromatin and/or chromosomes; or on functional considerations, such as units of replication or transcription. As pointed out in the “Cascade Regulation Hypothesis” (CRH; Fig. ), conceived in 1960s (Scherrer and Marcaud
1968) and laid out in final form in 1980 (Scherrer
1980), the most straightforward illustration of genomic domains are the bands in the polytene chromosomes observed in some insects as
diptera (Fig. C). Their salivary glands contain
bona fide interphase cells, which actively express many genes and predominantly those at the basis of silk secretion. By order of magnitude, in
Drosophila there are as many cytogenetically observable polytene chromosome bands as units of meiotic recombination (Judd et al.
1972; NCBI Map Viewer
2006); there is hence coincidence of physical and genetic units of function. From these bands spring up, upon developmental or experimental activation, the so-called “RNA puffs” (Fig. C), signs of transcriptional activity visible in the optical microscope (Grossbach
1974). A band may produce a single or several pre-mRNAs but corresponds, obviously, to a unit of transcriptional regulation. In some types of insects, the family of “Sciaridae”, the phenomenon of “DNA-puffs” occurs, where DNA has to be replicated locally, as a prerequisite for transcriptional activation (Glover et al.
1982). In this case, the unit of transcriptional control corresponds, to units of replication as well (Fig. C and Lara
1987).
There is, thus, good reason to consider the interbands of polytene chromosomes as borders of genomic domains. All the more since some molecular biological and biophysical facts point to the same interpretation. Interband DNA has some qualities of insulators, as defined by molecular genetics (Gaszner and Felsenfeld
2006) and are, e.g. in the case of the
Drosophila gene
Gipsy, visible in the cell nuclei after cytochemical staining (Gasser
2002; Gaszner and Felsenfeld
2006). Finally, and most interestingly, the interbands correspond to sites of Z-DNA formation (Nordheim et al.
1986).
The higher order organisation of DNA into genomic domains is embedded into the super-organisation of chromatin and chromosomes, which divide the genome into individual segments. Phenotypically very similar animals of closely related species may have vastly different numbers of chromosomes. Indeed, the fusion of the 46 telomeric chromosomes of
Mus Musculus into the 23 metacentric chromosomes of
Mus Posciavino (Capanna et al.
1976) will still produce a mouse, albeit of a different size. And the 6 chromosomes of
Muntjacus Muntjak or the 46 of
Muntjak Reevesi will be able to condition an almost identical phenotype (cf. Lima de Faria
1980); they maintain, however, a similar pattern of R- and G-bands (cf. review in Sumner
1982). At this level of organisation, other types of genomic information is encoded which bears only indirectly on gene expression. We shall discuss here the 3D organisation of DNA and some phenomena, which might be singled out as “chromosome mechanics”.
The 3D DNA organisation according to the Unified Matrix Hypothesis
The Unified Matrix Hypothesis (UMH) was an early attempt to give a logical interpretation to the, apparently, surplus DNA, lightly qualified as “junk” (Ohno
1972) (cf. discussion in Scherrer
1989). Starting from the C-value paradox showing linear correlation between DNA content and relative size of cells (Cavalier-Smith
1978), the proposition was made that a major part of the 95% of DNA not coding for proteins might have, essentially, an architectural function.
A straightforward illustration of this proposition was the phenomenon of
ectopic pairing (Barr and Ellison
1976; Cohen
1976; Kaufman et al.
1948; Ananiev et al.
1981) of polytene chromosomes observed in the salivary glands of
Drosophila and other systems of “giant” chromosomes (the latter are the result of DNA replication without disjunction of the daughter DNA strands which remain physically aligned up to 10,000 times). Ectopic pairing consists in physical connections by cables of, apparently, nucleo-protein nature, linking distant sites within and in between chromosomes (Fig. A). These connections run typically from interband to interband and in between telomeres. They have been mapped in details (Fig. B) providing genetically significant patterns (Kaufman et al.
1948). Of particular importance to the emerging matrix concept was the fact that several such ectopic cables suspend the nucleolus in a particular position relative to the chromosomes (see, Ananiev et al.
1981 and Fig. A). They must, hence, include the DNA of the nucleolar organiser sequences. The nucleolus was known for some time already to occupy specific positions in the nucleus of non-transformed cells differentiating normally (Fig. A). The idea arose, thus, that ectopic pairing might reveal a basic mechanism implemented in any normal interphase cell, having normal chromosomes based on double-stranded, non-amplified DNA.
On this basis, the proposition was made within the UMH (Fig. C, D) that, quite in general, the nuclear DNA was organised in a 3D network, where proximal and distal chromosome sites were connected by bi-functional matrix attachment regions (MARs) keeping chromosome domains and sites of transcription in specific spatial positions (Fig. C, D). At those positions, transcripts are formed, processed and exported to the nuclear periphery. A straightforward example of this process is the nucleolus where pre-rRNA is processed (see review in Tschochner and Hurt
2003) and from where subribosomal rRNA is exported, as a component of the ribosomal subunits (see also Fig. D).
The main conceptual implication was that
shear DNA length amounts to genetic information, independent of its sequence. This proposition of the UMH allowed to logically interpret several features hitherto difficult to understand, as e.g. the phenomenon of the “Chromosome Field” (Lima de Faria
1979,
1983,
1980) showing the topological maintenance in evolution of groups of genes within the chromosome organisation, as shown in Fig. E, F, and allowed propositions to explain, for instance, the specificity of sites of chromosome crossing-over in some types of leukemic cells.
This is not the place to further develop this theory; suffice to say that in recent years more and more relevant data could be placed within the originally loose frame of the UMH. The recent reports about “kissing chromosomes”, showing that distant chromosomal sites must be linked physically, to allow the expression of specific genes within “3D gene regulation”, is a most eloquent illustration of this basic concept (Kioussis
2005; Spilianakis et al.
2005). In the meantime more and more data accumulated which point to a quite strict organisation of the genome and gene expression in the nuclear space (Bolzer et al.
2005; Cremer and Cremer
2001; Cremer et al.
2000; Stadler et al.
2004). Genes seem to reside in specific places and mRNA is brought to cytoplasmic sites of, sometimes functional significance as, e.g., when muscle-specific mRNAs (resp. RNPs) are transported to the intra-cellular sarcomeric plates of myotubes in order to be translated locally, there where the proteins shall be assembled (Foucrier et al.
1999,
2001; Fulton and Alftine
1997).
Here we need just to point out that there exist basic functions of DNA that are only indirectly related to gene expression. The UMH indicates disjunction of the actual genome size, which varies vastly within the C-value correlation, in particular in its repetitive elements, from gene expression. As pointed out above, in the same group of species with vastly varying DNA content, the sequence complexity of the expressed genome may remain almost constant (Rosbash et al.
1974). However, the static and dynamic DNA architecture seem to play vital functions, which are maintained in evolution, independent of DNA and gene content.
Although the overall architectural function of DNA seems dissociated from the specific mechanisms of protein biosynthesis, an architectural function in gene expression of the transcripts as well became more and more evident. The observations of an RNA-dependant nuclear matrix (De Conto et al.
2000; Maundrell et al.
1981; Nickerson
2001; Penman et al.
1982) carried by the primary transcripts and their processing products (Ioudinkova et al.
2005) shows, that the genon-related program encoded in pre-mRNA and mRNA must also satisfy an architectural function, as originally suggested by the UMH (Scherrer
1989). We need to distinguish, however, this type of dynamic architectural function from the basic one, carried essentially and directly by the DNA, which is implemented prior to onset of transcription; it remains static in a given type of differentiated cell.
One may propose that the DNA defines the overall nuclear architecture per se and, in particular, the euchromatic part of chromatin which is unfolded and DNase-sensitive. The directly DNA-dependant 3D network is more “static” than the dynamic RNA-dependent architecture. It is liable to modification, however, in the process of cell differentiation, when the relative parts of hetero- and euchromatin are modified. The concept of “Quantal Mitosis” (see “Formation of differentiation-specific
local chromatin networks and the DNA-derived nuclear matrix” section) proposed by Holtzer et al. (
1975,
1972) was based on the fact that, in course of differentiation, there are special types of cell divisions when further differentiation is blocked, at precise stages of differentiation, by substitution of thymidine (T) by bromo-desoxyuridine (BudR) which is without any effect later on. BudR substitution reduces the dissociation constant of DNA-binding proteins, as observed already for the lac-repressor (Wick and Matthews
1991).
On the other hand, there is the transcript-dependant, dynamic nuclear architecture as a result of RNA transcription, processing and transport. It is encoded in the (pre-)mRNA and its (pre-)genons. However, in both cases - the non-transcribed as well as the transcribed genome - the architectural function turns one-dimensional DNA and RNA into 3D structures, into which the coding parts are inserted. This conceptual deduction seems liable to explain to some extent the 95% of “surplus” DNA in a logical manner.
Meiotic recombination, synaptonemal complex and chromosome mechanics
Another type of genetic information fixed by evolution into the genome without being directly involved in gene expression may be related to mechanisms termed, possibly, Chromosome Mechanics. This term relates again to the fact that the nuclear DNA not only carries several types of information, but is at the same time the mechanistic carrier of the information contained. Whereas molecules like DNA or RNA are carriers of information and of genon-related signals and provide, thus, information for the process of gene expression, the nuclear DNA in addition provides the structural organisation for the interaction of such biomolecules. Thus, here, in contrast to the typical fluid situation elsewhere in the cell where molecules have to find each other on the basis of mutual affinities, we see a spatial structure that enables specific interactions and prevents others. This is a type of information to be distinguished from the coding and regulatory information.
Applied to the genon concept, this means that in the nucleic acid backbone, within the cis-program of the holo-genon, coding, functional, and structural aspects are intertwined whereas in the transgenon the regulatory or controlling features dominate.
Thus, merely mechanistic criteria of the information carriers and their higher order complexes must be respected as solidity, flexibility and folding characteristics, adapted chemical stability (DNA is granite and RNA butter), viscosity, etc. In some phases of physiological life, these physical and chemical criteria have to prime over the information contained in the signals carried by individual biomolecules.
A particularly interesting illustration of such phenomena is meiotic recombination and sister chromatid exchange which imply the formation of the synaptonemal complex as the physical basis of meiotic crossing over (Colaiacovo
2006; Kleckner
2006). There, the two DNA strands with their gene fragments in the derived chromatin structure have to align point by point, down to the individual exon, in order to allow precise breakage of the DNA strands and their ligation to the opposite ones. If this condition is not satisfied, as is often the case in interspecies crosses, meiotic recombination cannot proceed and the DNA is dissolved. Of course, in most species other barriers have evolved which preclude interspecies mating prior to the molecular interactions outlined. However, chromosome mismatch represents the ultimate molecular mechanism at the basis of the species barrier, as clearly visible in the case of crosses of horse and donkey (cf. Fig. in Scherrer
1989; Chandley et al.
1974) resulting in mule and hinny; those creatures—though going strong—are incapable to reproduce. This example is particularly speaking since, surprisingly, fertile crossing-over in species having vastly different DNA content, and cell size, is possible in some cases (Bennett
1982), best illustrated for some plant species, the
Secale (which, thanks to this phenomenon bearing on the size of seeds, are at the basis of the “green revolution” in world nutrition). There, the chromosome alleles of the parent species match to align, but their surplus DNA folds out from the strictly aligned axis of the synaptonemal complex, in opposite loops of very different size (according to a proposition of Rees et al. (
1982). This process is a particularly striking example of “chromosome mechanics”; it implies the existence of an independent mechanism which lays down signals for meiotic alignment which seems to be largely independent of all other genomic information.