|Home | About | Journals | Submit | Contact Us | Français|
In plants, the 5S rRNA genes usually occur as separate tandems (S-type arrangement) or, less commonly, linked to 35S rDNA units (L-type). The activity of linked genes remains unknown so far. We studied the homogeneity and expression of 5S genes in several species from family Asteraceae known to contain linked 35S-5S units. Additionally, their methylation status was determined using bisulfite sequencing. Fluorescence in situ hybridization was applied to reveal the sub-nuclear positions of rDNA arrays.
We found that homogenization of L-type units went to completion in most (4/6) but not all species. Two species contained major L-type and minor S-type units (termed Ls-type). The linked genes dominate 5S rDNA expression while the separate tandems do not seem to be expressed. Members of tribe Anthemideae evolved functional variants of the polymerase III promoter in which a residing C-box element differs from the canonical angiosperm motif by as much as 30%. On this basis, a more relaxed consensus sequence of a plant C-box: (5’-RGSWTGGGTG-3’) is proposed. The 5S paralogs display heavy DNA methylation similarly as to their unlinked counterparts. FISH revealed the close association of 35S-5S arrays with nucleolar periphery indicating that transcription of 5S genes may occur in this territory.
We show that the unusual linked arrangement of 5S genes, occurring in several plant species, is fully compatible with their expression and functionality. This extraordinary 5S gene dynamics is manifested at different levels, such as variation in intrachromosomal positions, unit structure, epigenetic modification and considerable divergence of regulatory motifs.
Nuclear ribosomal DNA (rDNA) encoding 5S, 5.8S, 18S and 26S rRNA belong to the most important housekeeping genes playing a central role in cell metabolism . In plant genomes there may be from several hundred up to tens of thousands of highly homogeneous copies of each gene. A high copy number of these genes is probably important to ensure increased demand for proteosynthesis during plant development  but other functions, such as stabilization of the cell nucleus, have also been proposed . Each large 35S (45S in animals) rDNA unit contains 18S, 5.8S and 26S rRNA genes, the internal transcribed spacers (ITSs), and an intergenic spacer (IGS) (for review see ). The 35S units are organized in tandem arrays at one or several loci. The 5S rDNA encoding a 120-bp-long transcript has been traditionally considered to occupy separate chromosomal locations (hereafter S-type) in seed plants [5-9]. However, physical linkage of 5 S and 35 S genes predominates the organization of rDNA in streptophyte algae and early diverging land plants such as mosses [10,11]. These studies led to the hypothesis that “liberation” of 5S genes from the 35S unit might have occurred in a common angiosperm ancestor after the separation from early diverging plants. However, the linked arrangement of 35S-5S units (hereafter L-type) was later found in several species from the genus Artemisia (from family Asteraceae, considered one of the most phylogenetically derived groups of angiosperms), first based on cytogenetic evidence [12,13] and subsequently confirmed through molecular studies . Additional studies showed that as many as 25% of Asteraceae members could have the unusual L-type arrangement of rDNA [15,16] and the L-arrangement has recently been found in the living fossil gymnosperm Gingko biloba. Whether the L- or S-type was the ancestral rDNA status in angiosperm species remains to be determined. In most L-type genomes, the 5S insertion occurs in the IGS within 1kb downstream from the 26S gene, and the corresponding transcript is encoded exclusively on the opposite DNA strand than the 26S rRNA [15,16].
Although, in the cell, there has to be a stoichiometric ratio of rRNA molecules, fundamental differences exist with respect to the transcriptional regulation of individual genes. The large polycistronic 35S transcript produced by RNA polymerase I (Pol I) is endonucleolytically processed to produce mature 18S, 5.8S and 26S rRNA molecules . The Pol I promoter, located within the 26-18S intergenic spacer, binds to a complex of transcription factors . In contrast, transcription of 5S genes is carried out by RNA polymerase III (Pol III) which requires an internal promoter within the gene in addition to the TFIIIA, TFIIIB and TFIIIC transcription factors. The tripartite structure of the Pol III internal promoter comprises an A-box, an IE (internal element) and a C-box, elements that are highly conserved in plants and animals . Epigenetic tools are another layer of expression control involved in the regulation of both types of rRNA genes. Silencing of non-transcribed gene copies is mediated by complex epigenetic mechanisms relying on chromatin modifications including DNA methylation [20,21].
While there is an increasing number of eukaryotic genomes with the L-arrangement of 5S genes [10,11,22] their expression patterns and epigenetic modifications have not yet been investigated. The remaining issue is which (if any) of the linked genes are expressed and functional. In this work, we addressed the following questions:
1) Do the L- and S-type loci occur simultaneously in a given genome? If, so which of them contribute to 5S expression?
2) How much homogeneous are the 5S rRNA pools? Are regulatory elements conserved between the S- and L-type genes?
3) What are the DNA methylation and chromatin condensation patterns of genes with linked and unlinked arrangements?
We analyzed expression by RT-PCR, cloning and sequencing approaches. Bisulfite sequencing and FISH were used to determine DNA methylation and chromatin condensation levels.
We selected representative species known to evolve predominant L- or S-type arrangement of 5S rDNA, in order to cover all three subtribes in which unusual linked arrangement arose (Anthemideae, Gnaphalieae and Heliantheae alliance) and whose 35-5S units (IGS) had been previously sequenced. Leaf or seed material for the species Artemisia absinthium, A. tridentata, Elachanthemum intricatum, Helianthus annuus,Helichrysum bracteatum, Gnaphalium luteoalbum, Matricaria matricarioides, Tagetes patula (all Asteraceae) and Linum alpinum (Linaceae) were obtained either from wild populations or purchased. Plants were grown at the greenhouse of the Institute of Biophysics (Brno, CZ). Table Table11 lists the provenance of the studied materials.
Genomic DNA (gDNA) was isolated following a CTAB protocol . RNAs were extracted from leaf material using the RNeasy Plant Mini kit (Qiagen, Germany). The purified RNAs were treated with TurboTM DNase (Ambion, Applied Biosystems, USA) to get rid of traces of DNA contamination. To prepare cDNA, about 2μg of RNA was reverse transcribed by Superscript reverse transcriptase (Invitrogen, USA) employing random nonamer primers. The gDNA and cDNAs were analyzed by PCR using the following primers: 5SgF: 5’-GGTGCGATCATACCAGCACT-3’, 5SgR: GGTGCAACACGAGGACTT-3’, IGS1692: 5’-CGGAACYACCAAAGCGAGTAAG-3’ (newly designed) and 26Spr1: AGACGACTTTAAATACGCGAC . The 5S regions delimited by the primer sets were amplified using Taq polymerase (Roche, Germany) with the following PCR program: one cycle at 94°C for 3min; 29–35 (depending on amplicon) cycles at 55°C for 20s, 72°C for 30s, 92°C for 20s; extension 72°C for 7min. The PCR products were separated on a 1.2% agarose gel, stained with ethidium bromide and photodocumented (Ultralum, USA). Fragments corresponding to 5S transcripts were cloned into the pDrive vector (Qiagen, Germany) and inserts were sequenced from both directions using the T7 and SP6 primers.
Bisulfite treatments were carried out on purified gDNA (~100ng) using the EpiTect Bisulfite Kit (Qiagen, Germany). Primers amplifying the non-coding DNA strand designed with the aid of the BISPRIMER program  were as follows: forward primer 5'-GTTCGGATTCAAAAAAAGGGGT-3' and reverse primer 5'-CGATCATACCARCACTAAT-3' . The PCR program consisted on: one cycle at 94°C – 3min; 35 cycles at 94°C – 20s, 55°C – 20s,72°C – 20s; extension 72°C – 7min. The PCR products were separated on 1.2% agarose gels, purified using a PCR purification kit (Macherey-Nagel, Germany) and cloned into a TA vector (pDrive, Qiagen, Germany). Positive clones were PCR-screened using vector SP6 and T7 primers. From 11 to 13 clones from each sample were sequenced (Eurofins MWG Operon, Germany).
The fresh root tips of Helichrysum bracteatum were pretreated with an aqueous solution of colchicine 0.05% at room temperature, for 2.5 - 4h and fixed in 3:1 (v/v) ethanol: acetic acid. Protoplasts were obtained using cellulolytic enzymes (0.4% pectinase (Macerozyme R10, Duchefa, Holland), 0.4% cytohelicase [Sigma C8274, USA], and 0.4% cellulase [Onozuka RS, Duchefa, Holland) in citrate buffer), dropped onto microscope slides, frozen and desiccated using liquid nitrogen and 70% ethanol. Before FISH, the slides were pre-treated with 50μgmL-1 RNaseA for 1h at 37°C in a humid chamber. After washing three times in 2× SSC (2× standard saline citrate+0.1% (w/v) sodium dodecyl sulfate), slides were dehydrated in an ethanol series (50%, 70% and 100%) and air-dried. Remnants of cytoplasm were removed with pepsin treatment (10μgmL-1 in 10mM HCl, 4min room temperature). The slides were then washed, dehydrated in ethanol and fixed for 10min in 3.7% formaldehyde in 1× PBS, washed three times in 2× SSC, dehydrated again and air dried. The hybridization mix contained 50ng μL-1 (1000ng/slide), of Cy3-labeled (GE Healthcare, Chalfont, St Giles, England) 5S probe a 116bp-long insert of the cloned tobacco 5S rRNA gene , and 20ng μL-1 (400ng/slide) of 35S rDNA probe (a 2.5kb fragment of 26S rRNA gene from tomato labeled with Spectrum Green, Abbott Molecular, IL, USA). The FISH hybridization mixture (20 μL per slide) consisted of labeled DNA probes, 4 μL of a 50% solution of dextran sulfate, 10 μL pure formamide, 0.5 μL TE buffer and 2 μL 20×SSC, and it was denatured at 75°C for 15min and immediately cooled on ice. This was applied to the slides, which were denatured in a thermocycler using a flat plate: 5min at 75°C, 2min at 65°C, 2min at 55°C, 2min at 45°C, and transferred into a prewarmed humid chamber and put into an incubator. After overnight hybridization at 37°C, the slides were washed with 2× SSC, then 0.1× SSC (high stringency), at 42°C for 10min each followed by washes with 2× SSC, 4× SSC+0.1% Tween 20 at room temperature. Slides were rinsed in PBS and mounted in Vectashield (Vector Laboratories, Burlinghame, CA, USA) containing DAPI (1μg/mL-1). FISH signals were observed using an Olympus AX 70 fluorescent microscope equipped with a digital camera. Images were analyzed and processed using ISIS software (MetaSystems, Altlussheim, Germany).
Sequences were assembled by BioEDIT Sequence Alignment Editor 18.104.22.168  and aligned. The bisulfite data were processed and methylation density calculated using CyMATE software . Secondary structure modeling was carried out through an online tool at the Mfold Web Server (The RNA Institute, College of Arts and Sciences, University of Albany, State University New York). Public database searches were carried out through BLAST . Additional 5S sequences for comparative purposes were downloaded from the 5S RNA database .
Previous Southern blot hybridization revealed large amounts of linked 35-5S units in Artemisia absinthium, Artemisia tridentata, Helichrysum bracteatum, Matricaria matricarioides and Tagetes patula. Here, we wished to determine whether any unlinked (S-type) units were present in these genomes. For sensitivity we applied several PCR strategies (Figure (Figure1)1) using primers specific for the 5S and 26S coding regions. In the case of a linked arrangement (5S copies flanked by non-5S DNA) only products corresponding to ~120-bp monomers would be amplified (Figure (Figure1A).1A). Correspondingly, the monomeric bands were amplified in A. absinthium, H. bracteatum, M. matricarioides and T. patula. In the case of tandem arrangement, mono and oligomeric products would be formed, the latter originating from polymerase read-through into neighboring units. This situation occurred in Linum alpinum, a species that typically evolved a separate arrangement of DNA units and which showed several oligomeric bands extending to a smear of an unresolved high-molecular-weight fraction (Figure (Figure1C).1C). Significantly, similar ladders though with a shorter periodicity were visualized in A. tridentata and G. luteolbum (weak). Except for Linum, reactions using 26SPr1-5SgF primers (Figure (Figure1B)1B) produced 1–2 bands of <1kb confirming linked rDNA genotypes in all Asteraceae species studied (Figure (Figure11D).
Next, we analyzed tandemly arranged genes in A. tridentata by cloning an oligomeric PCR product corresponding to a trimer (Figure (Figure1C,1C, arrow). Sequencing of three plasmid clones (Genbank: JX101914-JX101916) revealed that clones contained trimer (#4) and dimers (#3 and #7) of the 5S gene. The characteristic feature of minor S-type units is an unusually short intergenic spacer (Additional file 1) whose size (58bp) markedly differs from the average (100–900bp) of 5S-5S spacers in plants [29,30]. The clones were highly homologous to each other and to the L-type copies (Figure (Figure2).2). Two gene copies (clones 3 and 4) harbored mutations within the A-box regulatory element.
Thus, A. tridentata and possibly G. luteoalbum contain rDNA in both linked and separate configurations of 5S genes. The ratio of gene copies is, however, shifted to linked 35S-5S units. Therefore this type of genomic arrangement was called as “Ls”.
To study the expression of 5S genes we analyzed RNA from five species with predominantly linked genotypes (A. absinthiumA. tridentataH. bracteatumM. matricarioides and T. patula) as shown in . Several potential 5S locus transcripts were examined by RT-PCR using different primer sets (Figure (Figure3A,3A, B). The 5SgF/5SgR primers would amplify a ~120bp genic region corresponding to nearly an entire mature 5S transcript. A longer product would correspond to polymerase read-through into the neighboring unit. These may originate either from independent tandem arrays (Figure (Figure3B)3B) or from the second incomplete 5S rDNA2 copy in the 26S-18S spacer (Figure (Figure3A).3A). The second ~180bp amplicon delimited by the IGS1692/5SgF primers involves the entire 5S genic region plus about 60bp of downstream IGS1 sequences. Finally, the third type of RT-PCR (26SPr1/5SgF primer set) maps potential 5S-IGS1-26S transcripts. The genic 5SgF/5SgR primers actually amplified a ~120-bp fragment from all cDNA templates (Figure (Figure3C)3C) consistent with the typical length of a mature 5S transcript. In contrast to genomic PCR, no oligomeric or high molecular weight fragments were visualized after the RT-PCR reaction. The IGS1692/5SgF primer set also amplified bands of expected size from A. tridentata and A. absinthium cDNA (Figure (Figure3D).3D). The 26S Pr1/5SgF primers did not amplify the products of any of cDNAs (Figure (Figure3E)3E) while they did amplify a specific fragment from genomic DNA (Figure (Figure11D).
The products of cDNA amplification were purified, cloned and sequenced. The alignment of cDNA and gDNA clones is shown in Figure Figure2.2. It is evident that, in each species, the cDNA clones were nearly identical to the gDNA clones derived from 5S rDNA1. Minor differences were attributed to only a few random mutations. Similarly, alignment of longer spacer sequences (IGS1692/5SgF) of cDNA and genomic clones also revealed nearly complete identity (Additional file 2). Consequently, phylogeny dendrograms (ML, NJ) constructed from both genomic and cDNA sequences revealed species-specific clustering (not shown). While comparison of gDNA and cDNA clones failed to reveal substantial intragenomic polymorphisms, up to seven conserved variable sites (occurring in all units) were detected across the species. Surprisingly, three of them located to the Pol III promoter element (C-box) in position 80–89 (Figure (Figure22).
As mentioned earlier, the internal Pol III promoter comprises a tripartite motif composed of an A-box, internal regulatory element (IE) and the C-box. Sequencing of multiple clones in the different analyzed species revealed high level of conservation of A-box and IE elements (Figure (Figure2).2). However, the third part of the internal regulatory region, the C-box, located at 80–89, was only partially conserved. There were three substitutions in the 5’ region: one A>G transition at a position +80 and two G>C and A>T transversions at +82 and +83, respectively. The 3’end of the C-box was invariant. All cDNA clones (excepting random non-fixed mutations) from A. absinthiumA. tridentata and M. matricarioides contained the same (5’-GGCTTGGGTG-3’) variant of the C-box (termed C*-box) whereas the other studied species displayed the canonical 5’-AGGATGGGTG-3’motif (Figures2 and and4)4) . The upstream sequences were less conserved but the TATA box at about −20 was present in all genomic clones  including the one originating from minor separate loci in Ls species. In addition, there were multiple dT terminators in each IGS1, one or two immediately downstream of the last 5S gene nucleotide (Additional file 2) while only a single terminator was found in the S-type genomic clones from A. tridentata (Additional file 1). We did not identify any repeated elements (using the REPFIND tool, Vienna server, http://molbioltools.ca/Repeats_secondary_structure/server) within the IGS1, proposed to function in the termination of transcription with Pol I .
The secondary structure of 5 S rRNA is believed to be important for its function on ribosomes, since its pseudogenes usually deviate from the typical Y-shaped molecule . We wished to determine the influences of conserved substitutions (occurring in all clones of a given species) on the folding of RNA molecules. Using the web-based computer program Mfold Web Server , the 5 S rRNA secondary structures of three species (A. absinthiumT. patula and H. bracteatum) found to differ by several mutations were modeled (Figure (Figure5).5). The alpha domain, considered to be the least conserved among land plants [11,28] showed a single polymorphic site at position +3. At this site, the G>T substitution was compensated by a C>A substitution at position +118, thus maintaining a stable number of hydrogen bonds. Position +24 within loop B (the beta domain) was the most variable, occupied either by A, C or T nucleotides. Polymorphism at this site seem to influence the size of loop B; the smallest being that of Tagetes. The gamma domain was formed by loop E constituting part of the highly conserved A-box, and a small terminal loop D containing part of the C-box. It is evident that species-specific mutations did not seem to markedly influence domain structure.
Using bisulfite sequencing we examined DNA methylation of 5S genes occurring in two different genomic organizations with the aim to address the question whether the differential arrangement influences epigenetic patterns. We selected two representatives of both species with predominant linked (Artemisia absinthium, Helichrysum bracteatum and Tagetes patula) and separate (Elachanthemum intricatum and Helianthus annuus) rDNA arrangement and analyzed the 89bp of the 5S coding region that encompassed our primers (Figure (Figure1A).1A). After the bisulfite treatment the amplified PCR products were cloned and sequenced. The results of bisulfite analysis are presented as diagrams at a single clone resolution (Additional file 3) and summarized in Table Table22 and Figure Figure6.6. The CG and CHG sites were more frequently methylated than the non-symmetrical CHH sites, which is typical for plant DNA . There was also considerable variation between clones originating from the same individual (Table (Table2).2). For example, in Tagetes, a single clone (# 17) contained only two methylated Cs (11%) while there were clones with as much as 44% methylation. Substantial variation in methylation densities also occurred between the species.
We analyzed the chromatin condensation patterns of 5S and 35S genes during different periods of the cell cycle (Figure (Figure7).7). The 35S and 5S probes labeled with Spectrum green and cyanine Cy3, respectively, were hybridized to Helichrysum bracteatum interphase and metaphase nuclei. The signals of both probes colocalized to one pair of homologs (field 2, pictures A-C) indicating that there was a single 35S-5S locus in this species. Similarly, the prophase nuclei (field 3, pictures A-C) showed two colocalized signals on already condensed chromosomes. In interphase (field 1, pictures A-C) two dark bodies representing nucleoli were visible in most cells. The 35S and 5S signals tend to associate around the nucleolus and in some cases (upper field 1, pictures A-C) the decondensed signals spread into the nucleolus. Anaphase/telophase (D-F) chromosomes split into two chromatids. In each newly forming nucleus, one homolog started to decondense earlier than the other.
In higher eukaryotes (plants and animals), 5S genes occur either as separate arrays (S-type) or, less frequently, linked to large 35S units (L-type). While the expression of S-type genes has been thoroughly studied in the past, expression of L-type genes has not yet been addressed in any of these organisms. Here we show that in several representative plant species the linked 5S genes are expressed and dominantly contribute to cellular 5S rRNA pools.
Previous Southern blot and FISH analysis revealed largely homogenized 35S-5S units in several genera of the Asteraceae family . The current PCR analysis revealed minor 5S-5S tandems along with dominant 35-5S units in two species (Artemisia tridentata and Gnaphalium luteoalbum). Separate tandems likely originate from loci that did not hybridize with the 26S probe on Southern blots . Quantitative estimates suggest that they represent less than 10% of 5S rDNA in Ls genomes . Contrast to linked 35S-5S genes, the 5S-5S tandems contained mutations in the A-box element (2/7 monomers) suggesting the occurrence of non-functional copies. However, more clones need to be analyzed to obtain statistical support for differential mutation frequencies between the arrays. The absence of S-type tandems in other L-type genomes (Figure (Figure1)1) further indicates their frequent loss and/or rapid replacement by linked units. The question arises as to the location of minor S-type loci on chromosomes. While FISH on metaphase chromosomes of A. tridentata failed to reveal separate 35S and 5S signals, in the interphase some sites were labeled more strongly with the 5S or 35S rDNA probe . We therefore favor the hypothesis that minor 5S-5S tandems occur close to 35S-5S arrays or are interspersed between them.
Thus, four types of rDNA arrangement could be distinguished among different plant genera: (i) L-type, in which 35S-5S units are homogenized to completion (e.g., Helichrysum, Matricaria and Tagetes), (ii) LS-type, in which mostly linked 35S-5S units occur along with minor separate 5S tandems (Artemisia and Gnaphalium), (iii) SL-type which is characterized by dominant independent 5S-5S tandems with a low abundance of 35S-5S units. Elachanthemum intricatum seems to be a representative of this group  and (iv) S-type, in which genomes contain independent 5S tandems typical for most angiosperms.
The Ls genomes harbor dominant L-type units and minor S-type units. Nevertheless, both loci encode potentially transcribed genes. Since only a fraction of rRNA genes is usually transcribed in the cell (the rest is epigenetically inactivated) it was of interest to determine the origin of 5S rRNA transcripts, particularly in the Ls- and L-type genomes. The dominant expression of linked 5S genes is supported by the following observations: (i) primary 5S transcripts extending into IGS1 beyond the first termination signals were identified, which suggests that transcribed 5S sequences actually stem from linked 5S genes since a fraction of the IGS1 sequence is detected, (ii) no read-through transcripts were detected from tandemly arranged 5S-5S units in the Ls-type species, A. tridentata, (iii) the RNAs derived from linked genes adopted a secondary structure typical of a functional molecule according to the RNA folding simulations preformed and finally, (iv) linked arrays contained undermethylated and decondensed chromatin fractions (Table (Table22 and Figure Figure6),6), likely corresponding to active genes. We therefore presume that the contribution from low abundant tandem arrays or dispersed 5S genes in Ls and L-type genomes to total rRNA pools is minor, if any.
One consequence of 5S and 35S linkage could be the putative transcription of both genes arising from read-through with both RNA Pol I and Pol III enzymes (Figure (Figure3).3). In mung bean (Vigna radiata), termination of 35S transcription occurs within 65bp and 315bp downstream of the 3'end of the 26S rRNA coding region . In cucumber (Cucumis sativus), several termination signals in the IGS were observed, the first being 350bp downstream of the 26S gene . In Tagetes, the functional 5S rDNA1 insertion occurs just within ~200bp downstream from the last 26S gene nucleotide (Additional file 2 and ), providing the possibility of formation of a long 35S-5S precursor and perhaps a double stranded RNA. However, we were unable to identify any transcripts containing both 26S and 5S sequences (Figure (Figure3).3). Thus, the transcription of both genes is probably efficiently terminated in the IGS1 and/or genes are compartmentalized in cell nucleus (discussed further below).
As previously noticed , some genes in A. absinthium contain a second 5 S insertion (5SrDNA2) located distally to the 26S gene. However, the PCR product corresponding to the 5S rDNA2 transcription (Figure (Figure2)2) was not detected among the sequenced clones, supporting the hypothesis that it may represent a pseudogene and therefore is not transcribed. Nevertheless, the duplication may have evolutionary significance since a similar duplication with one functional and one non-functional 5S copy was observed in horsetail, Equisetum hyemale. Apparent parallelism may point to a common mechanism of 5S integration, and/or similar selection pressures in different organisms maintaining gene functionality.
Unlike most other genes, 5S rDNA contains essential regulatory elements within the internal controlling region (ICR). As a consequence, the 5S rRNA carries the promoter sequences of the genes from which it is transcribed allowing the study of regulatory elements among cDNA sequences. The sequence and position of these elements (A- and C-box, Figure Figure2)2) is highly conserved across eukaryotes . It was therefore surprising that several closely related species from tribe Anthemideae evolved a variant of the C-box (C*) that differed from the angiosperm consensus by as much as 30% (Figure (Figure4).4). The motif was present in cDNA clones of respective species, and full congruence between cDNA and gDNA sequences was found, with little or no variation within the genome. Thus, units carrying the C*-box variant appeared to be functional. The closest relative of the C*-box was found in the algae Spirogyra. We therefore propose a revised, more relaxed version of the plant C-box consensus motif, written onwards as 5’-RGSWTGGGTG-3’. Most variation is located at the 5’ half of the box. It is surprising that mutations in this region actually lead to reduced transcription in Arabidopsis. Specifically, the +82G>T and +84T>C substitutions caused, respectively, partial or complete loss of transcription; in our new version, the C*-box contained C at +82 while the T at +84 remained invariant. This suggests that +84T might be critical for C-box functionality while the other three nucleotides in the 5’ half can be more variable. Besides, the secondary structure does not seem to be influenced by the C-box polymorphisms (Figure (Figure5)5) suggesting that the rRNA-TFIIIA interactions may not be impaired. However, since two out of three mutations were non-compensatory, their influences on tertiary structure  and binding of other factors cannot be excluded.
There does not seem to be a simple correlation between the occurrence of the promoter variants and the genomic arrangement of rRNA genes. For example, the C*-box is found in some, but not all L-type species. The phylogeny study suggests its preferential occurrence in the tribe Anthemideae, but again, not all members seem to bear it (Figure (Figure4).4). Sequence divergence may rather reflect the overall dynamics of locus undergoing frequent elimination/homogenization cycles in this group of plants. In this context, Asteraceae species show diverse positions of rDNA loci on chromosomes [13,16,49], substantial genome size variation , certain phylogenetic incongruence between 35S and 5S markers , and even rearrangements between telomeric and rDNA repeats . The C*-box variant appears to be of recent origin that perhaps evolved after the divergence of Anthemideae from the rest of Asteraceae less than few million years ago . To gain a better understanding of such C-box divergence, it will be interesting to analyze 5S promoters in many other Asteraceae species, as well as the transcription factors binding to them, in order to detect their possible co-evolution. Of note, TFIIIA is known to evolve rapidly (yeast and animal genes share only 20% similarity) and splicing of its primary transcript seems to be influenced by an exonized 5S insertion in plants .
Both S- and L-type species showed CG, CHG and CHH methylation patterns typical for plant repetitive DNA . Consistently, the methylation density at different motifs had the identical tendency descending in this order: CG>CHG>CHH in line with previous studies of 5S methylation [41,55]. One can conclude that a relatively high level of methylation (usually higher than genome average) is not linked to tandem arrangement but also occurs when 5S genes are organized as single or low copy insertions. In other words, the tandem arrangement does not seem to be essential for 5S methylation. Within the tandemly arranged units, genes with low or no methylation levels are considered active while highly methylated genes are heterochromatic and inactive . Variation in methylation density between clones might reflect epiallelic heterogeneity of arrays. Significantly undermethylated genes with as little as 11% methylation (Table (Table2)2) were detected possibly originating from the highly active part of 35S-5S arrays. A relatively high level of methylation, particularly at non-CG motifs, was found in Helianthus (S-type). Helianthus annuus shows pericentromeric location of 5S rDNA [15,38] while the less methylated 5S genes in Tagetes patula and Artemisia absinthium are located at (sub-)telomeric positions [12,15]. In this sense, 5S units located proximally to centromeres in Arabidopsis were more methylated than other distally located genes .
It is known that RNA Pol I (which transcribes the 35S genes) occurs in the nucleolus while RNA Pol III (which transcribes 5S genes) is a nucleoplasmic protein. Thus, a single 35S-5S unit actively transcribed by one polymerase cannot be transcribed at the same time by the other polymerase. Strict compartmentalization of 35S and 5S transcription may also explain our failure to detect products of bidirectional 26S-5S transcription (Figure (Figure3).3). Several models of spatial control of rDNA expression can be envisaged. First, there could be frequent reshuffling of genes between the nucleolus and nucleoplasm. This is unlikely considering that different transcription machineries are needed to execute the transcription of 5S and 35S genes. The second possibility is that a part of the megabase-sized array could be transcribed by polymerase I while another part is transcribed with Pol III. Certainly, the L-type genomes harbor enough genes (several thousand copies ) allowing the separation of arrays into transcription domains. Finally, regulation may occur at the level of individual chromosome sites. For example, one chromosome homolog could be involved in organizing the nucleolus and the transcription of 35S genes while the other homolog transcribes 5S genes. The FISH experiment in Figure Figure77 may provide some experimental support for this hypothesis. In late anaphase/telophase of Helichrysum bracteatum the nucleoli were apparently assembled on one highly decondensed homolog, while the other was highly condensed and probably not involved in nucleolus assembly. Such a dramatic difference in condensation patterns was not seen in interphase in which rDNAs on both homologs were condensed and associated with the nucleolar periphery (Figure 7A-C). Interestingly, TFIIIA factor essential for 5S transcription seems to be concentrated at several nuclear foci including the nucleolus in Arabidopsis suggesting that transcription of linked 5S genes may occur in a close proximity of the nucleolus.
With the present study, evidence was obtained for a dominant contribution of linked 5S genes to the overall 5S rRNA pools in species with completely or partially homogenized 35S-5S arrays, that is, 5S genes are entirely transcribed from these linked arrays. The unusual sequence variation found in the internal regulatory elements of 5S genes seems to be fully compatible with transcription, and considering these variations, an updated C-box consensus sequence has therefore been proposed. The methylation patterns of linked genes seem to be similar to their unlinked counterparts. As for the nuclear topology, the 35S-5S arrays closely associate with the nucleolus, suggesting that 5S transcription may occur in close proximity to the nucleolus, possibly at its periphery.
gDNA, Genomic DNA; cDNA, Complementary DNA; IGS, Intergenic spacer between the 26 S and 18 S rRNA genes; L-type, Linked arrangement of the 35 S and 5 S RNA genes; S-type, Separate arrangement of the 35 S and 5 S RNA genes; Ls-type, Linked arrangement with minor contribution of S arrangement; Sl-type, Separate arrangement with minor contribution of L arrangement; Pol I, RNA polymerase I; Pol III, RNA polymerase III; ICR, Internal controlling region; FISH, Fluorescent in situ hybridization.
The authors declare that they have no competing interests.
SG and AK designed the study and wrote the paper. SG carried out most of the molecular biology and cytogenetic experiments; AK carried out the molecular and bioinformatic studies and drafted the paper, LCK isolated RNA and prepared cDNAs. All authors read and approved the final manuscript.
Sequencing of 5S oligomers from A. tridentata. Alignment of sequenced clones. Coding regions are in bold letters. Boxes A and C are in yellow shading. The TATA box and termination signals are in red and blue, respectively. Asterisks indicate mutations.
Alignment of long 5S-IGS1 clones from A. absinthium. Alignment of 3 cDNA and 2 genomic (gDNA) clones containing 5S genic and intergenic sequences. Termination signals are underlined.
Structure of the 26S-5S intergenic spacer. Alignment of genomic clones. The first ~30 nucleotides represent the 3’end of the 26S gene. The last nucleotide belongs to the 5S coding region. Strand reading the 35S gene is shown; 5S is encoded by the bottom strand. Termination signals for Pol III transcription are highlighted. Note spacer length heterogeneity.
Bisulfite analysis of the 5 S rDNA genic region(central part). Description: CyMATE program outputs from sequencing of non coding strands are shown. Filled symbols – methylated Cs; empty symbols non-methylated Cs. The numbers below the diagrams indicate C residues in the alignments. Gaps in matrices were caused by sequence polymorphisms.
We wish to thank Dr. Jiří Široký (Academy of Science, CZ) for his advice on FISH and the anonymous reviewers for their insightful comments. This research was funded by the Grant Agency of the Czech Republic (P501-10-0208 and P501/12/G090), Academy of Science CZ (RVO68081707) and by the Spanish and Catalan governments (projects CGL2010-22234-C02-01 and 02/BOS, and project 2009SGR00439, respectively). SG benefited from a Beatriu de Pinós postdoctoral contract with the support of the Comissionat per a Universitats i Recerca (CUR) del Departament d’Economia i Coneixement de la Generalitat de Catalunya (Catalan government), and from a Short-term EMBO (European Molecular Biology Organization) fellowship.