|Home | About | Journals | Submit | Contact Us | Français|
The complete sequences of mitochondrial DNA (mtDNA) from the two budding yeasts Saccharomyces castellii and Saccharomyces servazzii, consisting of 25 753 and 30 782 bp, respectively, were analysed and compared to Saccharomyces cerevisiae mtDNA. While some of the traits are very similar among Saccharomyces yeasts, others have highly diverged. The two mtDNAs are much more compact than that of S.cerevisiae and contain fewer introns and intergenic sequences, although they have almost the same coding potential. A few genes contain group I introns, but group II introns, otherwise found in S.cerevisiae mtDNA, are not present. Surprisingly, four genes (ATP6, COX2, COX3 and COB) in the mtDNA of S.servazzii contain, in total, five +1 frameshifts. mtDNAs of S.castellii, S.servazzii and S.cerevisiae contain all genes on the same strand, except for one tRNA gene. On the other hand, the gene order is very different. Several gene rearrangements have taken place upon separation of the Saccharomyces lineages, and even a part of the transcription units have not been preserved. It seems that the mechanism(s) involved in the generation of the rearrangements has had to ensure that all genes stayed encoded by the same DNA strand.
The mitochondrial DNA (mtDNA) of bakers yeast, Saccharomyces cerevisiae is 85.8 kb long and encodes subunits I, II and III of cytochrome c oxidase (COX1, COX2 and COX3), apocytochrome b (COB), subunits 6, 8 and 9 of ATPase (ATP6, ATP8 and ATP9) and a ribosome-associated protein (VAR1). It also contains a number of unidentified reading frames (URFs) and intron-related open reading frames (ORFs). In addition, the mitochondrial genome specifies the small and large ribosomal RNAs (SSU rRNA and LSU rRNA), 24 transfer RNAs (tRNAs) and the 9S RNA (Rpm1r) component of RNase P (1). All genes are encoded and transcribed from one strand, except for tRNAThr1CUN (1, reviewed in 2). The yeast mitochondrial genetic code is used in translation and differs from the universal code by AUA being read as methionine, UGA as tryptophan and CUN as threonine. The S.cerevisiae mtDNA is characterised by a very low GC content, 17–18%, and extensive intergenic regions, which comprise 62% of the genome (1,2). These regions are composed of long adenosine and thymidine (A+T) stretches, short guanosine and cytidine (G+C) clusters, and a special class of intergenic sequences, ori/rep, which are present in eight copies (2,3). The ori/rep repeats are involved in preferential transmission of mtDNA (4). A number of short repeats, direct or indirect, can be found within the S.cerevisiae mitochondrial sequence, especially within the intergenic regions (1,3). These short repeats are involved in the generation of petite mtDNAs. The initially proposed circular nature of the S.cerevisiae mtDNA molecule is still controversial (5).
Within the hemiascomycetous yeasts, the size of the mtDNA varies from ~18.9 kb in Candida glabrata (6) to ~101 kb in Brettanomyces custersii (7). Pichia canadensis, formerly Hansenula wingei (8), Candida albicans (accession no. NC002653) and Yarrowia lipolytica (9) have mtDNA sizes of 27.7, 40.2 and 47.9 kb, respectively. Even in the genus Brettanomyces/Dekkera, mitochondrial genome sizes have been shown to vary from ~28 to ~101 kb (7,10). The mitochondrial genomes of some ascomycetous yeasts have been characterised by the presence of a VAR1 gene as in the yeasts S.cerevisiae, C.glabrata and P.canadensis (8,11–14). On the other hand, Y.lipolytica and Schizosaccharomyces pombe both lack VAR1 suggesting that only a limited number of ascomycetous yeasts have mtDNA containing VAR1 (9,15,16). Apart from Saccharomyces and Kluyveromyces yeasts (17) as well as S.pombe (15), mitochondrial genomes of several Ascomycetes, like Y.lipolytica (9), contain several genes encoding hydrophobic subunits of NADH dehydrogenase complex I. Finally, the mtDNA of Podospora anserina lacks both VAR1 and ATP9 (18).
The Saccharomyces genus contains several species that can be divided into a group of petite-positive and a group of petite-negative yeasts, of which the latter includes only Saccharomyces kluyveri (19). The petite-positive yeasts, which display both fermentative and respirative metabolisms, can be further divided into Saccharomyces sensu stricto yeasts (including S.cerevisiae) and Saccharomyces sensu lato yeasts (including Saccharomyces castellii and Saccharomyces servazzii). In addition, C.glabrata is very closely related to the sensu lato group (20). In the sensu stricto group the mtDNA sizes ranges from 64 to 86 kb, whereas the sensu lato group has mtDNA with sizes below 50 kb (19,21). The gene order also varies considerably within the Saccharomyces genus and is a result of a limited number of large rearrangements taking place during the yeast evolutionary history (21). Previously, these rearrangements were suggested to be created either by transposition- and/or inversion-like events (21,22). One should note that if a segment is inverted then the coding strand of the corresponding genes is changed. In the case of a transposition-like event, the coding strand can either be preserved or changed. Previously it has been demonstrated in vivo that the S.cerevisiae gene order could be rearranged by either homologous or illegitimate recombination (22). While the rearranged genomes have preserved the respiration capacity, they have exhibited a decreased transmission ability (4). Therefore, at least in S.cerevisiae, preservation of the gene order seems to be under strong selective pressure.
The complete sequence of the mitochondrial genome of S.castellii has recently been determined but not fully analysed (23). Most of the mtDNA sequence of S.servazzii has previously been obtained in the random sequencing tag project known as Génolevures (24). Along with the complete mtDNA sequence of S.servazzii, we present here a thorough analysis of the mitochondrial genomes of the two Saccharomyces sensu lato yeasts. The complete sequences of these two mitochondrial genomes, as well as the previously determined S.cerevisiae sequence, now provide sufficient data to reconstruct several events that have reshaped the Saccharomyces mtDNA molecules during yeast evolutionary history.
The mitochondrial sequence of the S.castellii type strain (NRRL Y-12630 = CBS4309) was determined and deposited to GenBank (accession no. AF437291) (23).
Much of the sequence of the mitochondrial genome of S.servazzii type strain CBS4311 was generated by the Génolevures project (24). Additional sequence was obtained from clones of the S.servazzii type strain genomic DNA library used in Génolevures. Sequence assembly was performed using programs phred (version 0.980904.c) and phrap (version 0.960731) with a minscore of 14 and a minmatch of 30 (25,26). The sequence compilation was edited with consed, version 10.38 (beta) (27) and deposited to GenBank (accession no. AJ430679). Each base of the S.servazzii mtDNA was covered at least twice. As far as the +1 T frameshift-containing regions are concerned, except for the frameshift region of COX3 obtained from the same strand on two independent clones, the frameshift region sequences were obtained in each case from both strands (phred score >40). The COB region has been sequenced from five independent clones, the COX2 region from four independent clones and the ATP6 from 10 independent clones.
Searches of similarity to the already known and annotated mtDNA genomes of S.cerevisiae (1) and P.canadensis (8) were carried out by using variations of the Basic Local Alignment Search Tool (BLAST) (28), BLASTN (nucleotide similarity based search) and BLASTX (amino acid similarity based search), available on the world wide web. In some instances, sequences were analysed with various programs in the GCG environment (Genetics Computer Group, Madison, WI, USA), including FASTA (29). tRNAs were annotated using the program tRNAscan-SE 1.1 (30). Intron and endonuclease nomenclature follows that of Dujon (31) and Dujon et al. (32), respectively, and classification is according to Burke et al. (33) and Burke (34).
‘DNA atlas’ plots were used to visualise different features of the mtDNA molecules of S.castellii and S.servazzii (35). Perfect palindromes, which are two copies of a sequence located on opposite strands, were found using 30-nt windows and a repeat length of 7 nt. Global repeats within the entire genome were determined using a repeat length of 100 nt as the window. A direct repeat is a 100-bp sequence that is present in at least two copies and located on the same strand; inverted repeats are located on opposite strands (36). The DNA atlases of S.castellii, S.servazzii and other yeast mtDNA genomes are available on our web site (http://www.cbs.dtu.dk/services/GenomeAtlas/scast/mito_table.html).
The size and gene order of the mitochondrial genomes of S.castellii and S.servazzii was previously determined by restriction analysis and mapping (21). Sequencing of the whole mitochondrial genomes has shown that these two mtDNAs consist of 25 753 and 30 782 bp, respectively, and in Figure Figure11 they are presented as circular molecules. For S.castellii the A+T content of the entire genome and of the exonic ORFs (the eight structural genes only) is 79.6 and 77.5%, respectively. VAR1 differs from the average by having an A+T content at 94.2%, and ATP9 differs by having only 65.8% A+T. The intronic ORF encoded by S.castellii COX1.2 has an A+T content (76.9%) close to the average. For S.servazzii, the A+T content of the entire genome is 77.5%, whereas the exonic ORFs (including ORF1) have 75.8% A+T. The VAR1 gene differs from the average by having 89.0% A+T, and ATP9 differs by having only 69.3% A+T.
The S.castellii mtDNA molecule consists of 52.7% gene encoding regions (tRNAs, rRNAs and eight protein-encoding) and 7.6% introns, leaving 39.7% for intergenic sequences. Similarly, S.servazzii mtDNA consists of 45.6% genes (tRNAs, rRNAs and nine protein-encoding), 11.4% introns and 43.0% intergenic sequences. Saccharomyces castellii and S.servazzii mitochondrial sequences are analysed for their coding regions, transcription units, gene order, introns and intergenic regions, as described below. Both mitochondrial sequences are compared to each other and to the S.cerevisiae sequence, as well as to other relevant fungal sequences, in order to deduce possible mechanisms, which operated during evolution of the yeast mitochondrial genome.
The gene content of the mitochondrial genomes of S.castellii and S.servazzii is typical for the already characterised Saccharomyces, Kluyveromyces and Schizosaccharomyces yeasts, with the NADH dehydrogenase complex 1 subunits not being present. Due to similarity to the already known yeast mitochondrial genes, eight structural genes were identified as COX1, COX2, COX3 (cytochrome oxidase subunits I, II and III), ATP6, ATP8 and ATP9 (ATPase subunits 6, 8 and 9), COB (apocytochrome b) and VAR1 (ribosomal associated protein) (Fig. (Fig.1).1). All ORFs encoding structural genes start at an ATG initiation codon and terminate at a TAA termination codon, with two exceptions in S.servazzii. One is the URF called ORF1, which shows homology to Q0255, a hypothetical ORF of the mitochondrial genome of S.cerevisiae. Saccharomyces servazzii ORF1 probably encodes a maturase and has ATA as the start codon and TAG as the stop codon. The other exception is an intronic ORF that has ATG as the start and TAG as the stop codon. All ORFs in both species are encoded by the same strand.
Interestingly, S.servazzii ATP6, COX2 and COX3 ORFs each contain a +1 frameshift at amino acid positions 191, 229 and 249, respectively. The strong amino acid sequence conservation found between the S.servazzii deduced protein sequences and those of S.cerevisiae and S.castellii suggests that these frameshifts are due to one additional T residue within the 3′ end (Fig. (Fig.2).2). Most strains of C.glabrata, a species closely related to S.cerevisiae, also contain a +1 frameshift in COX2 (37). Interestingly, the +1 frameshifts in the COX2 genes of S.servazzii and C.glabrata are located near the 3′ end of COX2 and are only separated by five amino acids. In addition, S.servazzii COB contains two +1 frameshifts also caused by additional T residues located after the annotated intron at positions 74 and 82 and separated by 14 bp (Fig. (Fig.22).
Approximately 1.5 kb downstream of the COB gene in the S.castellii mtDNA molecule, there is a pseudo 3′ part of a COB sequence. This is seen as a global direct repeat in Figure Figure11 and indicates that a partial gene duplication has occurred. When the amino acid sequences of the two S.castellii repeat regions are compared to the 3′ end of the S.cerevisiae COB gene, the S.castellii pseudo 3′ part of COB shows higher similarity (84% identity) than the 3′ end of the true S.castellii COB (72% identity), whereas the identity between the two S.castellii 3′-COB sequences is only 75%. Since both copies of the COB 3′ part have a conserved amino acid sequence, they might both be functional. Another interesting observation is the lack of palindromic sequences (see later section) in the 1.5 kb region between the two 3′ parts.
Based on sequence similarity, the mitochondrial genomes of both S.castellii and S.servazzii contain the large subunit and small subunit ribosomal RNA (LSU and SSU) genes. The 5′ and 3′ boundaries are estimated from sequence comparison to rRNAs from S.cerevisiae (1). tRNA genes were predicted from their secondary structure. Saccharomyces castellii and S.servazzii each contain 23 tRNA genes. All annotated tRNA genes are encoded by the same strand as all the other genes, except for tRNAThr1. In comparison, S.cerevisiae and P.canadensis mtDNAs encode 24 and 25 tRNAs, respectively (1,8). The codon usage within the exonic ORFs and the corresponding tRNAs can be found in the Supplementary Material. The anticodons for all tRNAs are identical between S.castellii and S.servazzii, and the genetic code seems to be the same as the one used by the S.cerevisiae mitochondria. The tRNA gene that is missing in S.castellii and S.servazzii is that corresponding to the CGN family, which in S.cerevisiae encodes arginine. This codon family is used neither in any of the predicted exonic ORFs, nor in the intronic ORFs. The same phenomenon is seen in other yeasts, like C.glabrata (13), Kluyveromyces thermotolerans (38) and in exonic ORFs of Y.lipolytica (9). The CUN family does not specify leucine but instead threonine, as in S.cerevisiae and C.glabrata (reviewed in 39). The corresponding tRNA gene is found on the opposite strand in S.cerevisiae, S.castellii and S.servazzii, but on the same strand as all other genes in C.glabrata (13). For all three Saccharomyces yeasts, the predicted secondary structure contains eight nucleotides and not the normal seven in the anticodon loop (40) and the anticodon is UAG. Saccharomyces castellii and S.servazzii differ from S.cerevisiae by having an additional nucleotide in the D-loop, but interestingly the extra nucleotide is differently positioned. Several codons are missing in the S.castellii and S.servazzii mitochondrial genomes, and, as expected for A+T-rich genomes, all missing codons are G+C-rich.
In S.cerevisiae, the gene encoding Rpm1r of RNase P (RPM1) is found between tRNAfMet and tRNAPro and the three genes are transcribed together as one unit (1,41). This organisation is also seen in other yeasts, like C.glabrata, where co-transcription also occurs (42), and in Saccharomyces exiguus (43) and Kluyveromyces lactis (44). In S.castellii the same tRNA genes are separated by 627 bp of which at least 525 nt presumably encode the Rpm1r, starting only a few nucleotides from the 3′ end of tRNAfMet. Two regions, one near the 5′ end and the other near the 3′ end, have high identity to the conserved sequences that are found in the RPM1 of C.glabrata, K.lactis, Saccharomycopsis fibuligera and other Saccharomyces yeasts (42,43,45). These two regions make up the conserved helix P4 in the potential secondary structure (data not shown) that is similar to other known RNA components of RNase Ps (reviewed in 46). In the case of S.servazzii, the same two tRNA genes are separated by 382 bp of which 277 nt presumably encode the Rpm1r. The assumption is based on the same conserved features as mentioned for RPM1 of S.castellii. For comparison, the Rpm1r of K.lactis is 191 nt (44), that of C.glabrata is 227 nt (42), and that of S.exiguus is 277 nt (43), while that of S.cerevisiae is 453 nt (41).
In S.cerevisiae, the nonanucleotide motif sequence 5′-TATAAGTAA-3′ is considered as the transcription initiation site (47) and is similar to what is found in C.glabrata, 5′-TATAAGTA-3′ (13) and P.canadensis, 5′-TATAAG(T/A)(A/T)-3′ (8). Based on the consensus sequences, we used the motif TATAAG to detect putative transcription initiation sites in S.castellii and S.servazzii mtDNAs, and several nonanucleotide motif sequences 5′-TA(T/a)AAG(A/T/g/c)(A/T/c/g)(A/T/c/g)-3′ and 5′-TA(T/a)AAG(A/T/g/c)(A/T/g/c)(A/c/t/g)-3′ were detected in S.castellii and S.servazzii, respectively (Fig. (Fig.11 and Supplementary Material). However, several of these motifs, located within genes or far upstream, are likely not to be transcription initiation sites. Taking this into account the consensus of the motif sequences is 5′-TATAAG(A/T)(A/T)(A/T)-3′ and 5′-TATAAG(T/A)(A/T)A-3′ for S.castellii and S.servazzii, respectively. Motifs with the core sequence 5′-TATAAG-3′ were not found upstream of S.servazzii anti-tRNAThr1, but instead two motifs with the core sequence 5′-TAAAAG-3′ were located in this region. The putative transcription initiation sites that were located <500 bp away from the downstream gene are indicated in Figure Figure11.
In S.cerevisiae the dodecamer motif 5′-AATAATATTCTT-3′ is found downstream of most protein-coding genes and believed to be an endonucleolytic cleavage site needed for RNA processing of multi-gene transcripts (48–50). Likewise, the motif 5′-TATAATATTCTT-3′ has been found in C.glabrata (13). In S.castellii and S.servazzii similar motifs are found downstream of several genes and have the consensus sequences 5′-A/T/c ATAATATTC A/c/t T/A-3′ and 5′-(A/t/c)(T/a/g)(TAATA(a)TTC(T/A)(A/T/c)-3′, respectively (see Supplementary Material). When only the motifs downstream of protein-encoding genes are considered, the consensus sequences change slightly to 5′-(A/t)ATAATATTC(A/c)(T/A)-3′ in S.castellii and 5′-(A/t)(T/a)TAATA(a)TTC(T/A/g)(T/A/C)-3′ in S.servazzii. Notice that the pseudo 3′ end of COB in the S.castellii mtDNA is followed by two perfect motifs. Interestingly, the consensus motif was found in the 3′ end of RNA genes, like in C.glabrata (13). The main difference between the putative 3′ processing sites in the two studied yeasts is that the second base is an A in S.castellii and a T in S.servazzii.
Transcription termination could occur in regions containing palindromic sequences. Therefore, the mitochondrial genomes of S.castellii and S.servazzii were examined for perfect palindromes (Fig. (Fig.1)1) and, in general, these were found in intergenic regions. As expected, the RPM1 as well as rRNA regions contain perfect palindromes to a much higher extent than the protein-coding regions, with the exception of the very AT-rich VAR1, and, not surprisingly, perfect palindromes were found in some of the S.castellii and S.servazzii introns.
The gene order of the mitochondrial genomes of S.castellii, S.servazzii and S.cerevisiae was compared (Fig. (Fig.3).3). Several of the gene clusters are shared with other related species. The three genes, COX1, ATP8 and ATP6, constitute the most conserved gene cluster among hemiascomycetous yeasts. This cluster has been found in S.cerevisiae, in all the Saccharomyces yeasts analysed by Groth et al. (21), in C.glabrata (13), K.thermotolerans (38), K.lactis (44, reviewed in 51), P.canadensis (8) and in Y.lipolytica (9), suggesting that this is a common ancestral unit. In S.cerevisiae they form a transcription unit (50), whereas in Y.lipolytica ATP8, ATP6, COX3 and ND4 are co-transcribed (9). The conservation of gene order and the presence of putative endonucleolytic cleavage sites downstream of COX1 and ATP6 in both S.castellii and S.servazzii supports co-transcription of these three genes (Fig. (Fig.3).3). However, promoter motifs are found upstream of the first two genes in S.castellii and upstream of all three genes in S.servazzii suggesting that at least COX1 could also be transcribed independently.
In S.cerevisiae, the tRNAGlu and COB genes are co-transcribed (see below), while ATP9 is the first gene in a transcription unit containing tRNASer1 and VAR1 (reviewed in 52). The latter unit is not conserved in the Saccharomyces sensu lato yeasts (Fig. (Fig.1),1), but it seems to have emerged in the lineage leading to the Saccharomyces sensu stricto yeasts (21). The gene order of protein-encoding genes in the related yeasts, S.exiguus (53–55), C.glabrata and K.thermotolerans (38) provides some insight on the fate of COX2. Together with S.castellii, these yeasts share a cluster of six protein-encoding genes, COB-COX1-ATP8-ATP6-ATP9-COX2, which presumably existed already in the ancestral mtDNA. It could be that COX2 in both S.cerevisiae and S.servazzii, as well as ATP9 in S.servazzii, have moved to a new position. In S.castellii and S.servazzii, COX2 is succeeded by a tRNA cluster with the conserved gene order (Fig. (Fig.3,3, cluster 6 and see below). Since promoter motifs are found upstream of both the tRNA cluster and the COX2 gene, the latter seems to be transcribed alone. A cluster shared between S.castellii and S.servazzii contains the three genes, tRNAPhe, anti-tRNAThr1 and tRNACys (Fig. (Fig.3,3, cluster 2). The same gene order is seen in C.glabrata, but in this yeast tRNAThr1 is located on the same strand as all other genes and implies that tRNAThr1 at some point in the evolution has changed the strand location. When the mtDNA of S.servazzii was examined for putative initiation sites upstream of the anti-tRNAThr1, only motifs with the altered core sequence 5′-TAAAAG-3′ were found. In addition, S.castellii and C.glabrata share the gene order of the following two genes, tRNAVal and COX3, with S.cerevisiae. Notice that in S.cerevisiae, tRNACys has moved to a new position and that the cluster containing tRNAPhe, anti-tRNAThr1 and tRNACys has moved downstream of COX3 in S.servazzii. In conclusion, the gene order found in S.castellii, S.servazzii, S.cerevisiae and C.glabrata suggests that tRNAPhe-anti-tRNAThr1-tRNACys-tRNAVal-COX3 comprise a conserved ancient block.
Whereas one of the transcription units in S.cerevisiae presumably contains tRNAPhe, tRNAVal and COX3 (47, reviewed in 52), promoter motifs are found upstream of both tRNAPhe and tRNAVal in S.castellii (Fig. (Fig.1),1), and in S.servazzii, where rearrangements have occurred, promoter motifs are found upstream of tRNAVal, tRNAPhe and ATP9. While the genes in cluster 2 (Fig. (Fig.3)3) are contained in one transcription unit in S.cerevisiae, at least two transcription units are likely to be present in S.castellii and S.servazzii.
As in S.cerevisiae and K.lactis, the SSU and tRNATrp genes are adjacent in S.castellii. Whereas S.castellii has a perfect promoter motif upstream of SSU and another promoter motif just upstream of tRNATrp, S.cerevisiae only seems to have an active promoter upstream of SSU. Since the gene order of SSU and tRNATrp is conserved in several yeasts, SSU has supposedly moved to a new location in S.servazzii (Fig. (Fig.3,3, clusters 3 and 5). Saccharomyces castellii and S.servazzii share the gene order, tRNATrp-LSU-tRNAThr2, with C.glabrata. In S.cerevisiae one transcription unit contains the LSU gene followed by tRNAThr2(56,57). Whereas S.servazzii has a promoter motif just upstream of LSU, S.castellii does not and the transcription unit may begin with tRNATrp (Fig. (Fig.11).
Saccharomyces castellii and S.servazzii share the five-gene cluster, from tRNAGlu to tRNAPro, with C.glabrata, whereas in S.cerevisiae this cluster has been split into two (Fig. (Fig.3,3, cluster 4), with the tRNAGlu and COB genes probably being transcribed as one unit from a promoter upstream of tRNAGlu (58). Whereas the arrangement of the two genes is the same in the Saccharomyces sensu lato yeasts, the presence of promoter motifs differs. Such motifs are found upstream of tRNAGlu (but within VAR1) and COB in S.castellii, while none is found in S.servazzii. The lack of suitable promoter motifs could suggest that these yeasts have transcription initiation sites with alternative motifs. Both S.castellii and S.servazzii have putative endonucleolytic cleavage sites downstream of COB (see Supplementary Material).
As previously mentioned, the gene order of tRNAfMet, RPM1 and tRNAPro is conserved in many different yeast species: S.cerevisiae (41), C.glabrata (42), S.exiguus (43) and K.lactis. Saccharomyces castellii has one perfect promoter motif upstream of tRNAfMet, three promoter motifs within the RPM1 and one putative endonucleolytic cleavage site downstream of RPM1, whereas S.servazzii has just one perfect promoter motif upstream of tRNAfMet (Fig. (Fig.1,1, see also Supplementary Material). As in S.cerevisiae and C.glabrata, these three genes are likely transcribed as one unit in both S.castellii and S.servazzii.
A large cluster containing seven tRNA genes from tRNALeu to tRNASer2 is shared not only between S.castellii and S.cerevisiae (Fig. (Fig.3,3, cluster 5) but also with C.glabrata and K.lactis, whereas it is split up into smaller pieces in S.servazzii (Fig. (Fig.3)3) and P.canadensis. The conservation continues between S.castellii and K.lactis, where tRNASer2 is followed by tRNAHis and tRNAMet. In addition to S.castellii and S.servazzii, tRNAMet and VAR1 are adjacent in C.glabrata. The comparison of the three Saccharomyces yeasts and K.lactis suggests that the three tRNA genes specifying asparagine, serine and histidine have switched position with SSU in S.servazzii and that tRNAHis has moved in front of tRNALeu in S.cerevisiae (Fig. (Fig.3,3, clusters 3 and 5).
In S.castellii and S.servazzii the tRNA cluster (Fig. (Fig.3,3, cluster 6) located downstream of COX2 and containing tRNATyr, tRNAAsn, tRNAAla, tRNAIle and tRNASer1, is also found in K.lactis. In C.glabrata, the first four tRNA genes have the same order, whereas in S.cerevisiae and P.canadensis these genes are shuffled. In S.cerevisiae, these four genes are part of the large tRNA cluster, which is not adjacent to tRNASer1 and VAR1 (Fig. (Fig.3,3, cluster 3). Based on the presence of promoter motifs, the five tRNA genes are likely to be transcribed together in S.castellii, while tRNATyr in S.servazzii might be transcribed alone (Fig. (Fig.11).
Mitochondrial introns are divided into two groups, I and II, based on their function, structure and splicing mechanism. Based on sequence homology of both the nucleotide sequence and the expected amino acid sequence, introns have been predicted to interrupt genes in both S.castellii and S.servazzii (Table (Table1).1). The S.castellii COX1 gene is interrupted by two introns of 487 and 1123 nt in length. The ORF from the first exon continues only ~10 nt into the first intron (Sca cox1.1) and encodes neither a separated gene if not spliced, nor a free-standing ORF. The second intron in the S.castellii COX1 gene (Sca cox1.2) shows high identity to the S.cerevisiae intron, Sc cox1.4. In addition, it shows limited identity to the S.cerevisiae intron Sc cob.4 and the P.canadensis intron, Pc cox1.1. Together with exon1 and exon2, it makes a 1797 bp ORF encoding 598 amino acids that shows high similarity (73% identity) to the DNA endonuclease I-SceII from S.cerevisiae and 61% identity to a probable site-specific DNA endonuclease from P.canadensis. The deduced amino acid sequence contains two LAGLI-DADG motifs (Table (Table11).
Saccharomyces servazzii COX1 contains three introns of 475, 1471 and 1230 nt in length. The first intron, Ss cox1.1 does not encode an ORF, while Ss cox1.2 contains a 1167 bp free-standing ORF encoding 388 amino acids with one or maybe two LAGLI-DADG motif(s). The third intron in S.servazzii COX1, Ss cox1.3 encodes together with the upstream exons a hypothetical protein of 675 amino acids with two LAGLI-DADG motifs and with homology to an intron-encoded endonuclease from Y.lipolytica (Table (Table1).1). The S.castellii LSU gene is interrupted by a 356 nt intron, Sca lsu.1, at the same position as the intron in the S.cerevisiae LSU gene, Sc lsu.1, also known as the ω (omega) intron (31,59). The S.castellii intron has previously been described as a group I intron resembling Sc lsu.1, and was shown not to encode a homing endonuclease gene (HEG) like I-SceI in Sc lsu.1 (60). In S.servazzii, a 326 nt intron was found at the same position as described for other Saccharomyces yeasts, including S.cerevisiae and S.castellii. Apparently, this intron does not encode a I-SceI homologue (Table (Table11).
Finally, a 1669 nt intron is found in the S.servazzii COB gene, Ss cob.1, and contains a 1401 bp free-standing ORF encoding 466 amino acids that shows limited similarity to several intron-encoded ORFs, especially from mtDNA of P.anserina, and to a hypothetical ORF from the mtDNA of S.cerevisiae, Q0255. The deduced amino acid sequence contains one LAGLI-DADG motif (Table (Table1).1). In addition to the intronic ORFs, the URF (ORF1) in S.servazzii contains one or maybe two LAGLI-DADG motif(s).
The S.cerevisiae mtDNA consists of approximately two-thirds of intergenic sequences (2), which contains approximately 200 repetitive A+T-rich spacers with an average size of 190 bp, separated by G+C-rich clusters. The S.cerevisiae ori/rep sequences are formed by three GC-clusters having the conserved sequences, cluster A, GGGGGTCCC, cluster B, GGGACCCGG and cluster C, CACCCACCCCCTCCCCC (3). Cluster C or basic monomeric penta-C units and the flanking sequences, r* and r, are believed to be the most ancient sequence elements in the ori sequences, since bi-directional DNA replication is initiated by RNA primers starting at r* and r and continuing into DNA chains at cluster C (3,61). In S.castellii only one element was identical to cluster A, whereas similarities to all three GC clusters were found at several locations but never within reasonable distance and/or in the same direction. The mtDNA of S.servazzii was also examined for ori/rep-like GC clusters. Only two sequence elements identical to cluster A were found, one on each strand. Several penta-C and penta-G units were found in the mtDNA of S.castellii and S.servazzii but never followed by sequences homologous to r or r′, which are described in de Zamaroczy et al. (62). The mtDNA of S.servazzii is approximately one-fifth larger than the mtDNA of S.castellii, has a higher GC content and almost three times as many GC-rich clusters.
The mtDNAs of S.castellii and S.servazzii were also examined for repeats. According to the analysis shown in Figure Figure1,1, a few global repeats were found, while several simple repeats and local repeats predominantly were found in the intergenic regions (D.W. Ussery, unpublished data; see http://www.cbs.dtu.dk/services/GenomeAtlas/scast/mito_table.html). In the S.cerevisiae VAR1 gene, two GC clusters (46 bp in length) of opposite orientation have been shown to be involved in recombination leading to size variation of this gene (63). No such GC cluster was found in the S.castellii or S.servazzii VAR1 genes. Instead, the S.castellii VAR1 gene contains one pair of large global (direct) repeats. This region contains a significant proportion of local repeats (direct, inverted, mirror, everted) and perfect palindromes (Fig. (Fig.11 and D.W. Ussery, unpublished data; http://www.cbs.dtu.dk/services/GenomeAtlas/scast/mito_table.html). In addition to VAR1, two other pairs of global repeats are found in the S.castellii mtDNA; one pair is located in the intergenic region upstream of tRNAVal and the other pair is seen as the implied duplication of the 3′ end of the COB gene (Fig. (Fig.1).1). Global repeats are also seen in the S.servazzii mtDNA; one pair of inverted repeats and two pairs plus one set of three direct repeats (Fig. (Fig.1).1). Most repeats are located in intergenic regions but one pair of direct repeats is located in the COX1 and the LSU gene.
Whether the S.cerevisiae mtDNA is circular or linear is controversial (5), but the present sequence analysis suggests that S.castellii and S.servazzii mtDNAs can exist as circular molecules at least for a short period of their life cycle (Fig. (Fig.1).1). On the other hand, several yeasts have been shown to have linear mtDNA molecules (64). Whereas the coding potential of the S.cerevisiae, S.castellii and S.servazzii mtDNAs is similar, they differ significantly in their size (Fig. (Fig.1).1). The mtDNA of S.cerevisiae consists of 16% genes, 22% introns and 62% intergenic sequences (2). Considering the distribution in the mtDNA of S.castellii (53, 7 and 40%) and S.servazzii (46, 11 and 43%) it seems that as the genome size increases, the amount of intergenic regions and introns rises, confirming that the size variation of the mtDNA among yeasts belonging to the Saccharomyces genus is mainly caused by the variable length of intergenic sequences and the presence or absence of introns (19). As in S.cerevisiae, all annotated genes are located on one strand, except for tRNAThr1 (Fig. (Fig.11).
Saccharomyces castellii and S.servazzii use the yeast mitochondrial genetic code for translation of their mitochondria-encoded genes but a few small differences in codon usage exist among the three Saccharomyces species (see Supplementary Material). Whereas S.servazzii almost entirely uses the CUA codon of the CUN family, S.castellii uses equal amounts of CUA and CUU. Another significant difference is the absence of the lysine specifying AAG codon in S.castellii, while almost one-tenth of the lysine codons in S.servazzii are AAG. The fact that the CGN family is missing in several yeasts, such as S.castellii, S.servazzii, C.glabrata, K.thermotolerans and Y.lipolytica, but is present in S.cerevisiae, contradicts the previously suggested time point in evolution where the CGN should have become non-coding in the lineage leading to S.cerevisiae after the separation from C.glabrata (reviewed in 65). It is more likely that the CGN family first became non-coding somewhere in the lineage leading to the Hemiascomycetes and then later in evolution was regained in S.cerevisiae. Another explanation could be that the CGN codon independently became non-coding in several lineages. Interestingly, while mtDNA of S.cerevisiae encodes tRNAArg with the anticodon ACG, S.pombe and all metazoan mitochondria have a tRNAArg with the anticodon UCG (reviewed in 65). Since no tRNAArg is imported into the S.cerevisiae mitochondria (reviewed in 66) and A normally only pairs with U, it is not known if the tRNAArg is functional in S.cerevisiae. However, unconventional pairing of A with C, G, and A has been seen with in vitro translation using anticodon AGU in Mycoplasma capricolum (67). In Y.lipolytica, CGN codons have been found in intronic ORFs that also contain mutations and thereby suggest that these intronic ORFs are pseudo-genes (9).
The S.servazzii mitochondrial genome contains several +1 frameshifts that affect four (ATP6, COX2, COX3 and COB) of the nine detected exonic ORFs. In COX2, a +1 C frameshift affecting the third base of the codon would result in a methionine. A +1 C frameshift in COX2 was described in all the tested strains of C.glabrata except for CBS138 (37). Recently, sequencing of the mtDNA of the latter strain confirmed the presence of the same frameshift (A. Malpertuy, personal communication). Interestingly, the +1 frameshifts in COX2 of S.servazzii and C.glabrata are located near the end of the ORF and are only separated by five amino acids. Another case of +1 frameshift in COX2 was described in a mutant of S.cerevisiae that was shown to be suppressed to a high extent by a mutation in the anti-codon stem of a tRNASer, through a likely alteration of the pairing between the tRNA and mRNA molecules (68,69). The mechanism described for S.cerevisiae is unlikely to be used in S.servazzii as this would require that different tRNAs would be specifically mutated to bypass the five detected frameshifts. Other suppression mechanisms involve the insertion of an extra nucleotide in the anticodon loop of the tRNA that would sterically modify the interactions between the codon and the anticodon and facilitate a 4-nt translocation (reviewed in 70). Except in the anti-tRNAThr1, which has an extra G in the anti-codon stem, as in that of other yeasts, no extra nucleotide was detected in the S.servazzii mtDNA tRNAs. In COX3, both a +1 T and a +1 C frameshift affecting the third base of the codon would conserve the tyrosine. In COB and in ATP6, any other combination than the suggested +1 T residues would either impair the sequence conservation or introduce a CGN codon. Removing the first T residue in COB (Fig. (Fig.2)2) ensures the conserved glycine at position 75 and prevents the occurrence of the unusual TGG codon, which is not used in the S.servazzii or S.castellii mtDNA (see Supplementary Material). Programmed frameshifting was previously described for translation of ORF2 of the S.cerevisiae retrotransposon Ty (71). Several mechanisms of RNA editing were also described in mitochondria of various organisms that rely on the post-transcriptional modification of tRNAs or mRNAs, like U-deletion in the mitochondria of trypanosomes (72).
Compared to S.cerevisiae, only a few introns were found in the mtDNA of S.castellii and S.servazzii and only introns belonging to group IA and IB were present, whereas no introns had the 5′ or 3′ splice site consensus of group II introns (Table (Table1).1). It could be that the Saccharomyces sensu lato yeasts have lost the group II introns or that these have been acquired only by the Saccharomyces sensu stricto lineage. Some of the intronic ORFs (Table (Table1)1) and one URF (ORF1) in S.servazzii contain the LAGLI-DADG motif found in RNA maturases and DNA endonucleases (reviewed in 73). Although intron mobility has not been demonstrated in S.castellii and S.servazzii, the presence of LAGLI-DADG motifs in intronic ORFs suggest that these introns may be or may once have been mobile (74). In S.cerevisiae, the Sc cob.4 encodes a maturase, which excises both the Sc cob.4 and the Sc cox1.4 intron from the cob pre-mRNA and the cox1 pre-mRNA, respectively (75). Skelly and Maleszka (76) found that two out of 21 investigated species, namely B.custersii and C.glabrata, contain only a cox1.4-like intron and no cob.4-like intron, as in S.castellii. A similar situation is found in P.canadensis (8). However, in S.cerevisiae Sc cox1.4 encodes a potential maturase that can be triggered by a single point mutation (77) suggesting that the cox1.4-like introns in the mentioned yeasts are self-splicing.
The mtDNA of S.cerevisiae carries many promoter motifs but not all have been shown to be active (1,47,78). Apparently, S.castellii and S.servazzii have similar consensus sequences to that of S.cerevisiae (Fig. (Fig.11 and Supplementary Material). But, like in S.cerevisiae, not all these sites are likely to be active. The 3′ end of all S.cerevisiae mitochondrial mRNAs is formed by endonucleolytic processing within the conserved sequence, 5′-AAUAAUAUUCUU-3′ (reviewed in 52). Surprisingly, the putative endonucleolytic cleavage sites differ not only between S.castellii and S.servazzii, but also from that of S.cerevisiae and that of C.glabrata (see Supplementary Material). The appearance of putative transcription initiation sites and endonucleolytic cleavage sites can nevertheless be used to propose possible transcription units in the investigated yeasts. Figure Figure33 shows the comparison of common gene clusters in the completely sequenced mtDNA of the Saccharomyces yeasts: S.castellii, S.servazzii and S.cerevisiae. Several of these clusters are also shared with other related species, like the sensu lato yeast, C.glabrata (13) and the less closely related petite-negative yeast, K.lactis (44, reviewed in 51). The sequence analysis clearly demonstrates that transcription units have not been completely conserved during evolution of the Saccharomyces yeasts. Apparently, transcription units containing several genes have been broken into separate units and/or two units have been fused into a single one in some yeast lineages.
The mtDNA molecules and their gene order can get rearranged if a segment, carrying mitochondrial genes, is (i) inverted or (ii) moved to a new position. Both of these events can be mediated by short intergenic repeats. In the case of an inversion only the repeats adjacent to the segment are involved, while in the case of a movement the repeats adjacent as well as those present at the new site are involved in the recombination event. If a segment is inverted, then the coding strand, regarding the whole genome, is changed. On the other hand, if a segment is moved to a new position in the genome (a transposition-like event), the coding strand is either preserved or changed. The orientation of genes is conserved in all the compared yeasts, with the exception of the C.glabrata tRNAThr1. This suggests that there has been a strong pressure to keep the coding potential concentrated on one strand. Therefore, we propose that the rearrangements have likely occurred exclusively through transposition-like events, and not inversions, using short intergenic repeated sequences as sites of the illegitimate recombination or homologous recombination using tRNA genes in such a way that the jumping genes preserved their orientation and stayed on the same coding strand. This mechanism does not rely on the presence of a circular mtDNA form. While inversions seem to have been ‘prohibited’ through evolution of the Saccharomyces yeasts they could have operated in the C.albicans lineage. The complete sequencing of the mitochondrial genome of C.albicans revealed that approximately one-third of the genes are located on the opposite strand (79). Whereas the native mitochondrial genomes of the Saccharomyces yeasts concentrate all genes on a single strand, S.cerevisiae mutants having genes on both strands have been reported (22). These mutants are respiratory competent, but are less competitive in genetic crosses (reviewed in 4). An elevated transmission capacity of the mtDNAs may be the reason why the coding potential in Saccharomyces yeasts is preserved on one strand and, as a result, transposition is a preferred mechanism to create novel gene orders.
The following features obtained for S.castellii and S.servazzii are available at NAR Online: a table showing sequence homology between mitochondrial-encoded proteins from various ascomycetous fungi, a table showing the codon usage in the exonic ORFs in the two mitochondrial genomes, a table listing the putative transcription initiation sites and a table listing the putative endonucleolytic cleavage sites.
We are grateful to A. Malpertuy (Institut Pasteur, Paris) for communicating results before publication. This work has been partially supported by grants from the Danish Research Council, the Novo Nordisk Foundation, the Carlsberg Foundation, the Plasmid Foundation, the Institut National de la Recherche Agronomique, the Centre National de la Recherche Scientifique and the GDR/CNRS 2354 ‘Génolevures II’.
DDBJ/EMBL/GenBank accession no. AJ430679