|Home | About | Journals | Submit | Contact Us | Français|
Debaryomyces hansenii, a yeast that participates in the elaboration of foodstuff, displays important genetic diversity. Our recent phylogenetic classification of this species led to the subdivision of the species into three distinct clades. D. hansenii harbors the highest number of nuclear mitochondrial DNA (NUMT) insertions known so far for hemiascomycetous yeasts. Here we assessed the intraspecific variability of the NUMTs in this species by testing their presence/absence first in 28 strains, with 21 loci previously detected in the completely sequenced strain CBS 767T, and second in a larger panel of 77 strains, with 8 most informative loci. We were able for the first time to structure populations in D. hansenii, although we observed little NUMT insertion variability within the clades. We determined the chronology of the NUMT insertions, which turned out to correlate with the previously defined taxonomy and provided additional evidence that colonization of nuclear genomes by mitochondrial DNA is a dynamic process in yeast. In combination with flow cytometry experiments, the NUMT analysis revealed the existence of both haploid and diploid strains, the latter being heterozygous and resulting from at least four crosses among strains from the various clades. As in the diploid pathogen Candida albicans, to which D. hansenii is phylogenetically related, we observed a differential loss of heterozygosity in the diploid strains, which can explain some of the large genetic diversity found in D. hansenii over the years.
Debaryomyces hansenii is a ubiquist, hemiascomycetous yeast that can be found in soil, fruits, and various manufactured foodstuff in which it participates by contributing to the maturation or as a contaminant. Its ability to grow at low temperatures and in high salinity environments makes it the most common yeast in cheeses, to which it brings a number of proteolytic and lipolytic activities and aromas in the course of maturation. D. hansenii has also been implicated as an emerging pathogen, sometimes under the name of Candida famata var. famata (see reference 17). Taxonomic classification of the species related to D. hansenii has always been subject to debate. Recent analyses have reinstated D. hansenii (previously D. hansenii var. hansenii), Debaryomyces fabryi (previously D. hansenii var. fabryi), and Debaryomyces subglobosus (previously Candida famata var. flareri) (13, 25). Phylogenetic analysis using conserved spliceosomal intron sequence comparison has shown that D. hansenii is a complex of species, which comprises at least four members: D. hansenii, Debaryomyces tyrocola, D. fabryi, and Candida flareri (previously Candida famata var. flareri) (18). In addition, our study has revealed the existence of at least three populations (clades 1 to 3) in D. hansenii, with the first one containing the strain CBS 767T, which has been entirely sequenced (8), and the last one containing Candida famata var. famata CBS 1795.
Most eukaryotic nuclear genomes contain pieces of mitochondrial sequences (designated NUMT [nuclear mitochondrial DNA] for nuclear sequences of mitochondrial origin) that result from the transfer of fragments of mitochondrial DNA (mtDNA) to the chromosomes. The number and size of the NUMTs varies greatly between eukaryotic genomes (33). A recent investigation of six hemiascomycetous yeasts has shown that even within this monophyletic group, the number of NUMTs varies greatly, from 1 in Kluyveromyces thermotolerans CBS 6340T to 145 in D. hansenii CBS 767T (36). The mtDNA is thought to invade nuclear genomes during the repair of chromosomal DNA double-strand breaks (DSB) by nonhomologous end joining (NHEJ), as shown experimentally in the yeast Saccharomyces cerevisiae (31, 44). The colonization of nuclear genomes by mtDNA is a dynamic evolutionary process, as observed in yeast and humans (3, 32).
D. hansenii harbors the highest number of NUMTs known so far for hemiascomycetous yeasts, making it of particular interest for NUMT studies. Conversely, NUMTs are potentially interesting markers to differentiate strains of this species. The 145 NUMTs of type strain CBS 767T are distributed in 86 loci (61 single NUMTs and 25 clusters). Most clusters (23, 25) are mosaics of NUMTs formed from noncontiguous mtDNA fragments inserted in random orientation at the same chromosomal locus. In the other two clusters, the NUMTs are all in the same orientation and order, as in the mitochondrial genome. These clusters (designated “processions”) correspond to a single ancient mtDNA insertion, followed by mutational decay, leaving recognizable mtDNA segments separated by more diverged sequences (36).
Few studies have attempted to evaluate the variability of NUMTs within the same species (2, 23, 32). Here, we have studied natural isolates to assess the intraspecific variability of the NUMT insertions in the nuclear genome of the yeast species D. hansenii. We were able to structure populations in this species, to determine the chronology of the NUMT insertions, and to correlate this chronology to the taxonomy of the D. hansenii complex species. Moreover, NUMT analysis revealed the existence of both haploid and diploid strains, the latter resulting from crosses between different D. hansenii clades.
The yeast strains used in this study were obtained from the CIRM-Levures (http://www.inra.fr/cirmlevures) and from the Centraalbureau voor Schimmelcultures (http://www.CBS.knaw.nl) and are listed in Table 1. Cells were routinely grown in YPD medium (1% yeast extract, 1% peptone, 1% glucose) at 28°C with shaking.
The nucleotides primers used in this study are available upon request. They were designed on the last release of the genome of D. hansenii CBS 767T (http://www.genolevures.org). Genomic DNA was extracted as previously described (16). PCR amplification was performed as described in reference 18. PCR products were separated by electrophoresis on 1% agarose gels.
PCR fragments on both strands were sequenced by Genome express (Meylan, France), using primers that served for the PCR amplification. Sequences were processed with the phred/phrap/consed package (12). Sequences were analyzed with various programs in the GCG (Genetics Computer Group, Madison, WI) environment, including BLAST and FASTA. Sequence alignments were generated by using Clustal X (22) and were manually adjusted with GeneDoc (http://www.psc.edu/biomed/genedoc).
The NUMTs of D. hansenii strain CBS 767T analyzed here are described in reference 36, with the coordinates on the mitochondrial (GenBank accession number DQ508940) and nuclear (the last release of the genome of D. hansenii CBS 767T (http://www.genolevures.org)) genomes. The prefix letters p, m, and s in Table 2 indicate procession cluster, mosaic cluster, and single NUMT, respectively. The associated number refers to the position in the genome (36). Mosaic 26 (m26) is a new cluster made of a single D95 and a new NUMT adjacent to D95 (nuclear coordinates 58517 to 58482; mitochondrial coordinates 3630 to 3665), which was not detected previously.
Genomic DNA in agarose plugs was prepared according to the method in reference 25. Chromosomes were separated using a Bio-Rad Mapper apparatus in 0.5× Tris-borate-EDTA (TBE) running buffer at 14°C for 24 h with 500-s pulses at 3 V/cm and a 106° angle and for 24 h with 120-s pulses at 5.2 V/cm and a 120° angle in 1% pulse-field gel agarose (Bio-Rad) gels. Gels were stained in 0.5 μg/ml ethidium bromide for 1 h and destained for 3 h.
Cells were grown in YPD at 28°C to optical densities at 600 nm (OD600) of 0.5 to 0.8. A total of 3 ml of the cell cultures was centrifuged for 5 min at 2,500 rpm, and the supernatant was discarded. The pellet was resuspended in 100 μl of 70% ethanol and incubated at 4°C overnight. Cells were pelleted, and the supernatant was discarded. The cells were then washed with 1 ml of 50 mM sodium citrate at pH 7, and the cell count was measured using absorbance (1 OD600 unit is equivalent to 2.8 × 108 cells). A total of 2.5 × 106 cells were resuspended in 200 μl sodium citrate and treated with RNase at a final concentration of 1 mg/ml for 1 h at 37°C. A volume of 800 μl of 50 mM sodium citrate with propidium iodide was added to reach a final concentration of 50 μg/ml propidium iodide. Cells were sonicated by a Branson Sonifier B12 (Branson Sonic Power Co., Danbury, CT) for 20 s at 25 W. One ml of 50 μg/ml propidium iodide in sodium citrate buffer was added, and samples were analyzed. All flow cytometry measurements were performed in duplicate using a FACSCalibur system (BD Biosciences, San Jose, CA) equipped with an argon ion laser, with an excitation power of 15 mW at 488 nm. Forward scatter (FSC) was analyzed on linear scales, side scatter (SSC) was analyzed on logarithmic scales, and red fluorescence intensity (FL2) was analyzed on linear scales. Analysis gates were set around debris and intact cells on an FSC-versus-SSC dot plot. The fluorescence histograms of 30,000 cells were generated using the gated data. Data acquisition and analysis were performed using CellQuest software. The Saccharomyces cerevisiae haploid strain BY4741 and diploid strain FY1679 isogenic to strain S288C (43) were used for calibration.
EMBL accession numbers for the sequences reported here range from FN434138 to FN434195.
We have examined the conservation of 21 NUMT insertions (four single NUMTs and 17 clusters of NUMTs, 16 mosaic and 1 procession) previously detected in strain CBS 767T (36) in a collection of 26 natural isolates of D. hansenii (Table 1). Primers for each of these 21 loci were designed on the sequence of CBS 767T and were used to amplify the nuclear DNA regions of these 26 strains, originating from various niches like seawater, fruit, and foodstuff (beef sausages and cheese). Among them, a total of 13 strains had been previously assigned to three different clades on the basis of sequence divergence (18), while the others have not been classified yet. Two Debaryomyces tyrocola strains, CBS 766T and CLIB 660, were used as outgroups.
The majority of the PCR products showed sizes similar to those of strain CBS 767T (Table 2), suggesting conservation. A total of nine PCR products were sequenced, and all sequences were identical to those of CBS 767T (data not shown). In a number of cases, PCR products had smaller sizes than those of CBS 767T, suggesting an absence of NUMTs at the considered locus (Fig. 1A). Sequencing of the PCR products confirmed the absence of NUMTs at the considered loci (see Fig. S1A in the supplemental material). We found one exception among the four strains of clade 2 (CLIB 617, CLIB 665, CLIB 667, and CLIB 698), where the smaller-sized band observed with the NUMT mosaic 26 region was due to a deletion of 268 bp of nuclear DNA located 106 bp downstream of the NUMT and not to the absence of the NUMT mosaic (see Fig. S1B in the supplemental material). The fact that these strains contain the same NUMT as the type strain CBS 767T and its close relatives indicates that the deletion occurred independently of the mtDNA insertion.
Larger PCR products were also found. The regions of NUMT mosaic 26 were slightly larger (99 bp, 67 bp, and 99 bp) in the strains of clade 3 (CBS 1795, CLIB 613, and CBS 5139, respectively). However, the sequence revealed an absence of mosaic 26, partially compensated in size by a new NUMT mosaic specific to clade 3 (see Fig. S2 in the supplemental material). This high polymorphism of PCR product size was generally observed in clade 3 strains, which led us to systematically sequence at least one strain of this clade.
Interestingly, the NUMT mosaic 10 regions in some clade 1 strains (CLIB 236, CLIB 249, CLIB 539, CLIB 541, CLIB 543, CLIB 629, and CLIB 700) are 499 bp larger than that in CBS 767T and slightly larger than those in the strains of clade 3 (CBS 1795, CLIB 613, and CBS 5139) (Fig. 1B). This will be analyzed in detail below.
Finally, in nine strains (CLIB 380, CLIB 542, CLIB 594, CLIB 608, CLIB 611, CLIB 657, CLIB 684, CLIB 685, and CLIB 702), amplification of four NUMT loci (single D18; mosaics 4, 10, and 15) revealed two PCR products of similar amounts, one with a size similar to the sizes of the clade 3 strains (corresponding to the absence of NUMT) and the other one with a size similar to the sizes of the clade 1 strains (corresponding to the presence of NUMT). The sequences of the latter strains were identical to those of the clade 1 strains, whereas the sequences of the former strains were identical to those of the clade 3 strains (Fig. 1C and and2).2). These results are consistent with the heterozygous diploid status of these strains, as previously observed in strains CLIB 594, CLIB 626, and CLIB 662 for the ACT1, the RPL31, and/or the RPL33 genes (18). Taken together, these observations indicate that the heterozygous diploids result from crosses between strains from clades 1-2 and strains from clade 3.
We concentrated our analysis on the eight most discriminatory loci out of the 21 loci previously tested to characterize and to classify the wider set of strains (77 in total) available at the CIRM-Levures. The PCR product size analysis led to the definition of 13 NUMT patterns, each corresponding to a group (see Table S1 in the supplemental material). The first group comprised CBS 767T and four other strains. The second group was composed of 22 strains identical to those of group 1, except for the larger-size NUMT mosaic 10 (Fig. 1B). A total of 19 strains, which differed from the group 1 only because of the absence of mosaic 14, form group 3 (Table 2; see Fig. S1A in the supplemental material). Group 4 was composed of three strains, which displayed an absence of NUMT in all eight tested loci. Remarkably, group 1 and 2 strains belong to clade 1, group 3 strains belong to clade 2, and group 4 contains the three typical C. famata strains making up clade 3.
The rest of the strains showed heterozygosity suggestive of diploidy and could be separated into nine groups on the basis of the distribution of this heterozygosity. Complete results are shown in Table S1 in the supplemental material. We found that in the heterozygous diploids, the m10 allele that does not carry a NUMT insertion exists as two forms, one corresponding to a PCR product with a size similar to that of the clade 3 strains and present in three strains, CLIB 611, CLIB 614, and CLIB 380, and one that is associated with a shorter PCR product (by about 20 bp), probably due to a deletion. This indicates that at least two different strains from clade 3 have contributed to the formation of the diploids. In order to estimate the number of contributors from clades 1 and 2 that were involved in these crosses, we analyzed the sequence of the only homozygous locus, m19, in the 28 diploid strains after specific PCR amplification. We found no sequence variation in the NUMT mosaic itself but instead found a deletion of 6 bp, GGAAGA, located 33 bp upstream of the first NUMT in eight strains (see Table S1 in the supplemental material), indicating that at least two different strains belonging to clades 1 and 2 were involved in the crosses. Overall, at least four hybridization events are likely to have occurred to form the heterozygous diploids tested here.
The mosaic 10 locus revealed a larger PCR product than our reference in seven strains (CLIB 236, CLIB 249, CLIB 539, CLIB 541, CLIB 543, CLIB 629, and CLIB 700). The sequence showed that these strains carried a longer NUMT mosaic, along with additional upstream sequences (Fig. 3). The distal left NUMT was extended by 20 bp, and a new NUMT of 50 bp was found immediately upstream of the cluster (positions 25480 to 25529 in mtDNA, within the last exon of COX1). BLASTX searching on the rest of the extra sequence (432 bp) showed that the first 239 bp were similar to those of the D. hansenii DEHA2G14058 ENO1-like gene (64% amino acid similarity), followed by a 120-bp sequence similar to that of the pDHL1 linear plasmid of D. hansenii (14). The remaining 73 bp did not give any BLAST hit. In CBS 1795 (clade 3), the entire region displayed similarity with ENO1 but did not carry mtDNA and linear plasmid DNA insertions. The clade 1 strains also underwent internal deletions. The presence of an ENO1 pseudogene devoid of insertions in the clade 3 lineage suggests that neither mitochondrial nor plasmid insertion triggered pseudogenization; instead, both contributed to the degradation of the pseudogene.
We also focused on NUMT D37, which overlaps a gene of unknown function (DEHA2C03388g) in type strain CBS 767T, where it could provide an extended N-terminal domain to the protein product (36). This NUMT was found in all strains of D. hansenii and in D. tyrocola. A comparison of the sequences of the D37 regions from CBS 1795 (clade 3), CLIB 660, and CBS 766T (D. tyrocola) showed that the NUMT of D. tyrocola CBS 766T was exactly as long as that of CBS 767T and had only four single nucleotide polymorphism (SNP) differences that do not create any in-frame stop codons in DEHA2C03388g, while the NUMTs of clade 3 CBS 1795 and D. tyrocola CLIB 660 created a frameshift, with a stop codon in the beginning of the gene (see Fig. S3 in the supplemental material).
Finally, we found that NUMT D114 is flanked by a duplication of 14 bp imperfectly repeated downstream from the NUMT (see Fig. S1E in the supplemental material). This observation suggests that the NUMT insertion was concomitant to a microrearrangement which duplicated 14 bp and that NUMT insertion may be associated with microrearrangements.
Out of the 77 strains analyzed here, a total of 28 strains carried at least one heterozygous locus for the presence/absence of NUMT, raising the possibility that these strains were diploid or aneuploid. We addressed this question by measuring their total DNA content using flow cytometry. Figure 4A and B clearly shows two peaks for CBS 767T and CLIB 667, corresponding to two subpopulations, one with a 1n content (G0/G1 phases) and the other with a 2n content (G2/M phases) of the 12.7-Mb genome. We concluded that these two strains were haploid, in agreement with previous results for CBS 767T (42). For CLIB 380 and CLIB 702, a clear shift of the two peaks toward a double amount of DNA was observed (Fig. 4C and D). The first peak (G0/G1) coincides with the second peak (G2/M) of the two haploid strains described above (Fig. 4). The second peak of CLIB 380 and CLIB 702, although a little flattened in our experiments, coincides with the second peak of the diploid S. cerevisiae strain FY1679 (not shown), consistent with these two strains being diploid.
The genome sizes of 19 Debaryomyces strains were estimated from the G0/G1 peak median fluorescence intensity (MFI), using the genome size of S. cerevisiae as the calibration. Results are shown in Table 3. A total of 10 strains have an estimated genome size between 13.4 and 18.9 Mb. The genome size of CBS 767T was estimated at 13.8 Mb, consistent with the size of 12.7 Mb deduced from the sequence without the ribosomal DNA (rDNA) repeats and the telomeric regions (8). Nine other strains have an estimated genome size between 29.0 and 33.2 Mb, corresponding to a 2n content in the G0/G1 phases and in agreement with the presence of heterozygous alleles of NUMTs.
Pulsed-field gel electrophoresis (PFGE) was performed to compare haploid and diploid strains in order to get an insight into the diversity of chromosome structure in D. hansenii, with respect to the ploidy of the strains. In Fig. 5, chromosome separation of the reference CBS 767T displays, as expected, a total of six bands, the 1.6-Mb band being a doublet made of chromosomes C and D. Despite the chromosome length polymorphism shown by strains CBS 767T, CLIB 236, CLIB 617, CBS 1795, and CLIB 660 for some chromosomes, these strains tend to harbor a similar number of chromosomal bands, amounting to six or seven chromosomes per strain. On the other hand, at least nine bands, without considering all doublets, can be seen in the chromosome separations of strains CLIB 380, CBS 1102, CLIB 594, CLIB 662, and CLIB 702. Based on the intensity of the various bands, we can deduce the number of chromosomal bands approaching 14 in these strains, consistent with their diploid status.
In order to establish a chronology of the mtDNA insertions, a presence/absence profile (Table 2) was defined for each NUMT locus, and sets of NUMT loci sharing the same profile were constituted. Figure 6A summarizes the results obtained with each clade of D. hansenii and the D. tyrocola species being represented by a single strain. The first set contains two NUMT loci (procession 6 and D37 single NUMT) present in all strains of D. hansenii and in D. tyrocola CBS 766T, showing that the two insertions occurred in the common ancestor of the two species. The second set includes six loci specifically found in all D. hansenii clades but absent in D. tyrocola, indicating that the insertion events occurred in the ancestor of the three D. hansenii clades after the D. hansenii/D. tyrocola divergence. Twelve loci present only in clades 1 and 2 of D. hansenii constitute the third set and witness the insertions that occurred after their divergence from clade 3. Finally, only one mosaic (mosaic 14) differentiates from clades 1 and 2, suggesting that this insertion is the most recent event in our data set. Moreover, the average percentage of sequence identity between the NUMTs and their mitochondrial counterparts for each set increases from 87% to 98%, in agreement with the actual chronology (Fig. 6B).
An interesting point in this chronology concerns mosaic 10, which belongs to the third set, signing an insertion that occurred in the ancestor of clades 1 and 2. Surprisingly, a subgroup of strains from clade 1 harbors the nontruncated conformation, from which the truncated conformation resulted by deletion of 499 bp (Fig. 3). This raises the question of whether the same deletion occurred twice in two different lineages or whether a group of strains recovered the nondeleted allele by recombination. The two conformations also differ by six SNPs and one indel of seven nucleotides, with the same sequence being found in the truncated conformations of clades 2 and 1, suggesting that these mutations are linked to the deletion event.
Taking the complete genome sequence of D. hansenii strain CBS 767T as a reference, we analyzed the variability of NUMT insertions within a population of strains of this species, using the closest relative species D. tyrocola as an outgroup. The comparison of analyses of the presence and absence of NUMT were first performed on 28 strains from diverse origins, with 21 analyzed loci, and then on a larger panel of strains (77 in total), with 8 selected loci. The latter analysis led to the constitution of 13 groups. In a previous study, we had subdivided the D. hansenii species into three phylogenetic clades (18) on the basis of sequence divergence. NUMTs in the closely related clades 1 and 2 are recent insertions, as more than half of them (13 out of 21) could not be found outside these clades. Only two NUMT insertions (single D37 and procession 6) were found common to all the D. hansenii clades and to D. tyrocola, indicating that insertions were persistent when present in the ancestor of the two species. We found no variation of NUMT insertion within each D. hansenii clade, although we used strains from various origins. This is in contrast with the observed NUMT variability in maize inbred lines (23) and in honey bees (2). The lack of variability between isolates from the same clade could simply indicate there is little exchange between isolates from different clades through homologous recombination, in agreement with reference 42. Close examination of NUMT insertion distribution showed that the classification defined by NUMT insertions (Fig. 6) agrees with the phylogeny deduced from intron sequence analysis (18). This observation implies that NUMT insertions may be used for facilitating the classification of D. hansenii isolates, as suggested by Hazkani-Covo (15) in her study of primate NUMT polymorphism. In addition, our study shows for the first time in yeast that colonization of the nuclear genome by mtDNA is a continuous phenomenon.
Several hypotheses can be put forward to explain the formation of the NUMT clusters as follows, since mitochondrial DNA pieces can be found as single insertions or as clusters inserted at the same genomic location: (i) successive independent NUMT insertions at a specific DSB locus, which can be considered an insertion hot spot; (ii) a single-step insertion, which combines various mtDNA fragments at a specific DSB locus; and (iii) the insertion of an entire mitochondrial chromosome, followed by large-scale rearrangements. The fact that we did not observe any intermediate in the NUMT clusters studied here does not support the first hypothesis. Our results do not allow us to choose between the two other hypotheses, but the absence of large NUMTs (they are all more than 2 orders of magnitude smaller than the mitochondrial genome) in all hemiascomycetes analyzed so far supports the second hypothesis (36).
The simultaneous presence of a NUMT insertion (mosaic 10) and a fragment of a linear plasmid DNA in an ENO1 pseudogene is intriguing. A recent analysis of integrated copies of genes carried by DNA plasmids or RNA viruses (designated NUPAV, for nuclear sequences of plasmid and viral origin) into yeast nuclear genomes failed to detect mixed clusters of NUPAV and NUMT in D. hansenii CBS 767T (11a). However, we could find a mixed mosaic in CBS 767T: a NUPAV homologous to linear plasmid pDHL1 ORF2 is present immediately upstream of NUMT mosaic 20 (positions 1260073 to 1260126 of chromosome F) (our unpublished data). The discovery of these two mixed insertion sites supports a similar mechanism for NUMT and NUPAV insertion. We speculate that mitochondrial and plasmid DNA have inserted simultaneously in both cases, although independent NUMT and NUPAV insertions cannot be ruled out, especially for mosaic 10, where low traces of the ENO1 gene were detected between the two elements. It is notable that the NUPAV/NUMT mosaic 10 was not responsible for the inactivation of the ENO1-like gene, in contrast to a number of observations involving NUMT insertions in the generation of pseudogenes (28, 37, 40, 41).
The persistence of NUMT D37 is particularly interesting, as this NUMT overlaps a gene at its 5′ end, extending the protein-coding sequence in D. hansenii CBS 767T. The presence of this NUMT in D. tyrocola CBS 766T, which conserves the upstream open reading frame, supports the hypothesis that the NUMT could act positively on the evolution of the protein by addition of new sequence. However, the disruption of this open reading frame in the NUMT part of the gene in the strains from clade 3 and in strain CLIB 660 of D. tyrocola argues against this hypothesis. A positive role for NUMTs that could create new exons has been reported recently; however, the phenomenon is much more obvious in humans and plants than in the yeast S. cerevisiae, where it appears to be very rare or to have ceased (26). The same may be true for other yeasts.
With the NUMT insertion analysis, we confirmed and expanded our observation (18), in which (i) a number of D. hansenii isolates carry heterozygous alleles with sequences belonging to the three clades, mainly clades 2 and 3; (ii) the heterozygosity is not conserved for all markers. The original finding of D. hansenii heterozygous diploids was surprising, since it was claimed that D. hansenii individual cells mate very rarely and hardly undergo genetic exchange (42). The presence of two distinct PCR products at the same NUMT loci could have been interpreted as the presence of two tandemly repeated sequences, with only one having received a NUMT insertion. Sequence analysis performed here rules out the latter hypothesis, as we show that sequences, being 100% identical to that of one strain of each clade, are clearly brought by different contributors (Fig. 2) (our unpublished data).
The PFGE analysis of some isolates that harbored heterozygous alleles has shown that the number of chromosomal bands in these isolates is close, if not equal, to double of that of CBS 767T, consistent with the idea that these strains could be diploid. Indeed, previous PFGE analyses had raised the possibility that some D. hansenii strains were diploids or polyploids (4, 27). The ploidy level measured here by flow cytometry supports our proposal, since all the strains tested that are heterozygous for at least one NUMT marker carried an amount of DNA corresponding to a 2n genome. In Zygosaccharomyces bailii and in Zygosaccharomyces rouxii, both haploid and diploid isolates were also found (34, 39). A recent study of the progeny of Candida lusitaniae crosses showed that it contained haploid, aneuploid, and diploid cells (30). Some of the isolates tested here may also be aneuploid or aneupolyploid.
Interestingly, different types of rDNA sequences were found in diploid strains of Z. rouxii (39), and one strain was proposed to derive from a recent hybridization between two related species (12). In D. hansenii, three loci of rDNA were detected from the genome sequence of clade 1 CBS 767T (8), suggesting that even the type strain CBS 767T could be derived from previous hybridization events. The hybrid nature of rDNA was recently claimed to be representative of the origin of the strains from the analysis of a large number of S. cerevisiae genomes (20). We had previously observed that some D. hansenii isolates carry genetic material from at least three different contributors (18). According to our flow cytometry measurements, some strains could carry supernumerary chromosomes, since they have a variable genome size, similar to one of the Z. rouxii strains analyzed in reference 39. This difference in DNA content could also reflect a variable contribution from mtDNA; it has been observed exceptionally, in Kluyveromyces bacillisporus, that half of the total cellular DNA was mitochondrial (21).
From the presence of heterozygous diploid NUMT loci, whose sequences completely match sequences from haploid strains belonging to clades 1 or 2 and 3, and from a close to doubled number of chromosomes in addition to a 2n DNA content, we conclude that some D. hansenii strains are diploid and result from hybridization between strains from clades 1 to 2 and strains from clade 3. In the nine diploid strains (CLIB 380, CLIB 542, CLIB 608, CLIB 611, CLIB 657, CLIB 684, CLIB 685, CLIB 702, and CLIB 594), only 5 out of 11 discriminatory loci (excluding mosaic 5 because it was not amplified in clade 3 strains) are heterozygous for the presence/absence of NUMT insertion. Most of the markers tested are homozygous for clades 1 and 2 alleles (see Table S1 in the supplemental material), indicating that a massive loss of heterozygosity (LOH) has taken place. LOH has been observed in the related species, the pathogenic diploid C. albicans, upon infection (10), after recovery of patients treated with antifungals (6, 9), or during commensalism (7). Like C. albicans isolates, a number of D. hansenii isolates are diploid and display LOH, as previously observed using multilocus sequence typing (MLST) (18) and confirmed here by NUMT insertion analysis. It has been proposed that LOH was due to gene conversion, allelic recombination/break-induced replication, or chromosome loss and reduplication (6, 11). In D. hansenii diploids, the use of former mechanism, gene conversion, is either unlikely or very rare because of its requirement for high sequence similarity. Therefore, allelic recombination by break-induced replication may be favored by this yeast. D. hansenii diploids also display an important PFGE variability compared to that of haploid isolates, and karyotype changes have also been associated with LOH in C. albicans (11). We speculate that the D. hansenii PFGE variability is linked to LOH, in agreement with the proposal of Diogo et al. (7) on chromosome rearrangements.
Yeast hybrids are increasingly found in fermentation. Originally limited to the Saccharomyces (for a review, see references 29 and 38), they are now found in other genera, too (12, 19, 39). Yeast diploids and hybrids display robust characteristics, such as tolerance to environmental stress or good fermentation capacities in wine or beer making. The D. hansenii diploids analyzed here were all isolated from cheese, suggesting that this may apply to cheese making. In contrast to the Saccharomyces species involved in alcoholic fermentation, the nature of the genetic variability in D. hansenii is different, since hybridization is followed by LOH, which can amplify this genetic diversity.
We are grateful to Marie-Christine Wagner and Hinde Benjelloun for their help with the flow cytometry experiments at the flow cytometry platform of the Institut Pasteur (Paris).
This work was supported in part by the Agence Nationale pour la Recherche grant no. ANR-05-BLAN-0331, the Agence Nationale pour la Recherche grant no. ANR-08-ALIA-7, and the GDR CNRS 2354 Genolevures. The work was supported by the INRA, CNRS, Institut Pasteur, and Université P. & M. Curie. B.D. is a member of the Institut Universitaire de France.
†Supplemental material for this article may be found at http://ec.asm.org/cgi/content/full/9/3/449/DC1.
Published ahead of print on 4 January 2010.