|Home | About | Journals | Submit | Contact Us | Français|
The pathogenic yeast Candida dubliniensis is phylogenetically very closely related to Candida albicans, and both species share many phenotypic and genetic characteristics. DNA fingerprinting using the species-specific probe Cd25 and sequence analysis of the internal transcribed spacer (ITS) region of the ribosomal gene cluster previously showed that C. dubliniensis is comprised of three major clades comprising four distinct ITS genotypes. Multilocus sequence typing (MLST) has been shown to be very useful for investigating the epidemiology and population biology of C. albicans and has identified many distinct major and minor clades. In the present study, we used MLST to investigate the population structure of C. dubliniensis for the first time. Combinations of 10 loci previously tested for MLST analysis of C. albicans were assessed for their discriminatory ability with 50 epidemiologically unrelated C. dubliniensis isolates from diverse geographic locations, including representative isolates from the previously identified three Cd25-defined major clades and the four ITS genotypes. Dendrograms created by using the unweighted pair group method with arithmetic averages that were generated using the data from all 10 loci revealed a population structure which supports that previously suggested by DNA fingerprinting and ITS genotyping. The MLST data revealed significantly less divergence within the C. dubliniensis population examined than within the C. albicans population. These findings show that MLST can be used as an informative alternative strategy for investigating the population structure of C. dubliniensis. On the basis of the highest number of genotypes per variable base, we recommend the following eight loci for MLST analysis of C. dubliniensis: CdAAT1b, CdACC1, CdADP1, CdMPIb, CdRPN2, CdSYA1, exCdVPS13, and exCdZWF1b, where “Cd” indicates C. dubliniensis and “ex” indicates extended sequence.
Candida dubliniensis is a pathogenic yeast species that is phenotypically, genetically, and phylogenetically very closely related to Candida albicans, the yeast species most commonly associated with infection in humans (49, 51). Despite their close relationship, C. albicans is significantly more pathogenic (49, 50). C. dubliniensis is most commonly associated with oral carriage and infection in human immunodeficiency virus (HIV)-infected and diabetic patients, although it has been identified as a minor constituent of the commensal floras in the oral cavities of healthy individuals (40, 49, 50). Although C. dubliniensis has also been recovered from patients with systemic infections, its incidence is far lower than that of C. albicans. While the latter is responsible for 40 to 60% of cases of candidemia, C. dubliniensis has been identified in only 1 to 2% of blood culture yeast samples (11, 15, 27, 29-31). These epidemiological data are reflected in the results of animal infection model studies that demonstrated that C. albicans is significantly more pathogenic than C. dubliniensis (22, 47, 56). The reasons for the lower virulence of C. dubliniensis relative to that of C. albicans have not been investigated in detail; however, recently published data showed that C. dubliniensis has a reduced capacity to produce hyphae that results in lower levels of colonization and tissue invasion (47).
In order to be able to perform meaningful and informative epidemiological studies of Candida isolates, it is essential to be able to discriminate between unrelated strains of the species of interest. Ideally, strain differentiation methods should be highly discriminatory, reproducible, and suitable for the analysis of large numbers of isolates. To date, DNA fingerprinting using the species-specific, semirepetitive-sequence-containing DNA probe Cd25 has been the most widely applied and informative tool used for C. dubliniensis epidemiology and population analysis. When first developed, data generated using this probe showed that C. dubliniensis is comprised of two distinctive major clades, termed Cd25 fingerprint groups I and II (21, 26). Cd25 group I isolates are all closely related, with an average similarity coefficient (SAB) value of approximately 0.8 (range, 0.8 to 0.86), where A and B represent two different strains. These isolates comprise the majority of isolates investigated to date recovered in many countries around the world, mainly from HIV-infected individuals (4, 21, 26). Furthermore, sequence analysis of the internally transcribed spacer (ITS) region of the rRNA gene cluster revealed that Cd25 group I isolates consist of a single ITS genotype: ITS genotype 1 (21). In contrast, Cd25 group II isolates are more diverse, with an average SAB value of 0.52 (range, 0.07 to 0.67) (4), and consist of three separate ITS genotypes (ITS genotypes 2 to 4), which correspond to distinct subclades within the Cd25 group II fingerprinting clade (21). More recently, a third major clade, termed Cd25 group III, was identified among isolates from Egypt and Saudi Arabia and displayed an average SAB value of 0.35 (range, 0.16 to 0.54) (4). The DNA fingerprints of Cd25 group III isolates are very distinctive relative to those of isolates from Cd25 groups I and II, and ITS sequence analysis revealed that they belong to ITS genotypes 3 and 4 (4). All Cd25 group III isolates examined to date exhibit resistance to 5-flucytosine (4).
DNA fingerprinting analysis of large numbers of C. albicans isolates using the species-specific, repetitive-sequence-containing Ca3 probe has demonstrated that the population structure of the species is complex, with five major clades, some of which appear to be associated with specific geographic locations (6, 42, 46). In comparison, the population structure of C. dubliniensis determined with the Cd25 probe is significantly less complex (4, 21, 26). Although DNA fingerprinting has been shown to be a very useful tool in the molecular epidemiological analysis of C. albicans and C. dubliniensis populations, it is time-consuming, expensive, and not conducive to interlaboratory comparisons. There are many other molecular strain-typing techniques that have been applied to the analysis of Candida species (e.g., karyotyping and randomly amplified polymorphic DNA fingerprinting, etc. ). However, all of these methodologies also suffer from drawbacks, particularly in relation to reproducibility. In the late 1990s, multilocus sequence typing (MLST), a technique based on the nucleotide sequence analysis of a set of housekeeping genes, was developed for the population analysis of several bacterial species (28). This technique has also been applied to the analysis of the diploid yeast C. albicans (9, 10, 54) and to other Candida species (17, 25, 53). In 2002, Bougnoux et al. identified six housekeeping gene loci that allowed accurate and reproducible discrimination between unrelated C. albicans isolates (9). A study by Tavanti et al. used four of these loci and a further four loci and also showed high levels of discrimination (54). These two groups of researchers subsequently revised the combination of loci used in the MLST analysis of C. albicans, aiming to identify the minimum number of loci required to maintain the high discriminatory standard of the scheme (10). An agreed consensus scheme between the different laboratories examined seven loci, AAT1a, ACC1, ADP1, MPIb, SYA1, VPS13, and ZWF1b, and allowed the application of MLST to the analysis of C. albicans epidemiology and population structure (7-9, 16, 36, 52). MLST analysis has since been demonstrated to be as sensitive as DNA fingerprinting (43), and due to the nature of the data (i.e., DNA sequences of specific loci), these can be used to create a large database generated by multiple laboratories (7). In the case of C. albicans, it has also been shown that the strain groupings identified by MLST correlate with clades of C. albicans organisms identified using the species-specific DNA fingerprinting probe Ca3 (52). When applied to specific epidemiological studies, the use of MLST has identified intrafamilial transmission of C. albicans from the human digestive tract, strain maintenance, replacement, and microevolution (8).
The purpose of the present study was to determine the usefulness of MLST for investigating the epidemiology and population structure of C. dubliniensis. In addition, we hypothesized that if the same set of MLST loci currently used for C. albicans could be applied to the analysis of C. dubliniensis, then comparative sequence data could be used to provide valuable information concerning the evolutionary relatedness of the two species.
The 50 epidemiologically unrelated C. dubliniensis isolates used in this study are shown in Table Table1.1. Isolates were selected from diverse geographical locations and included representatives of the three Cd25 major fingerprinting clades and all four ITS genotypes. The majority of the isolates studied have been described previously; however, a number of new clinical isolates were also included (Table (Table1).1). These new isolates were initially presumptively identified on the basis of dark-green colony coloration on Candida-selective chromogenic agar (Oxoid Ltd., Hampshire, United Kingdom) and on the basis of hyphal fringe production on Pal's agar after 48 h of growth at 30°C as described by Al Mosaid et al. (3). The identities of presumptive C. dubliniensis isolates were further confirmed using the API ID 32C yeast identification system (bioMérieux, Paris, France) (37). Definitive identification of C. dubliniensis was confirmed by PCR using primers specific for the CdACT1-associated intron as described previously (18).
Isolates were routinely cultured on potato dextrose agar (Oxoid) medium, pH 5.6, at 37°C. Liquid cultures were grown overnight in yeast extract-peptone-dextrose broth at 37°C in an orbital incubator (Gallenkamp, Leicester, United Kingdom) at 200 rpm.
A selection of 50 C. albicans isolates (Table (Table2)2) was chosen to represent a range of the 17 MLST clades recently described by Odds et al. (35). Sequence data for each of the 50 C. albicans isolates were available at http://test1.mlst.net/ for the seven collaborative consensus MLST loci, AAT1a, ACC1, ADP1, MPIb, SYA1, VPS13, and ZWF1b (35), and at http://calbicans.mlst.net/ for the RPN2 locus (9). All locus sequences were treated in a manner identical to that used for the C. dubliniensis sequence data, as described below.
Analytical-grade or molecular biology-grade chemicals were purchased from Sigma-Aldrich Ireland Ltd. (Tallaght, Dublin, Ireland) or Fisher Scientific Ltd. (Loughborough, United Kingdom). Enzymes were purchased from the Promega Corporation (Madison, WI) or New England Biolabs Inc. (Beverly MA) and used according to the manufacturers' instructions. Custom-synthesized oligonucleotides were purchased from Sigma Genosys Biotechnologies Europe Ltd. (Pampisford, Cambridgeshire, United Kingdom).
Isolates were grown overnight in 5 ml of yeast extract-peptone-dextrose broth, and cells from 1.5 ml of culture were harvested by centrifugation. DNA was extracted from the cells as described by Gallagher et al. (20).
Template DNA was tested in separate PCR amplification experiments with each of the primer pairs G1F/G1R, G2F/G2R, G3F/G3R, and G4F/G4R to identify the ITS genotype of the isolate (21). Genotypes are ascribed based on the nucleotide sequences of the ITS1 and ITS2 regions and of the intervening 5.8S rRNA gene (57). Template DNAs from the four reference C. dubliniensis isolates (CD36, genotype 1; Can4, genotype 2; CD519, genotype 3, and p7718, genotype 4) previously described by Gee et al. (21) were used in control amplification reactions. Each PCR was carried out with one pair of ITS genotype-specific primers and the universal fungal primers RNAF/RNAR (19), which amplify approximately 610 bp from all fungal large-subunit rRNA genes and were used as an internal positive control. Genotyping experiments were performed on a minimum of two occasions, with each isolate being tested with separately prepared C. dubliniensis template DNA.
Due to the high level of sequence homology between the majority of C. albicans and C. dubliniensis open reading frames (33), all loci previously examined for the purpose of MLST analysis in C. albicans were also investigated for their potential use with C. dubliniensis. The usefulness of the six genes in C. albicans examined as described by Bougnoux et al. (9) (i.e., ACC1, VPS13, GLN4, ADP1, RPN2 and SYA1) and the additional four loci described by Tavanti et al. (54) (i.e., AAT1a, AAT1b, MPIb, and ZWF1b) was assessed. The C. albicans MLST locus sequences were used in separate BLAST searches against the C. dubliniensis genome sequence database (the Wellcome Trust Sanger Institute C. dubliniensis genome sequence project, http://www.sanger.ac.uk/sequencing/Candida/dubliniensis/). For each MLST locus, sequences were aligned using the CLUSTAL W sequence alignment computer program (55), and the C. albicans primer binding regions were identified in the alignment. Any nucleotide differences in the primer binding regions between the species were adjusted in the corresponding C. dubliniensis oligonucleotides to facilitate optimum amplification. All of the C. dubliniensis-optimized primer pairs yielded a single PCR product of the expected size (ranging in size from 400 bp to 700 bp) (Table (Table33).
PCR assays were carried out in 50-μl reaction volumes containing a 200 μM concentration of each deoxynucleoside triphosphate, 1.25 U of GoTaq polymerase (Promega), 10 μl (1×) of GoTaq FlexiBuffer (Promega), 3 μM MgCl2, 100 pmol of each primer, and 1 ng of the DNA template. PCR products were purified using a QIAquick 96-well PCR purification kit (Qiagen Science, MD) and were sequenced on both strands using the same primers that had been used previously for amplification. DNA sequencing reactions were performed commercially by Cogenics (Essex, United Kingdom) using an ABI 3730xl DNA analyzer.
Sequence analysis was performed by examination of chromatogram files using the ABI prism Seqscape software, version 2.0 (Applied Biosystems, Foster City, CA). The sequences of all of the loci examined are provided in the supplemental material. Numbers were assigned to unique genotypes for each locus, and genotype numbers were then combined to yield a diploid sequence type (DST) number. All genotype numbers and DST numbers are also available in the supplemental material. Maximum-parsimony trees and dendrograms based on analysis by the unweighted pair group method with arithmetic averages (UPGMA) were constructed using the Bionumerics software package, version 4.6 (Applied Maths NV, Sint-Martens-Latem, Belgium), based on concatenated C. albicans and C. dubliniensis MLST sequences. The discriminatory power of each MLST scheme was determined using Hunter's formula (24).
Linkage disequilibrium was assessed using the index of association, as described by Smith et al. (44) and as calculated with the Multilocus 1.3 software package, available at http://www.agapow.net/software/multilocus/ (1), using genotype numbers for all loci (scheme C) from all 50 C. dubliniensis isolates. The levels of significance for nonrandom association between loci were computed under the null hypothesis of a freely recombining population (panmixia).
The stability and reproducibility of the sequence data at each MLST locus were assessed by carrying out the sequence analysis in duplicate on two randomly selected isolates. For each locus and isolate, the duplicate DNA extractions, PCRs, and sequencing reactions were carried out independently. Resulting sequence duplicates for each isolate were compared to each other.
All loci previously examined for the purpose of analyzing MLSTs in C. albicans were also investigated for their potential use with C. dubliniensis. Homologous genes were found in each case, and comparison of the sequences of the complete open reading frames of the orthologous pairs revealed homologies ranging from 89% to 94%, while the parts of the genes that were analyzed for MLST purposes displayed 88% to 100% sequence identity (Table (Table33).
The current consensus MLST scheme for C. albicans examines 2,883 nucleotides from seven loci: AAT1a, ACC1, ADP1, MPIb, SYA1, VPS13, and ZWF1b. For the purposes of this study, this scheme is referred to as scheme A (Table (Table4).4). The corresponding scheme A loci were examined in the 50 C. dubliniensis isolates included in this study. The scheme A loci resulted in 23 single nucleotide polymorphisms (SNPs) from the equivalent 2,883 nucleotides (0.8%) and identified 33 genotypes. In scheme A, only one site (i.e., position 186 in SYA1) showed a polymorphism at the same position in both C. albicans and C. dubliniensis, whereas all other variable sites were at different locations in the two species. Tavanti et al. analyzed eight loci (54) (AAT1a, AAT1b, ACC1, ADP1, MPIb, SYA1, VPS13, and ZWF1b) for sites of heterozygosity among 50 isolates of C. albicans. The group identified 87 sites that displayed polymorphisms, 71 of which (81.6%) also displayed heterozygosity. The same loci were studied for sites of heterozygosity in the 50 isolates of C. dubliniensis (Table (Table1),1), and of the 30 sites displaying variability, 9 (30%) displayed heterozygosity.
Due to the low levels of polymorphism observed in the C. dubliniensis loci examined, we investigated whether increasing the length of the sequences examined had the potential to increase the discriminatory power of the method. To achieve this, additional nucleotide sequence data (range, 100 bp to 280 bp) at each of the loci in both the 5′ and 3′ directions of the original sequence fragment were also analyzed for the potential presence of polymorphic sites. Sequence analysis of the extended fragments in CdVPS13 and CdZWF1b (exCdVPS13 and exCdZWF1b, respectively) revealed the presence of additional SNPs (Table (Table3).3). The inclusion of these two extended sequences together with the sequences of the other five loci generated a second MLST scheme, termed scheme B (Table (Table4),4), which was made up of the sequences of the CdAAT1a, CdACC1, CdADP1, CdMPIb, CdSYA1, exCdVPS13, and exCdZWF1b loci. Examination of exCdVPS13 revealed one extra SNP, which gave rise to two extra genotypes (Table (Table5),5), and exCdZWF1b yielded two further SNPs, which in turn gave rise to one extra genotype. Scheme B resulted in a new total of 3,285 nucleotides, 26 (0.8%) of which displayed SNPs, resulting in four additional genotypes (Table (Table55).
In a further attempt to improve the discriminatory power of this method for C. dubliniensis, three other sequence segments in C. dubliniensis that were previously investigated for possible use in the MLST analysis of C. albicans were also analyzed, although these were not included in the final consensus C. albicans MLST scheme. Sequences from these additional loci, CdAAT1b, CdGLN4, and CdRPN2 (Table (Table3),3), comprised an additional 1,051 nucleotides, 11 of which displayed SNPs (1.05%). The 10 sequence fragments (the scheme B loci and the three additional loci) were all analyzed together in a third scheme, termed scheme C (Table (Table4),4), which was based on the sequences of the CdAAT1a, CdAAT1b, CdACC1, CdADP1, CdGLN4, CdMPIb, CdRPN2, CdSYA1, exCdVPS13, and exCdZWF1b loci. CdGLN4 displayed four of the further 11 SNPs; however, it gave rise to only two genotypes among the 50 isolates investigated. Similarly, CdAAT1b displayed four SNPs, giving rise to five genotypes, and the CdRPN2 locus contained three polymorphic sites, which gave rise to three genotypes. Scheme C analyzed a total of 4,336 nucleotides, 37 of which (0.85%) displayed SNPs, resulting in a total of 48 genotypes from 50 C. dubliniensis isolates. The minimum number of MLST loci for maximum discrimination among isolates of C. dubliniensis was determined according to the highest number of genotypes per variable base in each locus (Table (Table44 and Table Table5)5) and is referred to as scheme D, consisting of the eight loci CdAAT1b, CdACC1, CdADP1, CdMPIb, CdRPN2, CdSYA1, exCdVPS13, and exCdZWF1b. Scheme D displayed a total of 32 SNPs, resulting in 40 genotypes from the 50 isolates of C. dubliniensis investigated. Polymorphic sites and resulting genotypes are summarized in Tables Tables44 and and55.
The effect of nucleotide polymorphism on the resulting amino acid sequence was investigated by mapping the triplet codons for each gene fragment and examining the effect that the SNP had on each codon in question. Nucleotide polymorphisms and amino acid substitutions are summarized in Table Table6.6. Of the 37 SNPs identified in the 10 C. dubliniensis loci examined (with substitution of CdVPS13 for exCdVPS13 and CdZWF1b for exCdZWF1b), 13 (35%) resulted in nonsynonymous amino acid changes, while all of the remaining nucleotide changes resulted in synonymous polymorphisms. Seven of the 13 nonsynonymous polymorphisms affected the resulting amino acid substantially, i.e., affecting the pH (4 of 7) or polarity (3 of 7) of the amino acid.
Sequence analysis was performed in duplicate for each of two randomly selected isolates (Table (Table1)1) per MLST locus, using independently prepared DNA, PCR, and sequencing reactions. For each MLST locus, duplicate sequences for each isolate showed 100% sequence identity, displaying full conservation of both SNPs and sites of heterozygosity at each locus.
Examination of the C. albicans MLST locus set (scheme A) in the 50 C. dubliniensis isolates investigated identified 20 unique DSTs based on the unique combinations of the genotype numbers for the seven loci examined. Application of Hunter's formula (24) to this data set infers that MLST using scheme A has a discriminatory power of 0.899 when applied to C. dubliniensis, compared with a value of 0.996 when applied to C. albicans (35). Extension of the CdVPS13 and CdZWF1b loci (scheme B) resulted in a further 2 DSTs, and adding the CdAAT1b, CdGLN4, and CdRPN2 loci (scheme C in Table Table4)4) resulted in a further 4 DSTs, giving a total of 26 DSTs from the 10 loci. Extending the CdZWF1b and CdVPS13 gene fragments and including the three other loci in scheme C increased the discriminatory power to 0.9102 (Table (Table4).4). DST 4 was the most common DST identified using the scheme A set of loci in C. dubliniensis, corresponding to 14 of the 50 isolates examined. The set of seven loci identified 11 DSTs that were unique to single C. dubliniensis isolates. When the larger set of loci was used in scheme C, the same 14 isolates from DST 4 in scheme A referred to above also gave an identical sequence type, this time termed DST 7, which correlated with the previously identified DST 4, making this the most common DST in the scheme examined for C. dubliniensis. Using this larger set of loci, 19 of the 26 DSTs were unique to single isolates.
A UPGMA dendrogram was constructed based on the sequence data from all 10 loci (scheme C) examined in C. dubliniensis by using the Bionumerics version 4.6 software program (Fig. (Fig.1).1). At a cutoff node of 99.7% sequence homology, the dendrogram revealed the presence of three major clades of isolates termed C1 to C3, which showed a significant degree of correlation with the major clades previously identified by fingerprinting using the Cd25 fingerprint probe (Fig. (Fig.11 and Table Table1).1). Clades C1 and C2 corresponded to the previously identified Cd25 fingerprint groups I and II, respectively, whereas clade C3 included strains previously identified as belonging to Cd25 fingerprint groups II and III, respectively (Fig. (Fig.11 and Table Table1).1). Furthermore, clade C1 consisted solely of ITS genotype 1 isolates, clade C2 consisted solely of ITS genotype 2 isolates, whereas clade C3 consisted of ITS genotype 3 and 4 isolates.
Genetic diversity and linkage disequilibria were assessed by using statistics implemented in the Multilocus 1.3 software (1) and genotypes obtained for all loci investigated in this study (scheme C). Each of these statistics tested the null hypothesis of a freely recombining population. Highly significant linkage disequilibria (P was <10−5 with 100,000 randomizations) were found for both the total collection of 50 C. dubliniensis isolates and a reduced collection of 26 isolates that represented each of the 26 DSTs defined by scheme C (i.e., that did not contain repeated genotypes) (data not shown). These results provide evidence that the sample of C. dubliniensis isolates analyzed in this study represents a clonal population.
C. albicans MLST sequences were available for the seven consensus MLST loci (i.e., AAT1a, ACC1, ADP1, MPIb, SYA1, VPS13, and ZWF1b) and an extra locus, RPN2. The resulting 3,189 nucleotides from the eight loci resulted in a fifth scheme, termed scheme E (Table (Table4).4). All locus sequences were concatenated and treated as one sequence for each of 50 C. albicans isolates and the corresponding sequences in 50 C. dubliniensis isolates in order to allow comparison between the two species using MLST. These concatenated sequences were used to construct a maximum-parsimony tree in order to assess comparative phylogenies between the species (Fig. (Fig.2).2). The maximum-parsimony tree displays the comparative divergence within the two species, as well as the level of relatedness between the two species. The maximum-parsimony tree demonstrated that C. albicans isolates belonging to different MLST clades can differ at as many as 90 nucleotide sites but that isolates belonging to the same MLST clade can differ at as many as 31 nucleotide sites (data not shown). In contrast, our data showed that C. dubliniensis consisted of three closely related clades, termed clades C1, C2, and C3 (Fig. (Fig.2),2), that show complete agreement with the clades described by the UPGMA dendrogram (Fig. (Fig.1).1). C. dubliniensis isolates from the three MLST clades differ from each other at a minimum of 10 nucleotide sites (range, 10 to 23 nucleotides). Isolates belonging to clade C1 differ at a maximum of 10 nucleotide sites, isolates belonging to clade C2 differ at a maximum of 6 nucleotide sites, and isolates belonging to clade C3 differ at a maximum of 8 nucleotide sites (data not shown).
MSLT has previously been shown to be a useful tool in the analysis of the epidemiology and population structure of C. albicans (8-10, 35, 52, 54). The purpose of the present study was to determine the usefulness of MLST in the analysis of C. dubliniensis. In addition, we investigated whether the same set of MLST loci currently used for C. albicans could be applied to the analysis of C. dubliniensis due to the high levels of nucleotide sequence homology (~90%) shared by the two species, thus allowing comparative sequence data to be used to provide valuable information concerning the evolutionary relatedness of the two species.
The current MLST scheme in use for isolates for C. albicans examines seven loci: AAT1a, ACC1, ADP1, MPIb, SYA1, VPS13, and ZWF1b (scheme A in Table Table4)4) and was also applied in the analysis of C. dubliniensis isolates. It demonstrated poor levels of discrimination among C. dubliniensis isolates, identifying only 20 DSTs from 50 isolates. SNPs have been detected in 172 of these 2,883 nucleotides (6%) in C. albicans (5). To date, 1,391 isolates of C. albicans have been examined, identifying 1,005 DSTs (35). An identical scheme demonstrated poor levels of discrimination among C. dubliniensis isolates, identifying only 20 DSTs from 50 isolates. Discriminatory powers were improved by extension of the CdVPS13 and CdZWF1b loci and by incorporation of an additional three loci, AAT1b, GLN4, and RPN2, which had also previously been examined for use in the MLST analysis of C. albicans. The scheme (i.e., scheme C) which examined the largest number of loci (Table (Table4)4) identified 26 DSTs from the 50 C. dubliniensis isolates. The 10 loci included in scheme C were concatenated and used in the construction of a UPGMA dendrogram (Fig. (Fig.1),1), which clustered the isolates into three distinct major clades at a cutoff sequence homology of 99.7%. A maximum-parsimony tree (Fig. (Fig.2)2) was also constructed using concatenated sequence data, this time with a further scheme, scheme E (Table (Table4),4), which was based on the seven loci from scheme A and an extra locus, RPN2 (Table (Table4).4). Sequence data for these eight loci were available for both C. albicans and C. dubliniensis, thus enabling the comparative study of the population structures for both species. Our study shows that the C. albicans clades are more divergent than those observed in C. dubliniensis (Fig. (Fig.2).2). The maximum-parsimony tree identified three distinct major clades (C1 to C3) in the population of C. dubliniensis, which were identical to those of the UPGMA dendrogram (Fig. (Fig.1).1). Overall, 257 nucleotides of the 3,189 nucleotides analyzed (8%) were identified as being different between the two species, correlating with the level of sequence homology typically exhibited between the two species (33).
The population structure of C. dubliniensis as determined by MLST correlated with the population structure previously determined using the complex fingerprinting probe Cd25 and on the basis of ITS genotypes (4, 21, 26). In previous studies, isolates assigned to ITS genotype 1 belonged exclusively to Cd25 group I (4, 21, 26). Similarly, by the MLST method, all of the ITS genotype 1 isolates tested belonged to MLST clade C1 (Fig. (Fig.11 and Table Table1).1). Cd25 group II was previously shown to consist of isolates belonging to ITS genotypes 2, 3, and 4 (21). The MLST method breaks Cd25 group II into two groups, assigning all of the ITS genotype 2 isolates to MLST clade C2 exclusively and assigning the ITS genotype 3 and 4 isolates to MLST clade C3 (Fig. (Fig.11 and Table Table1).1). Interestingly, a number of ITS genotype 3 and 4 isolates in MLST clade C3 displayed the same DST and localized to the same area on both the UPGMA and maximum-parsimony trees (Fig. (Fig.11 and and2).2). These isolates were all previously associated with Cd25 fingerprint group III (Table (Table1)1) and displayed high levels of resistance to the antifungal drug 5-flucytosine (4). The agreement between these three methods suggests that MLST may be applied as a reliable alternative method for studying the population structure of C. dubliniensis. MLST is less time-consuming, more reproducible, and more conducive to comparison between laboratories than other methods previously used for this purpose.
The best set of eight loci (CdAAT1b, CdACC1, CdADP1, CdMPIb, CdRPN2, CdSYA1, exCdVPS13, and exCdZWF1b) proposed for maximum discrimination among isolates of C. dubliniensis using the minimum number of MLST loci possible was determined according to the highest number of genotypes per variable base in each locus and is referred to as scheme D (Table (Table4).4). The AAT1b locus is no longer included among the more discriminatory seven loci currently recommended for use in C. albicans MLST studies (10), and therefore the corresponding CdAAT1b locus should be replaced with the CdAAT1a locus for the purpose of comparative population analysis between the two species. The eight loci recommended for the purposes of comparative population analysis between C. albicans and C. dubliniensis using MLST are therefore from scheme E: CdAAT1a, CdACC1, CdADP1, CdMPIb, CdRPN2, CdSYA1, CdVPS13, and CdZWF1b (Table (Table44).
The MLST data suggest a relatively low level of divergence in the population structure of C. dubliniensis relative to that of C. albicans. Interestingly, this is in contrast to the findings of electrophoretic karyotypic analysis showing that isolates of C. dubliniensis have significantly different major karyotypic patterns due to the processes of microevolution and chromosomal rearrangement (21). The low level of sequence variation throughout the population of C. dubliniensis suggests that MLST may not be ideal for local epidemiological studies (e.g., an outbreak in a hospital); in this instance, karyotypic analysis may be more appropriate. Karyotype analysis was able to distinguish the Cd25 group I and II isolates to a large extent but was unable to distinguish among the ITS genotypes. In comparison, the MLST method is able to distinguish readily among certain ITS genotypes, most notably the ITS genotype 1 and 2 isolates. ITS genotype 3 and 4 isolates could not be distinguished reliably by MLST, due to the lack of sequence variation among these isolates, the majority of which to date have been recovered from the Middle East. However, results may possibly be improved with the inclusion of a larger number of ITS genotype 3 and 4 isolates from geographical locations outside of the Middle East. A possible reason for the low level of discrimination is the relatively small collection of isolates studied. However, isolates were recovered from a broad range of geographical locations and included isolates recovered from both carriage and systemic infection. Another possible reason for the low level of sequence variation and heterozygosity is the lack of divergence within the population of C. dubliniensis strains. Results of genotypic diversity and linkage diversity analyses suggested that the sample of 50 C. dubliniensis isolates investigated in this study represent a clonal population. However, it is important to emphasize that the sample number was relatively small, even though many of the isolates were recovered from disparate geographic locations around the world. Furthermore, it is possible that a more diverse population of C. dubliniensis strains exists in nonhuman hosts and that this is not reflected in the present study (see below). Finally, the recent divergence of C. dubliniensis from its ancestor C. albicans suggests a C. dubliniensis population that is less divergent than that of C. albicans, as the latter has had more time to diverge into major and minor clades. C. dubliniensis is a poor pathogen in comparison to C. albicans, since it rarely causes infections in healthy individuals and therefore may be under less pressure to adapt to different host environments. However, it is also possible that the natural host/reservoir for C. dubliniensis is not humans. Candida species have also been recovered from nonhuman hosts, such as dogs, cats, birds, and chameleons (12, 14, 41, 52). Recent environmental studies have reported the recovery of ITS genotype 1 C. dubliniensis isolates from Ixodes uriae ticks on the Great Saltee Island off southeastern Ireland (34), the supposed source being bird excrement. Candida species such as C. albicans, C. guilliermondii, and C. tropicalis have been recovered previously from the gastrointestinal tracts and cloacae of birds (12, 14, 23). It is conceivable that humans represent only a minority host for C. dubliniensis and that the isolates studied to date represent a relatively small subpopulation of the species. In order to investigate this possibility further, it will be necessary to increase the number of isolates analyzed by MLST, including isolates from avian and possibly other nonhuman hosts. We anticipate that such additional studies will identify additional MLST clades. These studies are currently in progress. Further studies will also include investigating the levels of recombination events in C. dubliniensis, as Odds et al. have recently shown that a high frequency of recombination is apparent in C. albicans (35).
This study was supported by the Microbiology Research Unit, Dublin Dental School & Hospital. D.D. is the recipient of a Ph.D. fellowship from the Institut National de la Recherche Agronomique.
Published ahead of print on 5 December 2007.
†Supplemental material for this article may be found at http://jcm.asm.org/.