|Home | About | Journals | Submit | Contact Us | Français|
This communication describes the consensus multi-locus typing scheme established by the Cryptococcal Working Group I (Genotyping of Cryptococcus neoformans and C. gattii) of the International Society for Human and Animal Mycology (ISHAM) using seven unlinked genetic loci for global strain genotyping. These genetic loci include the housekeeping genes CAP59, GPD1, LAC1, PLB1, SOD1, URA5 and the IGS1 region. Allele and sequence type information are accessible at http://www.mlst.net/.
Cryptococcus neoformans, the agent of cryptococcosis, had been considered a homogeneous species until 1949 when the existence of four serotypes was revealed based on the antigenic properties of its polysaccharide capsule . Such heterogeneity of the species, however, remained obscure until the two morphologically distinct teleomorphs of C. neoformans were discovered during the mid 1970s [2,3]. The teleomorph Filobasidiella neoformans was found to be produced by strains of serotype A and D  while F. bacillispora was found to be produced by strains of serotype B and C . Ensuing studies revealed numerous differences between the anamorphs of the two Filobasidiella species with regards to their ecology, epidemiology, pathobiology, biochemistry and genetics.
Presently, the etiologic agent of cryptococcosis is classified into two species , C. neoformans, with two varieties: C. neoformans var. grubii (serotype A)  and C. neoformans var. neoformans (serotype D) , as well as an AD hybrid, and C. gattii (serotypes B and C) . Intra-species genetic diversity has also been revealed as more genotyping methods have been applied for each serotype. In addition inter-species hybrid strains of AB and BD serotypes have been described [8,9]. As a result, the number of scientifically valid species within C. neoformans has become a controversial issue because of the differing opinions among taxonomists as to the appropriate definition of a species. There are several research groups focusing on the molecular determination of the number of genetically diverse sub-groups within each serotype. The molecular methods employed by each group to define these sub-groups vary from DNA fingerprinting [10,11] and PCR fingerprinting based on microsatellite- (M13) or minisatellite-specific primers (e.g., (GACA)4 or (GTG)5) [12–16], over random amplification of polymorphic DNA (RAPD) analysis [17–20], amplified fragment length polymorphism (AFLP) analysis [21–23], restriction fragment length polymorphism (RFLP) analysis of the URA5 [16,24] and PLB1 genes , the use of IGS sequences , multigene sequence analysis [27, Meyer et al. unpublished data], to multi-locus sequence typing (MLST) [23,28] and multi-locus microsatellite typing (MLMT) [29,30]. This research has revealed associations between geographic origin and particular genotypes, implying an epidemiologic significance of certain genotypes. Different methods have resulted in various numbers of sub-groups or different nomenclature of those sub-groups. However, due to the lack of a cross-reference consensus between the results obtained by different genotyping method, there is currently no concordance on a universally acceptable genotyping method for this important human pathogen.
Recognizing the urgent need for a standardized globally acceptable typing method, a Cryptococcus working group I, ‘Genotyping of Cryptococcus neoformans and C. gattii’, was formed under the umbrella of the International Society of Human and Animal Mycoses (ISHAM) in the beginning of 2007 which united all the major research groups that were involved in molecular strain typing of C. neoformans complex. The members of this ISHAM working group met at the 3rd Trends in Medical Mycology (TIMM3) Meeting in Torino, Italy in October 2007, and reviewed all the typing techniques in use. The group selected multi-locus sequence typing (MLST) as the method of choice for future strain typing in light of its high discriminatory power as well as reproducibility between different laboratories. The working group also chose standard reference strains representing the eight known major molecular types of the agent of cryptococcosis as well as the nomenclature of each genotype.
As a result of the Torino meeting, the working group recognized that the different genotyping methods used by the different research groups lead to corresponding major genotypes for the agents of cryptococcosis (Table 1). Principally, the two main typing systems being used are: PCR fingerprinting using primers specific for microsatellite (M13) [14,16] or minisatellite (GACA)4 DNA [13,15] and AFLP analysis . In both typing schemes, over 2000 isolates were grouped into eight major molecular types. With some exceptions [26,31], the molecular types of C. neoformans are correlated with the serotypes: C. neoformans var. grubii, serotype A, consists of molecular types VNI=AFLP1 and VNII=AFLP1A; the hybrid serotype AD comprises VNIII=AFLP3; and C. neoformans var. neoformans, serotype D, corresponds to VNIV=AFLP2. C. gattii consists of VGI=AFLP4, VGII = AFLP6, VGIII=AFLP5, and VGIV=AFLP7, which all correspond to both serotypes B or C [16,21, unpublished data]. Based on these findings, it was agreed by all cryptococcal working group members present in Torino to use the VNI–VNIV and VGI–VGIV nomenclature  since it correlated with the current concept of two species and represents the global population structure based on more than 2000 C. neoformans and C. gattii isolates among which C. neoformans var. grubii (serotype A=VNI) being the most prevalent molecular type world-wide.
To enable global standardization, the working group also agreed to use a set of standard strains representing each of the eight major molecular types. This included the molecular type strains used in PCR fingerprinting or URA5-RFLP analysis  plus additional strains representing type cultures or strains, which are used in major cryptococcal genome projects (Table 2). All standard strains are publicly available from the CBS-Fungal Biodiversity Centre (CBS) (http://www.cbs.knaw.nl), the American Type Culture Collection (ATCC) (http://www.atcc.org) or the Fungal Genetic Stock Center (FGS) (http://www.fgsc.net). The corresponding collection numbers are listed in Table 2.
To overcome problems arising from inter-laboratory reproducibility associated with the two commonly used typing techniques, such as PCR fingerprinting or AFLP analysis, the working group decided to use multi-locus sequence typing (MLST) as the method of choice for future cryptococcal strain typing. MLST has become the number one typing approach for epidemiological investigations of microorganisms . MLST, originally developed for bacteria , indexes the sequence variation in approximately 400–500 bp of five to ten genes composed primarily of housekeeping genes. This technique has proven to be highly discriminatory for a number of human pathogenic fungi: C. albicans , C. glabrata , C. tropicalis , Coccidioides spp.  and Histoplasma capsulatum . Most of the published MLST schemes are developed as tools for the wider scientific community, by being made publicly available as online databases at http://www.mlst.net/ and http://pubmlst.org/. In the case of the Cryptococcus species complex, two different MLST typing schemes have been introduced to type isolates of C. neoformans , and C. gattii , using twelve and eight unlinked loci respectively.
In the first study, 12 unlinked polymorphic loci: MPD1, TOP1, MP88, CAP59, URE1, PLB1, CAP10, GPD1, TEF1, SOD1, LAC1 and the IGS1 ribosomal RNA intergenic spacer region, which are dispersed on nine different chromosomes, were used to type 102 globally obtained serotype A strains . MLST differentiated three major groups among the studied isolates, corresponding to VNI, VNII and VNB, a Botswana specific genotype closely related to VNI. In connection with this study a central web based database was created at www.mlst.net (http://cneoformans.mlst.net/) allowing for an online determination of the alleles and sequence types of C. neoformans serotype A strains.
The second study used eight unlinked polymorphic loci: SXIa or SXIα, IGS1, TEF1, GPD1, LAC1, CAP10, PLB1, and MPD1, of which two are mating type locus specific and can not be amplified for all strains, to type 202 C. gattii strains. These loci were supplemented for a more detailed analysis of 9 closely related strains by 22 additional gene loci: HOG1, BWC1, CNB1, TOR1, CAC1, CRG1, URE1, FHB1, BWC2, CNA1, CBP1, TSA1, STE7, FTR1, PAK1, CAP59, ICL1, GPA1, GPB1, RAS1, CCP1, and TRR1 to investigate the origin of the Vancouver Island outbreak isolates . MLST differentiated all four major molecular types of C. gattii (VGI, VGII, VGIII and VGIV) and highlighted two possible origins (Australia or South America) for the outbreak strains.
Statistical analysis using the Simpsons’s index of diversity  revealed that for both previously studied MLST data sets, a minimum of seven loci are required to differentiate between the sequence types of all strains (Fig. 1). For the Litvintseva et al.  MLST data set, the following loci resulted in the highest discrimination of the investigated strains: CAP59, IGS1, GPD1, LAC1, PLB1, MP88 and SOD1, with a Simpson’s index of diversity of 0.9632. For the Fraser et al.  MLST data set, the most discriminatory loci were: GPD1, IGS1, TEF1, LAC1, MPD1, CAP10 and PLB1, which resulted in a Simpson’s index of diversity of 0.9319.
Both MLST schemes utilized highly polymorphic loci, which resulted in stable and reproducible typing systems that were able to distinguish between closely related strains. While using as many genetic loci as possible would enhance the discriminatory power of the MLST scheme, it would be pragmatic to achieve the maximal level of differentiation with a minimal set of genetic loci. The ideal MLST scheme for the Cryptococcus species complex should fulfill two criteria: (i) it should amplify and type the same genes from all five serotypes/eight molecular types using the same set of primers, and (ii) the selected genes should contain sufficient sequence diversity to produce a discriminatory typing scheme. Taking these facts into account, the working group has selected a set of seven gene loci for a cryptococcal consensus MLST scheme based on the results obtained in the previously published studies by Litvintseva et al. , Fraser et al. , and additional unpublished data obtained by Meyer et al. and Fisher et al. Special emphasis was placed on using loci that exhibited the largest number of different allele types, as well as the potential to use the same primers with all eight major molecular types identified previously for C. neoformans and C. gattii. These gene loci included six housekeeping genes CAP59, GPD1, LAC1, PLB1, SOD1, URA5, from which three genes code for cryptococcal virulence factors: the polysaccharide capsule (CAP59), melanin synthesis (LAC1) and cell invasion (PLB1), and the intergenic spacer, IGS1, which was selected based on its high allelic diversity.
All the herein proposed MLST loci, except for the CAP59 locus, are similar to the ones used previously enabling the incorporation of, and comparisons with all previously obtained data. The region of the CAP59 locus proposed for the consensus MLST scheme represents a different fragment of the CAP59 gene used by Litvintseva et al.  (Fig. 2). This new locus was chosen based on the fact that it can be amplified from all eight molecular types using the same primers.
An additional locus, TEF1, which also showed high discriminatory power when used for C. neoformans var. grubii and for C. gattii molecular type VGII, was excluded from the consensus typing scheme. This was based on the fact that sequence data are only available for C. neoformans var. grubii and technical problems had been encountered when amplifying this locus. However, this locus may offer additional discrimination in some of the eight major molecular types.
To enable amplification of all seven loci from the eight major molecular types of C. neoformans and C. gattii, the previously published primers were tested on all eight major molecular types in three of the six laboratories (Teun Boekhout’s laboratory at the CBS, June Kwon Chung’s laboratory at the NIH, Matthew Fisher’s laboratory at the Imperial College, Wieland Meyer’s laboratory at the University of Sydney, Tom Mitchell’s laboratory at Duke University, and Maria Anna Viviani’s laboratory at the Università degli Studi di Milano) that collaborated in the development of the herein presented consensus MLST scheme. Satisfactory amplifications were obtained for all loci except for the SOD1 locus, where two different sets of primers were finally used to amplify either VNI–VNIV for C. neoformans or VGI–VGIV for C. gattii (Table 3). The specific primers and the suggested amplification conditions to amplify the seven gene loci are given in Table 3. Primer directions are listed according to the orientation in the genome sequence of the strain H99 at the Broad Institute (http://www.broad.mit.edu). Variations in the quality of the amplification products, resulting from either the Taq DNA polymerase enzyme or the PCR machine and PCR conditions used, were observed between participating laboratories. For that reason, the amplification conditions given in Table 3 should only serve as a guideline that may be optimized by individual laboratories.
Allele types for C. neoformans were assigned according to Litvintseva et al.  and for C. gattii according to by Fraser et al. , if applicable. The exact start- and endpoints for the sequence of each analyzed locus are given in Table 3 based on the H99 genome sequence at the Broad Institute (http://www.broad.mit.edu/), these may change over time if more strains are studied. The latest sequence cut points are listed at the webpage for each locus. To standardize the assignment of allele types (AT) and sequence types (ST), a centralized globally accessible MLST database will be established at www.mlst.net/. The online software NRDB (http://linux.mlst.net/nrdb/nrdb.htm) allows for an automatic retrieval of allele and sequence types and will assign a new allele and sequence type for any submitted unknown sequence. These are then uploaded to the database via a database curator. The designated curators are contactable via the website.
In conclusion the ISHAM working group on ‘Genotyping of Cryptococcus neoformans and C. gattii’ proposes the following set of genetic loci as an international standard for multi-locus sequence typing for C. neoformans and C. gattii: CAP59, GPD1, LAC1, PLB1, SOD1, URA5 and IGS1.
The authors would like to thank Matthew O’Sullivan for allowing us to use the software page developed as part of his PhD to determine the number of gene loci to be essential for an MLST scheme based on the Simpson’s index of diversity. This work was supported by an NH&MRC project grant #352303 to Wieland Meyer. June Kwon-Chung was supported by funds from the intramural program of the National Institute of Allergy and Infectious Diseases, National Institutes of Health, USA. Matthew Fisher and David Aanensen were supported by the Wellcome Trust. Sitali Simwami was supported by the BBSRC, UK. Ferry Hagen was supported by funds from the Odo van Vloten Foundation. Anastasia P. Litvintseva and Thomas G. Mitchell were supported by a US Public Health Service NIH grant AI 25783. Luciana Trilles was supported by CAPES scholarship from the Ministério da Educação, Brazil. Sirada Kaocharoen was supported by the Chulalongkorn University Graduate Scholarship to commemorate the 72th anniversary of his majesty King Bhumibol Adulyadej, Thailand.
Declaration of interest: The authors report no conflicts of interest. The authors alone are responsible for the content and writing of the paper.