Molecular Analysis of C. gattii VGII Outbreak vs. Global Isolates
To examine the C. gattii outbreak isolates collected from 2005 to 2009 (), an in-depth stepwise molecular analysis was applied to each isolate, and the genotypes were compared with other global genotypes. In total, 20 markers were selected for analysis. These markers include both coding and noncoding genomic regions and range in size and allelic diversity (). Additionally, all of the markers are randomly distributed among the chromosomes in the most recent assembly of the reference C. gattii VGI genome, WM276 (). Initially, all isolates were sequenced at a total of eight MLST markers, and four variable number of tandem repeats (VNTR) markers (, ). Next, global isolates were selected for diversity, and several isolates from each of the primary genotypes in the expansion region were chosen for sequence analysis at eight additional MLST loci, bringing the total number of genetic markers analyzed for these isolates to 20 (). As expected, the MLST markers were less variable and more conserved, while the VNTR markers allowed for higher-resolution differentiation between isolates that appeared identical by MLST analysis. The generated datasets were then concatenated both without and with VNTR data (, ).
Geographic dispersal of pathogenic C. gattii genotypes in the United States.
Markers used in the study are dispersed in the genome.
Clustering analysis of global VGII isolates shows high global diversity.
Expanded molecular analysis reveals increased divergence in VGIIc.
The combined analysis of the results presented here, and a 30 marker MLST analysis conducted previously 
, reveal several findings of interest in relation to VGII genotypes in the region. From the analysis of 34 markers (30 MLST/4 VNTR), we show that the Vancouver Island VGIIa/major isolates are fully identical at all loci to several recent isolates from Washington and Oregon, as well as a historical clinical isolate (1970's), NIH444, from Seattle. Additionally, the VGIIb/minor isolates from Australia and Vancouver Island are identical at 34 total loci, and also identical to VGIIb/minor isolates from Oregon at 20 loci (16 MLST/4 VNTR). Furthermore, all VGIIc isolates to date are identical across all 20 loci examined (). However, we also are able to discriminate the outbreak VGIIa genotype from an environmental VGIIa isolate from California, CBS7750, and clinical VGIIa isolates CA1014 and ICB107 from California and Brazil, respectively, at one or more MLST/VNTR loci. It is clear from prior studies that the VGIIa/major and VGIIb/minor isolates are clonal lineages 
, and here we confirmed that this is the case for the nine VGIIc/novel isolates, based on 7-loci MLST analysis of the global VGII population (Figure S1
The largest and most comprehensive dataset arose from the combined analysis of seven MLST and four VNTR loci, resulting in a total of 41 sequence types (STs). This dataset was generated from clinical, veterinary, and environmental C. gattii
isolates (, Figure S1
, Table S3
). From the analysis, it is clear that the VGIIa/b/c clusters are all related to each other, but also distinct. In addition, the data show that the VGIIa/major clade is closely clustered to VGIIc, further validating prior reports that examined a more limited number of loci 
. In addition, VGIIc (ST21) shares high sequence identity to ST34, represented by a mating type a
clinical isolate from Colombia, suggesting that the VGIIc genotype may have resulted from a
-α mating, even though all isolates related to the Pacific NW outbreak are exclusively α mating type. Additionally, Vancouver Island isolates from our collection that had not been fully typed by MLST were sequenced at two loci to determine if any were unrecognized VGIIc isolates (n
56) (Figure S2
). Of these, 51 were found to be VGIIa, five were VGIIb, and none were VGIIc, consistent with previous data from the region. Thus, VGIIc appears to remain exclusive to the United States, specifically Oregon, and has never been reported from Vancouver Island, the mainland of Canada, Washington State, or elsewhere globally.
Within the VGIIa/major cluster, based on the initial MLST analysis of 30 loci, only a single isolate (ICB107) could be distinguished from the other VGIIa isolates, and this was at only one locus 
. To further investigate this homogeneous population causing the vast majority of the outbreak-related morbidity and mortality, we expanded the molecular analysis to include highly variable regions of the genome. The application of these VNTR markers, in combination with the MLST markers, allowed us to generate five independent STs from within the VGIIa/major genotype and related isolates ().
These five sequence types (ST1, ST2, ST3, ST13, ST30) contained a total of 44 isolates (, Table S3
). The canonical VGIIa/major outbreak genotype, ST1, contained the vast majority of the 44 isolates (n
38). As expected based on previous models of the C. gattii
outbreak expansion 
, ST1 consisted of isolates exclusively from the initial outbreak and expansion zones, including British Columbia, Washington, and Oregon (Table S3
). These results further validate the hypothesis that the epicenter of the outbreak was on Vancouver Island, beginning in the late 1990's, with a direct expansion into neighboring mainland British Columbia and subsequently into the United States 
. The only exception in this dataset is isolate NIH444, an older isolate from the region that was isolated from a patient sputum sample in Seattle in the early 1970's 
, which is also identical at all 34 markers examined. This suggests that the VGIIa/major genotype responsible for most of the outbreak cases may have been circulating in the region prior to the outbreak. The possible travel history of this patient is unknown, and could therefore have involved exposure on Vancouver Island. Overall, this analysis provides increased evidence that the outbreak genotype is unique to the region thus far, and molecularly distinct from closely related isolates from both California and South America.
While the homogeneous nature of the VGIIa/major isolates based on robust molecular typing validated previous models, an underlying diversity within this group was also discovered. First, we further validated that the isolate ICB107 (ST13), from Brazil, was indeed distinct from the ST1 VGIIa/major clade. This isolate differs at one MLST marker (LAC1
), and three VNTR markers (VNTR3, VNTR15, VNTR34). Additionally, the high-resolution sequence analysis was able to discriminate other VGIIa isolates that were collected from California. These include isolate CBS7750 (ST3), collected from the environment in San Francisco in 1990 
, and isolate CA1014 (ST2), which was isolated from a patient with HIV infection in southern California. Each of these two isolates differs from ST1 due to unique mutations within the VNTR7 and VNTR34 loci, respectively. This shows that similar VGIIa genotype isolates have been found elsewhere, but that none are identical to those circulating as part of the ongoing Vancouver Island outbreak. Whether these isolates are a result of drift from ST1, or if ST1 arose from one of these related genotypes is not known.
In addition to discriminating VGIIa isolates that were not from the outbreak region, we also found a novel ST, ST30, which is highly similar to ST1, but divergent at a unique region of VNTR34. Interestingly, all three of the ST30 isolates are exclusively from Oregon, including two human clinical cases and one marine mammal case (, , Table S3
). These results are consistent with an expansion followed by genetic drift in the highly variable VNTR loci. Isolates of ST30 have not been detected on Vancouver Island, indicating that this divergence is recent, and likely occurred after the expansion of ST1 into the United States. Alternatively, both ST1 (VGIIa/major) and ST30 may have been present for a long period, with only ST1 having been transferred to Vancouver Island.
To gain insights into the potential origins of the VGIIc genotype, and to assess its position within the overall VGII clade, clustering analysis was applied. Analysis of the combined dataset including 41 sequence types generated from 115 C. gattii isolates shows that the VGIIc genotype is independent, but similar to VGIIa (). The closest relationship determined from the analysis was to ST34, an isolate from Colombia, which is also of the opposite a mating type. Moving beyond the direct branch, it appears that the VGIIc genotype shares sequence similarities to global isolates from South America, Africa, and also European isolates with likely African origins based on collected clinical case histories. Additionally, the VGIIc group also shares the IGS1 allele with isolates from Australia, further obscuring the possible origins and necessitating a more thorough analysis ().
When the clustering analysis was expanded to include additional MLST loci (), both with and without the VNTR markers, the relationships of VGIIc to other global genotypes was further elucidated, with close relationships observed with global isolates from South America, Africa, Europe (Greece), and Australia (, , Table S4
). These results increase the comprehensiveness of the analysis, and allow predictions of the relationship of this genotype to global isolates. Examination of alleles illustrates that, when the analysis is expanded, the VGIIc group appears to be more diverse from VGIIa and VGIIb. Each allele represented in green was initially denoted as an allele that was unique to the VGIIc genotype, with a total of seven such alleles (). To further elucidate the possible origins of these alleles, isolates selected based on their global diversity were sequenced at these loci (). Identical matches for four of the seven VGIIc-unique alleles were identified in isolates from Brazil, Australia, Europe, and European isolates with likely African origins, while three alleles (SXI1
, and CRG1
) remain unique to this novel genotype and only seen in Oregon thus far ().
To further characterize the genetic relationships among the global isolates in relation to the outbreak isolates, maximum likelihood (ML) analysis was applied. Initially, the isolates were characterized at 15 MLST loci, excluding the MAT locus so that both α and a isolates could be included. This analysis indicates that VGIIc may be more distantly related to the VGIIa/major genotype than initially observed. In addition, analysis of the 15 MLST loci shows a possible relation of VGIIc with isolates from South America, Africa, Europe, and Australia (). When this analysis was expanded to also include the four VNTR loci, similar results for the global comparisons of all genotypes and the relation of VGIIc to global isolates were observed (). For these reasons, additional sampling and analysis will be necessary to more precisely elucidate if this novel virulent genotype originated locally, or originated in an under-sampled region.
In addition to clustering analyses, TCS haplotype-mapping software was applied to establish the evolutionary histories of the MLST alleles examined during the analysis (, , Figure S3
). From the sequence results, all of the VGIIc isolates were determined to be 100% identical, indicating that there was likely a recent emergence in which all of the isolates are clonally derived. To test this hypothesis, the TCS analysis allowed for the examination of individual loci to determine which alleles are likely ancestral, intermediate, or recently derived. Of the sixteen loci examined, eight were consistent with VGIIc possessing the ancestral allele, six of the alleles were distal nodes at the terminal end of the respective haplotype networks, and two loci were of intermediate allele positions.
Haplotype networks define allele ancestry.
Evidence for recombination within the VGII molecular type.
Alleles with ancestral genotypes are less informative because these alleles may not have diversified over time in the VGIIc lineage for various reasons, including selection pressures and overall lack of diversity at the allele. When only non-ancestral alleles were examined, 75% lay at the distal ends of their haplotype maps. Intriguingly, the three VGIIc alleles unique to the genotype (SXI1α, HOG1, and CRG1) all have distal placements (). Additionally, the most recent ancestor to VGIIc in all three cases can be shown to derive from isolates that are from South America and Australia, indicating that VGIIc may have emerged out of one of these regions (). While other regions including Europe and North America can be seen, no other regions are observed for all three of these alleles. These distal placements are consistent with a recent divergence of the unique VGIIc lineage. The haplotype analysis, in combination with the lack of any underlying diversity within the nine VGIIc isolates analyzed, indicates a recent emergence of this novel virulent genotype in Oregon.
To examine the role that recombination may have played in the population structure of the VGII molecular type, we conducted paired allele analysis for 25 representative global isolates (, Figure S4
). The discovery of all four possible allele combinations between two unlinked loci (AB, ab, Ab, aB) serves as evidence for likely recombination 
. From this analysis, we show that isolates collected from South America, Africa, and Australia appear to be involved in recombination events. Representative VGIIa/major, VGIIb/minor, and VGIIc/novel isolates were found among groups of recombinant isolates. A group of ten isolates, all α, from South America and Africa (Figure S4
) appeared most commonly as recombinant partners, although several a
mating type isolates were also less frequently involved. In further support, when we examined the number of genotypes present by region and compared this data to the total number of genotypes represented (Figure S1
), it is clear that South America and Africa populations are more diverse when compared with isolates from North America, which are more clonal. Additionally, while the observed diversity in Australia was lower than South America and Africa, this may be attributable to sampling bias of clonal regions as prior studies have shown that this continent is a region with high levels of recombination due to both same-sex and opposite-sex mating events 
. In addition to the paired allele analysis, allele diagrams were constructed to observe possible recombination within individual MLST loci (Figure S5
). The most parsimonious explanation for allelic diversity in 11 of the MLST loci analyzed is as a result of consecutive and/or independent mutations within the population. Within the four remaining loci, there exists at least one hybrid allele that may be the result of a recombination event between two hypothesized parental alleles in the global VGII population (, Figure S5
). Phenotypic mating results were conducted and illustrate that the VGIIa/major (α), VGIIc/novel (α), VGII mating type a
genotypes, as well as several of the proposed parental contributors from the allelic and genotypic recombination analysis show fertility with the production of spores when mated with fertile VGIII isolates (Table S5
). Taken together, this suggests that both α-α and a
-α mating events may be contributing to the formation of recombinant genotypes as well as the production of infectious spores. There were no examples of alleles introgressed into VGII from VGI, VGIII, or VGIV, in accord with findings that the four VG molecular types likely represent cryptic species 
. In summary, these results suggest that recombination events may be critical driving forces in the evolution of C. gattii
VGII diversity, which may in part contribute to the generation of genotypes displaying increased virulence.
Proposed recombinant alleles and hypothesized parental contributors.
VGIIc/novel and VGIIa/major Outbreak Isolates Are Hypervirulent
It has recently been shown that intracellular proliferation rate (IPR) values for cryptococcal cells within macrophages are positively correlated with virulence in the murine model for cryptococcosis 
. To further elucidate the potential virulence of outbreak isolates collected from the United States, proliferation rates of selected isolates were tested and compared to other isolates for which proliferation data had been previously obtained. In total, IPR values for eight of the nine VGIIc isolates were measured (). In addition, the type strains for VGIIa/major (R265) and VGIIb/minor (R272) were included as controls, and previously published data for other VGIIa and VGIIb isolates were included for comparisons 
. On the basis of individual strains, seven of the eight VGIIc/novel isolates showed high IPR levels, with only a single outlier (EJB52) that had a low IPR value (0.97). Taken together, the median IPR value for VGIIc is significantly closer to that of VGIIa/major than to VGIIb/minor (). These results indicate that the VGIIc genotype has a similar intracellular phenotype, and thus virulence profile to the VGIIa/major genotype. This is noteworthy because previous analysis showed that the VGIIa/major genotype isolates from the outbreak had unusually high IPR values, and the VGIIc isolates from the same outbreak are here shown to have similarly high IPR values.
In vitro analyses of intracellular proliferation and mitochondrial morphology provide evidence the VGIIc genotype is hypervirulent.
Another unique feature of the outbreak VGIIa/major isolates is the ability to form highly tubular mitochondria after intracellular parasitism, a characteristic that correlates with both IPR and murine virulence 
. To explore the morphology of VGIIc isolates, we examined selected isolates in DMEM media and after exposure to macrophages. This analysis included two VGII environmental isolates (CBS8684, CBS7750) and four of the VGIIc/novel isolates. As expected, the vast majority of the mitochondria for all six isolates were non-tubular after exposure to DMEM media alone (). However, after exposure to macrophages, three of the four VGIIc isolates tested showed significantly higher percentages of tubular morphology (). The lone VGIIc isolate that did not exhibit this morphology (EJB52) was the same isolate that also had a low IPR value, and is thus an overall outlier for the VGIIc genotype.
When the results of IPR versus percentage of cells exhibiting tubular morphology were plotted, the graph showed a statistically significant correlation of the two measures with an R2 value of 0.85 (). These results further indicate that the VGIIc genotype is phenotypically similar to the Vancouver Island VGIIa/major outbreak strains. Our results also support evidence for similar mechanisms regulating the increased virulence seen in the novel VGIIc genotype. The exact roles that the mitochondrial tubular morphology might play in virulence are not yet known. However, the distinct phenotype is clearly unique to the outbreak isolates and is correlated with an increased ability to grow and divide within host innate immune cells.
The VGIIc isolates were found to be highly virulent in the murine inhalation model of infection. Two studies were conducted to examine virulence. In the first murine experiment a total of six isolates (n
5 animals/isolate), were examined including two VGIIc isolates (). The VGIIa/major isolate R265 served as a positive control for high virulence, based on prior studies 
, and the VGIIc isolates EJB15 and EJB18 showed similar virulence with this well characterized virulent isolate. Additionally, two VGIIa isolates that are not hypothesized to be from the current Vancouver Island outbreak, including NIH444, which is fully identical across 34 markers, and isolate CA1014, which differs from R265 at VNTR34, show a significant reduction in virulence compared to the high virulence isolates (P<0.05). Finally, in accordance with previous studies, the VGIIb/minor type strain R272 from Vancouver Island was avirulent in this model.
Isolates from the United States outbreak are hypervirulent.
The analysis of virulence within the VGII genotype was extended in a second experiment, in which 12 isolates (n
9–10 animals/isolate) were examined. This study included two VGIIa/major isolates from the outbreak zone, two VGIIb/minor isolates from the outbreak zone, five of the novel VGIIc isolates, two VGIIa-related isolates that are not part of the outbreak, and the C. neoformans
type strain, H99. The H99 isolate used (H99S) has been shown to be highly virulent in the murine model of infection 
As expected, all five of the VGIIc isolates from Oregon as well as the VGIIa/major isolates from Vancouver Island and Oregon, and the highly virulent H99 isolate exhibited a high level of virulence (median survival
20.6 days). The VGIIb/minor isolates tested were significantly decreased in virulence compared to the more virulent VGIIa and VGIIc genotypes (P<0.005). The VGIIb isolate R272 was avirulent whereas the VGIIb isolate EJB53 from Oregon exhibited significantly less virulence compared to the VGIIa/major and VGIIc isolates (P<0.005, median survival
46 days). Similar to the first animal study, two VGIIa isolates that differ at one or more molecular markers from the major VGIIa outbreak genotypes were also tested. The environmental isolate CBS7750 and a clinical isolate from South America ICB107 were significantly attenuated (P<0.005) (). These results provide further evidence that these are related to but distinguishable from isolates that are specific to the Vancouver Island outbreak, and subsequent United States expansion, and are decreased in ability to mount fatal infections in a mouse intranasal instillation model of infection.
The cause of infection was further evaluated by histopathological analysis of lung sections recovered from two infected animals per isolate at sacrifice. Harvested organs were processed and sectioned for slides with H&E staining. The lungs from the virulent isolates showed significant inflammation and numerous cryptococcal cells dispersed throughout the alveoli, in accordance with severe pulmonary infection. Our findings show that there are no major clinical differences between pulmonary infections with the infectious genotypes VGIIa/major (), and the novel VGIIc genotype (). These results further support similar disease progression caused by these two highly virulent outbreak genotypes.