The release of the several whole genome sequences of C. burnetii
has facilitated the development of genotyping schemes and the characterization of collections. Due to the complexity of culturing C. burnetii
and select agent restrictions, collections of C. burnetii
are small and rare making it all the more important to build upon existing work in order to facilitate inter-laboratory comparisons among the collections that are available. Such comparisons will lead to a better understanding of the phylogeographic distribution of this pathogen historically, at present, and in the future.
In this study, we exploit SNP signatures from the readily available MST scheme for C. burnetii
] and convert them into inexpensive, high-throughput, transferrable assays that can be used to quickly determine the genomic group and MST genotype of existing samples. This study further adds 41 isolates from a large collection of C. burnetii
maintained in the United States to the total number of strains with MST information, thereby expanding the data available for worldwide comparisons while demonstrating proof of principle of our methods. This will hopefully encourage further genotyping that builds upon our understanding of the phylogeography of C. burnetii
as new isolates are collected.
The strength of the typing scheme presented is that it allows for accurate identification of genotypes and rapid characterization of new isolates or field-collected samples from natural outbreaks or from a suspected intentional release. The CDC will be evaluating the use of the method for such purposes and for forensic investigations, association of particular strains with virulence, reservoir specificity, and the geographic origin of strains. An example of this use is shown in the analysis of genotype 8 which appears to be associated with chronic human disease, particularly Q fever endocarditis, with goats as the reservoir. It may be prudent to evaluate acute humans infections associated with contact with goats and goat farming products to determine if genotype 8 strains are involved. Human infections identified as being due to genotype 8 strains may warrant more intense scrutiny and follow up evaluation due to the potential for development of chronic disease. While the treatment regimen for all strains of C. burnetii is identical, it may be appropriate to emphasize the treatment of acute illness caused by certain strains associated with chronic infections. However, of particular interest is that none of the 35 genotype 8 isolates have yet been associated with acute human infection suggesting that these isolates may cause a benign or asymptomatic acute infection that is therefore generally not treated and subsequently may develop into chronic disease such as endocarditis. Additional studies, and the analysis of difficult to identify asymptomatic acute infections, will be necessary to determine the full pathogenic potential of the genotype 8 strains.
While we provide the signatures that will discriminate among all MST genotypes, we did not develop all such signatures into assays. There are 44 phylogenetic branches supported by SNPs, yet we only designed assays for SNPs on 14 branches. We selected the major branches for assay development along with a small number of other branches that would narrow down the list of possible genotypes within a genomic group for the bacterial strains tested. To do this, we designed and tested assays in an iterative fashion. As we did not have access to samples from each MST genotype, we did not design assays for each one as we would not be able to thoroughly test such assays. The signatures for every branch on the tree however are listed in Table S1
and could readily be developed into assays if needed. Also, some differences among MST genotypes are solely due to indels and were therefore not incorporated into assays that could discriminate between such genotypes.
While rapidity and simplicity are important advantages of SNP based genotyping assays, another benefit is robustness. As SNP mutation rates are low, the likelihood of convergent or reverse mutations are low, making homoplasy unlikely in the absence of selection. Furthermore, homoplastic data are not likely to result in incorrect phylogenetic placement as the phylogenetic signal will conflict with a congruent signal produced by other loci. This redundancy can occur even if a single locus is selected to represent each branch (canonical SNP). Of the 63 strains and 7 whole genome sequences genotyped against 14 SNP loci, there was only a single instance of homoplasy (1/972 data points), confirming the evolutionary stability of these signatures. As SNPs from multiple branches are assayed against each isolate, this provides a level of redundancy that makes it easy to spot suspicious results that arise if two assays on different parts of the phylogeny place the same isolate in two exclusive clades, as was seen with sample L 35 and assay Cox18bp166.
Of note is that the typing scheme presented here is designed to be fully comparable with MST genotype data and not to identify novel genotypes. However, it is reasonable to assume that novel genotypes may be found. For example, in silico
analysis of the whole genome sequence of Dugway 5J108–111 in this work revealed novel alleles at 8/10 loci, resulting in a novel genotype, but also see Mediannikov et al. 
where isolates from ticks resulted in new genotypes that were comprised of different combinations of already known alleles rather than new alleles.
The typing scheme presented here is compatible with current as well as emerging genotyping technologies. SNP based assays are highly amenable to adaptation to different platforms 
, chemistries (TaqMan, SYBR), and a variety of allelic detection machinery 
. For example, TaqMan assays have the potential for extremely sensitive detection and have been shown to successfully genotype single molecules 
. Such sensitivity means that these assays can be used to genotype samples collected from the environment without the need for culturing. As these assays are all sequence based, they are also compatible with whole genome sequencing and, unlike VNTR assays, will not likely be sensitive to different sequencing platforms. Analysis of whole genome sequence data will reveal alleles at all MST loci, allowing rapid placement of a genome onto the MST phylogeny. As whole genome comparisons will reveal more pairwise polymorphisms than MST, if the sequenced genomes are from the same MST genotype, SNPs might be found that can be developed into SNP based assays for testing against other samples within that genotype for added resolution.
In summary, while whole genome sequencing of every sample is currently impractical, the method we describe here can serve as a bridge between conventional PCR based genotyping and whole genome sequencing. We have developed 14 assays whose data can be used to place C. burnetii isolates into the phylogenetic context of six genomic groups and 35 MST genotypes, allowing for comparison of existing and new isolates. As these are sequence based signatures, data collected using these assays will remain useful even as platforms and technologies change and can be queried using in silico methods as more C. burnetii isolates are whole genome sequenced.