|Home | About | Journals | Submit | Contact Us | Français|
Zoonotic infections are a growing threat to global health. Chlamydia pneumoniae is a major human pathogen that is widespread in human populations, causing acute respiratory disease, and has been associated with chronic disease. C. pneumoniae was first identified solely in human populations; however, its host range now includes other mammals, marsupials, amphibians, and reptiles. Australian koalas (Phascolarctos cinereus) are widely infected with two species of Chlamydia, C. pecorum and C. pneumoniae. Transmission of C. pneumoniae between animals and humans has not been reported; however, two other chlamydial species, C. psittaci and C. abortus, are known zoonotic pathogens. We have sequenced the 1,241,024-bp chromosome and a 7.5-kb cryptic chlamydial plasmid of the koala strain of C. pneumoniae (LPCoLN) using the whole-genome shotgun method. Comparative genomic analysis, including pseudogene and single-nucleotide polymorphism (SNP) distribution, and phylogenetic analysis of conserved genes and SNPs against the human isolates of C. pneumoniae show that the LPCoLN isolate is basal to human isolates. Thus, we propose based on compelling genomic and phylogenetic evidence that humans were originally infected zoonotically by an animal isolate(s) of C. pneumoniae which adapted to humans primarily through the processes of gene decay and plasmid loss, to the point where the animal reservoir is no longer required for transmission.
Zoonotic infections from wildlife were recently suggested to be the most significant growing threat to global health of all the emerging infectious diseases (16). Chlamydia comprises a group of obligate intracellular bacterial parasites responsible for a variety of diseases in humans and animals, including several zoonoses. In 1999, Everett et al. proposed a reassignment from the single genus Chlamydia into two genera, Chlamydia and Chlamydophila, based on apparent differential clustering of the 16S rRNA genes (10). This change has not been widely accepted by the chlamydial research community; thus, reversion to the single genus Chlamydia was recently recommended (33). Accordingly, we use the Chlamydia nomenclature here.
Chlamydia pneumoniae (previously known as TWAR) was first recognized as a distinct species in 1988 (6) and is widespread in human populations, causing acute respiratory disease with effective human-to-human transmission by aerosol (30). It has also been associated with several human chronic diseases, including asthma (35), atherosclerosis (41), stroke (9), and late-onset Alzheimer's disease (1). C. pneumoniae was initially identified solely in humans; however, its host range is now the most cosmopolitan of all the chlamydiae, encompassing both warm- and cold-blooded animals such as horses, koalas, and other marsupials and amphibians and reptiles (3). Populations of the Australian koala (Phascolarctos cinereus) are widely infected with two species of Chlamydia: C. pecorum and C. pneumoniae (3). While C. pecorum infections are present at ocular and urogenital sites, C. pneumoniae infections are commonly found in the koala respiratory tract and are linked to symptoms of respiratory disease (40), which is consistent with acute human C. pneumoniae disease. Transmission of C. pneumoniae between animals and humans has not been documented; however, two other chlamydial species, C. psittaci and C. abortus, are well known zoonotic pathogens transmitted from birds and ruminants (21) that cause psittacosis, a life-threatening pneumonia, and abortion, respectively. Here we propose on the basis of compelling genomic and phylogenetic evidence that C. pneumoniae, a major human pathogen that is essentially clonal, was originally derived from an animal source.
The genome sequences of four epidemiologically distinct human-derived C. pneumoniae isolates were previously determined (18, 27, 31). These isolates are perceived as being genetically homogenous; this is supported by the fact that there are fewer than 300 single-nucleotide polymorphisms (SNPs) scattered around the chromosome in no discernible pattern (Fig. (Fig.1;1; Table Table1).1). Such a degree of similarity between temporally and geographically disparate isolates supports a relatively recent clonal expansion of human C. pneumoniae isolates (26) but is otherwise uninformative for deciphering the evolutionary origin(s) of this pathogen. Accordingly, we sequenced the complete genome of the koala C. pneumoniae isolate LPCoLN, seeking molecular insight into host specificity, evolutionary origin, and pathogenicity.
The koala C. pneumoniae isolate LPCoLN was originally isolated from a nasal swab of a captive koala showing signs of respiratory illness. C. pneumoniae was detected by PCR and gene sequencing, and LPCoLN was grown in vitro in HEp-2 cell monolayers. No other bacterium or virus was recovered from the nasal swab.
The complete genome sequence of C. pneumoniae LPCoLN was determined using the whole-genome shotgun method (23). Physical and sequencing gaps were closed using a combination of primer walking, generation and sequencing of transposon-tagged libraries of large-insert clones, and multiplex PCR (36). Identification of putative protein-coding genes and annotation of the genome were performed as previously described (23). An initial set of coding sequences (CDSs) predicted to encode proteins was identified with GLIMMER (7). CDSs consisting of fewer than 30 codons were eliminated. Frameshift and point mutations were corrected or designated “authentic,” as previously described (23). Functional assignment, identification of membrane-spanning domains, determination of paralogous gene families, and identification of regions of unusual nucleotide composition were performed as previously described (23). Sequence alignments were generated using methods described previously (23).
C. pneumoniae LPCoLN and the genomes of four previously sequenced human-derived C. pneumoniae isolates (18, 27, 31) (GenBank accession numbers: CWL029, AE001363; TW-183, AE009440; AR39, AE002161; and J138, BA000008) were compared at the nucleotide level by suffix tree analysis using MUMmer (8) and the data parsed by custom Perl scripts. Predicted C. pneumoniae genes were compared by BLAST against the complete set of genes from other chlamydial genomes using an E-value cutoff of 10−5. Synteny and BLAST score ratio analyses were performed as previously described (25).
High-quality synonymous SNPs (sSNPs) were identified by comparing the predicted genes on the closed genome of C. pneumoniae strain AR39 with the LPCoLN genome sequence using MUMmer (8). A polymorphic site was considered high quality when its underlying sequence comprised at least three sequencing reads with an average Phred quality score greater than 30 (11). sSNPs in CWL029, TW183, and J138 were similarly identified, although no assessment of quality could be made, as quality scores are not available for these genomes. Concatenated sSNPs for the individual C. pneumoniae isolates were further analyzed by the HKY85 method (13) with 200 bootstrap replicates, and the results were used to generate an unrooted phylogenetic tree according to the PhyLM algorithms (12).
One hundred eleven clusters of shared proteins, with a BLAST score ratio greater than or equal to 0.8 (25), were identified between the C. pneumoniae isolates C. pecorum E58 (G. S. A. Myers, unpublished data), C. muridarum Nigg (27) (GenBank no., AE002160), C. caviae GPIC (28) (GenBank no., AE015925), C. psittaci 6BC (Myers, unpublished), and C. abortus S26/3 (39) (GenBank no., CR848038). Protein clusters were aligned using ClustalX (37) and back translated into nucleotide alignments using TRANSALIGN, part of the EMBOSS software package (29). Concatenated aligned genes, spanning a total of 121,674 positions with a sequence similarity of 82.2% and identity of 58.8%, were further analyzed by the HKY85 method (13) with 200 bootstrap replicates, and the results were used to generate an unrooted phylogenetic tree according to the PhyLM algorithms (12).
The sequences of the C. pneumoniae LPCoLN chromosome and plasmid have been deposited in GenBank with the accession numbers CP001713 and CP001714, respectively.
C. pneumoniae LPCoLN possesses a single, circular chromosome of 1,241,024 bp, slightly larger (by approximately 10 kb) than the human-derived C. pneumoniae isolates. The small cryptic chlamydial plasmid (7,655 bp) that is absent from all characterized human C. pneumoniae isolates is present in the koala strain and is highly conserved relative to published chlamydial plasmid sequences. LPCoLN has 1,095 predicted CDSs, with 988 (90.2%) CDSs conserved, 14 (1.3%) divergent, and 93 unique relative to the human C. pneumoniae isolate AR39 (Fig. (Fig.1;1; Table Table2).2). Most unique CDSs encode hypothetical proteins with no currently discernible function (see Table S1 in the supplemental material).
Comparative genomic and proteomic analyses (25) show that the LPCoLN genome is highly similar to and syntenic with the four sequenced human-derived isolates (see Fig. S1 in the supplemental material). However, unlike the small number of SNPs found between the human-derived isolates, 6,213 SNPs (3,298 synonymous and 2,915 nonsynonymous) separate the genomes of LPCoLN and human isolate AR39 (Tables (Tables11 and and3).3). Phylogenetic analysis of all C. pneumoniae isolates based on SNPs (Fig. (Fig.2)2) and 111 highly conserved genes from across all sequenced animal chlamydial genomes (Fig. (Fig.3)3) indicate that LPCoLN is basal to the sequenced C. pneumoniae isolates from humans. Thus, while LPCoLN is a contemporary isolate, phylogeny places it closer to a presumptive ancestor of the C. pneumoniae isolates found in human populations.
The genome-wide SNP distribution observed in the koala isolate compared to the human-derived isolates provides further evidence for a zoonotic origin of C. pneumoniae recovered from humans. There are 10 noteworthy regions of SNP accumulation (Fig. (Fig.11 and and4),4), representing genomic hot spots that are likely evolving at different rates in C. pneumoniae from koalas and humans. Notably, many of the human isolates' CDSs within these hot spots are truncated or fragmented relative to LPCoLN, suggesting ongoing gene decay processes, with presumed concomitant loss of function in human-derived C. pneumoniae. Several of these hot spots encode known virulence or metabolic factors that display sequence polymorphisms and are variably represented in other chlamydial strains and species, including the polymorphic membrane protein (PMP) family, secreted type III secretion effectors, and enzymes involved in the biosynthesis of chorismate, a precursor of aromatic amino acids (Fig. (Fig.11 and and4).4). Gene truncation and fragmentation are also evident at several of these loci within the human-derived isolates, suggesting that microevolutionary processes are also ongoing in human C. pneumoniae. Of the human isolates, CWL029 consistently exhibits a higher degree of gene truncation and fragmentation in several hot spots; AR39 shows the least, with TW-183 and J138 being intermediate.
The largest SNP hot spot corresponds to the plasticity zone (PZ), a region that encapsulates much of the sequence diversity in all chlamydial genomes (28). The PZ, which has been shown to contain host- and/or tissue-specific genes in other chlamydial species (27, 28), appears to be fully intact in the koala isolate but is highly fragmented in all sequenced human-derived C. pneumoniae isolates. While many small CDSs appear to be unique to human-derived C. pneumoniae, comparison to LPCoLN reveals that several of these are actual remnants of four larger genes (Fig. (Fig.1)1) that are part of a previously unknown 11-member gene family encoding predicted membrane-bound proteins with predicted membrane spanning domains.
There are only three genes of known function that are present in the human isolates and absent from LPCoLN: guaBA and add, required for the synthesis of GMP, a precursor for the synthesis of guanine nucleoside triphosphates, located in the PZ of the human-derived C. pneumoniae isolates. However, in three of the four human isolates, guaB is fragmented, indicating that it is presumably not essential for human infection. Such gene fragmentation patterns are observed only in the genomes of human isolates and are discernible only by comparison to the koala LPCoLN genome. This unidirectional pattern of gene fragmentation seen throughout the human-derived C. pneumoniae genomes not only supports the phylogenetic analyses (Fig. (Fig.22 and and3),3), suggesting that animal-derived C. pneumoniae predates human-derived C. pneumoniae, but also suggests that animals were the original hosts for C. pneumoniae.
Prior to this study, C. pneumoniae genome sequences were available for only four isolates, all of human origin. These genomes showed surprisingly high similarity, with an approximate total of only 300 SNPs between them. Such a high degree of genomic conservation has been hypothesized to be evidence that C. pneumoniae was recently transmitted to humans followed by a rapid spread throughout human populations, giving little opportunity for genomic changes. This level of homology within C. pneumoniae is in contrast to the degree of genetic variability seen in the other chlamydial species, in particular C. trachomatis, which is thought to have infected humans throughout human evolution (4, 19, 32, 38). The host range of C. pneumoniae has been expanded significantly in the last 10 years, with infections reported in horses (34), reptiles (3), amphibians (2, 3, 14), and several Australian marsupials, including koalas (40) and bandicoots (20). Previous DNA sequence comparisons have focused on 16S rRNA and ompA genes. While these analyses have revealed differences between strains of human and animal origins, these differences have been minimal and relatively uninformative with regard to determinants of host specificity. Our whole-genome analysis of the koala LPCoLN isolate of C. pneumoniae has provided insight into the genetic differences between animal- and human-derived C. pneumoniae and the putative evolutionary events that have governed the spread of this organism and shows that human isolates of C. pneumoniae exhibit more heterogeneity than previously thought.
The chlamydial cryptic plasmid is present in some chlamydial species, including C. pneumoniae N16, isolated from horses (24), but is absent from others. The role of the plasmid in chlamydial biology is still largely unknown. All human-derived C. pneumoniae isolates studied to date lack the plasmid; however, the koala isolate carries a full-length chlamydial plasmid containing all eight CDSs found in other chlamydial cryptic plasmids. Mitchell et al. (22) reported a much higher growth rate in vitro for the koala LPCoLN isolate than for the human isolate AR39—it is possible that one or more of the genes present on the cryptic plasmid may account for this higher growth rate.
The most compelling evidence to support the idea that LPCoLN is either ancestral or closely related to an ancestral form of C. pneumoniae human isolates is the presence of several putatively full-length CDSs in LPCoLN, which are fragmented in human-derived C. pneumoniae, forming clusters of pseudogenes (Fig. (Fig.11 and and4).4). The membrane attack complex (MAC)/perforin gene, which has been associated with virulence in other intracellular pathogens, including Toxoplasma (17) and Plasmodium (15), is a 2,457 bp CDS in LPCoLN but is partially truncated in all four human-derived isolates due to an 840-bp deletion toward the 5′ end. Although the function of chlamydial MAC/perforin is currently unknown, we predict that it may be involved in host cell egression and invasion similar to Toxoplasma. The truncated version seen in the human isolates may then reflect adaptation to a specific niche within humans.
All pmpE and pmpG orthologs are intact in the koala strain but are fragmented in several of the human isolates (Fig. (Fig.1).1). The PMP family of proteins is considered to represent the expansion of progenitor proteins proposed to be involved in key roles such as adherence, immune evasion, and proinflammatory responses. In addition, orthologs of the Inc family of proteins are also extensively fragmented in the human isolates (Fig. (Fig.4).4). Inc proteins are a diverse family of chlamydial type III secreted effectors; IncA has been localized to the outer face of the inclusion membrane and is involved in the homotypic fusion of multiple inclusions of C. trachomatis. The apparent ongoing loss of several functional pmp and inc genes in the human-derived isolates of C. pneumoniae again suggests an adaptation to the human host. It is conceivable that different PMP and Inc profiles confer differential niche specificity in different hosts. The fragmentation of functional pmp and inc alleles in human-derived C. pneumoniae may therefore represent an example of convergent evolution of the two species in response to properties that are specific to humans (e.g., a more effective immune response against PmpE antigens in humans or the relative unavailability of cell surface receptors to PmpE in humans versus koalas).
Our analysis of the koala C. pneumoniae LPCoLN genome sequence, combined with phylogenetic analyses of all C. pneumoniae SNPs and the conserved chlamydial CDSs and the patterns of CDS fragmentation and plasmid loss in human-derived C. pneumoniae isolates, provides strong evidence that human isolates of C. pneumoniae have derived from zoonotic C. pneumoniae, supporting the conclusion of the selected SNP analysis of Rattei et al. (26). Thus, we propose that C. pneumoniae was originally an animal pathogen that crossed the species barrier to humans through ongoing reductive evolutionary processes and has adapted to the point where human isolates of C. pneumoniae no longer require an animal reservoir for transmission.
A limitation of our study is that it is based on the genome sequences of only one animal-derived C. pneumoniae isolate, the koala LPCoLN strain, and of four human-derived C. pneumoniae isolates. Hence, key questions such as how many times this host species jump occurred before terminal adaptation and which specific animal host the human-derived C. pneumoniae actually originated from cannot be addressed with this set alone. In addition to the full genome sequence from the LPCoLN isolate of koala C. pneumoniae, we also obtained and analyzed a second koala C. pneumoniae isolate, EBB. This isolate was obtained from a pharyngeal swab from a koala in a wild population from a location geographically separate from that of LPCoLN. Nine genes were sequenced from the EBB isolate, and in all cases the sequences were 100% identical to the sequences obtained from LPCoLN (data not shown).
The koala C. pneumoniae LPCoLN isolate has been relatively well characterized with regard to morphological and in vitro growth characteristics. Coles et al. (5) reported that LPCoLN produced large inclusions in both human and koala monocytes and in HEp-2 cells. Koala C. pneumoniae was able to induce foam cell formation both with and without added low-density lipoprotein, in contrast to TW183, which produced increased foam cell formation only in the presence of low-density lipoprotein. More recently, Mitchell et al. (22) compared the in vitro growth characteristics of LPCoLN with the human isolate AR39. LPCoLN displayed inclusions of size and morphology clearly distinct from those of the human isolate and had a much shorter doubling time (3.4 to 4.9 h versus 5.9 to 8.7 h) when grown in HEp-2 cell monolayers. Rates of inclusion fusion were also much higher with LPCoLN (100%) than with AR39 (30 to 40%). These biological differences between koala- and human-derived C. pneumoniae are consistent with the range of genomic differences that we identified in this work. Such phenotypic studies demonstrate the compensatory power of comparative pathogenomics in a genetically intractable organism such as C. pneumoniae. Moreover, the ability to compare genome sequences of organisms infecting different hosts provides snapshots of the evolutionary process as if frozen in time. The search is now on to find C. pneumoniae isolates from animals that are most closely related to human-derived isolates, as this will better indicate when this host species jump may have occurred.
Our findings indicate that the high prevalence and disease burden of C. pneumoniae in humans may represent a major evolutionary and public health corollary of zoonotic infections—the emergence of a full-fledged human pathogen, transmitted without the original animal vector, causing substantial acute and chronic disease sequelae.
This work was supported by the National Institute of Allergy and Infectious Disease grant 1R01AI051472.
We thank former TIGR and current IGS faculty, the TIGR/IGS Informatics group for expert advice and assistance, and the JCVI Sequencing Facility.
Published ahead of print on 11 September 2009.
†Supplemental material for this article may be found at http://jb.asm.org/.