|Home | About | Journals | Submit | Contact Us | Français|
Pathogenic uncultivable treponemes, similar to syphilis-causing Treponema pallidum subspecies pallidum, include T. pallidum ssp. pertenue, T. pallidum ssp. endemicum and Treponema carateum, which cause yaws, bejel and pinta, respectively. Genetic analyses of these pathogens revealed striking similarity among these bacteria and also a high degree of similarity to the rabbit pathogen, T. paraluiscuniculi, a treponeme not infectious to humans. Genome comparisons between pallidum and non-pallidum treponemes revealed genes with potential involvement in human infectivity, whereas comparisons between pallidum and pertenue treponemes identified genes possibly involved in the high invasivity of syphilis treponemes. Genetic variability within syphilis strains is considered as the basis of syphilis molecular epidemiology with potential to detect more virulent strains, whereas genetic variability within a single strain is related to its ability to elude the immune system of the host. Genome analyses also shed light on treponemal evolution and on chromosomal targets for molecular diagnostics of treponemal infections.
The genus Treponema (Krieg et al., 2010) comprises several human pathogens that cause chronic infections including several species of oral treponemes (e.g. T. denticola, T. parvum, and T. putidum;Visser and Ellen, 2011). Another member of the genus Treponema, Treponema pallidum subspecies pallidum (TPA) is the causative agent of the sexually transmitted disease syphilis, a chronic, multi-stage, human infection characterized by variable clinical symptoms. Compared to treponemes isolated from the mouth (subgingival plaque), not a single strain of T. pallidum has been steadily propagated under in vitro conditions. Moreover, genome sequencing has revealed considerably larger genomes of oral treponemes (e.g. 2.84 Mb for T. denticola, (MacDougall and Girons, 1995; Seshadri et al., 2004) compared to T. pallidum strains (1.14 Mb; Fraser et al., 1998). In this review, we will focus on the genetic diversity of obligatory pathogenic Treponema pallidum strains and on related uncultivable treponemes.
Pathogenic treponemes, similar to syphilis-causing TPA, include Treponema pallidum subspecies pertenue (TPE), the causative agent of yaws, Treponema pallidum subspecies endemicum (TEN), the causative agent of endemic syphilis, and Treponema carateum, the causative agent of pinta. Besides these human pathogens, a rabbit pathogen, Treponema paraluiscuniculi, has been shown to be very similar to syphilis treponeme (Strouhal et al., 2007). Pathogenic treponemes of this group differ in two principal etiopathogenic parameters including host range and degree of invasivity; with TPA being the most invasive, TPE being moderately invasive and T. carateum being non-invasive. Treponema paraluiscuniculi is not pathogenic to humans; whereas TPA and TPE strains can infect both humans and rabbits (isolates of T. carateum have not yet been described).
Sequencing of the whole TPA Nichols genome in 1998 (Fraser et al., 1998), started an era of whole genome analyses of pathogenic treponemes. Results, to date, have increased our understanding of pathogenic treponemes and provided possibilities for molecular diagnostics of syphilis and yaws. Moreover, whole genome sequences of uncultivable pathogens have been used in a number subsequent studies (Brinkmann et al., 2006, Brinkmann et al., 2008; McGill et al., 2010; McKevitt et al., 2003, 2005; Mikalová et al., 2010; Šmajs et al., 2002, 2005; Strouhal et al., 2007; Titz et al., 2008; Weinstock et al., 2000a).
In many pathogenic bacteria, including both Gram-negative and Gram-positive bacteria, pathogenicity islands are important bacterial virulence determinants encoding a number of virulence factors (Schmidt and Hensel, 2004). Genes present in the genomic islands can be transferred from bacterium to bacterium via horizontal gene flux and the foreign DNA can also be incorporated into the host DNA. In addition, other mobile genetic elements, including conjugative and other plasmids, bacteriophages, transposons, and insertion elements can encode virulence factors. In contrast to this strategy, several pathogenic bacteria appear not to contain pathogenicity islands, including Mycobacterium spp., Chlamydia spp., most streptococcal species, and spirochetes. The common theme of pathogenic bacteria lacking pathogenicity islands is adaptation to a specific host environment, usually accompanied with genome size reduction and loss of the ability to replicate outside a host. Adaptation to a single or only a few hosts is usually accompanied by reduced horizontal gene transfer. Moreover, several bacterial pathogens have a highly clonal population structure (monomorphic bacterial pathogens), including Bacillus anthracis (Van Ert et al., 2007), Burkholderia mallei (Godoy et al., 2003), Chlamydophila pneumoniae (Rattei et al., 2007), Mycobacterium leprae (Monot et al., 2005), Mycobacterium tuberculosis complex (DosVultos et al., 2008), Salmonella enterica serovar Typhi (Kidgell et al., 2002; Roumagnac et al., 2006) and Yersinia pestis (Achtman et al., 1999, 2004). Many of these monomorphic bacteria have unknown immediate ancestors (Achtman, 2008). One of the potential explanations for such genetic monomorphism is the fact that humans are often secondary hosts for monomorphic pathogenic bacteria and therefore represent a rather recent branch off the main evolutionary course of these bacteria (Achtman, 2008). The genome size of pathogenic treponemes is much reduced, being approximately one fourth to one fifth the size of the E. coli genome. Pathogenic uncultivable treponemes can be classified as genetically monomorphic bacteria (Achtman, 2008) in the process of adaptation to a narrow host environment without known mechanisms of horizontal gene transfer and no obvious ancestors. Relatively subtle genetic changes are expected to determine the differences in treponemal host specificity and pathogenicity (see Table 1).
Pathogenic uncultivable treponemes were originally considered as separate species based on disease symptomatology and epidemiology. Moreover, these bacteria clearly differed from nonpathogenic T. phagedenis and T. refringens treponemes (Miao and Fieldsteel, 1978). Later, it became clear that TPA and TPE strains were almost identical in DNA hybridization experiments (Miao and Fieldsteel, 1980) and this fact led to reclassification of both organisms into a single species (Smibert, 1984). Recent data from the T. paraluiscuniculi genome analysis (Šmajs et al., 2011) revealed that the genetic diversity between this strain and TPA strains, on a genome wide scale, was less than 2% and thus T. paraluiscuniculi represents a T. pallidum subspecies, rather than a new species. This fact further supports the genetic compactness of uncultivable treponemal pathogens and indicates that small genetic changes can result in profound changes in pathogenesis and host range. Therefore, every nucleotide change should be considered to have the potential for changing bacterial virulence.
Spirochetes that cause rabbit genital lesions were first denoted as Spirochaeta paralues-cuniculi (syphilis-like spirochete; Jacobsthal, 1920). Since that time, several papers have described rabbit infections with this organism, occurring both in the wild, as well as among laboratory animals; the pathogen has since been renamed T. paraluiscuniculi (DiGiacomo et al., 1983, 1984, 1985; Smith and Pesetsky, 1967). Rabbit venereal spirochetosis can be sexually transmitted and leads to mucocutaneous lesions of the genitoanal region, which are characterized by erythema, edema and/or crusting ulcers. In fact, T. paraluiscuniculi appears to be well adapted to rabbits, which develop active infections resembling syphilis in humans. However, unlike syphilis, no vertical transfer of T. paraluiscuniculi in rabbits has been documented. T. paraluiscuniculi is not infectious to humans and the non-infectivity was experimentally demonstrated via intradermal inoculation of three volunteers (Graves and Downes, 1981; Levaditi et al., 1921). In contrast to humans, intradermal inoculation of rabbits with T. paraluiscuniculi leads to chronic lesions with a serological response to T. paraluiscuniculi
Previous studies indicated that the 5′ and 3′ flanking regions of the 15-kDa lipoprotein gene (tpp15) distinguished the human pathogens from T. paraluiscuniculi strains (Centurion-Lara et al., 1998), thus providing the first evidence of genetic differences between these pathogens. Other studies (Giacani et al., 2004; Grey et al., 2006) identified heterogeneity in the paralogous tpr genes. Whole genome fingerprinting and microarray DNA analysis of the T. paraluiscuniculi Cuniculi A strain revealed indels and prominent sequence changes in 38 gene homologs and six intergenic regions of the Cuniculi A genome compared to the genome of T. pallidum subsp. pallidum Nichols (Strouhal et al., 2007). Interestingly, most of the observed differences were identified in tpr genes or in their vicinity. In addition to tpr genes, sequence analysis of heterologous chromosomal regions identified 14 additional genes with frameshift mutations (Strouhal et al., 2007). Harper et al. (2008a) identified over 50 nucleotide changes between TPA strains and T. paraluiscuniculi; additionally, differences were also observed in the arp gene (Harper et al., 2008b). The genome sequence of Treponema paraluiscuniculi, strain Cuniculi A, was recently determined (Šmajs et al., 2011) using a combination of several high-throughput sequencing methods. An unrooted tree constructed from whole genome sequence alignments is shown in Fig. 1A. The T. paraluiscuniculi genome was found to contain a high number of pseudogenes and gene fragments (51) and the genome size was reduced compared to the TPA and TPE genomes. The Cuniculi A genome (1,133,390 bp) is 4.6, 5.9, and 6.1 kb smaller than the genome sizes of TPA Nichols (1,138,006 bp) (Fraser et al., 1998), TPA Chicago (1,139,281 bp; Giacani et al., 2010), and TPA SS14 (1,139,457 bp; Matějková et al., 2008), respectively. Since it is known that at least part of the Nichols population contains a 1.3 kb tprK-like insertion, localized in the intergenic region between TP0126 and TP0127 gene (Šmajs et al., 2002), and that the arp gene (TP0433-TP0434) contains 14 repetitive sequence (Pillay et al., 1998) motifs (60 bp in length) instead of the 7 repetitions reported by Fraser et al. (1998), the Cuniculi A genome is, in fact, 6.2 kb smaller than the Nichols genome. Compared to TPA Nichols, Chicago, and the SS14 genomes, whole genome nucleotide diversity between these genomes and the Cuniculi A genome ( ± SD) ranged from 0.01016 ± 0.00508 to 0.01028 ± 0.00514, indicating an extremely close relatedness between the human non-pathogenic strain and syphilis strains.
Sequencing of T. paraluiscuniculi genome (Šmajs et al., 2011) allowed identification of groups of treponemal genes, important in TPA strains for infection of human hosts. Two features stand out; first, 134 (13.2%) of Cuniculi A genes encoded identical proteins compared to Nichols proteins, indicating negative selection of corresponding genes and conservation of proteins. These proteins were mostly involved in translation and general metabolism; however, for 35 of them, no function was predicted, suggesting as yet unknown essential functions in treponemal metabolism. Second, 84 Cuniculi A genes (32 with predicted function) were found to contain frameshifts or major deletions or other major sequence changes (defined as changes causing continuous amino acid replacements comprising 10 and more residues or 20 and more dispersed amino acid replacements). When counting the number of affected genes in different functional groups and comparing them to the number of affected genes linked to the general metabolism group, 3 groups of genes including (i) virulence factors, (ii) genes with an unknown function and (iii) genes involved in DNA metabolism, contained significantly more frameshifts and major sequence changes. Since the median transcript levels, during experimental TPA Nichols rabbit infection (Šmajs et al., 2005) of affected genes with unknown function, was considerably higher than the median gene expression rate of all genes of unknown function (1.46 versus 0.86), these genes represent promising candidates for important virulence factors of T. pallidum Moreover, affected genes linked to DNA metabolism could suggest their possible role in the acceleration of T. paraluiscuniculi evolution.
Treponema pallidum ssp. pertenue (TPE) is the causative agent of yaws, a tropical disease with an estimated prevalence of about 2 million cases worldwide (World Health Organization, 1998). TPE was discovered in 1905 (Castellani, 1905) and unlike syphilis, yaws is not vertically transmitted and is mostly characterized by skin, joint, soft tissue and bone affections. In general, yaws treponemes are considered less virulent compared to syphilis treponemes and this fact is likely encoded in the TPE genome.
DNA hybridizations performed with TPA and TPE strains, in 1980, revealed that TPE strain Gauthier and TPA strain Nichols were identical within the limits of the resolution of the technique (i.e. about 2% of the genome differences; Miao and Fieldsteel, 1980). This observation resulted in reclassification of both organisms into a single species (Smibert, 1984).
The first papers describing genetic differences between syphilis and yaws treponemes appeared about 20 years ago (Noordhoek et al., 1989, 1990). However, the observed difference in gene TP1038 (tpf-1 gene in TPA, tyf-1 gene in TPE) at position 122, between TPA Nichols and TPE CDC 2575, was not specific for TPE strains. Walker et al. (1995) cited unpublished data indicating a difference in the 16S rRNA gene between a single TPA and a single TPE strain. Later, Centurion-Lara et al. (1998) and Stamm et al. (1998) described genetic changes differentiating TPA and TPE strains in the 5′ and 3′ flanking regions of tpp15 and in the tprJ gene, respectively. Differences between TPA and TPE strains were also found in the gpd gene (TP0257) at position 579, by Cameron et al. (1999), in tp92 (Cameron et al, 2000), and in tprI and tprC genes (Centurion-Lara et al., 2006).
Harper et al. (2008a) found over a dozen nucleotide changes differentiating TPA and TPE strains and also TPA and TEN strains. Moreover, genetic analysis of the arp gene revealed repeat motifs differentiating venereal and non-venereal strains (Harper et al., 2008b).
The work of Mikalová et al. (2010) compared genomes of 4 TPA treponemes (Nichols, SS14, DAL-1 and Mexico A) to 3 TPE strains (Samoa D, CDC-2 and Gauthier), and the Fribourg-Blanc isolate (Fribourg-Blanc and Mollaret, 1969) using the whole genome fingerprinting technique (WGF; Strouhal et al., 2007; Weinstock et al., 2000a). Restriction target site analysis comprising detection of 1773 individual restriction sites, found a similar structure in all investigated genomes. The unclassified simian Fribourg-Blanc was found to cluster with TPE strains but not with TPA strains. Most of the identified genetic differences between TPA and TPE strains were localized in six chromosomal regions, mostly around tpr genes. Based on WGF data, the estimated genome sequence identity between TPA and TPE strains were considered to be 99.63% or higher.
Complete genome sequences of three TPE strains (Samoa D, CDC-2, and Gauthier) were determined using next-generation sequencing techniques (Čejková et al., 2011). The genome lengths ranged between 1,139,330 bp and 1,139,744 bp and a similar genome structure was found among TPE strains as well as between TPA and TPE strains. The overall identity between TPA and TPE genomes was found to be over 99.64% (Table 1). Compared to four TPA strains (Nichols, DAL-1, SS14, and Chicago), changes consistently present in all three TPE strains were considered potentially responsible for the increased virulence of syphilis strains. Sequencing of 3 pertenue genomes (Čejková et al., 2011) revealed 692 out of 983 TPE protein-coding genes (70.4%) to encode identical proteins or identical proteins with strain specific changes, 194 (19.7%) genes to encode proteins with 1 amino acid substitution, and 97 (9.9%) TPE genes to encode proteins containing two or more amino acid replacements and/or other major sequence changes. Similar to results found in the Cuniculi A genome, increased numbers of affected genes were found among hypothetical genes and genes encoding predicted virulence factors.
Since the divergent proteins comprised mostly those predicted as virulence factors, hypothetical genes with major sequence changes consistently present in all TPE and TPA strains, are candidates for important virulence factors of syphilitic treponemes. This prediction is supported by the relatively high transcription rate of the genes in TPA Nichols during experimental rabbit infections (Šmajs et al., 2005). Interestingly, several of these genes were also found to be altered in T. paraluiscuniculi Cuniculi A, suggesting their role in syphilis pathogenesis (Šmajs et al., 2011).
Genetic differences between TPA strains and the Fribourg-Blanc isolate have been studied in several previous studies (Grey et al., 2006; Harper et al., 2008a, 2008b) and also in the whole genome fingerprinting study by Mikalová et al. (2010). The Fribourg-Blanc treponeme was isolated in 1962, from a baboon in Guinea, Africa. Although infected baboons showed no signs of infection, the treponemes isolated from these animals were able to infect hamsters (Fribourg-Blanc and Mollaret, 1969). Moreover, experimental infection, using humans, revealed that humans could also be infected (Smith, 1971; Smith et al., 1971). A close relationship between the Fribourg-Blanc treponemes and TPE strains has been found by Gray et al. (2006), based on the phylogeny of the tprC and tprI genes. The WGF study (Mikalová et al., 2010) grouped TPA strains into a separate cluster compared to TPE strains. The Fribourg-Blanc isolate was clustered with TPE strains, although it clustered more distantly than other TPE strains (Mikalová et al., 2010). Interestingly, the Fribourg-Blanc treponemes were shown to contain the largest genome (1140.4 kb) of all uncultivable treponemes characterized on the genome level. However, due to identified duplication of DNA in the intergenic region between TP0696-TP0697 it is unlikely that the Fribourg-Blanc isolate harbors any unique DNA region that is missing in the TPA and TPE strains (Mikalová et al., 2010).
Mapping of the genetic diversity among TPA strains is an important step in molecular typing and in epidemiology of syphilitic strains. Genetic differences among TPA type strains were first identified in the tprD gene (Centurion-Lara et al., 2000a) and in the TP0126-TP0127 intergenic region (Marra et al., 2006). In addition to genetic changes among TPA strains, phenotypic differences comprising neuroinvasivity were observed among TPA strains (Tantalo et al., 2005). The WGF analysis revealed several differences among the Nichols, DAL-1, SS14, and Mexico A strains (Mikalová et al., 2010). The estimated sequence identity among TPA strains based on WGF data was 99.92% or higher.
Sequencing of the complete genome of TPA SS14 (Matějková et al., 2008) and the Chicago strain (Giacani et al., 2010) revealed a full list of nucleotide changes among these genomes. When compared to the Nichols strain, sequencing of the SS14 strain revealed 327 single nucleotide changes, 14 deletions and 18 insertions. However, this set excludes changes found in the highly variable tprK gene (Matějková et al., 2008). In general, the observed genetic diversity among TPA strains is similar to the genetic diversity seen in other monomorphic bacterial species, including B. anthracis (Pearson et al., 2004), S. enterica serovar Typhi (Roumagnac et al., 2006), and Y. pestis (Achtman et al., 2004).
Nucleotide diversities computed between pairs of whole genome TPA sequences are shown in Table 2. The diversity between the Nichols and Chicago strains was estimated to more than 150 nt changes and one larger insertion (Giacani et al., 2010). Since this insertion was already identified in a part of Nichols population (Šmajs et al., 2002; Matějková et al., 2008; Mikalová et al., 2010), it does not represent a real difference between both genomes. Moreover, the real number of nucleotide changes differentiating both Nichols and Chicago genomes are likely to be considerably lower (dozens of nucleotide changes) as a result of sequencing errors present in the Nichols genome. The diversity between between SS14 and Mexico A genomes (~80 nucleotide changes; Pětrošová, unpublished results) is also small, while the diversity between Nichols and SS14 groups (several hundred nucleotide changes) is moderately larger. This clustering was also observed from genome comparisons of 8.3 – 8.4 kb concatenated nucleotide sequences comprising variable chromosomal regions (see Fig. 1B).
Several studies have mapped the genetic diversity among TPA DNA isolated from clinical material of syphilis patients. The first typing system for detection of genetic diversity (Pillay et al., 1998) revealed 16 subtypes among 46 typeable DNA, isolated from laboratory strains and clinical specimens from the USA, Madagascar, and South Africa, based on the , detected number of repetitions in the arp gene and RFLP polymorphisms in amplified tpr genes. Similar analysis of 45 typeable samples, from Arizona, USA, yielded 10 genotypes (Sutton et al., 2001) and both studies together revealed 22 different TPA subtypes. The work of Pillay et al. (2002) described 35 TPA subtypes among 161 typeable specimens, from South Africa, yielding a total of 44 subtypes identified worldwide. Pope et al. (2005) mapped molecular subtypes among 23 typeable specimens, isolated in North and South Carolina, and identified 7 subtypes. Molepo et al. (2007) identified 4 subtypes in 13 typeable samples from patients with neurosyphilis. Florindo et al. (2008) identified 3 subtypes among 42 typeable specimens from Lisbon, Portugal. Cole et al. (2009) identified 6 subtypes among 58 typeable specimens from Scotland. Martin et al. (2009a) typed 36 samples from Shanghai, China and identified 4 subtypes, and Martin et al. (2010) typed 43 samples in western Canada with four identified subtypes. Altogether, more than 50 individual subtypes in 467 typeable treponemal samples have been isolated from patients. The work of Katz et al. (2010) added to the original CDC typing system (Pillay et al., 1998) analysis of TP0279 (rpsA gene) where the number of repeats in the G homopolymer was used for further typing. Among 69 specimens isolated in San Francisco, 8 individual subtypes were identified.
Moreover, several studies mapped mutations causing macrolide resistance among TPA clinical samples (Katz et al., 2010; Lukehart et al., 2004; Martin et al., 2009a, 2009b; Matějková et al., 2009; Mitchell et al., 2006; Tipple et al., 2011; Van Damme et al., 2009) and an alarming increase in the number of patients infected with TPA strains harboring the A2058G mutation has been detected in USA (Katz et al., 2010; Mitchell et al., 2006). Another mutation (A2059G) leading to macrolide resistance was described recently (Matějková et al., 2009) and a similar frequency of both A2058G and A2059G were found in clinical samples isolated from syphilis patients in the Czech Republic (Woznicová et al., 2010; Flasarová et al., 2011). Both A2058G and A2059G mutations were found also in clinical samples isolated in the USA (Pillay et al., 2011).
Recently, 15 TPA isolates and clinical specimens isolated from 158 syphilis patients in the USA, China, Ireland, and Madagascar were typed with an improved version of the CDC typing system (Marra et al., 2010), which added TP0548 gene sequencing analysis (Flasarová et al., 2006; Woznicová et al., 2007) to the originally described typing system. Compared to the original CDC typing system (Pillay et al., 1998), which identified 14 subtypes, 25 subtypes were identified with the improved version (Marra et al., 2010). As shown with molecular typing studies, genetic diversity within TPA strains is geographically specific, suggesting human population-specific sets of TPA strains.
To date, several examples of treponemal intrastrain heterogeneity have been published indicating that genetically distinct subpopulations of individual TPA strains exist during infection of human or animal hosts. Stamm and Bergen (2000) identified intrastrain genetic heterogeneity in the tprK of the Nichols and SS14 TPA strains. In addition, variability was also found in the tprJ of the SS14 strain (Stamm and Bergen, 2000). Centurion-Lara et al., (2000a) found intrastrain heterogeneity in the tprK genes of an additional three TPA strains (Sea 81-4, Bal 7, Bal 73-1). tprK sequences found in treponemes isolated from syphilis patients with primary chancres were diverse, but clustered within a sample and not among samples (LaFond et al., 2003). In addition, sequencing of tprK from DNA samples (corresponding to V3-V5 regions of TprK), isolated from patients with syphilis, revealed 48 samples (out of 279) with multiple tprK sequences in one sample (Heymans et al., 2009); indicating that individual patients often contain multiple TPA strains expressing variants of the tprK gene.
Another region with a variable presence of the 1.3-kb DNA sequence was found in the Nichols intergenic region between TP0126 and TP0127. The library of TPA DNA constructed from E. coli contained 6 out of 21 clones with the 1.3 kb insertion and 15 without it (Šmajs et al., 2002). DNA regions corresponding to the 1.3 kb sequence, similar to the tprK gene, were found in all TPA (SS14, DAL-1, Mexico A) and TPE genomes (Samoa D, Gauthier, CDC-2, Fribourg-Blanc) that have been tested (Mikalová et al., 2010).
Except for the variable regions of the tprK gene or tprK-similar sequences, sequencing of the complete TPA SS14 genome has revealed at least 43 nucleotide positions in which the sequence varied within one treponemal strain (Matějková et al., 2008). Multiple intrastrain variants were detected in additional tpr genes (tprC,I,J), in the intergenic region between tprI and tprJ, in TP0402 (flagellum specific ATP synthase), TP0971 (membrane antigen), and TP1029 (hypothetical protein). However, the genome of the SS14 strain was not analyzed systematically for intrastrain heterogeneity and therefore, additional variable genetic loci can be expected (Matějková et al., 2008).
Syphilis is a multi-stage disease that is capable of being sexually transmitted, has the potential to invade the central nervous system, and, when untreated in woman, can lead to frequent transplacental congenital infections. In contrast, yaws is a disease with predominant cutaneous or mucocutaneous and bone manifestations (Antal et al., 2002). Although both diseases are distinguished on the basis of epidemiological characteristics and clinical symptoms, these differences do not appear to be fundamental. Yaws can progress to central nervous system and cardiovascular infections as well as to congenital infections (Román and Román, 1986). Moreover, the transmission route of syphilis and yaws appears to reflect opportunity, rather than inherent differences between TPA and TPE strains (Mulligan et al., 2008). Although experimental infection with TPA or TPE strains did not result in complete cross-protection, suggesting differences in the pathogenesis of syphilis and yaws (Miller, 1973; Schell et al., 1982), a similar situation was also found with different TPA strains (Turner and Hollander, 1957). In general, the invasivity of TPE strains appears to be intermediate compared to the more invasive TPA strains and the less invasive T. carateum (Antal et al., 2002).
The small genome size of TPA strains (Fraser et al., 1998; Giacani et al., 2010; Matějková et al., 2008) appears to be the reason for the drastic reduction of treponemal metabolic activities resulting in long generation times (longer than 30 hours; Fieldsteel et al., 1981), sensitivity to oxygen and growth temperature (Stamm et al., 1991) and obligatory host-dependent growth. In contrast to reduced metabolic activities, TPA is an extremely successful pathogen characterized by a low infection dose for humans (as low as about 10 treponemes with a median infection dose of 57 bacteria; Magnuson et al., 1956), the ability to infect any type of human tissue, immune escape resulting in persistence in the host for years or even decades, and the ability to cross the placenta and infect the fetus. Despite efficient host infection, no clearly identified virulence factors have been identified on the basis of sequence analyses of the genomes except for several genes encoding putative hemolysins. However, recombinant expression of these genes in E. coli did not result in a hemolytic phenotype, suggesting another primary role for these genes other than cytolysis (Weinstock et al., 2000b; D. Šmajs, unpublished results). These data indicate the presence of a new, yet unknown, repertoire of genes encoding virulence determinants specific for treponemal/spirochetal pathogens.
In general, TPA strains are characterized by low toxicity and high invasiveness, caused, at least to certain extent, by the corkscrew-like motility of treponemes, which allows them to penetrate low viscosity human tissues. Several genes encoding components of treponemal chemotactic sensory systems suggest a possible role for chemotaxis in treponemal pathogenesis.
Differences in specific genes between (collectively) the genomes of TPA strains and the Cuniculi A genome are likely to include genes involved in infectivity of TPA strains in humans (for list of genes see Šmajs et al., 2011). In addition, genetic differences between TPA genomes and both the T. paraluiscuniculi and TPE genomes (e.g. tprA,C,D,F,I,J,K,L; recQ; TP0136; tp92, arp; mcp; and hypothetical genes) may be involved in determination of syphilis-specific symptomatology (i.e. invasion of the central nervous system, sexual transmission, and transplacental transmission).
Sequencing of the rabbit pathogen, which is unable to infect humans, has revealed dozens of genes with major sequence changes with the potential to significantly affect their function (Šmajs et al., 2011). Although there is no direct evidence on the role of these genes in treponemal host specificity (Fig. 2), at least three lines of indirect evidence support this prediction: (i) these genes are actively transcribed during experimental rabbit infections (Šmajs et al., 2005, ,2011), (ii) predicted positive evolution of some of these genes in TPE -TPA comparisons (15 genes under positive selection after exclusion of genes with possible recombination events and frameshift mutations (Čejková et al., 2011) and the neutral evolution of several of these genes in TPA – T. paraluiscuniculi comparisons suggesting their importance in human pathogens and their decay in the rabbit pathogen, and (iii) higher numbers of these genes in the group of predicted virulence genes compared to genes encoding components of general metabolism. One of the best studied genes, arp, contains a variable number of repetitive 60 bp sequences in its central region (Pillay et al., 1998) and the variability was also found in the sequence motif (Liu et al., 2007). Moreover, the repetitive motifs of Arp proteins were found to be immunogenic and to contain a fibronectin-binding domain (Liu et al., 2007). Proteins containing tandem repetitions are often outer membrane virulence factors that help the pathogen survive within the host (Denoeud and Vergnaud, 2004). These same proteins have been identified in the genomes of Haemophilus influenzae (Hood et al., 1996), Neisseria meningitidis (Saunders et al., 2000) and Mycoplasma hyorhinis (Citti et al., 1997). Harper et al. (2008b) classified the repetitive motifs of the arp gene into several groups and correlated the variability of repetitive motifs with sexual transmission. Similar variability in the number of invariant repetitions was found in the TP0470 gene (Mikalová et al., 2010; Strouhal et al., 2007). Moreover, the conserved hypothetical protein TP0470 was found to be immunogenic (Brinkman et al., 2006; McKevitt et al., 2005).
Genes specifically affected in TPE genomes (TP0671 encoding ethanolamine phosphotransferase, hypothetical genes) could code for factors increasing invasivity of TPA strains (Fig. 2). Except for the tpr genes, only TP0077, TP0326, TP0520, and TP0936 were among the predicted virulence factors on a list containing 67 possible virulence factors (Weinstock et al., 1998). This discrepancy likely reflects our insufficient understanding of the pathogenesis of TPA and TPE infections. The C. jejuni phosphoethanolamine transferase modifies the lipooligosaccharide lipid anchor (lipid A) and the flagellar rod protein, FlgG (Cullen and Trent, 2010). Since treponemes do not contain lipopolysaccharides, it may be involved in the modification of the proteins encoded by the flgG-1 and flgG-2 genes (TP0960, TP0961) that are present in both syphilis and yaws treponemes.
The genomic data provides sequence information that can be used for prediction of genes involved in pathogenesis by estimation of selection type. The type of selection has been predicted for several treponemal genes based on available sequences from T. paraluiscuniculi, TPE and TPA strains. As shown in Table 3, a positive selection type was identified between several orthologous sequences. The predicted type of selection together with an assumption that T. paraluiscuniculi is a descendant of TPA or TPE strains in the process of adaptation to rabbits (Šmajs et al., 2011), can be used for identification of genes involved in the pathogenesis of particular treponemal diseases. The tp92 (TP0326) gene is under positive selective pressure (Table 3) in TPE and TPA treponemes and therefore has a potential role in syphilis and/or yaws pathogenesis. In contrast, this gene in the Cuniculi A strain is under purifying (negative) selection, indicating its important cellular function but no major role in adaptation to rabbits or immune escape during rabbit syphilis. In contrast, the arp (TP0433) gene appears to have been positively selected in the Cuniculi A strain (Harper et al., 2008b) compared to TPA/TPE strains, indicating its possible role in rabbit syphilis. The mcp (TP0488) gene appears to be positively selected in TPA – TPE comparison, suggesting a role in adaptation of the yaws and/or syphilis treponemes.
Of 78 investigated TPE divergent genes, positive selection in 15 orthologous genes has been predicted. The positively selected genes were often found to encode exported or membrane proteins including predicted lipoproteins and outer membrane proteins suggesting that these proteins may be responsible for differences in pathogenesis between syphilis and yaws (Čejková et al., 2011). Predicted positive selection in metabolic, transport, and cell- process genes suggest that positive selection of these genes could be correlated to adaptive processes, such as climate adaptation of treponemes.
Sequencing of the Cuniculi A genome revealed gene fusions compared to the originally annotated Nichols genome (Fraser et al., 1998) where 52 Nichols orthologs were fused into 25 genes in the Cuniculi A genome (Šmajs et al., 2011). Ongoing resequencing of the Nichols genome (P. Pospíšilová, personal communication) has revealed that most of the observed gene fusions are also present in the Nichols genome. Since the Nichols strain was originally isolated in 1912 from a patient with neurosyphilis (Nichols and Hough, 1913) and, since then, propagated in rabbits, questions regarding newly emerged, adaptive, nucleotide changes important for rabbit infectivity have appeared. Despite extended propagation in rabbits, the Nichols strain is still virulent as demonstrated by human inoculations (Magnuson et al., 1956) as well as accidental infections of laboratory personnel (Chacko, 1966; Fitzgerald et al., 1976). Since whole genome sequencing of the Nichols strain (Fraser et al., 1998) and Nichols resequencing (Nichols 4787, S.J. Norris) used different Nichols preparations, we examined highly related TPA genomes (Chicago, DAL-1, and Mexico A) for the presence of the newly identified changes (Šmajs et al., 2011). Since similar sets of nucleotide changes were also found in TPA Chicago (Giacani et al., 2010) and the preliminary DAL-1 and Mexico A genomes (unpublished data), the potential presence of recently emerged intrastrain adaptive mutations in the Nichols genome was excluded. Instead, the observed changes represent sequencing errors in the published Nichols genome (Fraser et al., 1998; Šmajs et al., 2011).
In treponemal genomes, there are 12 paralogous genes (or pseudogenes) classified into 3 subfamilies (subfamily I, tprC,D,F,I; II, tprE,G,J; and III, tprA,B,H,K,L) Several of these proteins, belonging to all three subfamilies, have been recently predicted as outer membrane proteins (TprB,C,D,E,F,I,J; Cox et al., 2010). In addition to these proteins, Centurion-Lara et al. (1999) showed that tprK encodes an outer membrane protein, although others have questioned this finding (Hazlett et al., 2001). Recently, Giacani et al. (2010) showed that the host immune response selects new TprK variants during experimental infection which is consistent with the surface localization of TprK. Subfamily II tpr genes are expressed differently in different TPA isolates with G homopolymers of variable length in the promoter regions (Giacani et al., 2007, 2009). Since most of Tpr proteins elicit an antibody response and expression of tpr genes are time and strain specific, their role in persistence of treponemal infections in immunocompetent host has been suggested (Leader et al., 2003). TprK protein is the most variable treponemal protein having several variants within one strain of TPA (except for the rabbit propagated Nichols strain; LaFond et al., 2006). Moreover, it has been shown that most variable regions of TprK showed increased sequence diversity upon rabbit experimental infection (La Fond et al., 2003). The extreme variability of the tprK gene is generated through a gene conversion mechanism (Centurion-Lara et al., 2004) causing TprK to undergo antigenic variation, which may in turn promote chronic infection. Whereas T-cells recognize conserved regions of TprK, antibodies are directed against the variable regions (Morgan et al., 2002a). Immunization with recombinant Nichols TprK led to partial protection against a Nichols challenge, but provided less protection against TPA with heterologous TprK (Centurion-Lara et al., 1999; Morgan et al., 2002b, 2003). The fact that tpr genes are the most variable gene family in treponemal strains, suggests (Centurion-Lara et al., 2000a, 2000b) their role in pathogenesis of treponematoses as well as in host specificity. The variability of tprK appears to be one of principal mechanisms of immune evasion allowing reinfection of hosts that had already been previously infected (Palmer et al., 2009).
Most of the work on syphilis origin and evolution is based on paleontological findings; however, several studies have already started to use genetic data. The work of Gray et al., (2006) analyzed 6 tpr genes in TPA, TPE, TEN, the Fribourg-Blanc isolate and T. paraluiscuniculi strains and identified intragenomic recombination as an important mechanism of tpr gene evolution. Moreover, this work does not support the origin of TPA strains in recent history (less than 500 years). The work performed by Harper et al. (2008a) involved genetic analyses of 21 chromosomal regions of several (26) treponemal strains and isolates. This work proposes that T. pallidum as a species originated in the Old World as a non-venereal infection and then spread to the New World as yaws. Consequently, treponemal strains from the Americas were introduced to Europe and became the syphilis-causing strain. However, evolutionary order with respect to TPA and TPE strains cannot be unambiguously inferred from this analysis (Mulligan et al., 2008), especially in the light of genome decay identified in T. paraluiscuniculi (Šmajs et al., 2011), which suggests that T. paraluiscuniculi is a descendant of an ancestor of pallidum and pertenue strains rather than the opposite. Measurement of evolution rate in treponemes and its comparison with paleopathological evidence and with evolution rates of other bacteria is consistent with the divergence of syphilis from other human treponematoses 5 - 16.5 thousand years ago (de Melo et al., 2010).
Historically, there have been several hypotheses on the origin of syphilis, including the Columbian, Pre-Columbian, Alternative and the Unitarian hypotheses. Whereas the Columbian hypothesis presumed the existence of syphilis in America with subsequent import by Columbus' crew into Europe (Mays et al., 2003; Naranjo, 1994), the Pre-Columbian hypothesis assumed the existence of syphilis and other treponematoses in the Old and New World a long time before Columbus' era and the misidentification of pre-Columbian syphilis cases as leprosy or other diseases (Meyer et al., 2002; Powel and Cook, 2005). According to the Unitarian hypothesis, all treponemal diseases (syphilis, yaws, bejel, and pinta) are caused by the identical microorganism and disease symptomatology reflects differences in climate and social factors (Powel and Cook, 2005). The Alternative theory suggests trans-species transmission of syphilitic treponemes from African baboons (Livingstone, 1991).
Based on consistent genetic differences found between strains causing yaws and syphilis (Gray et al., 2006; Harper et al., 2008a, 2008b; Čejková et al., 2011), the Unitarian hypothesis can be rejected. The genetic distance between TPA and TPE strains appears to be inconsistent with a recent emergence of syphilis in the 15th century and TPA strains appear to be at least several thousand years old (de Melo et al., 2010; Gray et al., 2006). However, one cannot exclude a very rapid adaptation of a parasite to its host under the right circumstances.
Since bone alterations caused by syphilis, yaws and bejel are distinctive for these diseases (Rothschild, 2005) and the osseotype of syphilis in pre-Columbian era was found in the New World but not in Europe, Africa or Asia (Rothschild, 2005; Rothschild and Rothschild, 2000), transmission of syphilis by Columbus' crew into Europe is a plausible explanation. However, other authors suggest the presence of syphilis in other parts of the world, in addition to the New World, in pre-Columbian era (de Melo et al., 2010). Support for the Alternative hypothesis (Livingstone, 1991) comes from the fact that concurrently with Columbus' expeditions, exploration of the African continent was also intensifying, which could have increased exposure to a possible reservoir of yaws treponemes in the human (and primate) populations of Africa.
Although our knowledge regarding the evolution of syphilis treponemes is rather fragmentary, there are several points revealed by genetic and genomic studies: (i) there is a close genetic relationship between the Fribourg-Blanc isolate and yaws strains (Gray et al., 2006; Harper et al., 2008a; Mikalová et al., 2010) as well as between treponemes causing infections in wild baboons and TPE strains (Knauf et al., 2011), which indicates a common origin of these treponemes. Moreover, the Fribourg-Blanc treponeme has the largest genome amongst the investigated TPA and TPE strains (Mikalová et al., 2010), (ii) genetic diversity between TPA and TPE strains is consistent with TPA evolution of several thousand years (de Melo et al., 2010; Gray et al., 2006), and (iii) pathogenic treponemes evolve by decreasing genome size while adapting to their hosts (Šmajs et al., 2011). The available genetic and genomic data are thus inconsistent with any of the remaining theories regarding the origin of syphilis, but may combine some aspects of all of them in the origin of the African baboon treponemal strains, in evolution of TPA strains for several thousand years and in sudden import of TPA strains from the New World by Columbus' crew into Europe.
Except for tpr genes and the arp gene, TP0136, TP0548, TP0326 and TP0488 genes were identified as those with the greatest nucleotide differences among TPA strains tested, including Nichols, SS14, DAL-1, Mexico A, Grady, MN-3, Philadelphia-1, Philadelphia-2, and Bal-73-01 (D. Šmajs, unpublished results). All the investigated strains clustered in two subclusters containing either the Nichols or the SS14 strain (Fig. 1B). Together with the 23S rDNA locus, the TP0136 and TP0548 have already been tested for typing of syphilis causing strains isolated in the Czech Republic. Out of typeable treponemal DNA samples taken from 64 patients, nine different genotypes were identified (Flasarová et al., 2011). More importantly, the identified genotypes were found to independently combine with each other and also with the number of repetitions in the arp gene and the restriction profile of tprEGJ genes (Pillay et al., 1998). The recently improved CDC typing system combines arp and tprEGJ gene typing (Pillay et al., 1998) with analysis of rpsA gene G repeats (TP0279) (Katz et al., 2010) or with sequencing of part of the TP0548 gene (Marra et al., 2010). Since human populations are partially separated based on geographic origin, religion, native language, etc. (e.g. T. pallidum DNA from patients in Madagascar shared only one out of six subtypes identical to those isolated from the USA patients, Marra et al., 2010), it is likely that population-specific typing systems would be needed for precise epidemiological mapping of syphilis.
Whole genome analyses of TPE strains (Mikalová et al., 2010) and subsequent whole genome sequencing (Čejková et al., 2011) revealed several regions suitable for a molecular diagnosis of the yaws causing strains including TP0266 and TP0316 genes, intergenic region (IGR) between TP0548-TP0549, and TP1030-TP1031 genes. The observed indels ranged between 33-635 nt in length and can be found in the Samoa D genome (GenBank acc. no.: CP002374). However, these regions need to be analyzed in a larger number of TPE strains. Interestingly, all of these indels were also found in the Fribourg-Blanc genome. However, several specific indels differentiate the Fribourg-Blanc isolate from other individual TPE strains (Mikalová et al., 2010). Whole genome sequencing of the TEN strain, Bosnia A, that is in progress (D. Šmajs, unpublished data) has already revealed indels in IGR TP0085-TP0086, and in genes TP0136, TP0326, and TP0865, ranging between 13 and 0.06 kbp, which differentiate the bejel treponeme from both the TPA and TPE strains. Further work will test TPE and TEN specific molecular signatures as targets for specific molecular detection of yaws and bejel treponemes.
We are still some distance from understanding of the genetic basis of syphilis pathogenesis and of the strategies that T. pallidum uses to survive, invade and adapt in human hosts, especially when we consider its ability to cause, when untreated, life-long infections and the ability to infect all human tissue types. In spite of our relative lack of understanding with regard to the pathogenesis of T. pallidum, it is nonetheless easy to treat T. pallidum infections with antibiotics. Because of this and other features, such as a relatively short incubation period, it should be theoretically possible to eradicate syphilis (Rompalo et al., 2001) in the relatively near future. Future will reveal if the discovery of T. pallidum molecular cell strategies in pathogenesis of human infection will come sooner than the syphilis eradication or vice versa.
This work was supported by the grant of the Grant Agency of the Czech Republic (310/04/0021) and the grant of the Ministry of Health of the Czech Republic (NT11159-5/2010) and Ministry of Education of the Czech Republic (VZ MSM0021622415) to D.S.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.