|Home | About | Journals | Submit | Contact Us | Français|
RNA transcript levels in the syphilis spirochete Treponema pallidum subsp. pallidum (Nichols) isolated from experimentally infected rabbits were determined by the use of DNA microarray technology. This characterization of the T. pallidum transcriptome during experimental infection provides further insight into the importance of gene expression levels for the survival and pathogenesis of this bacterium.
Treponema pallidum subsp. pallidum is the causative agent of syphilis, a sexually transmitted disease characterized by multiple stages, widespread dissemination, persistent infection for years to decades, and varied clinical manifestations (18). Closely related spirochetes cause yaws, pinta, and endemic syphilis. This group of bacteria has not been cultured continuously under in vitro conditions, although some multiplication can be obtained in a tissue culture system (8, 17). T. pallidum can be maintained by intratesticular or intradermal inoculation of rabbits, causing an experimental infection that in many ways resembles the primary stage of syphilis. The inability to culture T. pallidum in vitro prevents the use of mutational analysis and other common molecular genetic approaches to study this bacterium and has limited the information available on the physiology and pathogenesis of the organism.
The 1.14-Mbp genome of T. pallidum subsp. pallidum Nichols was sequenced in 1998, and 1,039 open reading frames (ORFs) were predicted (12, 30). The availability of the complete genome sequence permits the use of genomic approaches to study this organism. DNA microarray-based gene expression profiling has been used to study differences in gene expression in several pathogenic and nonpathogenic bacteria under various environmental conditions (19-21, 24, 27). Moreover, DNA microarray techniques have been employed to map the genomic differences among strains of the same bacterial species (3, 7, 13).
The intradermal or intratesticular inoculation of rabbits with T. pallidum results in lesions that in many ways resemble the primary lesions of human syphilis (8, 28). As in human chancres, T. pallidum can reach extremely high concentrations in experimentally infected rabbits, permitting the isolation of 1010 to 1011 organisms from infected testicular tissue at 10 to 12 days postinoculation. This source of host-adapted T. pallidum permits analyses of gene expression during infection. In this study, we report the construction of a T. pallidum DNA microarray and the profiling of T. pallidum subsp. pallidum Nichols gene transcript levels during experimental infections of rabbits.
T. pallidum subsp. pallidum (Nichols) was maintained by rabbit inoculation and purified by Hypaque gradient centrifugation as described previously (2, 12). Chromosomal DNA was prepared as described by Fraser et al. (12). For RNA isolation, T. pallidum (Nichols) was purified from rabbit testicular tissue at 10 or 11 days postinoculation by Hypaque centrifugation, and RNAs were purified by the RNAzol B method based on guanidine thiocyanate phenol-chloroform extraction (TelTest, Friendswood, Tex.). Care was taken to maintain the samples at 4°C at all times and to minimize the processing time (3 to 4 h). RNA processing is unlikely to occur during the guanidine and phenol RNA extraction procedure. Gel electrophoresis of the samples was performed to evaluate the quality of RNA isolation. A smear with two strong bands corresponding to 23S and 16S RNAs and with discrete bands representing precursor RNAs was observed, indicating the integrity of the RNA molecules. To further assess the quality of RNA samples, transcript levels of 93 genes in two independent RNA preparations were examined by reverse transcription and real-time PCR amplification. A quantitation of gene expression yielded a high degree of correlation (r = 0.94), indicating that there was no bias in the relative amounts of mRNA species in individual RNA isolates.
We developed a microarray chip for T. pallidum subsp. pallidum Nichols containing all 1,039 predicted ORF PCR products together with a set of control PCR products from Escherichia coli, Shigella sp., and Enterococcus sp. (see the supplemental material). The annotation of ORFs was taken from the genome sequence (12), and most of the PCR products were amplified from the pUNI-D-TOPO cloning vector containing individual cloned T. pallidum genes (15). Some of the genes were amplified either from treponemal cosmid clones (23) or from isolated T. pallidum chromosomal DNA. Most of the individual PCR products covered the complete gene length, whereas products for genes with predicted signal sequences lacked the gene segment corresponding to those sequences (15). Negative control spots with no DNA were also included. The PCR products were purified by the use of Multiscreen 96-well filter plates (Millipore, Bedford, Mass.) according to the manufacturer's recommendations. Alternatively, some of the PCR products were purified by use of a PCR purification kit (QIAGEN, Valencia, Calif.).
T. pallidum chromosomal DNA (0.25 to 0.75 μg) was labeled by use of the Klenow enzyme (New England Biolabs, Beverly, Mass.) and random nonamers with a CyScribe first-strand cDNA labeling kit (Amersham Pharmacia Biotech, Piscataway, N.J.) protocol, with minor modifications. Labeling was performed at room temperature for 2.5 h. The total RNA (10 to 20 μg) was labeled with Superscript II reverse transcriptase (Invitrogen, Carlsbad, Calif.) as described in the supplemental material. The pretreated slides were hybridized simultaneously with labeled DNA and cDNAs by use of a CyScribe first-strand cDNA labeling kit (Amersham Pharmacia Biotech). Slides were then scanned with a ScanArray 5000 scanner (Packard BioScience, Meriden, Conn.), and the fluorescence intensities of individual spots were measured with ImaGene software (BioDiscovery, Marina del Rey, Calif.). The data analysis method was adapted from the work of Revel et al. (20) (see the supplemental material). The microarray data represent the means of three separate hybridizations (of two different RNA preparations), i.e., nine possible hybridizations for each gene.
The T. pallidum gene microarray used for these studies was constructed on glass slides by the use of PCR products. The 1,039 annotated T. pallidum ORFs (12) were named TP0001 to TP1041 (the loci TP0495 and TP0635 are not valid; for more information, see www.tigr.org). The average gene length of T. pallidum genes is 1,023 bp. Of the 1,039 ORFs, 1,034 gave positive signals for both labeled chromosomal DNA and cDNAs labeled with the total RNA as a template. The microarray signals of five ORFs (TP0156, TP0161, TP0518, TP0777, and TP1032) were below the threshold limit of the DNA channel and were omitted. All of these ORFs are relatively short (402, 60, 690, 222, and 432 bp, respectively) and code for conserved hypothetical (TP0156 and TP0518) or hypothetical (TP0161, TP0777, and TP1032) proteins. The cohybridization of Cy3-labeled T. pallidum DNA and Cy5-labeled cDNAs (or the reverse) yielded highly reproducible results in three independent experiments performed for this study. Moreover, comparable data were obtained with different RNA preparations, indicating that minor variations in the RNA preparations did not affect the overall results. Data were calculated as average expression ratios representing average, normalized ratios of cDNA to DNA fluorescent signals for replicate spots on each microarray and for replicate experiments. A value of 1 corresponds to the mean transcript level for all genes on the array. When expressed as a log value, the mean value is 0; values of >0 represent higher than average transcript levels, whereas those of <0 are below the average level. The cohybridization of differentially labeled cDNA and DNA preparations with each microarray chip permitted the normalization of effects resulting from different gene lengths, varied quantities of target DNA in each spot and of total RNA per hybridization, and other factors (29). This approach allowed us to compare gene expression levels among treponemal genes under given conditions. Gene expression profiling experiments performed with many organisms that can be grown under different conditions (10, 19-21, 24, 27) cannot be performed with T. pallidum because this spirochete cannot be continuously grown under in vitro conditions. Thus far, we have been unable to isolate sufficient RNAs after intradermal inoculations of rabbits and time-limited cultivation of T. pallidum under in vitro conditions to perform DNA microarray experiments. Although the cDNA signal depends on the efficiency of reverse transcription, which is different for different mRNA species, the random priming used in this study to label mRNA may minimize these differences. cDNA signals were detected for all treponemal genes, and observed differences in their expression varied by >2 orders of magnitude. This was consistent with the microarray experimental design (1). The fact that positive cDNA signals were detected for nearly all genes suggests that all genes of T. pallidum are expressed during experimental infection. This might imply that all of the genes that are not necessary for interacting with the mammalian host were lost during adaptation.
The expression of treponemal genes along the chromosome varied to a wide extent; the average log ratios in windows containing 10 genes are shown in Fig. Fig.1.1. The region with the highest average transcription rate was located between genes TP0187 and TP0250, which encompass a large ribosomal protein operon. Interestingly, this region was previously described to be underrepresented in a bacterial artificial chromosome library of T. pallidum DNA in E. coli (23).
The coordinates and gene numbering of the T. pallidum genome sequence begin at the predicted origin of replication; thus, roughly the first 520 genes are replicated in the forward (+) direction, whereas the other half of the genome is replicated in the reverse (−) direction. In most bacterial genomes, the majority of genes are transcribed in the same direction as that in which DNA replication proceeds (22), and this replication transcription codirectionality appears to apply especially for highly transcribed genes (31). The distribution of treponemal genes in the expression level intervals for each chromosomal half is shown in Fig. Fig.2.2. The tendency toward replication transcription codirectionality was also found for T. pallidum, but the higher transcription of genes oriented in the direction of replication was not observed. Given the long doubling time of T. pallidum (>30 h) in vivo and in vitro (9, 11), it is possible that exceptionally high transcription rates are not needed to keep up with cellular multiplication.
T. pallidum genes were sorted according to their relative degrees of expression standardized to the amount of T. pallidum chromosomal DNA. The 100 genes with the highest transcript levels are shown in Table Table1.1. For brevity and clarity, ORFs encoding hypothetical proteins (41 ORFs) were omitted from the table, but they are included in the supplemental material (Tables S1 and S2). Forty-one ORFs encoding hypothetical proteins (with an average ORF length of 624 bp) were randomly dispersed along the chromosome. Some of these formed clusters of two to three ORFs, indicating an operon organization. Besides the expected genes for ribosomal and polypeptide processing proteins (17 genes), the most highly transcribed genes were the flagellar filament and cytoplasmic filament protein genes (5 genes, including flaB-1-3, flaA, and cfpA), genes for lipoproteins or other prominent membrane proteins (11 genes, namely, tp47, TP0663, tpD, tmpC, tmpA, tpp15, tmpB, tpp17, p83/100h, tpn38b, and tap1), genes encoding chaperonins (3 genes, namely, groEL, groES, and dnaK), genes for proteins involved in redox balance (5 genes encoding alkyl hydroperoxidase C, flavodoxin, thioredoxin, pyruvate oxidoreductase, and desulfoferrodoxin), 4 genes for chemotaxis proteins (cheX, cheY, mcp2-1, and cheA), and genes encoding metabolic and other functions (14 genes). High gene expression activities were also detected for genes encoding glycolytic pathway enzymes (e.g., TP0844, encoding glyceraldehyde-3-phosphate dehydrogenase) and for one V-type ATPase operon (TP0424 to TP0430), whereas the other ATPase operon (TP0527 to TP0533) was transcribed at low levels. These findings may reflect the dependence of T. pallidum on the glycolytic pathway for energy production, on the maintenance of optimal redox conditions related to its microaerophilism, and on chemotaxis to localize to optimal tissue environments (17).
Most of the prominent proteins recognized by two-dimensional gel electrophoresis (16) were shown to be encoded by genes with high degrees of gene expression (Fig. (Fig.3).3). Two relatively minor spots corresponded to an average (1.01) gene transcription level (TP1016 [tpn39b], coding for a basic membrane protein) and to a below-average level (0.41) (TP0545 [mglB-1]), coding for a periplasmic galactose binding protein). Overall, there appears to be a strong correlation between the level of gene transcripts and protein abundance in T. pallidum.
T. pallidum genes with the lowest cDNA/DNA ratios, coding for 62 proteins with assigned functions, are shown in Table Table2.2. Genes encoding flagellar biosynthesis and other components of the cell envelope (12 genes), components of DNA metabolism (11 genes, e.g., genes encoding subunits of DNA polymerases I and III), and metabolic and other functions (11 genes) constituted the major groups. Eight genes for transporters (with specificities for oligopeptides, K+, thiamine, carnitine, and sugars), eight genes encoding various cellular processes (putative hemolysins and cell division proteins), four genes involved in transcription and translation, and four regulators were found within this group. In addition, four genes with internal authentic frame shifts were detected. Possible reasons for these low transcription levels range from constitutively weak promoters to tightly regulated genes under in vivo conditions. The latter explanation might apply to at least some of the genes encoding transporters, components of the cell envelope, and proteins for DNA metabolism. In contrast to genes encoding components of periplasmic flagella (14) (flaB-1, flaB-2, flaB-3, and flaA), which were the most highly expressed genes during infection, low transcription levels were observed for some biosynthetic flagellar genes (fliQ, fliR, flhA, and flhB). These last genes encode proteins with significant homologies to bacterial proteins associated with the flagellar export apparatus (25). The remaining 38 ORFs encoding hypothetical proteins were localized along the whole chromosome, with few clusters. The average ORF length within this group (1,276 bp) was considerably longer than the corresponding average length of ORFs encoding hypothetical proteins with high transcript levels, suggesting that these ORFs represent weakly transcribed genes. Investigations of changes in gene expression by more sensitive methods for these genes in T. pallidum grown under different conditions (e.g., in time-limited in vitro cultures) will be needed to detect changes induced by environmental changes and to unravel the regulatory mechanisms.
A set of genes encoding possible virulence factors of T. pallidum was predicted and identified by Weinstock et al. (30). The corresponding expression levels of these genes are shown in Table Table3.3. Since the treponemes investigated for this study were isolated from rabbit testes, expression profiling showed gene expression during infection of an animal host. The most highly transcribed genes code for membrane or surface-exposed proteins, and a majority of them are known antigens. Genes encoding putative hemolysins and regulators and genes involved in polysaccharide biosynthesis and other functions were found to be expressed at moderate or relatively low levels. Similarly, the T. pallidum repeat (tpr) genes were found to be expressed at relatively low levels. The tpr genes encode paralogous proteins with sequence similarity to the major surface protein (Msp) of Treponema denticola (12). This multigene family is found only in the genus Treponema, induces an antibody response during infection, and exhibits heterogeneity both within and between the T. pallidum subspecies and strains examined (4-6); therefore, it is thought that the Tpr proteins may be involved in pathogenesis and/or immune evasion. Our results indicate that the tpr genes are expressed at relatively low levels, consistent with the difficulties encountered in detecting Tpr protein expression in T. pallidum (J. M. Hardham and S. J. Norris, unpublished data).
To determine the accuracy of the microarray data and to test their consistency, we performed real-time PCR amplifications of selected genes. Eighty-four genes with varied levels of expression were tested by real-time PCRs using the same RNA preparations as those used for the microarray experiments. Primers for real-time PCR amplifications were designed with the Primer3 program (http://www-genome.wi.mit.edu/cgi-bin/primer/primer3_www.cgi/), and the lengths of PCR products were set to be between 150 and 250 bp. cDNA samples were prepared as described for microarray experiments, without the addition of Cy-labeled nucleotides. cDNAs were used at a 10-ng/μl concentration with primers (10 pmol of each/reaction) and SYBR green master mix (Applied Biosystems). The ABI 7900 sequence detection system (Applied Biosystems) was used according to the manufacturer's instructions. The relative quantitation method (ΔΔCT) was used to evaluate the real-time PCR data. All amplifications were performed at least three times. For each amplification, the resulting ΔΔCT values were normalized to the corresponding value obtained with the control gene TP0426 (encoding the V-type ATPase subunit A-1) and were log transformed. Average log ratios from three independent experiments were computed and correlated to the log ratios obtained by the DNA microarray method. For microarray experiments, TP0426 had a cDNA/DNA signal ratio of close to 1. A similar normalization was also applied to relevant ratios obtained for microarray experiments. A high degree of correlation (r = 0.94) was achieved, with a relatively narrow 95% confidence interval (± 0.46). The real-time reverse transcription-PCR (RT-PCR) data for the genes shown in Tables Tables11 to to33 can be compared to the ratios obtained by the use of DNA microarrays. A correlation between DNA microarray and real-time RT-PCR data for tpr genes (Table (Table3)3) suggests that expression ratios of individual paralogous genes sharing regions of high nucleotide homology can be altered in microarray experiments because of the cross-hybridization of cDNAs of other paralogous genes. It thus appears that real-time PCR amplifications can detect wider differences in gene expression and also render more accurate results for genes belonging to the same paralogous family. These two differences might be explained by the principal differences of both methods, i.e., their sequence specificities and effective concentration ranges. Microarray expression analysis as performed in this study detects transcripts along their entire length and thus may be subject to the confounding effects of cross-hybridization. In contrast, the quantitative RT-PCR method only detects gene transcripts recognized by primers (and the distances between primer sites) and thus is likely to be more specific for paralogous genes. Microarrays also have a more limited working range restricted to ~2 to 3 orders of magnitude due to the relatively narrow range between the signal detection threshold and signal saturation (1).
This study represents the first global analysis of gene transcription in T. pallidum by utilizing organisms extracted from rabbit testicular tissue at the height of infection. As an obligate pathogen of humans, T. pallidum has evolved to occupy a relatively homeostatic environment and hence may have jettisoned most of the genes that are not necessary for interaction with the host and also its capacity to alter gene expression. The apparent lack of a heat shock response (16, 24) and the dramatic effect that temperature has on T. pallidum replication both in vivo and in vitro (8, 26) may exemplify this reduced adaptability. However, given the occurrence of latent infections (in both humans and experimentally infected rabbits) and late gummatous, neurologic, and cardiovascular forms of the disease in humans, it is likely that gene expression is altered in these different tissue environments. This question could be approached to some extent by examining the transcription levels of T. pallidum genes during intradermal rather than intratesticular infections of rabbits, although fewer organisms can be recovered from the resulting dermal lesions. Further analyses of the genes expressed at high levels during infection may provide insight into the potential roles of their products in the pathogenesis of syphilis.
This work was supported by grants from the U.S. Public Health Service to G.M.W. (R01 DE12488 and R01 DE13759), S.J.N. (R01 AI49252), and T.P. (AI45842) and by grants of the Grant Agency of the Czech Republic (310/04/0021) and the Ministry of Health of the Czech Republic (NI7351-3/2003) to D.S.
†Supplemental material for this article may be found at http://jb.asm.org/.