|Home | About | Journals | Submit | Contact Us | Français|
Members of the Streptococcus bovis group are important causes of endocarditis. However, factors associated with their pathogenicity, such as adhesins, remain uncharacterized. We recently demonstrated that endocarditis-derived Streptococcus gallolyticus subsp. gallolyticus isolates frequently adhere to extracellular matrix (ECM) proteins. Here, we generated a draft genome sequence of an ECM protein-adherent S. gallolyticus subsp. gallolyticus strain and found, by genome-wide analyses, 11 predicted LPXTG-type cell wall-anchored proteins with characteristics of MSCRAMMs, including a modular architecture of domains predicted to adopt immunoglobulin (Ig)-like folding. A recombinant segment of one of these, Acb, showed high-affinity binding to immobilized collagen, and cell surface expression of Acb correlated with the presence of acb and collagen adherence of isolates. Three of the 11 proteins have similarities to major pilus subunits and are organized in separate clusters, each including a second Ig-fold-containing MSCRAMM and a class C sortase, suggesting that the sequenced strain encodes three distinct types of pili. Reverse transcription-PCR demonstrated that all three genes of one cluster, acb-sbs7-srtC1, are cotranscribed, consistent with pilus operons of other gram-positive bacteria. Further analysis detected expression of all 11 genes in cells grown to mid to late exponential growth phases. Wide distribution of 9 of the 11 genes was observed among S. gallolyticus subsp. gallolyticus isolates with fewer genes present in other S. bovis group species/subspecies. The high prevalence of genes encoding putative MSCRAMMs and pili, including a collagen-binding MSCRAMM, among S. gallolyticus subsp. gallolyticus isolates may play an important role in the predominance of this subspecies in S. bovis endocarditis.
Members of the Streptococcus bovis group have long been regarded as opportunistic pathogens and are important causes of endocarditis in humans. These organisms are the most frequently encountered clinical isolates among group D streptococci and the second-most important streptococcal cause of endocarditis after “oral streptococci” (4, 15, 55) which, however, are now classified into many different species and even different genera (16, 28). S. bovis group isolates are considered to be commensals in the intestinal tract, but they have been found only in ~2.5% to 15% of the human population (9, 24, 38, 58). Surprisingly, there is a clear correlation—for unknown reasons—between the isolation of S. bovis group organisms as the causative agent of endocarditis and the presence of cancerous lesions and polyps in the intestinal tracts of patients (3, 20, 24, 57), with the first suggestion of this association dating back to 1951 (36). Among patients with carcinoma of the colon, an increased S. bovis carriage rate of 56% was found (24). Whether this increased occurrence is a cause or effect of colon cancer is unclear. However, the association of S. bovis bacteremia or endocarditis with colon cancer raises the intriguing hypothesis that premalignant and malignant lesions in the intestinal tract, perhaps coupled with intestinal overgrowth of S. bovis, could facilitate translocation of this organism through the disrupted mucosal barrier and provide access to systemic circulation.
The classification of the heterogeneous S. bovis group has seen several revisions in recent years. Isolates from human infections were previously divided into three biotypes, designated as biotype I, biotype II/1, and biotype II/2. The majority of S. bovis group endocarditis isolates have been identified as biotype I, which was recently reclassified as Streptococcus gallolyticus subsp. gallolyticus (59). Isolates belonging to the genetically related biotype II/2 were assigned the classification Streptococcus gallolyticus subsp. pasteurianus, and the taxonomically synonymous Streptococcus macedonicus and Streptococcus waius (59) were combined into a third subspecies, Streptococcus gallolyticus subsp. macedonicus. The least frequently endocarditis-associated biotype II/1 strains were separated to form a new species, Streptococcus infantarius (60).
A number of surface proteins have been well characterized in gram-positive bacteria, including other streptococci, enterococci, and staphylococci. Many of these belong to the “microbial surface component recognizing adhesive matrix molecules” (MSCRAMM) family (45). Recent studies have demonstrated that MSCRAMMs from staphylococci and enterococci contain an N-terminal nonrepeated region called the A domain, which typically represents the primary ligand binding region and is usually composed of two to three subdomains, which each adopt an immunoglobulin (Ig)-like fold. Structural and biochemical studies with the fibrinogen-binding MSCRAMMs ClfA from Staphylococcus aureus and SdrG from Staphylococcus epidermidis have led to a so-called “dock, lock, and latch” ligand binding model, in which two of the subdomains containing Ig-like folds cooperatively bind a linear fibrinogen peptide ligand (50). A variation of this binding model, the “collagen hug,” includes an extended loop between the two subdomains, which facilitates binding of the large collagen triple helix by the collagen-binding MSCRAMMs Cna of S. aureus and Ace of Enterococcus faecalis (31, 73). The A domain is typically followed by a variable number of repeats, which may function as ligand binding domains in some cases (14). At the C-terminal end, MSCRAMMs contain a well-conserved cell wall-anchoring (CWA) domain, which consists of an LPXTG-like motif, a hydrophobic transmembrane segment, and a stretch of positively charged amino acids (aa). Attachment of the CWA domain to the cell wall peptidoglycan is catalyzed by a family of cell surface transpeptidases called sortases, which covalently couple the tyrosine residue in the LPXTG motif to peptidoglycan (35).
While the genes encoding the MSCRAMM proteins described above are usually organized as individual genes, several gene clusters encoding similar CWA proteins with predicted Ig-folded structures have recently been described. These clusters often contain an associated class C sortase, which tethers the CWA proteins into extended filamentous structures, called pili, on the cell surface (69). During pilus elongation, it appears that the major subunits are linked covalently to each other by sortase through the LPXTG motif and a lysine residue, which is conserved in a “pilin” motif of the major subunits (69). Pili have recently been characterized in a number of gram-positive bacteria, including other streptococci (68), and we reported on the Ebp pilus operon of E. faecalis, which is important for biofilm formation as well as experimental endocarditis and urinary tract infections (41, 65), and the recently identified pilus clusters of Enterococcus faecium (17, 62). In Streptococcus pyogenes, a genomic island known as the FCT region (fibronectin-binding protein, collagen-binding protein, and T antigen) includes a pilus cluster with the T antigen as the major pilus subunit and the collagen-binding MSCRAMM Cpa as an accessory pilus subunit. These pilus proteins, as well as those of Streptococcus agalactiae and Streptococcus pneumoniae, have also been shown to be involved in adhesion to other extracellular matrix (ECM) proteins (19) and host cells (1, 2, 25, 26, 32, 33, 43).
We have recently demonstrated that human endocarditis isolates of the S. bovis group frequently adhere to individual proteins of the host ECM (63). To our knowledge, however, no published data are available on adhesins of the S. bovis group, regardless of their long-recognized and important association with endocarditis in humans. In the current study, an ECM protein-adherent S. bovis group endocarditis isolate (S. gallolyticus subsp. gallolyticus) was selected for whole-genome sequencing and subsequent genome-wide analysis for genes encoding putative MSCRAMM and pilus family proteins. Expression of the identified genes in the sequenced strain and their prevalence among other S. bovis group endocarditis isolates were then assessed. One of the genes, acb (encoding adhesin to collagen of the S. bovis group), was selected for further functional and phenotypic characterization.
Two collections of previously published S. bovis group isolates obtained from blood cultures of endocarditis patients in two U.S. hospitals were used in this study. The first collection included 17 isolates, consisting of 15 S. gallolyticus subsp. gallolyticus isolates, 1 S. gallolyticus subsp. pasteurianus isolate, and 1 Streptococcus infantarius subsp. coli isolate, and they were originally identified as S. bovis biotype I, II/1, or II/2 (63, 67). The second collection included 13 S. bovis group isolates, consisting of S. bovis biotype I and II/2 and S. macedonicus (18), and also included 7 other S. bovis group isolates (J. M. Steckelberg, unpublished data). To adopt the recently revised S. bovis group classification, subspecies-level identification of isolates in the second collection was performed by colony hybridizations using a biotype I-specific probe and 16S rRNA gene sequencing of selected isolates, as described previously (63), as well as a PCR-based identification method (67).
Collagen types I (rat tail), IV (human placenta), and V (human placenta) were purchased from Sigma Chemical Co. (St. Louis, MO).
Genomic DNA from S. gallolyticus subsp. gallolyticus strain TX20005 (designated as S. bovis biotype I strain 2703 elsewhere ) was isolated using cetyl-trimethylammonium bromide (CTAB), as described previously (72). Cell walls of cells grown overnight were weakened with mutanolysin and lysozyme before cell lysis. DNA sequencing was performed using the 454 Life Sciences pyrosequencing strategy (34). Read-pair information (i.e., paired reads generated from the two ends of the same DNA fragments but assembled in different contigs) was used to create higher-order scaffolds of contigs.
Open reading frames (ORFs) in the unannotated contigs were predicted with Glimmer. Initially, we performed homology searches in the sequenced S. gallolyticus subsp. gallolyticus strain TX20005 genome with a locally installed stand-alone BLAST (ftp://ftp.ncbi.nih.gov/blast/) against an extensive database of over 90 previously identified or putative adhesins from both gram-positive and gram-negative bacteria. As a complementary approach to locate other potential cell surface adhesin-encoding genes, all ORFs longer than 70 amino acid residues were searched for the presence of an expanded LPXTG motif, [LYF]PX[TSA][GANS], within 60 C-terminal amino acid residues. This sequence covers over 95% of known sequence variations within this motif. For maximal inclusion of potential CWA proteins, we also included [LYFPSIV][PGSA]X[TSA][GANS], which is based on the variation found by Boekhorst et al. (6) and the Pfam database of LPXTG motifs (Pfam entry, Gram_pos_anchor; accession number PF00746). To include other known sortase-dependent cell wall sorting signals in our analysis, we used an extended NPQTN-like motif N[PSA][QK]T[NA] as a third search string. The retrieved ORFs were then analyzed for the presence of a hydrophobic transmembrane domain (TMHMM Server v2.0 [www.cbs.dtu.dk/services/TMHMM-2.0/]) and a short stretch of positively charged residues following the anchoring motif. Similarity searches were carried out with PSI-BLAST and ClustalW multiple alignments. Secondary and tertiary structures were predicted by using the 3D-PSSM (3D Position Specific Scoring Matrix) and PHYRE (Protein Homology/analogY Recognition Engine) fold recognition servers (http://www.sbg.bio.ic.ac.uk/~3dpssm/; http://www.sbg.bio.ic.ac.uk/phyre/html/index.html). The presence of an N-terminal signal peptide sequence was analyzed by the SignalP server (www.cbs.dtu.dk/services/SignalP/). Conserved domains were analyzed with reverse position-specific BLAST (http://www.ncbi.nlm.nih.gov/blast/Blast.cgi) and with similarity searches against the InterPro database of protein families, domains, and repeats (http://www.ebi.ac.uk/interpro/). Amino acid sequence repeats were identified with the RADAR algorithm (http://www.ebi.ac.uk/Radar) and further refined with multiple sequence alignments.
The DNA region encoding aa 35 to 338 (predicted A region consisting of N1 and N2 subdomains) of Acb was amplified from genomic DNA of strain TX20005 using the primers listed (see Table S1 in the supplemental material) and cloned into the expression vector pQE30 (Qiagen) to obtain pTEX21000. The cloned acb sequence was confirmed by sequencing. The corresponding expressed Acb segment was designated as rAcb35, based on its calculated molecular mass of 35,118. Plasmid pTEX21000 was then electroporated into Escherichia coli XL1 Blue, and expressed rAcb35 containing an N-terminal His6 tag was purified from lysed cells using nickel-affinity chromatography under native conditions, as described previously (40, 64). Protein concentrations were determined by absorption spectroscopy at 280 nm using a calculated molar absorption coefficient value of 39,880 M−1 cm−1. The molecular mass of purified rAcb35 was determined using matrix-assisted laser desorption ionization-time of flight (MALDI-TOF) mass spectroscopy from a protein sample dialyzed into H2O.
Binding of the recombinant His6-tagged rAcb35 to collagen types I, IV, and V was tested using a previously described assay with minor modifications (39, 62). In brief, 1 μg of each ECM protein was coated in 100 μl phosphate-buffered saline in Immulon 2HB (Thermo Scientific) 96-well microplate wells. Wells were incubated with various concentrations of rAcb35, and bound His6-tagged protein was detected with anti-His6 monoclonal antibody (GE Healthcare), followed by alkaline phosphatase-conjugated anti-mouse antibody (Bio-Rad). p-Nitrophenyl phosphate (Sigma) was used for signal detection.
Polyclonal antiserum against rAcb35 was generated in rats, and the titer was determined using ELISA. Surface expression of Acb by S. gallolyticus subsp. gallolyticus isolates grown in brain heart infusion (BHI) broth for 1, 6, and 16 h from an initial inoculum of an optical density at 600 nm (OD600) of 0.01 was detected by a whole-cell ELISA assay (42, 62) using a 1:1,000 dilution of anti-rAcb35 antiserum, followed by 1:5,000-diluted horseradish peroxidase (HRP)-conjugated goat anti-rat IgG (Jackson Immunochemicals). Preimmune serum was used as a negative control.
To detect Acb surface expression by flow cytometry, bacteria grown in BHI to late logarithmic growth phase from an inoculum of OD600 of 0.01 (3 h) were labeled using a 1:1,000 dilution of rat antiserum raised against rAcb35 or preimmune serum followed by 1:400-diluted goat anti-rat F(ab′)2 fragment-specific IgG conjugated with R-phycoerythrin, as described previously (23, 62). Cells were then fixed in 1% paraformaldehyde in phosphate-buffered saline and analyzed with a Coulter EPICSXL AB6064 flow cytometer (Beckman Coulter) and System II software.
Total RNA was isolated from S. gallolyticus subsp. gallolyticus strain TX20005 grown in BHI to an OD600 of 0.3, 0.6, and 1.0 using the RNAprotect bacteria reagent and RNeasy mini kit (Qiagen) according to the manufacturer's recommendations and treated twice with 20 U RQ1 DNase (Promega) for 30 min at 37°C. DNase was removed using the RNeasy mini kit and purification protocol (Qiagen). Total RNA was then reverse transcribed with specific primers (see Table S1 in the supplemental material) using the SuperScript One-Step reverse transcription-PCR (RT-PCR) with the Platinum Taq kit (Invitrogen) according to the manufacturer's instructions. As an internal control for each RNA isolation, a 513-bp fragment of the gyrase A gene of TX20005, which we identified in the sequenced TX20005 genome, was amplified using the Sb gyrA F and Sb gyrA R primers (see Table S1 in the supplemental material). Reactions without RT were used as controls to verify the lack of DNA contamination in the total RNA preparation.
Preparation of colony lysate membranes of S. bovis group isolates (with the sequenced strain TX20005 as a positive control and E. faecalis strain OG1RF as a negative control) and hybridization under high-stringency conditions were performed as described previously (11, 66). DNA probes of 400 to 500 bp in length were obtained by PCR amplification of S. gallolyticus subsp. gallolyticus TX20005 genomic DNA using the primers listed (see Table S1 in the supplemental material) and were radiolabeled by using the RadPrime DNA labeling system according to the manufacturer's recommendations (Invitrogen, Carlsbad, CA).
GenBank accession numbers for the nucleotide sequences of the 11 MSCRAMM-like proteins and five other predicted CWA proteins reported here for S. gallolyticus subsp. gallolyticus strain TX20005 are GQ497715, GQ497716, GQ497717, GQ497718, GQ497719, GQ497720, GQ497721, GQ497722, GQ497723, GQ497724, GQ497725, GQ497726, GQ497727, GQ497728, GQ497729, and GQ497730.
Selection of one of the S. gallolyticus subsp. gallolyticus isolates, strain TX20005, for whole-genome sequencing was based on its adherence to ECM proteins and identical pulsed-field gel electrophoresis pattern with two other isolates in our collection, suggesting it may represent a widespread clonal type among clinical isolates (63). The current draft genome sequence of TX20005 consists of 77 unassembled contigs with 37.8 times the coverage of the estimated 2.3-Mb-sized genome (X. Qin, G. M. Weinstock, B. E. Murray, et al., unpublished results). Hence, we considered this sequence appropriate for the in silico analysis of this study.
Our first approach for identifying potential adhesin-encoding genes in the TX20005 genome focused on BLAST similarity searches to locate ORFs which shared significant sequence similarity to a collection of previously characterized or putative MSCRAMM-type and other adhesins, including those from other streptococci. As a result, two such ORFs were identified and named Acb (for adhesin to collagen of the S. bovis group) (see below) and Sbs16 (for S. bovis group surface protein). Acb showed the most similarity (52 to 71%) to the A domains of the collagen binding MSCRAMMs Cne from Streptococcus equi (27), Cna from S. aureus (52), and Acm from E. faecium (42) (Table (Table1).1). The E. faecalis MSCRAMM EF1269 (64) was identified as the closest homolog of Sbs16 (54% similarity), followed by the fibrinogen binding A domain regions of the staphylococcal MSCRAMMs ClfA, ClfB, FnBpB, and SdrG.
To identify CWA proteins that are more distantly related over the rest of their sequences and, thus, might not have been detected by the search described above, we searched for the presence of a CWA domain in the sequenced genome and found 15 ORFs (named Sbs proteins, similar to Sbs16 described above) with predicted LPXTG-like motif cell-wall sorting signals. Thirteen of these contained a predicted N-terminal signal peptide for Sec-dependent secretion (with a cutoff level of ≥0.73 using SignalP), while no signal peptide could be detected in two ORFs (Sbs4 and Sbs5). As expected, these 15 ORFs included Acb described above. Sbs16, however, was not found by this search, likely due to the location of sbs16 at the 3′ end of a yet-unassembled contig, leading to a prediction of a prematurely truncated protein and a missing CWA domain.
With the exception of Acb and Sbs16, extensive similarity was not found in the other predicted CWA proteins to the A domains of known MSCRAMMs from staphylococci and enterococci or other adhesins, including those from other streptococci. However, structural analysis (61, 62, 64) using two complementing fold recognition programs, 3D-PSSM and PHYRE, revealed that 11 of the CWA proteins (Table (Table1)1) contained regions predicted to fold into Ig-like structures with high probability with an estimated content of three to six repeated Ig-like-folded modules. Based on the similar domain organization and predicted structure with repeated Ig-folded modules, as in known MSCRAMMs, we consider these 11 CWA proteins as good candidates for an adhesive function. The number of potential MSCRAMM-encoding genes in the S. gallolyticus subsp. gallolyticus genome is in a range similar to that previously identified in the genomes of other gram-positive pathogens (7, 54, 62, 64). While the current draft sequence assembly is predicted to represent a high coverage of the genome and include the majority of the ORFs of this organism (2,418 predicted ORFs), we cannot rule out the possibility that completion of the remaining sequence gaps could lead to identification of additional protein(s) with MSCRAMM-like features.
The predicted N-terminal signal peptides of the 11 MSCRAMM-like proteins range from 25 aa to 47 aa and are followed by nonrepeated regions (called A regions in MSCRAMMs) that show large size variation (ranging from 179 aa to 1,579 aa in length). Five of the proteins also contain C-terminal repeats preceding the CWA domain (Fig. (Fig.1A);1A); these are typically present in MSCRAMMs (called B repeats). Although only Acb and Sbs16 show extensive similarity to the A domains of staphylococcal and enterococcal MSCRAMMs, our subsequent sequence analyses revealed that the remaining nine proteins contain regions with some homology to regions of other CWA proteins containing Ig-folded domains. As summarized in Table Table1,1, these include the streptococcal antigen I/II family (Sbs2), which consists mostly of CWA proteins and is implicated in binding to collagen, fibronectin, fibrinogen, and laminin, formation of biofilm, as well as internalization into host cells (5, 44, 46, 47); the Ig-folded B repeats of Cna and the corresponding repeats of the collagen-binding MSCRAMM CbpA of Arcanobacterium pyogenes (Sbs12 and Sbs13); the internalin family proteins of Listeria spp. (Sbs15), a large group of well characterized mostly CWA proteins that mediate internalization into epithelial cells (12, 29); and pilus subunit proteins of other streptococci and enterococci (Sbs7, Sbs11, and Sbs14) (see below). Interestingly, two of the proteins, Sbs6 and Sbs10, are apparent CWA enzymes (a protease and glycosyl hydrolase, respectively) with predicted C-terminal Ig-folded domains. While we found two proteins, Acb and Sbs16, with similarity to MSCRAMMs of enterococci and staphylococci and three predicted major pilus subunit proteins, it is noteworthy that our genome-wide analyses did not identify homologs of MSCRAMM-type adhesins from major pathogenic streptococci, such as the fibronectin-binding protein F or collagen-binding Cpa. Thus, the overall content of MSCRAMM-like proteins in S. gallolyticus subsp. gallolyticus appears most similar to that of enterococci (previously included in group D streptococci together with the S. bovis group) (62, 64).
Of note, our genome search also identified an ORF (not included in the Sbs proteins numbered in this study), showing 86 to 89% similarity to a family of streptococcal fibronectin binding adhesins, FbpA of Streptococcus gordonii, Fbp54 of S. pyogenes, and PavA of S. pneumoniae (10, 13, 21); their homologs are also present in enterococci and staphylococci. However, these proteins lack an identifiable signal peptide region, any known CWA domain, and are not predicted to be Ig folded. Hence, this ORF was not included in the current study.
Further sequence and fold analyses predicted that the nonrepeated A region of Acb is likely to consist of two subdomains, designated as N1 and N2, which are structurally homologous to those found in previously characterized MSCRAMMs (31, 50, 73). The predicted N1N2-region of Acb also contains a sequence similar to the conserved signature motif of MSCRAMMs, TYTFTDYVD, and a putative latch sequence, which is important for ligand binding of Cna, Ace, and SdrG (8, 31, 50, 73). Both of these motifs in Acb aligned perfectly with those of known MSCRAMMs in multiple alignments. To determine whether Acb is involved in binding to ECM proteins, we expressed a segment consisting of the complete N1N2 region as an N-terminal His6-tagged protein. The purified rAcb35 protein migrated on sodium dodecyl sulfate-polyacrylamide gel electrophoresis as expected and without the presence of other protein contaminants or degradation products (Fig. (Fig.2).2). MALDI-TOF analysis confirmed the expected size of the protein (calculated Mw of 35,118 versus MALDI-TOF Mw of 35,063) and lack of posttranslational proteolytic processing. As seen in Fig. Fig.2,2, rAcb35 showed dose-dependent and saturable binding to collagen types I, IV, and V in a solid-phase ELISA-type ligand binding assay, while no binding was detected to bovine serum albumin. The highest affinity of rAcb35 was observed to collagen type I, with an apparent KD (equilibrium dissociation constant) of 4.5 × 10−8 M, followed by collagen type IV (KD = 3.2 × 10−7 M) and collagen type V (KD = 5.3 × 10−7 M). These estimated affinity values for collagen are in a range similar to or lower than those reported previously for Cna, Acm, Ace, or CpbA (42, 48, 51, 53).
To determine if Acb is expressed on the cell surface of S. gallolyticus subsp. gallolyticus, we initially analyzed eight isolates (including the sequenced strain TX20005) grown in BHI to three time points (1 h, 6 h, and 16 h) using whole-cell ELISA. Surface expression of Acb was detected in six of the isolates, which also carry the acb gene (TX20001, TX20005, TX20007, TX20008, TX20013, and TX20014), while no Acb expression was observed by two strains that lack acb (TX20004 and TX20010) (see Fig. S1 in the supplemental material; summarized in Fig. Fig.3B).3B). Our subsequent quantification of Acb surface expression by flow cytometry with cells grown to late exponential growth phase detected generally similar expression levels with the eight isolates, thus confirming our observations with whole-cell ELISA (Fig. (Fig.3A).3A). Among the six acb-positive isolates, the order of highest to lowest mean fluorescence intensities was TX20005, TX20010, TX20008, TX20007, TX20013, and TX200014, while the number of positive cells ranged between 85.8 to 29.9% among these isolates. No significant labeling of cells was detected with the acb-negative isolates TX20004 and TX20010. Although these results are generally in good agreement with the whole-cell ELISA described above, TX20005 showed lower Acb surface expression levels when assessed by whole-cell ELISA (cells immobilized onto wells) than by flow cytometry (cells in liquid phase). This may be due to less-efficient coating of TX20005 onto the ELISA wells or Acb being less accessible on the surface of immobilized cells of TX20005 than on those of other isolates. Taken together, these results suggest that Acb expression is common during in vitro growth among S. gallolyticus subsp. gallolyticus isolates. Furthermore, a clear correlation was found when cell surface expression of Acb by the eight isolates and presence of acb in their genomes (see below) were compared with our previously published adherence of these S. gallolyticus isolates to collagen (Fig. (Fig.3B)3B) (63), suggesting that Acb may mediate the observed adherence of S. gallolyticus subsp. gallolyticus to collagen. On the other hand, we recently found two other acb-negative isolates which nevertheless exhibited adherence to collagen type I (63), suggesting that another collagen-binding determinant is also likely to exist in some of the isolates.
Three of the 11 predicted CWA proteins with Ig-like folds, Sbs7, Sbs11, and Sbs14, show homology to major pilus subunit proteins that have been identified recently in several gram-positive species (Table (Table1).1). Each of the three ORFs is organized in a separate gene cluster that includes a second gene encoding an Ig-folded CWA protein (Acb, Sbs12, or Sbs15, respectively) preceding the predicted major pilus subunit (Fig. (Fig.1B).1B). Immediately downstream of sbs7, sbs11, and sbs14, there is an ORF in each of the three clusters encoding a predicted class C sortase, which is thought to be specific for polymerization of pilus subunits into pili.
Most previously characterized pilus gene loci in other gram-positive cocci, such as other streptococci and enterococci, consist of three structural pilus genes, one coding for the major pilus subunit and two for minor or accessory subunits, in addition to one or more adjacent sortase-encoding gene(s). The presence of three putative pilus gene clusters in S. gallolyticus subsp. gallolyticus consisting of one major pilus subunit and another MSCRAMM-like protein, which may function as an accessory pilus protein, has not been reported before in other gram-positive bacteria. However, some subtypes of the heterogeneous FCT region pili found in S. pyogenes (e.g., FCT-1 of serotype M6) (37, 49) consist of only one accessory pilus protein, in addition to PrtF1, which appeared not to be assembled into these pili (37). The recently described second pilus type of S. pneumoniae (PI-2) is encoded by a two-gene cluster; however, the predicted accessory subunit occurs as a pseudogene, suggesting that this pilus is composed of only the pilus backbone protein (2).
It is noteworthy that the gene coding for the collagen-binding MSCRAMM, Acb, is located in the same gene cluster as the putative major pilus subunit gene, sbs7, whereas its staphylococcal and enterococcal homologs, cna, acm, and ace, all occur as individual genes. This raises an intriguing question of whether Acb, unlike its known counterparts in other gram-positive bacteria, is incorporated into pilus fibers during their assembly instead of or in addition to being anchored to the cell wall peptidoglycan. However, this remains a subject for future studies.
Consistent with our prediction of Ig-like-folded structures in the putative pilus subunit proteins, crystal structures of an accessory pilus subunit GBS52 of S. agalactiae and the major pilus subunit Spy0128 of S. pyogenes were recently shown to consist of two domains, each folding into an Ig-like structure with a topology similar to that found in the crystal structures of the individually organized MSCRAMMs ClfA, SdrG, Cna, and Ace (22, 26). Of note, an electron microscopy analysis published almost a decade ago revealed pilus-like structures on the surface of S. gallolyticus cells, although their structural components were not identified (71), and therefore it is not clear whether the putative pilus proteins identified here were involved in the assembly of these structures.
Taken together, the three gene clusters of S. gallolyticus subsp. gallolyticus resemble known pilus operons in several ways. (i) The predicted ORFs share sequence similarities and predicted Ig-like-folded structures with known pilus subunit proteins. (ii) The corresponding genes are organized in clusters and include an adjacent class C sortase. (iii) A pilin motif, required for polymerization of the subunits during pilus assembly (69, 70), was found in Sbs7, and potential degenerate motifs in Sbs11 and Sbs14 and the E box, which may be important for attachment of accessory subunits to the pilus backbone, are also present in all three proteins, with the central glutamic acid 100% conserved. (iv) Finally, the surrounding regions consist of ORFs homologous to transcriptional regulators, a signal peptidase, DNA-acting enzymes, and transposases, similar to those found around pilus clusters of other gram positive bacteria, where these clusters are often located in genomic islands (30, 37, 56). Therefore, these features suggest that the three gene clusters identified here code for pilus proteins and possibly enable S. gallolyticus subsp. gallolyticus to express three distinct pili on the cell surface.
The genetic organization of the three gene clusters suggests that the two Ig-folded proteins and possibly also the adjacent sortase in each cluster may be cotranscribed. To examine this further, we selected the acb-sbs7-srtC1 cluster for transcriptional analysis. Our search for transcriptional terminators identified a 15-bp inverted repeat 3 bp downstream of srtC1, whereas no such sequences were found between the three individual genes. This inverted repeat is predicted to form a stem-loop structure (free energy, −23.07 kcal mol−1) and is thus likely to function as a ρ-independent transcriptional terminator. We next performed RT-PCR using a collection of internal primer pairs (one for each of the three genes, acb, sbs7, and srtC1) and primer pairs designed to amplify regions in between these three genes. As shown in Fig. Fig.4,4, the three internal primer pairs amplified fragments of the expected sizes, thus demonstrating that acb, sbs7, and srtC1 are expressed in TX20005 under in vitro growth conditions. Intergenic primers amplified the regions between acb-sbs7 and sbs7-srtC1, while no PCR product was detected with the primer pair designed to amplify the upstream region of acb encompassing the predicted promoter region or with the primer pair that includes the 3′ end of srtC1 and extends beyond the predicted transcriptional terminator. Therefore, our results indicate that the three genes in this cluster produce a single polycistronic mRNA transcript and are thus organized as an operon, a characteristic of known pilus gene clusters (41, 62).
To determine whether, in addition to sbs7 and acb (see above), the remaining nine predicted ORFs represent actively transcribed genes, we performed a transcriptional analysis using RT-PCR and an internal primer pair for each gene. PCR amplification was detected for all 11 genes during early, mid, and late exponential growth phases, although at distinctly variable levels and with sbs2 and sbs10 producing only weak PCR bands, especially sbs2 in early and late exponential growth phases (see Fig. S2 in the supplemental material). Hence, these results indicate that the 11 genes are expressed in S. gallolyticus subsp. gallolyticus TX20005 during several stages of in vitro growth, consistent with the previously observed adherence phenotypes to ECM proteins among S. gallolyticus subsp. gallolyticus isolates (63) and further supporting a possible role for these genes in adherence.
Since the positions of 20 of the 37 S. bovis group isolates in the recently revised classification scheme of the S. bovis group were unclear (59, 60), we first updated their species/subspecies identification to conform to the current criteria using a combination of hybridization, PCR analysis, and 16S rRNA gene sequencing for selected isolates (63, 67). As a result, the 37 S. bovis group isolates were determined to consist of 31 S. gallolyticus subsp. gallolyticus isolates, 4 S. gallolyticus subsp. pasteurianus isolates, 1 S. gallolyticus subsp. macedonicus isolate, and 1 S. infantarius subsp. coli isolate.
DNA hybridizations of the 37 S. bovis group isolates showed that, among the 31 S. gallolyticus subsp. gallolyticus isolates, the majority of the genes (9 of 11) were widely distributed with 74% to 100% of the 9 genes found to be present. Three of these nine genes (sbs2 and the predicted pilus cluster sbs14-sbs15) were detected in every S. gallolyticus subsp. gallolyticus isolate tested (Table (Table2).2). The remaining two genes, sbs11 and sbs12, which are predicted to form one of the three pilus-encoding clusters, were present in 29% and 26% of the S. gallolyticus subsp. gallolyticus isolates, respectively. The 11 genes were found less often among the other S. bovis group isolates tested, with only 1 to 5 genes present in each of the five isolates belonging to the two other S. gallolyticus subspecies (S. gallolyticus subsp. pasteurianus or macedonicus), while 3 of the genes were present in the genetically more distant S. infantarius subsp. coli isolate.
On average, the 31 S. gallolyticus subsp. gallolyticus isolates carried 8 of the 11 genes tested, while the four S. gallolyticus subsp. pasteurianus isolates carried only 3, and the two remaining isolates, S. gallolyticus subsp. macedonicus and S. infantarius subsp. coli, carried 6 and 3 genes, respectively, further emphasizing the different distribution patterns of the 11 genes within the S. bovis group.
Of note, all 11 genes were preserved in the three isolates with identical pulsed-field gel electrophoresis patterns (TX20005, TX20006, and TX20011), unlike in the 11 other nonrelated S. gallolyticus subsp. gallolyticus isolates (63), further indicating that these isolates represent a common or more pathogenic lineage. The preservation of these genes in the majority of S. gallolyticus subsp. gallolyticus isolates derived from endocarditis patients suggests that they may give this organism an advantage in the human host, e.g., by facilitating adherence and colonization of intestinal tissues, such as cancerous cells, translocation into the bloodstream, and/or attachment to heart valves.
In summary, our genome sequencing and in silico analyses of the ECM-adherent S. gallolyticus subsp. gallolyticus strain TX20005 identified 11 proteins with features of MSCRAMMs and pilus proteins: 2 of these are homologous to well-established collagen and fibrinogen binding MSCRAMMs of enterococci and staphylococci, and 6 are clustered in predicted pilus-encoding gene clusters. We demonstrated that one of these proteins, Acb, binds to collagen with high affinity as a recombinant protein, and we also observed an association between Acb surface expression, presence of the acb gene in the genome, and adherence to collagen by S. gallolyticus subsp. gallolyticus isolates. Furthermore, one of the predicted pilus-encoding clusters was demonstrated to be expressed in S. gallolyticus subsp. gallolyticus TX20005 during in vitro growth and organized as an operon. Except for one of the predicted pilus gene clusters, sbs12-sbs11, the remaining genes were found to be carried by the majority of endocarditis-derived S. gallolyticus subsp. gallolyticus isolates, but they were considerably less abundant in other S. bovis group species/subspecies. The widespread prevalence of multiple predicted adhesin-encoding genes may have important implications for our understanding of the interactions of this pathogen with its human host. It is of particular interest and our future goal to find additional binding targets for these proteins and assess whether they are involved in the pathogenesis of S. gallolyticus subsp. gallolyticus.
This work was supported by the J. Ralph Meadows Professorship in the Department of Internal Medicine, University of Texas Medical School, to Barbara E. Murray.
Published ahead of print on 28 August 2009.
§Supplemental material for this article may be found at http://jb.asm.org/.