|Home | About | Journals | Submit | Contact Us | Français|
A single strain of Bacteroides fragilis synthesizes eight distinct capsular polysaccharides, designated PSA to PSH. These polysaccharides are synthesized by-products encoded by eight separate polysaccharide biosynthesis loci. The genetic architecture of each of these eight loci is similar, including the fact that the first gene of each locus is a paralog of the first gene of each of the other PS loci. These proteins are designated the UpxY family, where x is replaced by a to h, depending upon the polysaccharide locus from which it is produced. Mutational analysis of three separate upxY genes demonstrated that they are necessary and specific for transcription of their respective polysaccharide biosynthesis operon and that they function in trans. Transcriptional reporter constructs, reverse transcriptase PCR, and deletion analysis demonstrated that the UpxYs do not affect initiation of transcription, but rather prevent premature transcriptional termination within the 5′ untranslated region between the promoter and the upxY gene. The UpxYs have conserved motifs that are present in NusG and NusG-like proteins. Mutation of two conserved residues within the conserved KOW motif abrogated UpaY activity, further confirming that these proteins belong to the NusG-like (NusGSP) family. Alignment of highly similar UpxYs led to the identification of a small region of these proteins predicted to confer specificity for their respective loci. Construction of an upaY-upeY hybrid that produced a protein in which a 17-amino-acid segment of UpaY was changed to that of UpeY altered UpaY's specificity, as it was now able to function in transcriptional antitermination of the PSE biosynthesis operon.
Species of the genus Bacteroides are collectively the most abundant gram-negative bacteria of the human intestinal microbiota, and a conserved feature of these symbionts is the synthesis of a large number of capsular polysaccharides. A single strain of Bacteroides fragilis, for example, synthesizes eight distinct capsular polysaccharides, PSA to PSH (16). The two capsular polysaccharides of B. fragilis NCTC9343 for which structures have been elucidated, PSA and PSB, have repeat units comprised of four and six monosaccharides, respectively, several of which are di- or trideoxymonosaccharides (3). Numerous genes are required to synthesize such complex polysaccharides, and most of the genes necessary for the synthesis of the PSs are contained within each of the eight PS biosynthesis loci (9, 10, 12, 13). These eight loci range in size from 11 kb to 23.6 kb and are scattered throughout the B. fragilis genome rather than localized to a single genetic area.
The eight PS biosynthesis loci have a common genetic architecture (Fig. (Fig.1A).1A). Each PS biosynthesis locus is arranged as a single operon with a promoter upstream of the first gene. Therefore, transcription initiated from the single promoter must be maintained with great fidelity over large distances. Seven of the eight PS promoters are contained within a segment of invertible DNA located between 19- to 25-bp inverted repeats (IRs) (16). The invertible DNA regions between the IRs range in size from 168 to 193 bp, and their inversions are mediated by a single global DNA invertase, Mpi (14), resulting in phase-variable expression of each of the capsular PSs. There is a region of 126 to 224 bp of untranslated DNA between the downstream IR and the first gene of each locus. The proteins encoded by the first gene of each locus are 38 to 84% similar to each other and have been designated the UpxY proteins (13), where the x is replaced by the letter designation of the specific PS (UpaY to UphY). The products encoded by the second gene of each locus are also similar to each other and are designated the UpxZ family. For each of the eight loci, the genes downstream of these two upx genes encode enzymes involved in the synthesis of the respective PS. We have shown that the promoter regions and the upx genes of each of the eight loci are conserved in the species but that there is great heterogeneity just downstream in the PS biosynthesis genes, which results in the synthesis of structurally distinct PSs by different B. fragilis strains (9, 11, 13).
Based on conservation of the upxY genes in each PS locus and the lack of similarity of their encoded proteins to known products involved in PS biosynthesis, we predicted that these proteins are involved in regulation of PS synthesis (16). Each of the eight UpxY proteins contains an identifiable KOW motif, initially described as confined to NusG and the large subunit ribosomal protein families (17). In Escherichia coli, NusG interacts with RNA polymerase (RNAP) and has a global function in modulation of the transcription of most E. coli genes. NusG is highly conserved in eubacteria, and its gene is contained within a genomic region that includes secE (15), ribosomal protein genes, and rpoB and rpoC. The UpxYs do not share extensive similarity with the characterized NusG of E. coli, and the B. fragilis genome encodes a bona fide NusG ortholog (BF4020) whose gene is contained in a conserved genetic region similar to that of E. coli. Therefore, the UpxY proteins likely constitute a family of NusG-like proteins. A recent phylogenetic analysis of NusG homologs from 511 prokaryotic genomes revealed three distinct families (4). The first are the true NusG orthologs, the second are the archaeal NusGs, and the third have been designated NusGSP, for specialized paralog of NusG (4). The family of NusGSP proteins differs from the true NusGs in that their sequences are more divergent and their genes are contained in diverse genetic loci. The only well-characterized NusGSP is RfaH, which functions to prevent premature transcriptional termination of several operons of E. coli rather than having a global nonspecific function characteristic of NusG. RfaH recognizes a region of nucleic acid downstream of the promoter in the 5′ untranslated region (UTR) known as the ops element (21), which is present in several operons including the capsular polysaccharide biosynthesis locus, pilus operon, and hemolysin operon. The RNAP pauses at the ops site, allowing RfaH to specifically engage (1). RfaH remains stably complexed with RNAP and increases its apparent transcription elongation rate, thereby decreasing pausing and premature transcriptional termination within operons (2).
Operon-specific transcriptional antitermination is likely a common feature of the NusGSP family, but no other members have been molecularly characterized. Although RfaH is specific for operons containing the ops element, it affects the transcription of heterologous operons that are distantly located from the respective gene on the E. coli chromosome. In this study, we tested the prediction that the UpxY proteins are a family of NusG-like (NusGSP) factors that act specifically in transcriptional antitermination of the operons from which they are encoded. The novelty of this system lies in the large number of related UpxY NusGSP factors present in a single organism, allowing us to identify the regions of these proteins that confer operon specific regulation.
All primers used in this study are listed in Table S1 in the supplemental material.
Escherichia coli DH5α containing recombinant plasmids was grown in L broth or on L agar plates containing ampicillin (100 μg ml−1) and/or kanamycin (50 μg ml−1). B. fragilis NCTC9343 was the parental strain of all mutants. B. fragilis strains were grown anaerobically in basal medium or on brain heart infusion plates supplemented with hemin (50 μg ml−1) and menadione (0.5 μg ml−1), with gentamicin (200 μg ml−1) and erythromycin (5 μg ml−1) added where appropriate.
Internal deletions of upaY, upeY, and uphY were made by a standard allelic replacement technique and were made so that 443 bp of the 519-bp upaY, 445 bp of the 519-bp upeY, and 510 bp of the 540-bp uphY were deleted. DNA segments upstream and downstream of each region to be deleted were PCR amplified and cloned by three-way ligation into the Bacteroides conjugal suicide vector pJST55 (26). The resulting plasmid was conjugally transferred into B. fragilis NCTC9343, and cointegrates were selected on the basis of erythromycin resistance (Emr) encoded by pJST55. The cointegrate strain was passaged, plated on nonselective medium, and replica plated on medium containing erythromycin. Em-sensitive (Ems) colonies were screened by PCR to select colonies with the mutant genotype. For complementation studies, upaY, upeY, and uphY genes were PCR amplified, cloned into the expression vector pFD340 (23), and conjugally transferred to their respective deletion mutant.
For Western immunoblot analyses, bacteria were boiled in LDS sample buffer and subjected to electrophoresis using NuPAGE 4 to 12% gradient sodium dodecyl sulfate-polyacrylamide gels with morpholineethanesulfonate buffer (Invitrogen). The contents of the gels were transferred to polyvinylidene difluoride membranes, and polysaccharide expression was tested with specific antiserum generated using a standard protocol (9, 10, 12, 13). Alkaline phosphatase-labeled anti-rabbit immunoglobulin G secondary antibody (Pierce) served as the secondary antibody, and the membranes were developed with 5-bromo-4-chloro-3-indolylphosphate/Nitro Blue Tetrazolium substrate (KPL, Gaithersburg, MD).
Northern blotting was performed using total RNA extracted from late-log-phase cultures (optical density at 600 nm [OD600] of ~0.8) with the RNeasy minikit (Qiagen, Valencia, CA) using RNAprotect (Qiagen) and on-column DNase digestion with an RNase-free DNase set (Qiagen). Twenty micrograms of total RNA was mixed with formaldehyde load dye (Ambion, Austin, TX), and the samples were loaded on a 1% formaldehyde gel (Northern Max 10× denaturing gel buffer; Ambion) and run at 120 V with 1× morpholinepropanesulfonic acid gel running buffer (Ambion). The RNA was then transferred to a Bright Star-Plus positively charged membrane (Ambion) by downward transfer with Northern Max transfer buffer (Ambion) according to the manufacturer's instructions. Hybridization was performed with PCR-amplified probes. Probe labeling and signal detection were performed with the ECL direct nucleic acid labeling and detection system (Amersham Pharmacia, Piscataway, NJ).
Alteration of the two conserved glycine (G) residues to alanine (A) in the KOW motif of UpaY was achieved with the QuikChange XL site-directed mutagenesis kit (Stratagene). The upaY gene cloned into the pFD340 expression vector was used as a template for mutagenesis. Plasmid pMCL59 expresses a protein with the G122 changed to an A, and plasmid pCMB1 expresses a mutant protein with the G130 changed to an A. Using pMCL59 as a template, plasmid pMCL61 was created, which expresses a mutant protein with both G residues changed to As. pMCL63 was created by mutating pMCL61 such that the residue at amino acid (aa) 122 was restored to G. Each of these plasmids was mobilized into the ΔupaY background, and PSA expression was monitored by Western blotting.
xylE-fusion clones were created by using the reporter plasmid pLEC23 (14). For the PSA region, these clones contained a DNA fragment that began just inside the upstream IR containing the promoter in the “on” orientation and extending 47 bp (clone 1), 75 bp (clone 2), 90 bp (clone 3), 102 bp (clone 4), 112 bp (clone 5), 127 bp (clone 6), and 178 bp (clone 7) downstream of the transcriptional start site. For the PSE region, these clones contained DNA beginning just inside the upstream IR containing the promoter in the on orientation and extending 44 bp (clone 8), 93 bp (clone 9), 135 bp (clone 10), and 229 bp (clone 11) downstream of the transcriptional start site. These 11 PCR-amplified products were cloned into pLEC23 in the proper orientation for transcription of the downstream xylE gene. The plasmids were mobilized into B. fragilis NCTC9343 and mutant backgrounds, and XylE activity was quantified as previously described (27). XylE assays were performed in duplicate, and the results are expressed as means ± standard deviations.
RNA was reverse transcribed from the PSA region with the primers indicated below in Fig. Fig.33 (see also Table S1 in the supplemental material). Total RNA was isolated as described above, except it was further DNase treated with the DNA-free kit (Ambion). Reverse transcription-PCR was performed with the SuperScript III one-step reverse transcription-PCR (RT-PCR) system with Platinum Taq DNA polymerase (Invitrogen, Carlsbad, CA). A separate set of all reactions was performed without the addition of reverse transcriptase.
Deletions in the 5′ UTR of PSA were constructed so that 24 bp (ΔA), 26 bp (ΔB), and 25 bp (ΔC) were removed by allelic replacement (see Fig. Fig.3,3, below). DNA segments upstream and downstream of the regions to be deleted were PCR amplified and cloned by three-way ligation into the Bacteroides conjugal suicide vector pJST55. The resulting plasmid was conjugally transferred into B. fragilis NCTC9343, and cointegrates were selected on the basis of Emr encoded by pJST55. The cointegrate strain was passaged, plated on nonselective medium, and replica plated on medium containing erythromycin. Ems colonies were screened by PCR to detect colonies that had the mutant genotype.
Construction of the upaY-upeY hybrid was performed by a six-step process using the QuikChange XL site-directed mutagenesis kit (Stratagene, La Jolla, CA). upaY cloned into pFD340 was first mutated by deletion of a 51-bp region corresponding to amino acids 62 to 78 of the protein, creating plasmid pMCL56. Next, pMCL56 was used as a template for the addition of the first 15 nucleotides from upeY, yielding pMCL60, which was subsequently used as a template for the addition of another 15 bp of upeY, yielding pMCL62. pMCL62 was further modified by the addition of 11 more base pairs of upeY, yielding pMCL66. pMCL66 was further altered by the addition of six more base pairs of upeY, yielding pMCL72. pMCL72 served as a template for the addition of the last 4 bp of the upeY sequence, yielding the final plasmid pMCL74 containing the upaY-upeY hybrid cloned into the Bacteroides expression vector pFD340. Each plasmid was sequenced to confirm incorporation of the correct modifications.
As a first step in determining the necessity of the UpxY proteins for synthesis of their respective PS, deletion mutants were made of three upxYs: upaY, upeY, and uphY. Western immunoblot analysis demonstrated that each of these upxYs is necessary for expression of its respective PS but not for expression of heterologous PSs. In addition, specific PS synthesis is restored when the respective upxY is supplied to its mutant in trans on a plasmid (Fig. (Fig.1B).1B). These data demonstrate that the UpxY proteins are necessary for PS synthesis, are specific in positively regulating the synthesis of their respective PS, and function in trans.
To determine if the defect in PS synthesis observed in upxY mutants occurs at the transcriptional level, we performed Northern blot analysis. PSA transcript was assayed using the second gene of the operon, upaZ, as a probe. Deletion of upaY resulted in undetectable levels of PSA transcript, but transcription was restored to detectable levels when upaY was supplied in trans to the mutant (Fig. (Fig.1C).1C). A second locus, PSE, was assayed in a similar manner to determine if transcriptional regulation was a general property of the UpxY products. Due to promoter inversions and phase-variable expression of the capsular polysaccharides, PSE is expressed by many fewer cells than PSA from an in vitro-grown wild-type population. Although this wild-type level of PS synthesis is easily detectable with antibody specific to PSE, the PSE transcript is barely detectable from a wild-type population. Therefore, to better detect PSE transcript, we used an mpi mutant in which the PSE promoter is locked in the on orientation and therefore constitutively expresses PSE. Deletion of upeY from this mutant resulted in the expected abrogation of PSE synthesis, which was restored to a detectable level when upeY was supplied in trans (Fig. (Fig.1D).1D). Northern blot analysis of these strains revealed that upeY is necessary for transcription of the PSE operon (Fig. (Fig.1E),1E), just as upaY is necessary for transcription of the PSA operon.
The data to this point are consistent with the UpxY products functioning as NusGSP factors. Despite the lack of amino-acid-level similarity of the UpxYs to the characterized NusG of E. coli, each protein contains the pfam02357 NusG domain and a KOW motif (17) (Fig. (Fig.2A).2A). The presence of these sequences further suggests that the UpxYs are NusG-like proteins. An alignment of the KOW motifs of each UpxY protein, the NusG proteins of B. fragilis and E. coli, and RfaH of E. coli is shown in Fig. Fig.2A.2A. There are conserved glycine residues at positions 4 and 12 within the motif in each of these proteins. The conservation of these residues suggests that they may be essential to protein function. The conserved G at residue 12 (designated as residue 11 in the original KOW description) was found to be invariant within this family of proteins (17). To determine if this residue is necessary for UpxY activity, upaY was mutated so that the G at position 12 of the KOW motif (position 130 of UpaY) was replaced by an alanine. When this altered gene was added in trans to the upaY deletion mutant, PSA synthesis was restored (Fig. (Fig.2B);2B); thus, this conservative substitution within the KOW motif did not abrogate UpaY activity. Next, a separate upaY mutant was created so that the G at position 4 of the KOW motif (position 122 of UpaY) was changed to an A. This mutant gene was also able to restore PSA synthesis to the upaY deletion mutant when added in trans. A third upaY mutant was created so that the resulting protein carried both substitutions: G122A and G130A. This altered protein lost its activity and was not able to restore PSA synthesis to ΔupaY. To confirm that the two described substitutions in the protein accounted for the lack of UpaY activity and not a spurious mutation resulting from the PCR-based mutagenesis process, we restored the G at position 122 in the upaY double mutant. This mutant protein, which now was only altered from wild-type UpaY at G130A, restored UpaY activity (Fig. (Fig.2B).2B). These data demonstrate that two conservative substitutions within the KOW motif of UpaY completely abrogate its activity. There have been very few studies that have addressed the importance of the KOW motif for the activity of NusG and NusG-like factors. Based on their conservation and the lack of activity when both sites are altered, we predict that this region confers a structural or functional property that is common to all NusG and NusGSP factors.
NusG has roles in both transcriptional termination and antitermination (6, 7, 19, 22). Based on the phenotypes and transcriptional analyses of the upxY deletion mutants and the activity of the well-characterized NusGSP RfaH, we predicted that the UpxYs function to prevent premature transcriptional termination of their respective PS operon. RfaH engages with RNAP at a site downstream of the promoter and does not affect transcription initiation. To determine at what step in the transcriptional process the defect occurs in the ΔupaY mutant, we made transcriptional fusions using the promoterless xylE reporter plasmid pLEC23 (14). The PSA regions cloned into this vector contain DNA beginning just inside of the upstream IR of the PSA region (so that the promoters could not undergo DNA inversion) and included the promoter in the on orientation, the downstream IR, and various amounts of DNA in the UTR and into upaY, as shown in Fig. Fig.3A.3A. These seven transcriptional fusion reporter clones were conjugally transferred into both the wild type and the ΔupaY mutant, and XylE activity was quantified from each using a standard assay (27). As shown in Table Table1,1, clones 1 and 2 produced equivalent amounts of XylE in both the wild-type and the ΔupaY backgrounds, showing that UpaY is not required for transcription initiation. Inclusion of DNA 90 bp from the transcriptional start site greatly reduced XylE activity from both the wild type and ΔupaY (clones 3, 4, and 5). It may be that this region induces RNAP pausing, preventing transcription into downstream sequences. When additional downstream DNA is included in the transcriptional fusion construct (clones 6 and 7), transcription is restored in the wild type but is severely decreased in the ΔupaY background. Therefore, the inclusion of DNA in the vicinity of primers 6 and 7 may allow RNAP to overcome its paused state in a UpaY-dependent manner. As transcriptional fusions past upaY cannot be analyzed in this assay due to restoration of upaY to the deletion mutant, we used RT-PCR to further examine transcriptional defects. Relative levels of PSA transcript were analyzed from the wild type and ΔupaY by using RT-PCR, which confirmed that transcriptional initiation is not affected in the ΔupaY mutant but that transcriptional read-through into downstream genes is severely reduced in the ΔupaY background (Fig. (Fig.3B).3B). PSA transcript extending into the second gene of this 10-gene operon was barely detectable in the ΔupaY mutant.
The PSE operon was analyzed in a similar manner using transcriptional reporter constructs to determine if UpeY had a similar function in preventing transcriptional termination. Four PSE-xylE clones were constructed, all of which contained DNA with the promoter locked on and extending to various regions downstream of the promoter (Fig. (Fig.3C).3C). These four transcriptional fusion reporter clones were conjugally transferred into both the wild type and the ΔupeY mutant, and the XylE activity resulting from each is shown in Table Table1.1. The results are consistent with those of the PSA locus and demonstrate that UpeY is not necessary for transcriptional initiation but that the transcript is largely terminated before the first gene in the absence of UpeY. These data show that transcriptional termination/antitermination in the 5′ untranslated regions of these loci is important for PS expression and transcriptional control of these long operons.
As the UpxYs are specific for their respective loci, there must be locus-specific nucleotide sequences within the 5′ UTR that dictate this specificity. RfaH has been shown to require the 9-bp ops element, which recruits RfaH and mediates RNAP pausing, thereby allowing engagement of RfaH with RNAP and restoring transcription elongation. Each of the seven capsular PS biosynthesis loci with invertible promoters has a 126- to 224-bp UTR between the downstream IR and the start codon of the respective upxY. As the XylE data indicate that the transcriptional defects in the ΔupaY and ΔupeY mutants occur early in the transcript, we predicted this 5′ UTR is where the UpxYs engage RNAP. To determine if the UTR of the PSA locus is necessary for PSA expression, we made three separate deletions in this region, removing 24 bp (ΔA), 26 bp (ΔB), and 25 bp (ΔC), as shown in Fig. Fig.3A.3A. The ΔA deletion is in a region upstream of where a transcriptional difference was detected between the wild type and the ΔupaY mutant by the XylE assay. The ΔB deletion removes a segment of DNA up to primer 3 of the xylE constructs, which is where the XylE activity was nearly abrogated even in the wild type. The ΔC deletion is in the region corresponding to primers 4 to 6 and includes the segment of DNA that restored transcriptional read-through into xylE. All deletions were significantly downstream of the inverted repeat and upstream of the start codon of upaY. Deletion of each of these three regions abrogated expression of PSA (Fig. (Fig.4A).4A). Northern blot analysis of PSA transcription using upaY as a probe demonstrated that the effect of the deletions was at the transcriptional level (Fig. (Fig.4B).4B). Therefore, it is likely that sequences contained throughout the 5′ UTR are necessary for UpaY to engage RNAP. It is possible that the region deleted in ΔA is involved in forming an RNA secondary structure with downstream regions of the UTR.
The UpxYs are a family of eight similar proteins, and therefore they offer a unique opportunity to identify regions of these molecules that dictate specificity for their respective loci. Since some of the proteins are only 38% similar to each other, comparing all eight proteins together reveals many regions that differ within this family. However, by comparing very similar UpxY proteins, we can identify regions that are likely involved in conferring specificity. UpaY and UpeY are two of the most similar UpxYs, sharing 84% similarity along their lengths. Comparison of the sequences of UpaY and UpeY revealed two regions that differ substantially, one region from aa 62 to 78 and another region spanning aa 108 to 120 (Fig. (Fig.5A).5A). Similar analysis of two other UpxYs, UpdY and UphY, which are only 47% and 49% similar to UpaY but 82% similar to each other, revealed that they differ substantially from each other in only a few regions. Comparison of the regions that differ between UpaY and UpeY with those that differ in UpdY and UphY revealed only one common region of difference, corresponding to the region at aa 62 to 78 of UpaY. To determine if this region dictates specificity of the UpxY proteins for their specific locus, a hybrid gene was created in expression vector pFD340 by a six-step mutational process that yielded a protein with the primary sequence of UpaY except that aa 62 to 78 were changed to those of the UpeY protein (Fig. (Fig.5A).5A). This hybrid was tested for its ability to restore PS synthesis to both the ΔupaY and ΔupeY mutants when added to these mutant strains in trans (plasmid pMCL74). As shown in Fig. Fig.5B,5B, this hybrid protein is still able to restore PSA synthesis to the ΔupaY mutant to some degree, demonstrating that the small region altered compared to that of PSE is not sufficient to completely abrogate its activity as a transcripitional antitermination factor for the PSA locus. This hybrid protein, however, acquired UpeY activity and is able to fully restore PSE synthesis to the ΔupeY mutant. Therefore, alteration of a 17-aa portion of UpaY to that of UpeY was sufficient to confer UpeY activity. These data demonstrate that this small segment of UpxY proteins, which differ most substantially between highly similar paralogs, is critical in determining the specificity of the UpxY products for their respective PS locus.
In this study, we characterize the function of the UpxY family of proteins of B. fragilis and demonstrated that they are members of a recently described NusGSP family, along with the well-characterized E. coli protein RfaH. Like RfaH, the UpxY proteins are nonessential operon-specific factors, in this case necessary for the transcription of the polysaccharide biosynthesis loci of B. fragilis. These proteins are not involved in the regulation of transcription initiation, but rather prevent premature termination of transcription. We show that the 5′ UTR is essential for the transcription of the PSA operon and that a upaY deletion mutant is severely attenuated for transcription beyond the 5′ UTR. These data suggest that UpaY engages with RNAP in the 5′ UTR in a manner similar to RfaH, a property that may be common to NusGSP factors.
There is a large body of work regarding the function, significance, and molecular characteristics of NusG, which affects the transcription of most genes of E. coli. NusG is an essential factor that associates with RNAP, increasing Rho-dependent termination of most genes, and is involved in transcriptional antitermination of a few operons (20, 24). Unlike NusG, which is involved in the transcription of native genes, RfaH controls horizontally acquired regions (2, 18), suggesting that the NusGSP family acts to increase expression of foreign genes (4), in contrast to NusG, which has a role in suppression of foreign genes (8). The eight polysaccharide biosynthesis loci are each heterogeneous between different strains of B. fragilis, suggesting recent acquisition by horizontal DNA transfer. In fact, prior analyses of the G+C content of the PSA, PSB, and PSC biosynthesis loci suggested that these large regions were acquired by multiple horizontal transfer events (9, 10, 12, 13). Therefore, the UpxYs provide another example of NusGSP regulation of horizontally acquired “foreign” DNA regions.
Regulation by RfaH and the UpxY products also have some differences. RfaH affects the transcription of several different operons, whereas each UpxY protein regulates only a single operon. RfaH regulates transcription of heterologous operons that are located in regions of the E. coli genome distant from its own gene, whereas the UpxYs affect the transcription of the operons from which they are encoded. In addition, the UpxYs are a family of very closely related NusGSP proteins, whereas RfaH is the only NusGSP of E. coli.
UpxY orthologs are not exclusive to B. fragilis. The presence of multiple PS biosynthesis loci is a general characteristic of Bacteroides species, and most of the polysaccharide biosynthesis loci of other Bacteroides species have upxY homologs as the first gene. Therefore, UpxY-mediated transcriptional antitermination of PS biosynthesis operons is a common property of this bacterial genus. The size of the PS biosynthesis loci is a factor that likely contributes to the conservation of this regulatory mechanism. These loci are large (~10 to 25 kb), and in most cases, every polysaccharide biosynthesis protein encoded within the operon is necessary for a respective polysaccharide to be synthesized. Therefore, these operons must be transcribed in their entirety with high fidelity. RfaH decreases pausing of the transcription elongation complex and reduces both Rho-dependent and Rho-independent termination (1). Engagement of RNAP with the UpxY factors likely has a similar function in increasing the relative rate of RNAP, decreasing its pausing, and providing a mechanism for faithful transcription of long operons.
Our mutational data reveal that conserved residues within the KOW motif are necessary for activity. Structural comparisons of RfaH and NusG reveal significant differences in the C-terminal regions of these proteins. The C terminus of NusG is comprised of β-sheets (25), whereas the C terminus of RfaH has an α-helical coiled-coil domain (5). There is a 14-aa flexible linker region between the N- and C-terminal regions of RfaH and NusG (5). The KOW motif is partially contained within the flexible linker region of RfaH (5), but it begins just after the flexible linker region of NusG (25). Based on the conservation of this motif and the fact that conserved substitutions of both glycines at positions 4 and 12 of the motif abrogate UpaY activity, this region likely confers important structural or functional properties to these proteins.
The presence of eight related, operon-specific UpxY factors within a single organism allowed us to analyze regions of these proteins that confer specificity for their respective operons. Comparisons of highly similar UpxY proteins identified a region in the N-terminal half predicted to mediate this specificity. Replacement of a 17-aa region of UpaY with that of UpeY was sufficient to confer UpeY activity to this hybrid protein. Identification of the exact segments of RfaH that recognize the ops element on the noncoding DNA strand of RfaH-regulated operons is not as straightforward, as these studies could not benefit from comparing highly orthologous proteins. However, alanine substitutions in a cluster of residues located in the N domain impair RfaH's ability to bind the nontemplate DNA, based on lack of RNAP delay at the ops site (5). Therefore, the N-terminal half of RfaH may also dictate its specificity for ops element containing operons similar to the UpxYs. It is predicted that RfaH may first engage with RNAP and then a site formed by the complex may recognize the ops element. Whether such a mechanism occurs in the recognition of specific nucleic acid regions by the UpxYs cannot be excluded. This study sets the foundation for more in-depth analyses of both the nucleic acid regions in the 5′ UTR and the regions of the UpxY proteins themselves that mediate PS operon-specific transcriptional antitermination.
We thank Corinna Krinos, Katja Weinacht, and Colin Bloor for preliminary analyses.
This work was supported by National Institutes of Health NIAID grant AI044193.
Published ahead of print on 2 October 2009.
†Supplemental material for this article may be found at http://jb.asm.org/.