|Home | About | Journals | Submit | Contact Us | Français|
Ubiquitous surface protein A molecules (UspAs) of Moraxella catarrhalis are large, nonfimbrial, autotransporter proteins that can be visualized as a “fuzzy” layer on the bacterial surface by transmission electron microscopy. Previous studies attributed a wide array of functions and binding activities to the closely related UspA1, UspA2, and/or UspA2H protein, yet the molecular and phylogenetic relationships among these activities remain largely unexplored. To address this issue, we determined the nucleotide sequence of the uspA1 genes from a variety of independent M. catarrhalis isolates and compared the deduced amino acid sequences to those of the previously characterized UspA1, UspA2, and UspA2H proteins. Rather than being conserved proteins, we observed a striking divergence of individual UspA1, UspA2, and UspA2H proteins resulting from the modular assortment of unrelated “cassettes” of peptide sequence. The exchange of certain variant cassettes correlates with strain-specific differences in UspA protein function and confers differing phenotypes upon these mucosal surface pathogens.
Moraxella catarrhalis is a gram-negative, unencapsulated pathogen responsible for a significant portion of childhood otitis media (22, 43) as well as infectious exacerbations of chronic obstructive pulmonary disease in adults (44, 56). Over the past decade, much of the research effort concerning M. catarrhalis focused on the identification of proteinaceous surface antigens and determinations of their potential role in virulence expression (32, 60). Of these surface antigens, the Hag (23, 30), OmpCD (29, 54), filamentous hemagglutinin-like proteins (7, 53), McaP (37), and type IV pilus proteins (40) have been shown to be involved in bacterial adherence to human cells or components of the respiratory tract in vitro, while CopB (4, 12), transferrin-binding proteins (39, 55), and lactoferrin-binding proteins (11, 21) are likely essential for iron acquisition.
Perhaps the most intensively studied potential virulence factors of M. catarrhalis are the nonfimbrial ubiquitous surface protein A (UspA) molecules, initially described about 14 years ago. At first, these two proteins were thought to be a single gene product (25, 34) until it was discovered that two genes encode distinct proteins sharing certain epitopes (2). UspA1 and UspA2 are encoded by open reading frames located far apart in the M. catarrhalis genome (62). Both proteins are autotransporter molecules that mediate their own insertion and subsequent translocation of functional extracellular domains into the outer membrane. Based on their C-terminal regions, UspAs appear to be trimeric autotransporters (17, 20, 33). Their large size and extended structure can be visualized by transmission electron microscopy as a “fuzzy” layer surrounding the bacterial cell (28, 50). The closely related UspA2H protein appears to be a hybrid that most closely resembles UspA1 along its N-terminal half and UspA2 in its C-terminal region (36).
Early studies of M. catarrhalis uspA1 and uspA2 mutants indicated that the former protein had adhesive properties and functioned to bind M. catarrhalis to Chang conjunctival epithelial cells in vitro, whereas the expression of UspA2 was essential for M. catarrhalis to resist the bactericidal activity of normal human serum (2). Subsequent studies have shown that both of these proteins have the potential to express multiple different functional activities, ranging from attachment to host gene products to affecting biofilm development by some strains of M. catarrhalis growing in vitro (51). UspA1 has been shown to bind carcinoembryonic antigen-related cellular adhesion molecule 1 (CEACAM1) (26, 27), fibronectin (59), laminin (58), and the serum complement factors C3 (46) and C4b-binding protein (45). UspA2 has also been reported to bind most of these same proteins except that CEACAM1 binding has not been described. UspA2 also binds the serum protein vitronectin and apparently uses this mechanism to help some strains of M. catarrhalis resist killing by normal human serum (5, 6).
Previous but limited molecular characterization of the UspA proteins revealed that UspA1, UspA2, and UspA2H share a number of homologous repeats and “motifs” (2, 15), a finding which suggested that the makeup of UspA proteins might be interchangeable. Despite the large number of functions attributed to the UspA proteins, the molecular characterization of these proteins has been limited largely to alleles encoded by few prototypical strains. We (11a) and others (36) have noted strain-specific differences in UspA protein function, yet no systematic structure-function analyses have been performed to date. In the present report, we sequenced and characterized the “motifs” and repeats of a large number of UspA proteins and further defined these shared peptide sequences. In addition, we show that individual UspA proteins consist of a modular arrangement of sequence cassettes, some of which encode recently defined functional motifs.
PCR was used to amplify DNA encoding the uspA1 and uspA2 genes from M. catarrhalis chromosomal DNA as described previously (6, 36, 51). Nucleotide sequence data derived from automated sequencing systems were initially subjected to analysis by using MacVector software (version 6.5; Oxford Molecular Group, Campbell, CA).
While the crystal structure of a UspA protein has not been solved to date, the basic structure of these proteins can be inferred from structural studies of other autotransporter proteins due to their sequence similarity to the Neisseria meningitidis NalP (5, 48), Yersinia enterocolitica YadA (48), and Haemophilus influenzae Hia (42) proteins, the crystal structures of which have been solved in whole or in part. NalP is a conventional autotransporter, whereas YadA is the prototype of a trimeric autotransporter (35). A model of the entire YadA structure has been generated by combining a crystal structure of the amino (N)-terminal domain (47) with predicted structures of the stalk and the C-terminal membrane-spanning domains (18). As such, the UspA proteins are also predicted to possess three distinct domains (Fig. (Fig.1A):1A): (i) an amino-terminal leader peptide or signal sequence, (ii) the secreted mature protein (or passenger domain), and (iii) a C-terminal translocation domain responsible for the formation of a pore in the outer membrane to allow the passage of the passenger domain to the cell surface (35). By analogy to YadA, the N-terminal portion of the passenger domain likely forms a “head” that is connected to an extended “stalk” region, which itself is bound to the membrane-spanning region of the translocation domain.
The UspA proteins have been subdivided into three basic types: UspA1, UspA2, and UspA2H (3, 36). UspA1 and UspA2 can be distinguished by differences in amino acid sequences within the head and membrane-spanning regions, yet they share homology within the stalk region (Fig. (Fig.1B).1B). UspA2H is a “hybrid” protein containing a head region similar to that of UspA1 while having the UspA2-like C-terminal region (Fig. (Fig.1B).1B). Based on analyses of the C-terminal region of these three proteins, all three appear to be members of the trimeric autotransporter family (17, 33).
Our recent demonstration that UspA1 proteins from different M. catarrhalis strains differ with respect to host cell receptor specificities (11a) prompted us to analyze the diversity within each UspA protein group, considering the clear modular arrangement of variant sequences apparent from these analyses. Herein, we will discuss structural considerations of the variant sequences on each domain separately before discussing their functional implications.
UspA1 and UspA2H were previously shown to possess homology within their N-terminal domains. This region is characterized by a number of sequence motifs, including a series of “GGG repeats” followed by an “FAAG domain” (36). We have sequenced the genes encoding additional UspA1 proteins and combined these data with data from the previously sequenced UspA1 and UspA2H genes (Table (Table1)1) to further define these “motifs” (36). Using WebLogo (19) software, we refined the N-terminal GGG repeat consensus sequence using 150 repeats taken from all available UspA1 and UspA2H sequences (19 variants) (Fig. (Fig.2A),2A), leading to the generation of a logo showing a highly conserved consensus sequence. The number of GGG repeats varies between 5 and 16 repeats within individual UspA1/UspA2H proteins, with each GGG repeat being predicted to form an antiparallel β-strand based on alignments with the N-terminal YadA structure (19, 28, 47). This observation highlights a substantial variation in the size of the distal portion of the head region resulting from a difference in the number of conserved core repeats among individual alleles.
Downstream of the GGG repeat lies the previously defined FAAG motif (35, 36) (Fig. (Fig.2B),2B), which contains a high degree of sequence similarity to the head region of YadA (36). A recently solved crystal structure of the YadA head reveals a structural repeat of 11 amino acids containing an NSVAIG-S sequence, which forms two short antiparallel strands (47). While the link between the NSVAIG-S repeats varies slightly in size and composition, a strikingly conserved G residue occurs at the fifth position of each repeat, allowing the formation of a left-handed parallel β-roll (Fig. (Fig.2C).2C). We analyzed the M. catarrhalis FAAG motif for NSVAIG-S-like sequences. Each UspA FAAG contained five core NSVAIG-S-like sequences, each being linked to the next one by short sequences that varied in size and composition to a greater degree than is apparent among YadA variants. Using WebLogo software, we created consensus sequences (Fig. (Fig.2D)2D) for the available YadA SVAIG-S (42 repeats) and UspA FAAG (91 repeats) sequences. The homology between these sequences is linked primarily to the structural regions, suggesting a similar fold, with various lengths and compositions of the linker sequences. While the head of YadA is responsible for the collagen binding properties of YadA (35), it remains unclear whether such linker variability contributes to antigenic or functional differences among UspA1 and/or UspA2H variants.
While the UspA1 N-terminal region has a high degree of overall similarity to that of YadA, the N-terminal region of UspA2 does not share similarity in either sequence or predicted structure with either YadA or UspA1. The N-terminal region of UspA2 itself has also remained largely uncharacterized due to an apparent lack of homology between the few fully sequenced UspA2 variants. To understand this diversity, we sequenced additional uspA2 alleles and compared them to previously sequenced variants. This analysis revealed that the N-terminal domains can be clearly divided into two different groups, which we have termed NTER2A and NTER2B (Fig. (Fig.3).3). The UspA2 protein from strain P44 is the only variant that does not belong to either group; it appears to be NTER2B related, but the N-terminal domain has been replaced by a duplicated “HDD” stalk motif (described below). It is clear that UspA2 proteins can have significantly different N-terminal regions based on the amino acid sequence alone. Any function ascribed to the N-terminal domain of UspA2 must be considered in the context of these two protein families.
While the N-terminal regions of individual UspA proteins are clearly distinct based upon primary amino acid sequence, the stalk region consists of both a combination of repeats that occur in all UspA proteins and distinct sequence “motifs” that are associated primarily with either UspA1 or UspA2. We further refined each of the previously described “motifs” by using all available sequences to create consensus logos. Our inclusion of new alleles has also allowed us to define previously uncharacterized sequence motifs shared among the UspA proteins. Each “motif” located within the stalk region is depicted as sequence logos within Fig. Fig.4,4, including the number of repeats used to define each consensus. Adjacent stalk repeats are often connected by short (typically 4 to 8 residues [K/QADIAKN]) linker sequences, which tend to be conserved at junctions between specific combinations of stalk repeats but vary when an adjacent repeat is different. Whether linker sequences are determined by protein structural requirements or required for nucleotide recombination remains unclear. While the specific arrangement of individual stalk repeats will be detailed below, it is pertinent that certain “motifs” appear to be restricted to UspA1, UspA2, and/or UspA2H, whereas others are shared among these different groups.
The C-terminal regions of UspAs form a structurally conserved membrane-spanning domain (63). Each domain contains a coiled-coil region that links the stalk region to the membrane-spanning translocation domain. In contrast to the primary stalk domain, there is almost complete sequence identity shared among the C-terminal domains (CTER1) of UspA1 proteins and among the C-terminal domains (CTER2) of the UspA2 and UspA2H proteins, although CTER1 and CTER2 are clearly distinct from each other (Fig. (Fig.5).5). In each case, the most carboxyl-terminal sequence encodes four β-strands that would be predicted to weave back and forth through the outer membrane, reminiscent of both Hia and YadA. Structural modeling of YadA illustrates that three membrane-spanning monomers coalesce to from a single β-barrel that spans the membrane 12 times. The α-helical portion of each monomer passes up through the center of this trimer, forming a coiled-coil structure that forms the base of the stalk domain. This trimeric structure is remarkable considering that the proteins are autotransporters that direct their own insertion and assembly within the outer membrane.
By defining the conserved motif and repeat sequences, we are able to consider the composition of distinct UspA variants. The C-terminal region of UspA1 variants is highly conserved among diverse clinical isolates, with all 14 available UspA1 proteins possessing a NINNY-KASS-FET sequence immediately adjacent to the CTER1 motif (Fig. (Fig.6).6). The first variability in modular arrangement involves the CEACAM motif that has been shown to mediate adherence to CEACAM (14, 26). This motif is either truncated or completely absent from some UspA1 variants, and both of these changes disrupt CEACAM-mediated adherence (11a). The CEACAM-binding phenotype of all of the M. catarrhalis strains included in the present study strictly correlates with the presence of an intact CCM motif (Fig. (Fig.6),6), highlighting the importance of allelic sequence-specific analyses in the assignment of function to UspA proteins.
The CEACAM motif is preceded by a highly conserved LAAY-KASS sequence (Fig. (Fig.44 and and6).6). The stalk regions of UspA1 proteins then contain a variable number of VEEG repeats, a NINNY repeat, and then an additional VEEG sequence (Fig. (Fig.6).6). This region was previously shown to mediate binding to the extracellular matrix protein fibronectin and/or attachment to Chang epithelial cells (59). This binding appears to correlate with a complete ordered VEEG-NINNY-VEEG sequence, as UspA1ATCC 43617 lacks the N-terminal VEEG, and its binding to Chang cells in vitro is diminished (11a). As discussed below, the presence or absence of an intact VEEG-NINNY-VEEG sequence strictly correlates with the ability of UspA2 and UspA2H variants to adhere to Chang cells, indicating that shared sequences explain the overlapping function of UspA proteins from these various groups.
The N termini of UspA1 (Fig. (Fig.6)6) and UspA2H (Fig. (Fig.7B)7B) both consist of a variable number (5 to 16) of GGG repeats followed by a FAAG motif. However, the region between the FAAG motif and the fibronectin-binding sequence is remarkably divergent and can be subdivided based upon apparent phylogenetic relationships within this region. For this reason, we have termed this region the UspA1 variable region (U1VR). Phenotypic analysis of strains expressing diverse U1VR domains may reveal sequence motifs that mediate other functions attributed to UspA1, including its ability to bind laminin (58) and at least two different proteins involved in the complement cascade (45, 46).
UspA2 proteins follow a trend very similar to that of UspA1 proteins in terms of sequence conservation. The CTER2 motif (Fig. (Fig.5B)5B) is very highly conserved among UspA2 and UspA2H variants and consists of a membrane-spanning region linked to a coiled-coil stalk. CTER2 is linked to a FET motif (Fig. (Fig.7B)7B) in all cases but one: UspA2TTA24 contains a portion of the CEACAM domain in place of the FET sequence. Based upon previous work (11a, 26, 27), it is not likely that this small portion of the CEACAM motif is sufficient for CEACAM binding. However, this arrangement highlights the potential for the exchange of cassette-like sequences between UspA1 and UspA2 coding sequences and clearly indicates that function must be assigned to individual alleles rather than to UspA protein groups.
The C-terminal stalk region of the UspA2 protein mimics what is seen in the UspA1 protein with a NINNY-KASS repeat except that UspA2 proteins completely lack the CEACAM-binding sequence, generating a LAAY-KASS-NINNY-KASS sequence. Rather than the single LAAY-KASS repeat present in UspA1 proteins (Fig. (Fig.6),6), UspA2 proteins have a variable number (two to four repeats) of LAAY-KASS sequences closely linked to one or more VEEG motifs (Fig. (Fig.7A).7A). While VEEG and NINNY repeats exist in all complete UspA2 sequences available, they only rarely exist in a VEEG-NINNY-VEEG format that correlates with fibronectin binding. This is consistent with the fact that only a small subset of UspA2 variants bind to fibronectin and those that do display an obvious fibronectin/Chang cell binding motif (Fig. (Fig.7A7A).
As mentioned above, the UspA2 variants tend to display one of two distinct N-terminal “motifs” (NTER1 or NTER2) (Fig. (Fig.3).3). These motifs are linked to a variable region similar to that observed in UspA1, highlighting the diversity in UspA2 proteins among different strains. To highlight the fact that most stalk motifs found in the UspA2 variable region differ from those found in UspA1, we have termed this region the UspA2 variable region (U2VR) (Fig. (Fig.77).
It was previously demonstrated that serum resistance could be conferred on serum-sensitive M. catarrhalis strain 317 by exchange of the UspA2MC317 variable region (HDD-SIE) with that of UspA2O35E (LAAY-KASS-TAEER) from serum-resistant M. catarrhalis strain O35E (5). However, both of these UspA2 proteins contain an NTER1 motif, and the specific requirements for serum resistance remain unclear because both UspA27169 and UspA2O12E also confer serum resistance on their respective M. catarrhalis strains despite having what are clearly different UspA2 variable regions.
UspA proteins were previously assigned to one of the three main groups: UspA1, UspA2, or UspA2H. Our detailed analysis of 34 UspA amino acid sequences has revealed a remarkable diversity among the three UspA groups as well as the potential to exchange variable motifs between them. Structurally, the interstrain variability appears most evident within the N-terminal region, where the UspA1 and UspA2H proteins possess a wide variability in the number of GGG repeats, each of which forms an antiparallel β-strand, followed by an FAAG motif. This variability will undoubtedly cause large changes in the size of the head domain (Fig. (Fig.11 and and2),2), which is otherwise analogous to that of the Yersinia YadA protein (47, 57). In stark contrast, the UspA2 N terminus lacks any similarity to the UspA1/UspA2H/YadA structure. This is the greatest difference among these proteins, yet its impact awaits definition of a function attributable to the novel head structure. At the amino acid sequence level, the highly variable U1VR and U2VR regions (Fig. (Fig.66 and and7)7) extend the overall diversity by changing the length of the coiled-coil stalk and by conferring different binding phenotypes on the UspA1 and UspA2H proteins.
While the modularity of UspA proteins may facilitate immune escape due to the variation of peptide sequences exposed at the bacterial surface, it also clearly affects the bacterial phenotype. Perhaps the most striking examples of this involve host cellular attachment via CEACAM1 receptors and adherence to Chang cells, with the latter trait appearing to be mediated by binding to the extracellular matrix protein fibronectin. CEACAM binding is common to a variety of UspA1 variants (27), yet it is clearly not a property observed in all clinical isolates (Fig. (Fig.6)6) (11a). Sequence analysis reveals a direct link between the presence of a complete CEACAM-binding motif and CEACAM binding. While CEACAM binding is not apparent in any of the UspA2 variants tested (11a), the presence of a portion of the CEACAM-binding domain on UspA2TTA24 clearly illustrates the potential for exchange between UspA classes.
Rather than CEACAM receptor binding, Chang cell binding by M. catarrhalis is associated with the ability of UspA proteins to bind the extracellular matrix component fibronectin (59). This activity has been attributed to UspA1 and/or UspA2, yet the majority of UspA1 and UspA2H variants characterized contain the fibronectin-binding motif, whereas only a minority of the UspA2 proteins characterized in this study possessed it. Laminin binding was previously linked to the N-terminal regions of UspA1 and UspA2 (58); however, ascribing this function to a particular sequence is less clear because the amino acid sequences of laminin-binding UspA protein variants are generally unavailable.
Serum resistance is associated primarily with UspA2 expression (2, 6), but it is not inherent to all UspA2 variants (2, 5, 6), and different UspA2 variants appear to have different effects on the complement cascade (5, 45, 46). For example, while the complement components C4b (6, 45, 46) and C3 (45) both bind to UspAs, it is clear that C4b binding is restricted to certain M. catarrhalis strains (46) and that either UspA1 or UspA2 is of primary importance in serum resistances of different strains (6, 45). The issue of serum resistance in M. catarrhalis is further complicated by the fact that other mutations have been reported to have an adverse effect on the serum resistance of some strains. The inactivation of the genes encoding CopB (1), outer membrane protein CD (29), and outer membrane protein E (10) was shown to reduce serum resistance. In addition, the expression of at least three genes encoding lipooligosaccharide biosynthesis enzymes (38, 52, 64) is required for wild-type levels of serum resistance in M. catarrhalis. To date, however, only the UspA2 protein has been shown to be directly involved in the expression of serum resistance (5, 6).
In summary, while various studies have revealed novel UspA protein functions in a prototypical strain, it is clear that the function conferred by different UspA variants may differ widely. Moreover, it is enticing to consider that the various sequence “cassettes” evident in the UspA variants characterized to date may confer heretofore unrecognized functions. Future analyses must obviously consider the sequence of a particular UspA variant being studied and place the results in the context of known allelic differences. Considering the natural genetic competence of M. catarrhalis (13), it seems likely that the structural and functional determinants of each UspA protein are phylogenetically fluid, allowing the acquisition of various combinations of binding functions. This combinatorial nature of UspA proteins makes it essential to understand how these different functions interact, with an aim to ascertain whether certain bacterial phenotypes differentiate between asymptomatic commensalism and pathogenesis upon M. catarrhalis infection.
This work was supported by funding from Canadian Institutes for Health Research grant no. MOP-15499 to S.D.G.-O. and U.S. Public Health Service grant no. AI36344 to E.J.H. S.D.G.-O. is supported by New Investigator Awards from the Canadian Institutes of Health Research and is a recipient of the Province of Ontario Premier's Research Excellence Award.
We thank John Nelson, Anthony Campagnari, David Goldblatt, Richard Wallace, Steven Berk, Merja Helminen, and Frederick Henderson for providing the wild-type isolates of M. catarrhalis used in this study.
Editor: J. N. Weiser
Published ahead of print on 4 August 2008.