|Home | About | Journals | Submit | Contact Us | Français|
The cell wall envelope of gram-positive bacteria is a macromolecular, exoskeletal organelle that is assembled and turned over at designated sites. The cell wall also functions as a surface organelle that allows gram-positive pathogens to interact with their environment, in particular the tissues of the infected host. All of these functions require that surface proteins and enzymes be properly targeted to the cell wall envelope. Two basic mechanisms, cell wall sorting and targeting, have been identified. Cell well sorting is the covalent attachment of surface proteins to the peptidoglycan via a C-terminal sorting signal that contains a consensus LPXTG sequence. More than 100 proteins that possess cell wall-sorting signals, including the M proteins of Streptococcus pyogenes, protein A of Staphylococcus aureus, and several internalins of Listeria monocytogenes, have been identified. Cell wall targeting involves the noncovalent attachment of proteins to the cell surface via specialized binding domains. Several of these wall-binding domains appear to interact with secondary wall polymers that are associated with the peptidoglycan, for example teichoic acids and polysaccharides. Proteins that are targeted to the cell surface include muralytic enzymes such as autolysins, lysostaphin, and phage lytic enzymes. Other examples for targeted proteins are the surface S-layer proteins of bacilli and clostridia, as well as virulence factors required for the pathogenesis of L. monocytogenes (internalin B) and Streptococcus pneumoniae (PspA) infections. In this review we describe the mechanisms for both sorting and targeting of proteins to the envelope of gram-positive bacteria and review the functions of known surface proteins.
The cell wall of gram-positive bacteria is host to a wide variety of molecules and serves a multitude of functions, most of which are critical to the viability of the cell. Although the primary function of the cell wall is to provide a rigid exoskeleton for protection against both mechanical and osmotic lysis (694, 695) the cell wall of gram-positive bacteria also serves as an attachment site for proteins that interact with the bacterial environment. Over the past decade, it has become apparent that the gram-positive bacteria have evolved a number of unique mechanisms by which they can immobilize proteins on their surface. These mechanisms involve either the covalent attachment of protein to the peptidoglycan or the noncovalent binding of protein to either the peptidoglycan or secondary wall polymers such as teichoic acids.
This review describes our current knowledge about surface proteins of gram-positive bacteria and the mechanisms of their anchoring to the cell wall. Functions performed by the various wall proteins are incredibly diverse. For example, many covalently linked surface proteins of gram-positive pathogens are thought to be important for survival within an infected host (713). Other wall-targeted proteins are responsible for the controlled synthesis and turnover of the peptidoglycan at specific sites (division septa) during cell growth and division (348). It is believed that these enzymes are targeted to the division sites through a noncovalent interaction with specifically localized septal receptors. Still other surface proteins of gram-positive bacteria, including the internalin B molecule of Listeria spp., lysostaphin, and S-layer proteins, are immobilized to the cell surface by binding to secondary wall polymers present throughout the cell wall.
To facilitate this discussion of the mechanisms of protein attachment, we briefly discuss the mechanisms of protein secretion in these bacteria. We also summarize what is known about the structure, assembly, and turnover of the cell wall of gram-positive bacteria. For more detailed treatises on these subjects, we refer the reader to other excellent reviews (252, 707).
Gram-positive bacteria are simple cells. On the basis of morphological criteria three distinct cellular compartments can be distinguished: the cytosol, a single cytoplasmic membrane, and the surrounding cell wall (261). Some gram-positive bacteria synthesize a large polysaccharide capsule, whereas others elaborate a crystalline layer of surface proteins (739); both structures may envelope the entire cell. Spore-forming gram-positive bacteria, such as Bacillus subtilis, generate morphologically distinct daughter cells by a developmental program of asymmetric cell division (501). The cell wall of spores differs from that of mother cells and contains specific sets of proteins (645, 852). Some gram-positive bacteria divide without separating their cell walls and thus continue to grow as strings of cells (streptococci) or as clusters (staphylococci). Figure Figure11 is a transmission electron micrograph of thin-sectioned Staphylococcus aureus cells, revealing the characteristic morphology of the subcellular compartments of gram-positive bacteria.
The cell wall of gram-positive bacteria is a peptidoglycan macromolecule with attached accessory molecules such as teichoic acids, teichuronic acids, polyphosphates, or carbohydrates (302, 694). The glycan strands of the cell wall consist of the repeating disaccharide N-acetylmuramic acid-(β1-4)-N-acetylglucosamine (MurNAc-GlcNAc) (254, 255). Glycan strands vary in length and are estimated to contain 5 to 30 subunits, depending on the bacterial species investigated (270, 331, 749). In most cases, the d-lactyl moiety of each MurNAc is amide linked to the short peptide component of peptidoglycan (256, 564, 792). Wall peptides are cross-linked with other peptides that are attached to a neighboring glycan strand (787–789, 791), thereby generating a three-dimensional molecular network that surrounds the cell and provides the desired exoskeletal function (463, 760) (Fig. (Fig.22).
Cell wall synthesis in gram-positive bacteria can be divided into three separate stages that occur in distinct subcellular compartments, the cytoplasm, the membrane, and, finally, the cell wall itself (759, 761) (Fig. (Fig.3).3). Although we discuss here peptidoglycan synthesis in S. aureus, all bacteria use a nearly identical synthesis pathway. Peptidoglycan synthesis begins by generating UDP-MurNAc from UDP-GlcNAc and phosphoenolpyruvate (287, 758). Five amino acids are linked to UDP-MurNAc in four consecutive steps, l-Ala, d-isoGlu, l-Lys, and, finally, the d-Ala–d-Ala dipeptide (370, 554, 572). Synthesis of d-Ala–d-Ala requires two enzymes. The first, d-Ala racemase, converts l-Ala to d-Ala, while the second, d-Ala ligase, creates a peptide bond between two d-Ala residues (113, 578, 823). Synthesis of each peptide (amide) bond within the wall peptide consumes 1 high-energy phosphate (ATP) (812). The end product of the cytoplasmic segment of the cell wall synthesis pathway, UDP-MurNAc–l-Ala–d-isoGlu–l-Lys–d-Ala–d-Ala (UDP-MurNAc-pentapeptide, Park’s nucleotide), has been solubilized by acid extraction of gram-positive cells and purified (616, 617). Radiolabeled Park’s nucleotide was used as substrate for the in vitro synthesis of peptidoglycan, which provided a powerful assay for the elucidation of this biochemical pathway (116). UDP-MurNAc-pentapeptide is phosphodiester linked to an undecaprenyl-pyrophosphate carrier molecule at the expense of UDP (C55–PP-MurNAc–l-Ala–d-isoGlu–l-Lys–d-Ala–d-Ala, or lipid I) (13, 371). UDP-GlcNAc is linked to the muramoyl moiety to generate the disaccharide lipid II precursor [C55–PP-MurNAc(-l-Ala–d-isoGlu–l-Lys(Gly5)–d-Ala–d-Ala)-β1-4-GlcNAc] (335–337). Lipid II is further modified by the addition of amino acids to the -amino of lysine (12, 417). Figure Figure33 shows these reactions for the biosynthesis of staphylococcal peptidoglycans, which contain five glycine residues linked to l-Lys. Three glycyl-tRNA species are thought to be dedicated to this biosynthetic pathway (281, 672, 673, 755). Finally, the modified lipid II precursor is translocated across the cytoplasmic membrane and serves as the substrate for the assembly of peptidoglycan (Fig. (Fig.33).
Cell wall assembly is catalyzed by penicillin binding proteins (PBPs) (251). The high-molecular-weight PBPs are bifunctional enzymes that have recently been categorized into one of two classes based on sequence similarity (251, 272). Class A PBPs promote both the polymerization of glycan from its disaccharide precursor, i.e., the successive addition of MurNAc(-l-Ala–d-isoGlu–l-Lys–d-Ala–d-Ala)-GlcNAc to C55–PP-MurNAc(-l-Ala–d-isoGlu–l-Lys–d-Ala–d-Ala)-GlcNAc, and the transpeptidation (cross-linking) of wall peptides (570). The latter reaction results in the proteolytic removal of the d-Ala at the C-terminal end of the pentapeptide and the formation of a new amide bond between the amino group of the crossbridge and the carbonyl group of d-Ala at position 4 (791). The transpeptidation reaction of staphylococcal peptidoglycan is depicted in Fig. Fig.3.3. This reaction is the target of penicillin and other β-lactam antibiotics which mimic the structure of d-alanyl–d-alanine (791). After cleavage, the β-lactam ring continues to occupy the active site serine residue of PBPs, thereby inhibiting PBPs (871, 872). Less is known about the class B PBPs; however, they are speculated to be involved in morphogenetic networks (272).
Not all cell wall peptides are cross-linked to their neighboring peptidoglycan strands. Unsubstituted (free) peptides may be preserved by trimming the terminal d-Ala off the pentapeptides (839). This reaction is catalyzed by other, low-molecular-weight PBPs that function as carboxypeptidases (372, 805). The reactions carried out by transpeptidases and carboxypeptidases are similar in nature. Transpeptidases use an amino group as a nucleophile to resolve the acyl-enzyme intermediate between the active-site hydroxyl of serine and the carbonyl of d-alanine, whereas carboxypeptidases use water as a nucleophile (251). Consequently, there are sequence and structural similarities between these enzymes (251). The ratio between transpeptidation and carboxypeptidation is thought to be important in generating a three-dimensional structure of the cell wall (463). For example, different degrees of cross-linking could allow wall synthesis at distinct angles and the generation of curves in otherwise cylindrical cells. However, this assumption has never been tested for the distinctly shaped gram-positive bacteria. Peptidoglycan with a low degree of cross-linking is much more sensitive to degradation by cell wall hydrolases (760). It is conceivable that the degree of cross-linking plays a role in specifying sites on the cell wall that are more prone to either degradation or additional synthesis.
While the repeating disaccharide [MurNAc-(β1-4)-GlcNAc] is found in all bacterial peptidoglycans, the wall peptides differ in composition between bacterial species (710). Some organisms, such as Listeria, replace l-Lys with another diamino acid, in this case m-diaminopimelic acid (710). Others add amino acids to the side chain -amino of l-Lys to synthesize extended peptidoglycan cross-bridges (710). Figure Figure22 provides an example for the peptidoglycan structures of three different gram-positive bacteria. The wall peptides can either be cross-linked with those of another glycan strand such that the -amino of l-Lys or any other amino acid added at this position will be amide linked to the carbonyl of d-Ala in the wall peptide or be trimmed to generate an un-cross-linked wall tetrapeptide (710).
The cell wall of many gram-positive bacteria is modified by O-acetylation of MurNAc at C-6 (790). Although the enzymatic mechanism for this decoration remains unknown, it does confer resistance to cell wall degradation by animal lysozymes (97). Other chemical modifications include additions of teichoic acids, phosphorylation, and attachment of carbohydrates (see below). Recent advances in cell wall structure have been made by analyzing peptidoglycan breakdown products by reverse-phase high-pressure liquid chromatography combined with mass spectrometric analysis (144, 244, 269, 801). These techniques allow the characterization of different degrees of cross-linking as well as the identification of new cross-linking reactions within the cell wall (145).
Gram-positive bacteria synthesize several compounds that decorate their peptidoglycan exoskeleton (694). Based on structural differences, one can distinguish teichoic acids, teichuronic acids, lipoteichoic acids, lipoglycans, and polysaccharide modifications (694). Over the past 30 years, the complete structures of some of these compounds as well as the genes required for their synthesis have been characterized. This research has been reviewed in detail elsewhere, and we provide here a brief summary of these elements, since they may be important for protein targeting mechanisms (17, 205, 643).
All gram-positive bacteria are thought to synthesize anionic polymers that are covalently attached to the peptidoglycan or tethered to a lipid anchor moiety (18, 23). Typically these polymers consists of polyglycerol phosphate (Gro-P), poly-ribitol phosphate (Rit-P), or poly-glucosyl phosphate (Glc-P) all of which may be glucosylated and/or amino acid esterified (205, 643). Figure Figure44 compares the structure of staphylococcal cell wall teichoic acid with that of two other bacteria. A polymer of 30 to 50 Rit-P subunits is phosphodiester linked to 2 or 3 Gro-P residues which are linked to the disaccharide ManNac(β1-4)GlcNAc (16, 183, 184, 700, 701). The GlcNAc moiety is (1–6) phosphodiester linked to MurNAc within the peptidoglycan repeating disaccharide (MurNAc-GluNac) (128, 318, 438). The (Gro-P)n-ManNac-GlcNAc tether between wall teichoic acids and the peptidoglycan is referred to as the linkage unit (15). Although many gram-positive bacteria are known to synthesize identical linkage units, some species modify its structure with other carbohydrates whereas a few others use entirely different compounds to tether their wall teichoic acids (15).
Modifications of the repeating subunits differ from one bacterial species to another (17). Both Gro-P and Rit-P repeating units can be decorated with a variety of different sugars and can also be esterified with d-alanine (78–80). Synthesis of the backbone structure is thought to occur in the bacterial cytoplasm (78, 643). Synthesis begins by fusing UDP-GlcNAc to prenol-phosphate (862, 863) (Fig. (Fig.5).5). The resulting GlcNAc-PP-prenol is then mannosylated with UDP-ManNac to yield ManNac-GlcNAc-PP-prenol (873). The lipid-anchored cell wall linkage unit serves as an attachment site for the addition of two or three Gro-P units and is extended by Rib-P, which is added from CDP-ribitol substrates (267, 268, 521, 642). The end product of this synthetic pathway is presumably translocated across the cytoplasmic membrane via an ATP binding cassette transporter system (472). Attachment of wall teichoic acid to MurNAc proceeds during cell wall assembly; however, the enzymatic machinery required for this process is still unknown.
Esterification of Gro-P or Rit-P with d-Ala occurs on the surface of the cytoplasmic membrane, i.e., after teichoic acids have been translocated across the membrane (580). Four gene products are required for this process, during which d-Ala is first linked to a diacyl carrier protein and then linked to an undecaprenol phosphate lipid carrier (139, 316, 317, 580, 592, 641). Lipid-linked d-Ala is translocated across the cytoplasmic membrane and serves as a substrate for esterification. This reaction is presumably used for the d-Ala ester modification of cell wall teichoic acids and lipoteichoic acids (579). Cell wall teichoic acids are thought to be uniformly distributed over the entire peptidoglycan exoskeleton (806). Some teichoic acid decorations, for example the esterified d-Ala, are unstable, and S. aureus continuously reesterifies d-Ala residues (431, 433).
Although the structures of cell wall teichoic acids are largely known and some of the genes involved in their synthesis appear to be essential for the growth of gram-positive bacteria, the physiological role of these molecules is still not completely understood (642). It is conceivable that the negatively charged teichoic acids function to capture divalent cations or provide a biophysical barrier to prevent the diffusion of substances (205, 207). However, these claims have been largely speculative and experiments that directly prove or disprove them are difficult to design. Cell wall teichoic acids appear to be the binding sites for some enzymes that cleave the bacterial peptidoglycan (333). For example, the LytA amidase of S. pneumoniae binds to the choline moiety of the cell wall teichoic acids of this organism (351, 352). Conceivably, the affinity for teichoic acids directs murein hydrolases to the cell walls of specific species (discussed below). Thus, teichoic acids may serve as species-specific decorations which allow gram-positive bacteria to synthesize an envelope structure that is chemically distinct from the envelope other organisms that display an otherwise identical peptidoglycan exoskeleton.
Lipoteichoic acids are polyanionic polymers inserted in the outer leaflet of the cytoplasmic membrane via a lipid moiety (205). The polymer extends through the cell wall peptidoglycan onto the surface of gram-positive cells. The precise function of lipoteichoic acids is unknown (205). The cell wall teichoic acids and lipoteichoic acids possess different chemical structures in all gram-positive bacteria except Streptococcus pneumoniae. For example, S. aureus synthesizes poly-Rit-P cell wall teichoic acids, which are modified at the 2′ and 4′ hydroxyl with GlcNAc and d-Ala (701). Staphylococcal lipoteichoic acids are composed of poly-Gro-P, which can be modified at the 2′ hydroxyl of Gro-P with either GlcNAc or esterified d-Ala (165, 208–210). The lipoteichoic acid moiety of S. aureus is attached to Glc(1-4)Glc-(1-3)diacylglycerol (433).
Lipoteichoic acids of other bacteria have different repeating subunits and/or different lipid anchors. Figure Figure6 compares6 compares the structure of staphylococcal lipoteichoic acid with that from S. pneumoniae (205). Lipoglycans are related structures that can be distinguished from lipoteichoic acids by the absence of phosphate from the repeating subunits (205). Synthesis of the lipoteichoic acids is thought to occur on the surface of the cytoplasmic membrane. Thus, the substrates of each lipoteichoic acid constituent must be translocated across the cytoplasmic membrane prior to assembly (205). Indeed, each of the precursor molecules is tethered to a lipophilic moiety: either glycolipid, phosphatidylglycerol, hexosyl-1-phosphoundecaprenol, or undecaprenyl-d-Ala (106, 241) (Fig. (Fig.77).
Lipoteichoic acid synthesis begins in the cytoplasm with the synthesis of phosphatidic acid from glycerol, ATP, and two acyl side chains (Fig. (Fig.7).7). Phosphatidic acid is converted to phosphatidyl-glycerophosphate and then to phosphatidylglycerol, which serves as a substrate for the polymerization of Gro-P (266). This synthetic scheme requires large amounts of diacylglycerol, which are recycled to phospatidic acid or used for the biosynthesis of other membrane lipids (433). Glycosylation of lipoteichoic acid requires hexose linked to undecaprenyl, which is presumably synthesized in the cytoplasm and translocated across the membrane (206, 432, 482). It is conceivable that lipoteichoic acids, similarly to the cell wall teichoic acids of S. pneumoniae, also serve as species-specific decorations of the peptidoglycan exoskeleton.
Protein secretion has been studied extensively in Escherichia coli, and the paradigms established through this work are thought to be true for all bacterial and eukaryotic cells (171, 640, 654). Proteins destined for translocation across the cytoplasmic membrane are marked by a signal (leader) peptide (65) that is generally composed of a core of 15 to 20 hydrophobic residues flanked at the N-terminal end by positively charged residues (182, 728). For secreted proteins, signal peptides are proteolytically removed by signal (leader) peptidases upon translocation across the cytoplasmic membrane (136, 153). Signal peptides are necessary and sufficient for protein translocation across membranes if the fused polypeptide substrate can be maintained in an export competent state, a function that can be achieved by one of two separate pathways (142, 808). In one pathway, a signal recognition particle (SRP) can bind to the signal peptides of nascent chains and temporally arrest their ribosomal translation (824–826). The SRP of E. coli is thought to be a ribonucleoprotein particle consisting of Ffh (also known as P48) and 4.5S RNA (46, 648, 678, 762). Translation resumes after the SRP-ribosome complex docks onto its membrane receptor (FtsY), thereby delivering the nascent polypeptide to the Sec translocation channel (551, 808). Alternatively, signal peptide-bearing precursors may be translocated after their synthesis has been completed, i.e., by a posttranslational translocation process. Binding of a secretion chaperone, SecB, can maintain these precursors in an unfolded, translocation-competent state (452, 661). The SecB protein binds to the mature part of signal peptide-bearing precursors but may also interact with the SecA protein, a component of the secretion machinery itself (90, 151, 341, 662). Once initiated into the secretory pathway by the signal peptide, the SecB chaperone dissociates from the precursor, allowing the translocation of the full-length polypeptide across the membrane (197). The pathways converge at the translocation channel, and the choice of pathway that is used may be a function of the hydrophobicity of the signal peptide (807, 808). Recent observations suggest that the SRP pathway may target proteins destined for the inner membrane to the Sec translocase whereas the SecB pathway seems to be used by proteins that are secreted across the membrane (721, 804).
The gene products necessary for the initiation of leader peptide-containing precursors into the secretory pathway were identified as mutants that restored the β-galactosidase activity of LacZ fusions to the C terminus of signal peptide bearing proteins (600). These LacZ fusion proteins are substrates for export but arrest at an undefined step of the secretion pathway and block the export of other precursors. The results of the sec screens generally matched those of a suppressor screen (prl, for “protein localization”) that scored for the export of a mutant polypeptide harboring a defective signal peptide (181). The functionality of the Sec proteins in membrane translocation of precursor proteins has been elegantly demonstrated in vitro (309, 563). SecYEG is the preprotein translocase channel and requires the SecA ATPase to push polypeptides through a hydrophilic channel (172, 174, 179, 811). The SecDF and YajC proteins function at a later step, perhaps in regulating the activity of SecA (173).
Homologs of SecA, SecD, SecE, SecF, SecY, and YajC have been identified in the sequenced genome of the gram-positive organism B. subtilis; however, no homologs of SecB and SecG were found (454). It is not clear whether SecB and SecG are dispensable for secretion in B. subtilis or whether they have been replaced with other polypeptides that do not display homology to the E. coli gene products. Ffh, 4.5S RNA, and FtsY are conserved between E. coli and B. subtilis (102, 456, 774). Another striking difference between E. coli and B. subtilis is the number of signal peptidases. The gram-positive organism appears to encode seven signal peptidases, whereas E. coli contains only two. Two of the Bacillus genes specify class 2 signal peptidases for the maturation of lipoproteins, whereas the other five signal peptidases are similar to the E. coli lepB gene. Signal peptides of gram-positive bacteria differ from those of their gram-negative counterparts by usually being longer and more hydrophobic and possessing more charge in their N-terminal ends (818, 819). Although these signal peptides of gram-positive organisms generally function in gram-negative bacteria, the same is not always true when signal peptides of gram-negative bacteria are expressed in gram-positive organisms (716). Could it be that gram-positive bacteria require additional components to recognize signal peptides? This has been tested biochemically for both B. subtilis and S. aureus by purifying membranes with bound ribosomes and comparing their protein content with the content of membranes those that did not contain ribosomes (2, 3). Although several different polypeptides could be identified as a secretory (S) complex, analysis of the cloned sequences revealed pyruvate dehydrogenase, an enzyme of intermediary metabolism that is presumably not directly involved in protein secretion (329, 330).
The existence of a periplasmic space in gram-positive bacteria that is analogous to the one found in gram-negative bacteria has been a matter of speculation for many years (276). Morphological inspection has at times revealed a narrow space between the cytoplasmic membrane and the cell wall peptidoglycan (276). Visualization of this space depended on the mode of sample preparation (275). Others have emphasized an operational definition for the periplasm of gram-positive bacteria as a subcellular compartment (544); i.e., it should contain a subset of secreted proteins that are specifically targeted to this compartment (644).
Several proteins that are periplasmic in gram-negative bacteria were observed to be lipid modified in gram-positive bacteria (264). These lipoproteins function to capture specific import substrates, such as carbohydrates, and deliver them to transport machinery embedded within the cytoplasmic membrane. The homologous proteins in gram-negative cells are generally soluble in a periplasmic space that is bounded by both the bacterial inner and outer membranes. Hence, the lipoyl moiety might be a targeting device to retain polypeptides on the membrane surface of gram-positive bacteria (264, 769, 770). For example, β-lactamase (Bla) is a soluble periplasmic enzyme that confers resistance to β-lactam antibiotics in gram-negative bacteria (444) while the β-lactamases of S. aureus and B. licheniformis are lipoproteins (584, 585). β-Lactamase precursors in gram-positive bacteria appear to be substrates for both type I and type II signal peptidase, resulting in mixed populations of secreted, soluble Bla as well as glyceride-modified, membrane-anchored Bla (573). Mutant Bla exported solely by a type I signal peptide is secreted into the extracellular medium but fails to protect staphylococci from β-lactam antibiotics (573). Thus, tethering of such enzymes to the cytoplasmic membranes is an important factor in both lowering the local concentration of inhibitors such as penicillin and raising the local concentration of necessary nutritional resources.
The high incidence of streptococcal and staphylococcal human diseases at the beginning of this century sparked the interest of many medical investigators in these microbes. Initial efforts concentrated on the characterization of the predominant antigens during infection by gram-positive bacteria with the rationale of developing protective vaccines (285, 466). Although vaccines for Streptococcus pyogenes and Staphylococcus aureus are still not commercially available, this research rapidly identified and characterized protein and carbohydrate structures on bacterial surfaces. Lancefield and colleagues classified streptococci on the basis of carbohydrate antigens. The group A streptococcus (GAS), S. pyogenes, is the causative agent of pharyngitis, purulent skin lesions, and postinfectious sequelae. GAS strains display M protein on their surface and can be typed with antisera raised against the N-terminal portion of this α-helical coiled-coil molecule (467). More than 100 types are known, and it was recognized early that these type-specific antibodies conferred immunity to infection of strains carrying one specific M type by promoting the phagocytic killing of GAS (461, 467). Hence, much effort was directed at characterizing the M protein molecule as an antiphagocytic antigen (212).
In 1958, Jensen observed that several S. aureus strains could precipitate immunoglobulins (Ig) from both human and preimmune animal serum in gel precipitation assays (380). The nonimmune precipitation of antibodies is due to the binding of protein A on the staphylococcal surface to the Fc portion of Ig (222). However, antibodies directed specifically at protein A are not protective during animal infections, and strains deficient in protein A do not display a significant reduction in their pathogenic potential in an animal experimental system (403). Although we do not know the precise role of protein A during S. aureus infections, this molecule has served as a model system for immunological, structural, and microbiological studies (588).
It is widely assumed that surface proteins of Gram-positive bacteria might interact with eukaryotic proteins as a means of establishing residence at unique locations or evading the immune system. Although these assumptions have not always been rigorously tested, they have increased research interest in this field. The advent of molecular cloning techniques yielded a rapid accumulation of DNA sequence and allowed structural comparison of surface proteins. The current completion of several microbial genome sequences will soon provide us with a comprehensive view of all sequences. Therefore, we assume that future work will concentrate on the physiological and biochemical characterization of surface proteins in gram-positive bacteria.
Initial experiments to characterize surface proteins focused on solubilizing and purifying streptococcal M protein and staphylococcal protein A from the bacteria. Lancefield used acid extraction of streptococci to release M proteins from the cell surface (467). This treatment caused partial hydrolysis of polypeptides but not of the bacterial cell wall, and released peptide fragments were recovered in the supernatant after centrifugation of acid extracts. Treatment with detergents, bases, or proteases or by boiling also released some M protein; however, all of these preparations yielded either only peptide fragments or minute amounts of material, indicating that the solubilization technique was insufficient (212). This problem was overcome by the use of cell wall-hydrolytic enzymes, which enabled the efficient solubilization of full-length surface proteins (212, 340). The biochemical analysis of protein A was facilitated by its ability to be purified by affinity chromatography on matrices containing linked immunoglobulins after its enzymatic solubilization from the staphylococcal cell wall (340). When purified S. aureus cell walls were digested with lysostaphin, a bacteriolytic enzyme secreted by Staphylococcus simulans bv. staphylolyticus, protein A was released as a homogeneous population (737). Egg white lysozyme, an N-acetylmuramidase, does not effectively degrade the staphylococcal cell wall. Nevertheless, Sjöquist et al. were able to purify small amounts of lysozyme-released protein A and to show that it migrated more slowly on sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) than did the lysostaphin-released counterpart, suggesting an increase in molecular mass (738). Acid hydrolysis of the lysozyme-released species yielded the amino sugars GlcNAc and MurNAc, which were not observed for lysostaphin-released protein A (738). On the basis of these observations, Sjöquist et al. concluded that protein A must be linked to the staphylococcal cell wall (738).
Cloning and sequencing of the emm (streptococcal M protein) and spa (staphylococcal protein A) genes revealed open reading frames that specified polypeptides harboring an N-terminal signal peptide and a C-terminal hydrophobic domain (346, 803). Because hydrophobicity is a universal signal for membrane anchoring of polypeptides (138), it seemed plausible that surface proteins could be tethered to the cytoplasmic membrane of gram-positive bacteria rather than linked to the cell wall. Comparison of several additional surface protein gene sequences allowed a more detailed analysis of their primary structures and revealed a striking C-terminal homology, namely, the presence of an LPXTG sequence motif, where X is any amino acid, followed by a C-terminal hydrophobic domain and a tail of mostly positively charged residues (216). The remarkable conservation of the LPXTG motif in surface proteins of gram-positive bacteria suggested that this element must be involved in the anchor mechanism of surface proteins in gram-positive bacteria.
To characterize the C-terminal part of M proteins, Pancholi and Fischetti digested the surface exposed portion of this molecule with trypsin (613). Streptococci were washed to remove trypsin as well as cleaved peptide fragments and subsequently digested with phage lysin amidase, which resulted in a spectrum of released M-protein fragments with increasing mass when analyzed by SDS-PAGE (613). The fastest-migrating species was further purified and characterized by Edman degradation as well as amino acid composition. The data indicated that the C-terminal end of M protein, beginning at residue 302, is uniformly protease protected (613). Amino acid analysis revealed that the C-terminal peptide did not contain phenylalanine and suggested that the phenylalanine-containing hydrophobic domain might be absent from mature M protein (612). In a subsequent study of the possible membrane anchor cleavage activity, this data was interpreted as indicating a cleavage between the proline and the serine of the LPSTG motif (612), an estimate that was close to the later-identified cleavage site between the threonine and the glycine of the LPXTG motif within protein A (see below). Phage lysin released anchor fragments do not contain amino sugars (396). This can be explained by the amidase activity of this enzyme, which removes the glycan component of the cell wall from its crosslinked peptide backbone (613). Thus, although this has never been demonstrated, the data corroborate a model in which the C-terminal end of M proteins may also be covalently linked to the peptidoglycan.
Chemical analysis of the peptidoglycan of several other bacterial species including S. pneumoniae (476), S. mutans, S. sanguis (64, 680), Lactobacillus fermenti (822), and Mycobacterium tuberculosis (853) revealed the presence of significant amounts of nonpeptidoglycan amino acids. Analysis of the cell walls of S. mutans led to the conclusion that it contained covalently attached proteins, and it was suggested that this may be a feature common to the cell walls of all gram-positive species (577, 689).
Staphylococcal protein A has been used as a model system to study the anchoring of surface proteins in gram-positive bacteria (716). The cytoplasmic protein A precursor is exported and processed to generate the mature anchored species within 1 min of its synthesis. As described above, the anchored mature species is accessible to protease on the bacterial surface and requires enzymatic release from the staphylococcal cell wall for solubility. Lysostaphin cleaves at the pentaglycine crossbridge of the staphylococcal cell wall and solubilizes protein A as a uniform species on SDS-PAGE (716). In contrast, muramidase cleaves the glycan strands of the cell wall and muramidase-released protein A appears as a spectrum of bands on SDS-PAGE, all of which migrate more slowly than the lysostaphin-released counterpart (715). The notion that these mass differences must be the result of linked peptidoglycan was confirmed by an experiment in which muramidase-released protein A is digested with lysostaphin, which converts all material to the same uniform mobility as that observed for protein A directly solubilized by lysostaphin (715). Because of the uniform migration of lysostaphin-released fragments on SDS-PAGE, the anchoring point of surface proteins must be more proximal to the pentaglycine crossbridge, i.e., the cleavage site of lysostaphin, than to the glycan chains where muramidase cuts.
Protein A lacking a C-terminal sorting signal, i.e., the LPXTG motif, hydrophobic domain, and charged tail, is secreted into the medium and thus does not require the enzymatic digestion of the cell wall in order to be soluble when cells are boiled in SDS (716) (Fig. (Fig.8).8). Mutant protein A with the charged tail truncated behaves similarly. Lysostaphin-solubilized protein A migrates faster on SDS-PAGE than the secreted, unanchored species does, suggesting that the polypeptide chain is proteolytically cleaved during the anchoring process (716). To confirm that the secreted species comprised the complete (unprocessed) polypeptide chain, a protein A mutant that contains a single cysteine at the C terminus has been generated. Metabolic labeling of this cysteine revealed that the secreted protein A mutant is uncleaved and that some form of processing must be required for the anchoring of surface proteins (716).
Mutants with mutations within the LPXTG motif have a different phenotype. These polypeptides are not secreted into the medium but remain cell associated. Compared to the wild type, the LPXTG mutants do not reveal the characteristic size differences of lysostaphin- and muramidase-released species, indicating that these substances are not covalently linked to the cell wall (715). Furthermore, the mutant polypeptides also migrate more slowly on SDS-PAGE than their anchored counterparts do, suggesting that processing, i.e., proteolytic cleavage and cell wall linkage, is disrupted. Thus, the two different mutant phenotypes reveal that surface protein anchoring is arrested at distinct steps. The hydrophobic domain and charged tail cause protein A to be retained from the secretory pathway, whereas the LPXTG motif is needed for cleavage and linkage to the cell wall.
To determine the signal sufficient for cell wall anchoring of proteins normally secreted into the medium or lipid anchored to the cytoplasmic membrane, C-terminal protein A sequences have been fused to staphylococcal enterotoxin B (Seb), β-lactamase (BlaZ), or E. coli alkaline phosphatase (PhoA) exported by the protein A signal peptide (715, 716). The C-terminal 35 residues of protein A, comprising the LPXTG motif, hydrophobic domain, and charged tail, are sufficient for cell wall anchoring of each of these fusion proteins. Homologous sequence elements of other surface proteins also function to signal cell wall anchoring (715). When fused to the C terminus of secreted reporter proteins, some of these signals function in a manner indistinguishable from that of the protein A signal. Others fail to signal anchoring but can be mutated to gain this function. In all cases examined, these mutations alter the spacing between the LPXTG motif and the charged tail. The absolute number of residues between the LPXTG motif and the charged tail does not appear to be critical. It has been proposed that the folding of the hydrophobic domain determines the correct spacing between the flanking elements required for proper recognition of the sorting signal (715).
Retention is signaled by the positively charged residues within the charged tail of the sorting signal. Although a single arginine is most effective in signaling retention, this residue can be replaced by lysine when other positively charged residues are present. Increasing the spacing between the LPXTG motif and the charged tail further does not interfere with cell wall sorting but proves to be toxic for staphylococci. The reason for this phenomenon is not known (715).
The LPXTG motif is conserved within the sorting signals of all known wall-anchored surface proteins of gram-positive bacteria (Table (Table1).1). The threonine (T) displays some variation in that either alanine or serine can be found at this position. A threonine-to-alanine substitution was tested in the sorting signal of staphylococcal protein A, and this mutation does not affect anchoring, which suggests that the identified sequences are indeed functional sorting signals. Other mutations in the LPXTG motif, for example a proline (P)-to-asparagine (N) mutation, do abolish sorting. Nevertheless, a rigorous mutagenesis of this sequence element has never been performed, and the basis for the sequence conservation is thus unknown. The simplest explanation for the conservation of the LPXTG motif may be the preservation of a proteolytic cleavage site. Sortase, the presumed enzymatic activity that cleaves at this point and catalyzes the transpeptidation reaction, may require a specific sequence or fold for substrate recognition. It is surprising that other transpeptidation mechanisms, for example that of glycosylphosphatidylinositol (GPI) anchoring, seemingly have much less stringent substrate requirements (see below).
The hydrophobic domain and charged tail together are thought to retain the polypeptide from the secretory pathway and provide an opportunity for the proteolytic cleavage of surface proteins. One way in which this might be achieved is to anchor the polypeptide in the cytoplasmic membrane. If so, fusion of the C-terminal hydrophobic domain to other secreted proteins such as PhoA or Seb should insert the hybrids in the membrane by acting as a stable transmembrane domain. This, however, is not the case, and the mutant proteins are instead found to be peripherally associated with the membrane (715). Furthermore, mutation or truncations of the charged tail, even at positions that cannot play a role in membrane insertion, lead to the secretion of the polypeptide into the surrounding medium. Perhaps the C-terminal hydrophobic domain and charged tail function to arrest translocation of the polypeptide within the secretion channel but without permitting the actual insertion into the membrane. This arrest in translocation could suffice to allow cleavage of the polypeptide and concomitant cell wall anchoring. A simple test of this hypothesis might be to engineer sorting signals with increased hydrophobicity to allow membrane anchoring. Once inserted into the membrane, the surface protein could diffuse away from the site of cell wall sorting without permitting a transpeptidation reaction at the LPXTG motif.
The proteolytic cleavage site during the anchoring of surface proteins has been determined by purifying and sequencing the C-terminal cleavage fragment (574). This has been done by using a hybrid protein in which the mature part of maltose binding protein is fused to the C-terminal end of the sorting signal. The hybrid protein is exported by its N-terminal signal peptide and cleaved, and the N-terminal portion is linked to the bacterial cell wall. In contrast, the C-terminal cleavage fragment with the fused maltose binding protein domain remains within the cytoplasm and can be purified by affinity chromatography (574). Edman degradation reveals that the cleavage site is located between the threonine and the glycine of the LPXTG motif (Fig. (Fig.9).9). Deletion of the LPXTG motif abrogates cleavage as well as anchoring, indicating that the two events are linked. Proteolytic cleavage at the LPXTG motif absolutely requires protein export by the Sec machinery. When cells are poisoned with sodium azide, which at low concentrations is an inhibitor of the SecA motor of preprotein translocase (223, 601), no secretion or cleavage of the fusion protein occurs. Similarly, protein A mutants harboring a defective signal peptide are not cleaved at the LPXTG motif. This result suggests that the site of proteolytic cleavage may be either in the membrane, i.e., during secretion of protein A, or on its extracytoplasmic side, which coincides with the locus for cell wall assembly.
To date, the only cell wall anchor structure to be solved in molecular detail is that of staphylococcal protein A (575, 713, 796, 797). The structure was determined through the use of specifically designed hybrid proteins that enabled the purification of large amounts of surface protein after solubilization from the cell wall with muralytic enzymes. The C-terminal anchor structure was removed and isolated from the full-length hybrid surface protein by proteolysis or cyanogen bromide cleavage at sites specifically engineered into the hybrid protein. The resulting C-terminal anchor peptides allowed detailed studies of associated cell wall structures by chemical analysis, Edman degradation, and mass spectrometry.
In an early study, a maltose binding protein harboring the C-terminal sorting signal of protein A was solubilized with lysostaphin, purified, and subjected to limited proteolysis to identify C-terminal anchor peptides (713). The C-terminal threonine of the LPXTG motif was found to be amide linked to a triglycine peptide. Because the pentaglycine crossbridge of the staphylococcal cell wall is also the site of lysostaphin cleavage, it was proposed that surface proteins are amide linked to the free amino group of the wall crossbridges. Because both the LPXTG motif and the free amino group of wall crossbridges are conserved features in gram-positive bacteria, this mechanism might be universal for all organisms expressing surface proteins via a C-terminal sorting signal (574, 713).
In later studies, a hybrid protein containing the C-terminal sorting signal of protein A fused to Seb was used in similar experiments to analyze the C-terminal anchor structure of surface protein solubilized with several other muralytic enzymes (575, 796, 797). In these studies, six histidyl residues placed at the fusion junction between Seb and the sorting signal facilitated the purification of the full-length protein by chromatography on nickel resin. C-terminal anchor peptides were obtained as follows. The purified surface protein was first cleaved with cyanogen bromide at a methionyl immediately adjacent to the six histidyl. The anchor peptides were then purified by another round of affinity chromatography on nickel resin. As mentioned above, surface protein solubilized from staphylococcal peptidoglycan with muramidases migrate as a ladder of bands on SDS-PAGE (Fig. (Fig.10).10). Muralytic amidases, i.e., enzymes that cut at the amide linkage between the d-lactyl of MurNAc and the l-Ala of the wall peptide, also solubilize surface protein as a ladder of bands (575). Mass-spectrometric analysis of C-terminal anchor peptides from the hybrid surface protein solubilized with these enzymes confirmed that the distinct pattern observed on SDS-PAGE is due to the attachment of a variable number of linked peptidoglycan subunits (575). Amidase-solubilized surface proteins were found to be linked to one or more peptidoglycan subunits of the structure [NH2–l-Ala–d-iGlu–l-Lys(Gly5)–d-Ala–d-Ala], whereas the muramidase-solubilized surface proteins were linked to subunits of the structure MurNAc[-l-Ala–d-iGlu–l-Lys(Gly5)–d-Ala–d-Ala]-GlcNAc. The attached peptidoglycan subunits were found to be linked to one another through their pentaglycine crossbridges. As would be expected, redigestion of amidase- and muramidase-solubilized surface protein with lysostaphin removed the linked peptidoglycan subunits (575). The structure of a surface protein linked to a peptidoglycan dimer is shown in Fig. Fig.1111.
On the basis of sequence homology and limited biochemical analysis, the murein hydrolase of staphylococcal bacteriophage 11 was predicted to be an amidase (835). Hybrid surface proteins solubilized from the staphylococcal cell wall with the 11 hydrolase migrate as a doublet of bands on SDS-PAGE, in marked contrast to the ladder pattern observed when the protein is solubilized with other well-characterized amidases (796) (Fig. (Fig.10).10). Analysis of 11 hydrolase-solubilized C-terminal anchor peptides revealed that surface proteins were linked to a single peptidoglycan subunit, of which approximately 50% lacked the GlcNAc-MurNAc disaccharide, indicating that cross-linked peptidoglycan subunits had been removed (796). Recent results reveal that the 11 hydrolase displays two activities, the expected amidase as well as a d-Ala-Gly peptidase activity (575, 576).
Genetic analysis of staphylococcal methicillin resistance has provided new insights into the synthesis of the peptidoglycan crossbridge (43). Staphylococcal strains expressing PBP2a (PBP2′) are resistant to most β-lactam antibiotics including methicillin (310, 520). Genetic screens designed to identify other elements necessary for methicillin resistance yielded mutations in at least 10 different genes (44, 45, 146). Some of these genes are involved in the synthesis of the pentaglycine crossbridge (145, 331, 513, 756) or the amidation of d-isoglutamyl within the wall peptide (292, 394). Staphylococcal strains harboring mutations in the femA, femB, or femX gene synthesize altered cell wall crossbridges with either three glycyl (femB), one glycyl (femA), or a combination of no or one glycyl. The last phenotype has been reported for a mutant with a combination of a femA mutation and a second one leading to a partially nonfunctional FemX protein (442). Although this has not yet been demonstrated directly, it seems likely that FemA, FemB, and presumably also FemX catalyze the addition of glycyl to the -amino group of l-lysyl in the wall peptide portion of lipid II. Ton-That et al. tested whether the sorting reaction could proceed in staphylococcal strains carrying mutations in any of the three genes femA, femB, and femX (797). Although sorting was slowed in all fem mutant strains, surface protein was found anchored to peptidoglycan-bearing crossbridges with only three or one glycyl residues as well as crossbridges containing seryl moieties. However, species anchored directly to the -amino of l-lysyl in the wall peptide were not observed, suggesting that the sorting reaction has restricted substrate specificity for the amino group acceptor of this amide bond exchange mechanism (797).
Although it is now clear that protein A and other staphylococcal surface proteins are linked to the pentaglycine cell wall crossbridge, the specific peptidoglycan substrate of the sorting reaction has thus far not been identified. If the sorting reaction requires mature, assembled peptidoglycan as a substrate, staphylococcal protoplasts in which the peptidoglycan had been removed enzymatically should be unable to catalyze the cleavage of sorting precursors. Movitz studied the synthesis of protein A in staphylococcal protoplasts and reported that most of this polypeptide was released into the extracellular medium (561). A similar result was observed for the secretion of M protein by streptococcal protoplasts (809). Staphylococcal protoplasts generated by lysostaphin digestion and osmotically stabilized with sucrose cleaved sorting precursors similarly to intact cells, suggesting that the mature, assembled cell wall is not required for the sorting reaction (797a). Protein A molecules that were cleaved at the LPXTG motif were released into the extracellular medium. Nevertheless, it is not clear whether these released protein A molecules contain linked peptidoglycan fragments.
On the other hand, if the sorting reaction uses the peptidoglycan synthesis precursor, lipid II, as a substrate, inhibition of peptidoglycan synthesis with known antibiotics might also interfere with the sorting reaction. This question has also been addressed by Movitz (562). Although the anchoring mechanism was unknown at that time, the author suspected that protein A was linked to peptidoglycan and measured the amount of pulse-labeled protein A in the cell walls of staphylococci that had been treated with vancomycin. The results showed that vancomycin inhibited cell wall synthesis but did not interfere with the anchoring of protein A. Movitz concluded that protein A may be linked to mature, assembled cell wall. Vancomycin binds to d-alanyl–d-alanine within the wall peptide, and treatment of staphylococci with this antibiotic leads to the accumulation of lipid II molecules (789). If lipid II served as substrate for the sorting reaction, Movitz’s experiments could not have distinguished between protein A molecules that were linked to lipid II and others that were tethered to the assembled cell wall. We think that the identification of the substrate for the sorting reaction will require biochemical studies with purified sortase enzyme as well as the rigorous characterization of presumed sorting intermediates.
The sorting of surface proteins to the cell wall of gram-positive bacteria begins in the cytoplasm with the initiation of precursor molecules into the protein export (Sec) pathway via their N-terminal leader peptides (Fig. (Fig.9).9). By default, this pathway leads to the secretion of polypeptides into the surrounding medium of staphylococci; however, the C-terminal hydrophobic domain and charged tail function to retain protein A along the pathway, perhaps within the preprotein translocase. This retention allows recognition of the LPXTG motif and a proteolytic cleavage between the threonine and the glycine of the LPXTG motif. The carbonyl of threonine may be acylated to an active-site thiol or hydroxyl of sortase, and a subsequent nucleophilic attack of the free amino at the end of the pentaglycine cell wall crossbridge results in the transpeptidation of surface protein to peptidoglycan precursor. The sorting intermediate may subsequently be incorporated into the cell wall. Proteolytic degradation of anchored proteins as well as the physiological turnover of peptidoglycan will finally remove the anchored polypeptides.
Braun’s murein lipoprotein (LPP) serves as a paradigm for the synthesis and structure of cell wall-linked proteins in gram-negative bacteria (85) (Fig. (Fig.12).12). The N-terminal signal peptide of LPP directs the polypeptide into the secretory pathway (367). A cysteine residue within the signal peptide is glyceride modified via a thioester linkage prior to proteolytic cleavage at its amino group (305). The amino group of the N-terminal cysteine is acylated prior to the insertion of LPP into the outer membrane (485). The 70-residue polypeptide of mature LPP trimerizes in the periplasmic space (84, 87). About one-third of all lipoprotein is covalently linked to the peptidoglycan layer, thereby tethering the outer membrane to the peptidoglycan skeleton of E. coli (86). Lipid modification of murein lipoprotein first requires membrane-embedded glyceryl transferase and O-acyl transferase activities (240, 289, 783). The glycerol-modified LPP precursor is then cleaved by signal peptidase (type II), and the liberated amino group of cysteine is modified by an N-acyl transferase (365). The -amino of the C-terminal lysine is amide linked to the carboxyl group at the D center of meso-diaminopimelic acid within the E. coli cell wall peptides (88, 89). Neither the chemical reaction nor the enzymes responsible for this linkage have been characterized. It is conceivable that each trimeric murein lipoprotein unit is covalently linked to the cell wall. It also seems plausible that the amide bond between LPP and cell wall is generated at the expense of another amide bond, namely, that between meso-diaminopimelic acid and d-Ala in cell wall tetrapeptides, similar to penicillin-sensitive transpeptidation and the sorting mechanism of surface proteins anchored to the cell wall of gram-positive bacteria.
Some proteins displayed on the apical surface of polarized eukaryotic cells are modified by the GPI anchor (187, 198, 527) (Fig. (Fig.13).13). These polypeptides are synthesized with an N-terminal signal peptide and a C-terminal hydrophobic domain that functions to transiently retain these polypeptides in the endoplasmic reticulum (ER) membrane. Precursors are recognized in the ER as substrates by a GPI-modifying enzyme which proteolytically cleaves the peptide backbone upstream of the hydrophobic domain and amide links the N-terminal cleavage fragment to the free amino group of ethanolamine within the GPI moiety (69). GPI-decorated surface protein species are thought to be sorted at the Golgi and trans-Golgi network to lipid vesicles containing ceramide rafts that are marked for fusion with the apical portion of the cytoplasmic membrane (96, 495, 674). The exact mechanism of vesicle targeting is not known; however, the apical membrane differs in lipid composition from the basolateral membrane, which does not contain sphingomyelin and cerebrosides (203, 874). GPI-anchored surface proteins can be internalized by the budding of surface lipids into a structure referred to as caveoli (421). These vesicles are destined for fusion with lysosomes, which results in the degradation of both the GPI-anchored proteins and their bound ligands (522).
Synthesis of the GPI moiety occurs in the ER and begins on the cytoplasmic leaflet of the membrane (523, 524, 526, 813). After linkage of GlcNAc to phosphatidylinositol, a three-mannose core glycan is added and capped with phosphoethanolamine. The GPI moiety can translocate to the luminal side of the ER membrane and then serve as a substrate for transamidation (187, 775) (Fig. (Fig.14).14). The requirements for substrate recognition of precursor proteins have been thoroughly investigated (248, 434, 435, 548, 549, 556, 557). No conserved amino acids are present at or near the cleavage site, a situation more reminiscent of N-terminal leader peptides than of C-terminal cell wall sorting signals. Nevertheless, extensive mutagenesis of this region revealed that the residue at the cleavage site (ω) could be glycine, alanine, cysteine, serine, or asparagine but was none of the other 15 amino acids found in proteins. The ω+1 position, i.e., the residue at the N terminus of the C-terminal cleavage fragment, could tolerate almost all substitutions except proline. However, ω+2 appears to require the presence of either glycine, serine, threonine, or alanine, i.e., small-chain and uncharged amino acids (435). Furthermore, although the hydrophobic domains of GPI-modified proteins suffice to transiently insert precursors into the membrane, these sequences are also not characteristic membrane anchors and often lack the terminal charged residues as well as the degree of hydrophobicity found in peptide membrane anchors. Their function may therefore be similar to that of the C-terminal hydrophobic domain and charged tails of gram-positive sorting signals, i.e., to temporally retain the polypeptide chains from the secretory pathway and to permit a C-terminal transpeptidation reaction. Single point mutations within the hydrophobic domain suffice to stabilize the peptide membrane anchor and allow insertion into the basolateral portion of the cytoplasmic membrane (828).
Genes encoding the enzymes involved in synthesis of the GPI anchor were identified by screening for yeast mutants defective in the incorporation of [3H]inositol into the GPI-anchored protein GAS1 (604). Mutants with mutations in three genes, gpi-1, gpi-2, and gpi-3, are defective in GPI anchoring and temperature sensitive for growth, suggesting that these genes may be essential. Strains carrying mutations in any of three genes were found to be defective in the first step of GPI anchoring, the transfer of UDP-GlcNAc to phosphatidylinositol (478–480, 838). A second approach screened for yeast mutants defective in the surface presentation of GPI-anchored agglutinin, and the identified mutants could be grouped into six complementation classes, gpi-4 to gpi-9 (33). One of those strains, carrying a mutant gpi-8 allele, synthesizes a complete GPI moiety but does not transfer it onto precursor proteins, whereas all other mutants are defective in GPI assembly (32). GPI8 is a type I transmembrane ER protein with homology to cysteine proteases (32). Another transmembrane ER protein, GAA1, is also required for the attachment of GPI anchors to polypeptides (300, 338). Furthermore, it has been reported that soluble ER components are also necessary for the anchoring process (814). The role of these factors in catalyzing the transpeptidation reaction of GPI anchoring will have to be determined in a biochemical assay.
In vitro GPI anchoring of precursor protein has been demonstrated independently by several different laboratories (196, 525, 802). Mayor et al. established an assay that allows measurements of GPI anchoring without protein translocation into microsomal membranes (525). This is by far the simplest system and should allow characterization of the required enzymatic activities. The in vitro reaction requires precursor protein and GPI but no ATP or GTP, consistent with a transpeptidation (transamidation) mechanism. It is conceivable that the GPI8 enzyme catalyzes this reaction via an active-site sulfhydryl; a definitive result should be forthcoming.
Cole and Hahn examined the sites of cell wall and surface protein synthesis in Streptococcus pyogenes (127). This gram-positive organism forms chains of coccal cells due to the incomplete separation of its peptidoglycan. Thus, although the individual cells are round, the chain appearance permits an easy orientation of cell division sites at the contact points between cells. New cell wall appeared to be synthesized only at the contact points between cells. This experiment was extended to the synthesis of surface proteins by first proteolytically removing all surface (M) proteins and detecting the newly synthesized polypeptide by immunofluorescent staining of type-specific antiserum. All newly synthesized M protein first appeared at the contact sites between streptococcal cells and then extended slowly over the entire surface of the gram-positive cell. This result suggests that streptococci incorporate their surface proteins together with peptidoglycan at a defined site. Such localization may be required to coordinate protein sorting and cell wall synthesis, but these sites could also represent defined secretion sites in the cell wall peptidoglycan of gram-positive bacteria.
Pancholi and Fischetti observed that streptococci pretreated with phage lysin amidase would release large amounts of M protein (612). Phage lysin treatment is performed at pH 5.5 for optimal enzymatic activity, and during this treatment relatively little surface protein is released. However, once the pH of the buffer containing predigested streptococci was adjusted to pH 7.0 or higher, large amounts of M protein were released into the medium. The authors investigated the enzymatic activity that releases M proteins into the medium, assuming that it would cleave a membrane anchor of M proteins. The releasing activity is inhibited by pHMB, zinc, or other divalent cations. In view of the sorting hypothesis described above, it is conceivable that the measured solubilization is that of cell wall-linked M proteins. A change in the pH of intact streptococci does not lead to the release of M proteins.
The release of surface protein upon binding of specific antibody has been investigated for S. mutans P1 protein (476a–476c). Incubation at pH 6 caused the release of the surface protein antigen with its bound antibody from the streptococci (476a, 476b). The biological relevance of this was tested in an experiment that measured the detachment of S. mutans cells from a biofilm, which occurred only under conditions that also permitted the release of surface protein (476c). Shedding of P1 antigen required live cells, suggesting a pathogenic mechanism conceptually similar to another established paradigm, the shedding of variable surface antigen by trypanosome species (227). These organisms anchor their surface antigen via a GPI moiety to the cytoplasmic membrane, and cleavage with a phospholipase C allows the release of these peptides into the surrounding medium upon incubation with cross-linking antibodies. This phenomenon is thought to be one essential aspect of a complex mechanism by which trypanosomes establish antigenic variation of their surface protein antigen.
The domain of protein A that is protected from added protease by the staphylococcal cell wall (region X) has a remarkable repeat structure and is about 175 residues long (291). The fact that thermolysin digestion generated a uniformly protected C-terminal peptide suggests that the distance between the cell wall-anchoring point and the surface-displayed portion is similar for all protein A molecules. Hence, protein A molecules that are linked to the cell wall do not “migrate” toward the cell surface, as would be implied by a model in which the cell wall is synthesized from the inside out, like the bark of an aging tree. Rather, all protein A molecules are linked to the cell wall and remain at that distance as the cell wall grows longitudinally over the growing cytoplasmic membrane. If this is so, the distance of the wall-spanning segment between the anchoring and the folded portions of surface protein should be critical in determining the functionality of these molecules. This was tested for S. aureus fibronectin-binding protein (FnBPB) expressed in Staphylococcus carnosus, and it was found that decreasing amounts of wall-spanning region caused reduced surface display of a fused lipase or β-lactamase domain (757). Furthermore, if the wall-spanning segment of FnBPB was shorter than 90 amino acids, the activity of the recombinant enzymes was reduced or abolished, presumably because the peptidoglycan exoskeleton interfered with the folding of the fused lipase or β-lactamase domains.
Other surface proteins harbor wall-spanning segments that also display a repeat structure, albeit with differences in sequence. Protease protection experiments of M proteins have yielded similar results to those described for protein A above (613). The wall-spanning domain of clumping factor (ClfA) of S. aureus consists of tandem repeats of an aspartyl-serine dipeptide. This domain functions to present the fibrinogen binding domain on the staphylococcal surface, and successive truncations of the repeat domain cause reductions in surface display (308).
Many gram-positive species colonize the mucosal surfaces of humans or animals without ever causing disease. These species constantly interact with the host immune system, which prevents their penetration into deeper tissues and establishment of disease. Although the cellular immune system and the phagocytosis of bacteria are critical in defending the host mucosal surfaces, it is thought that the humoral immune system also contributes to this defense via secretory IgA molecules (581). Other, more pathogenic microorganisms have the ability to penetrate mucosal surfaces and multiply within the host. On a first encounter with such microbes, the host may be defenseless, and the outcome of this interaction might result in disease. Pozzi and coworkers developed the gram-positive commensal organism Streptococcus gordonii as a delivery system to generate a mucosal immune response against such pathogens (215, 649, 650). Surface display of epitopes or entire proteins requires an N-terminal signal peptide and a C-terminal sorting signal for cell wall anchoring (716). For increased stability and surface display, several vectors that allow fusions with the scaffold of surface proteins of gram-positive bacteria are available (593). Inoculation of these engineered S. gordonii strains into naive animals led to colonization of mucosal surfaces and to the generation of a mucosal immune response (158, 540–542). Similar strategies for the immunization of mucosal surfaces have been established for Staphylococcus xylosus and Lactococcus lactis, two other nonpathogenic commensal species (583, 631, 752).
Surface display of engineered proteins might also be useful for biotechnological purposes. For example, recombinant antibodies are needed for diagnostic and therapeutic approaches. To facilitate their selection, these antibodies are often displayed on the surface of filamentous phages or on the surface of E. coli (518). Certain limitations apply, because these antibodies have to fit within the scaffold of phage proteins that are incorporated into an infectious particle. Several laboratories have used staphylococci as hosts for the expression of recombinant surface proteins. Engineered S. carnosus and S. xylosus were shown to display IgG as well as several other polypeptides on the cell surface (288, 582, 583, 697, 757). Surface-displayed antibodies may be manipulated by genetic or chemical means, and this technology may provide some advantage over phage display.
Surface proteins are displayed on the cell surface in order to interact with a substrate located in the surrounding environment. These proteins can have a wide range of functions including binding of host tissues, binding to specific immune system components, protein processing, nutrient acquisition, and interbacterial aggregation for the conjugal transfer of DNA. This section briefly reviews the various proteins identified thus far that possess C-terminal cell wall sorting signals. It is intended to provide a reference for those who are interested in the functions of various surface proteins and to highlight the incredible functional diversity found in this class of proteins. It is important to note that covalent attachment of proteins to the peptidoglycan has been conclusively demonstrated only for a few of the proteins reviewed below. Future studies should determine conclusively whether cell wall sorting is truly a universal process among gram-positive bacteria. It should be noted, however, that no proteins containing cell wall-sorting signals have been identified in the spore-forming gram-positive genera such as Bacillus and Clostridium.
Anchored surface proteins must span the thick cell wall of gram-positive bacteria in order to display their functional domains to the surrounding environment. Despite their diverse range of functions, several common themes in the design of cell wall-anchored proteins of gram-positive bacteria are apparent (Fig. (Fig.15).15). N-terminal domains that contain the binding or catalytic activities are frequently followed by a set of repeat domains that may or may not possess activity. Often there is also a proline-rich stretch of amino acids immediately preceding the LPXTG motif that is thought to introduce random coils in the structure to assist in traversing the peptidoglycan network (212). It is possible, however, that such a domain has another function, since the proline-rich domains of several proteins are found to be more N-terminal than are regions predicted to span the cell wall.
Many cell wall-sorted surface proteins contain a number of tandem repeat domains that can vary in size from a few to several hundred amino acids. Striking examples are the alpha and Rib proteins of group B streptococci, which possess 9 or 13 repeats identical in both amino acid and nucleotide sequence (837). Strain-to-strain variation in the exact number of tandem repeat domains has been observed for many surface proteins, indicating that these domains are targets for recombination and/or duplication. Several mechanisms are likely to be involved in the generation of these repeat domains (139, 419, 420).
Much is known about the Ig binding surface proteins of gram-positive bacteria. Since the discovery of staphylococcal protein A, several other Ig binding proteins have been characterized and many of their genes have been cloned and sequenced. In 1977, Myhre and Kronvall proposed three different classes of Fc receptors based on the ability of certain bacteria to bind several different Ig subtypes from various mammalian species (569). Types I and II were the designations given to staphylococcal protein A and the streptococcal Fc receptors (M proteins) respectively. Type III was the designation given the Fc binding protein of group C and G streptococci (protein G). These designations were established before the genes encoding any of the Ig binding proteins had been determined; however, this nomenclature is still widely used today. For proteins A, G, and L, the three-dimensional structures of the antibody binding domains have been determined (143, 286, 854). With the exception of protein L, all surface protein Ig receptors bind to the heavy-chain portion of the antibody at the interface between the Fc CH2 and CH3 domains (231). Remarkably, despite their similar binding specificity, proteins A and G and the streptococcal Fc receptors (M proteins) have no structural similarities and appear to be the result of convergent evolution (231). Although several gram-positive bacterial species possess IgG binding activity, only two, the group A and B streptococci, have demonstrated the ability to bind IgA. Similar to the situation for the IgG receptors, the IgA receptors of GBS and GAS are unrelated to each other and are also most probably the result of convergent evolution.
Adherence to host tissues is the first step in the establishment of an infection. Many pathogenic gram-positive bacteria do not express structures such as pili and instead utilize cell wall-anchored surface proteins to adhere either to the extracellular matrix (ECM) or to other components present in host tissues. Höök and colleagues have given the name MSCRAMMs (microbial surface components recognizing adhesive matrix molecules) to surface proteins that bind to components of the ECM (621, 624). MSCRAMMs have been found in almost all pathogenic gram-positive species, and their modular design and common binding domains suggests that they have arisen from a series of recombinational events and horizontal gene transfer. Many MSCRAMMs are capable of binding to more than one ECM component, and a single strain often possesses several different proteins that bind the same host component. For example, S. pyogenes has at least three proteins capable of binding fibronectin while Streptococcus dysgalactiae and some S. aureus strains have two (see below). MSCRAMMs and other adhesive surface proteins are the subject of a number of reviews (621, 624, 844).
Other surface proteins bind a wide variety of serum proteins including albumin, complement regulatory factors, soluble forms of fibronectin and fibrinogen, and the proinflammatory molecules plasmin(ogen) and kininogen. In addition, several bacterial surface proteins bind molecules present on the surface of host cells, a function that may help the bacteria either adhere or invade. Surface proteins that bind serum molecules or molecules found on the cell surface are not technically defined as MSCRAMMs even though these molecules may play a role in adherence to host tissues (624).
Several examples have emerged where a metabolic enzyme is surface attached, particularly in cases where the function of the enzyme is to break down a large, nontransportable nutrient polymer into smaller subunits that can subsequently be taken up into the cell by a permease system. This is the case for both the dextranases and fructosidases of the mutans streptococci and the casein peptidase of lactococci. Not all surface-localized enzymes perform a nutrient acquisition function for the cell, however. For example, the nisin leader peptidase is involved in the processing of the nisin precursor to an active form by the removal of its leader peptide. Another surface protease, the streptococcal C5a peptidase, is involved in the degradation of this chemoattractant, presumably to prevent the recruitment of polymorphonuclear neutrophils (PMNs) and macrophages to the site of infection (see below).
Both the conjugal transfer of DNA and the formation of dental plaque require that bacterial cells be able to adhere to one another through the specific interactions of surface proteins (160, 168, 441). In a majority of cases, the gram-positive bacteria employ cell wall-anchored surface proteins for this purpose. A notable exception is the fimbriae of the actinomycetes, whose subunits surprisingly contain C-terminal cell wall sorting signals (see below).
The work of Rebecca Lancefield, which began over 70 years ago, demonstrated that streptococcal surface proteins are mediators of virulence as well as targets for acquired immunity (467). Early studies determined that the surface proteins of GAS displayed a high degree of antigenic diversity. These findings led to the currently used typing system for GAS that is based on antisera raised against a set of variable surface antigens designated M, T, and R (212, 468). Of these, the M antigens are the most serologically diverse, with over 80 different serotypes identified thus far (see below). Considerable effort has been put into determining the relationship between the antigenic profile of a strain of GAS and its ability to cause disease (213, 420).
The GAS are also subclassified by whether they express the serum opacity factor (OF), an apoproteinase whose expression causes serum to become opaque (see below). Approximately half of streptococcal isolates express OF, and this expression pattern generally correlates with certain M serotypes. For example, M types 2, 4, 8, 9, 13, 22, 49, 59, 62, and 76 are often found to be OF+ whereas M types 1, 3, 5, 6, 12, 14, 18, 19, 24, 55, 57, and 80 are usually OF−. The M41 strain D421 is OF+, whereas another M41 strain, D463, is OF−, demonstrating that exceptions to this general rule exist (658). A table compiling the various M, T, and OF types has been published (389), and streptococcal surface proteins and typing systems have been the subject of a number of excellent past reviews (212, 419, 467). Therefore, this section is focused primarily on findings that have been made in the past decade.
It has recently become possible to analyze the genetic elements responsible for the antigenic diversity displayed by GAS at the nucleotide level. Until a decade ago, it was assumed that the antigenic properties of the various M types were due to the expression of a single variable M antigen. However, it is now evident that many strains of GAS, including almost all OF+ isolates, express more than one member of the M-protein family (295, 607, 632, 753) and that there can be considerable genetic diversity within a single M serotype (152, 627). In addition, there are many cases where M proteins from strains considered to be of a different M serotype have identical N termini (193, 848), suggesting that some type-specific antisera may be directed against more than one of the expressed M proteins (see below). Such findings have led many to reassess the previous assumptions that stemmed from early serological studies.
The genes encoding the OF, the C5a peptidase, and M proteins are all regulated at least in part by a multiple gene regulator called Mga (formerly virR or mry [111, 718, 729]) (117, 470, 634, 639). The mga gene encodes a 62-kDa DNA binding protein that activates transcription by binding to a consensus 45-nucleotide binding site found in the promoters of the regulon members (533, 536). Mga also positively regulates the expression of the genes encoding a secreted complement inhibitor (sic) (9, 427), streptococcin A (scnA), a cysteine protease (speB), and an oligopeptide permease system (opp) while down-regulating the expression of streptolysin S (639). There is some evidence, however, that other elements are involved in the regulation of these genes, including transcriptional terminators that serve to down-regulate the expression of the scpA gene and a neighboring open reading frame (71, 637, 639, 652). Mga bears some homology to the effector proteins of bacterial two-component regulatory systems, although a cognate sensor-kinase for this protein has not been identified. The expression of Mga and surface proteins in GAS is regulated by a variety of environmental conditions (534) including elevated levels of carbon dioxide (110, 595, 637) and/or the growth phase (535).
Mga regulons are ubiquitous throughout GAS (633), but their exact organization varies from strain to strain, particularly in the number and types of M-protein genes present (295, 296, 344, 632). In all cases, the mga gene is located immediately upstream of the M-protein genes (mrp, emm, or enn) and the gene encoding the C5a peptidase (scpA).
The M family of streptococcal surface proteins is composed of the related Emm (class I and II), Mrp (FcrA), and Enn proteins. The Mga regulons of OF+ strains of GAS always contain all three M-family proteins in the order mrp, emm, and enn followed by the scpA gene. In contrast, the organization of the Mga regulons in OF− strains is much more variable and may include either one, two, or all three of the M-family genes, although the emm gene is always found (344). In at least one case, an additional open reading frame encoding a surface protein of unknown function (orfX) controlled by Mga has been found downstream of scpA (639).
Members of the M family of proteins are elongated dimeric alpha-helical coiled-coil molecules that can bind a variety of host components including Igs (5, 7, 75, 229, 320), fibrinogen (358, 409, 830, 850), kininogens (35), plasminogen (42), and albumin (229, 669) as well as factors that inhibit the deposition of complement on the bacterial surface (354, 780). A source of some confusion is that different laboratories have assigned different names to similar M-family proteins (e.g., the various class II Emm proteins have been designated Arp, Sir, or Emm). To assist the reader, the different names and ligands of several M-protein family members that have been sequenced are summarized in Table Table22.
The designation “M protein” has traditionally been reserved for members of the M family that are known to possess antiphagocytic properties, whereas all similar proteins with unknown or no antiphagocytic properties are generally designated “M-like.” Recent studies, however, have blurred this distinction, and it has become necessary to reassess the classical definition of an M protein. Separate laboratories have found that both the Mrp and Emm proteins can have antiphagocytic properties in a given strain (638, 781). It is also unclear whether M proteins play an equal role in protection from phagocytosis in every strain (137, 496) or whether a given M protein will have antiphagocytic properties when expressed in all strains (361, 781). Also, recent findings suggest that heparin, added as a blood anticoagulant in most phagocytosis assays, may interfere with the function of some M proteins by competing against the binding of factor H (62, 445, 781) (see below). If this is true, the antiphagocytic function of proteins previously considered to be M-like may have to be reassessed. For these reasons, we choose to adopt the functionally neutral terms Mrp, Emm, and Enn used by other laboratories to identify individual proteins based on their specific structural features while using the term “M protein” to designate all members of the family irrespective of their antiphagocytic function (345, 635, 781, 848). The mechanisms by which M proteins are thought to prevent phagocytosis are discussed below.
Sequencing of the various M proteins has revealed several common features in their molecular design, which are summarized in Fig. Fig.16.16. The N-terminal hypervariable domains of mature M proteins account for the observed antigenic diversity of these molecules. In contrast, the central and C-terminal domains are almost completely conserved within each specific M subfamily (420). Throughout almost the entire M-protein sequence are heptad repeat motifs whose first and fourth amino acids are apolar. This arrangement is typical of coiled-coil molecules and serves to display the hydrophobic amino acids on the face of an alpha helix serving as a surface for dimerization (212). The heptad repeats are found even in the highly variable N-terminal domains, although not always in an optimal distribution, indicating that there is considerable selective pressure to retain the coiled-coil conformations (212, 217, 515, 586).
All members of the M family possess an N-terminal leader peptide and a typical C-terminal cell wall sorting signal. The cell wall localization of M and M-like proteins was determined as early as 1952 (27, 228, 696), and it was demonstrated that purified cell walls could be used as a source for M protein (410, 446). Covalent attachment to the cell wall has been demonstrated by the fact that M protein solubilized from the cell wall with phage-associated lysin migrates as a ladder of bands on SDS-PAGE. More recently, it has been shown that the M6 protein is cell wall anchored when expressed in the gram-positive lactic acid bacteria (631).
The class I Emm proteins are found almost exclusively in OF− GAS. They were originally distinguished from the class II Emm proteins on the basis of their reactivity with a set of monoclonal antibodies raised against their C-terminal repeats (49). The binding properties of the class I Emm proteins are remarkably diverse (Table (Table2),2), but most have been shown to bind fibrinogen and/or albumin. These binding functions appear to have been divided between the Emm (class II) and Mrp proteins in the OF+ strains of GAS (see below).
The class II Emm proteins are typically found in GAS of the OF+ lineage and are the molecules usually recognized by M-typing antisera. These proteins were initially distinguished from their class I counterparts due to their inability to react with certain monoclonal antibodies directed against the C repeats of class I Emm molecules. None of the class II Emm proteins have been found to bind fibrinogen, a function that seems to be compensated for by the Mrp proteins in the OF+ strains. In addition, all class II Emm proteins studied thus far bind Ig, although their exact binding preferences differ (see Table Table22).
Mrp proteins are usually found in GAS strains of the OF+ lineage and are always encoded immediately downstream of the mga gene (Fig. (Fig.16).16). All Mrp proteins characterized thus far bind both human IgG (subclasses 1, 2, and 4) and fibrinogen (753). They can be distinguished from the Emm and Enn proteins by the absence of the conserved C-repeat domains. Instead, these proteins possess two or more copies of a different type of central repeat designated “A repeats” (295, 607), in which the IgG binding properties are thought to reside (8, 312). Upstream of the conserved repeats in both the Mrp and class II Emm proteins is a domain of unknown function that is rich in glutamate and glutamine (EQ domains). Unlike in the Emm proteins, the N-terminal domains of the Mrp proteins are not highly variable. Sequencing of the N-terminal regions of Mrp proteins from 37 different M serotypes revealed the existence of only six different types of N-terminal Mrp domains (447). In at least four cases (Mrp2, Mrp22, Mrp49, and Mrp64/14), an Mrp protein has shown evidence of being antiphagocytic (638, 781), and this property may be due to their ability to bind fibrinogen (see below).
The genes encoding the Enn proteins are found almost exclusively in OF+ strains as the third open reading frame following mga in the Mga regulon (Fig. (Fig.16).16). Instead of the EQ-rich domains found in the class II Emm and Mrp proteins, the “C-repeat” domain of Enn proteins are usually preceded by a conserved set of amino acids (EDLKTTLAKTTKEN). The Enn proteins, like the Mrp proteins, do not display the same high degree of variability displayed by the Emm proteins (847). Recombinant Enn proteins expressed in E. coli are generally found to have IgA binding activity (51).
Thus far, no members of the Enn family have been demonstrated to have antiphagocytic properties. Under laboratory conditions, it has been demonstrated that the expression of enn genes is consistently more than 30-fold lower than the expression levels found for both mrp and emm (51, 381, 496, 849, 877). The low expression level has led to the hypothesis that the enn genes act primarily as a “sequence reservoir” for recombination with the emm and mrp genes. Indeed some unusual “mosaic” Enn proteins including protein H, Enn 5.8193, and Enn 64/14 all appear to have been generated by recombination between an emm gene and an enn gene (see below). Low expression of Enn has not been demonstrated during in situ, however, and it is possible that the Enn proteins do play a role during an infection.
It was demonstrated long ago that GAS strains are resistant to phagocytosis in human blood in the absence of specific antibodies directed against the M proteins (467). In 1979, it was demonstrated that M-protein-deficient strains of GAS could be efficiently opsonized and cleared by the alternative complement pathway (55, 630). The alternative complement pathway ultimately results in the deposition of C3b on the surface of the bacteria, which can subsequently act as a receptor for phagocytic immune system cells. Jacks-Weis and colleagues found that strains of GAS expressing M protein accumulate approximately fourfold less C3b on their surface than did mutant strains (374). Several laboratories have since attempted to elucidate the molecular basis of the antiphagocytic properties of the M proteins and their possible effect on the alternative complement pathway. These studies have variously attributed the antiphagocytic property of the M proteins to their ability to bind fibrinogen (355), factor H (62, 214, 354), FHL-1, C4BP (392, 393, 780), or C1q (443).
Factor H, FHL-1, and C4BP belong to the RCA (for “regulators of complement activation”) family of molecules that are capable of regulating complement activity. The proteins in this family are composed of multiple, highly homologous domains known as short consensus repeats (SCRs) (356). SCRs domains are each composed of approximately 60 amino acids, of which 4 cysteines and a few other residues are conserved and provide the structural framework for the domain while the remaining amino acids are thought to contribute to the binding specificity. SCRs, like many other common protein domains, are modular and are believed to act independently of other domains in a protein (28). Factor H is a soluble molecule composed of 20 SCRs, which compose almost the entire protein (356). FHL-1 (for “factor H-like”) is a truncated variant of factor H generated by alternate mRNA splicing and is composed of the first seven SCR domains of factor H (356, 553). C4BP is a soluble regulator of complement that is composed of seven alpha chains and one beta chain, which contain eight and three SCRs each, respectively (356).
The finding that an M protein could bind a down-regulator of the alternative complement pathway was first demonstrated for factor H (354). The binding of factor H to the class I M6 protein was mapped to the C-repeat region (214); however, this finding has not been without some controversy. For example, the C-repeat domains was shown to bind albumin by another laboratory (8, 229, 230), although the possibility exists that the C repeats can bind multiple ligands. In addition, fibrinogen was found to compete for factor H binding to the M6 protein even though fibrinogen binding is believed to occur in the B repeats of M6, upstream of the C repeats (355). It was suggested in that study, however, that fibrinogen could form a complex with factor H and that this fibrinogen-factor H complex would be responsible for providing protection from phagocytosis (355). This hypothesis may explain the previously observed antiphagocytic properties of fibrinogen. One study of a GAS strain (JRS251) expressing an M6 protein lacking the C repeats indicated that this strain was no longer able to bind factor H even though it was still resistant to phagocytosis (628). A more recent study, however, has demonstrated that these mutant cells can bind factor H, albeit at lower levels (724). Given this data, it appears unlikely that the binding site of factor H lies within the C-repeat domain.
Recent findings have suggested that the physiologically relevant binding partner of the class I Emm proteins is not factor H but, rather, FHL-1 (392, 445). Surprisingly, the binding of FHL-1 occurs through the hypervariable N-terminal domain of the M protein. Three separate laboratories have identified a domain found in both factor H and FHL-1 (SCR7) as being responsible for interacting with the M proteins (62, 445, 724). Given this data, it is difficult to envision how the N-terminal domains would be capable of selectively binding FHL-1 over factor H. It has been noted, however, that the properties of FHL-1 and factor H differ both structurally and functionally (326). In addition, it has been demonstrated that the binding of radiolabeled FHL-1 to the M5 protein can be inhibited only weakly by factor H and not at all by fibrinogen and that FHL-1 bound to bacteria retains its complement inhibitory function (392). Also, the binding of FHL-1 appears to occur under physiologically relevant conditions whereas the binding of factor H is inhibited by fibrinogen at the concentrations present in plasma (355, 392). This data suggests that there are two factor H binding sites on the M5 protein, one of which binds factor H (see above) while the other has a greater affinity for FHL-1 and lies outside of the fibrinogen binding and C-repeat regions. Such a model is in agreement with other observations made by Sharma and Pangburn (724) and would explain why M proteins lacking C repeats retain antiphagocytic activity (628). It should be noted that the characterization of the N-terminal FHL-1 binding domain was carried out by fusing domains onto an M protein without FHL-1 binding activity. This type of “gain-of-function” binding study avoids certain experimental concerns raised by the negative data obtained from binding studies that employ deletion mutants, such as the possibility of deleterious effects caused by conformational changes.
Several Emm proteins, particularly those in the class II subfamily, do not have the ability to bind either factor H or FHL-1. Instead, it has been demonstrated that many of these proteins bind C4BP. Similar to the binding studies of FHL-1, it was determined that the binding of C4BP occurs through the hypervariable N-terminal domain (393). The FHL-1 and C4BP binding data suggest a model whereby the N-terminal hypervariable domains of M proteins have evolved the ability to bind complement regulatory molecules (392). Interestingly, no consensus amino acid motif has been identified in the hypervariable regions demonstrated to bind C4BP (393). This data suggests that the N-terminal domains of the M proteins arise from two opposing selective pressures, the ability to maintain their antiphagocytic function while simultaneously altering their antigenic structure (393). This model may explain why the N-terminal variable domain is so critical to the function of M proteins and has not been lost due to selective pressure. It would also explain why mutant M6 proteins lacking C-repeat domains retain their antiphagocytic function.
Very recently, C1q, another complement component, was suggested to bind to the M proteins FcrA76 (Mrp76) and M5 (443). The binding studies were carried out on crude acid extracts, and the identification of the two M proteins was based solely on their apparent masses on SDS-PAGE. More studies are therefore necessary to demonstrate if this binding actually occurs through the M proteins and is biologically relevant in vivo.
At this point it should be mentioned that there is still some debate about whether M proteins are either necessary or sufficient for protection from phagocytosis in all GAS strains. For example, a mutant strain that lacks a hyaluronic acid capsule but expresses wild-type levels of M protein is sensitive to phagocytosis whereas the parent strain (strain 87-282) is resistant (841). In a later study, it was shown that strain Vaughn (M type 24) required fibrinogen for optimal resistance to opsonization and phagocytosis whereas in surprising contrast, strain 87-282 (M type 18) was opsonized by C3b in either the presence or absence of fibrinogen but in both cases remained resistant to phagocytosis (137). These results suggest that in some strains the deposition of C3b does not necessarily lead to phagocytic killing. It has recently been reported that a mutant of 87-282, in which the M protein was insertionally inactivated, remained resistant to phagocytotic killing in blood (496). The 87-282 mutant strain used in these studies still had an intact enn gene, although the Enn expression levels appeared to be quite low. The Emm18.1 protein from this strain was demonstrated to confer protection from phagocytosis to a strain of a different M type, indicating that the M protein from strain 87-282 is indeed functional (134). Nevertheless, these data suggest that the hyaluronic acid capsule can, by itself, protect the bacteria from phagocytosis in strains like 87-282 whereas other strains also require the expression of an M protein.
The ability of many different M proteins to bind Ig has been clearly demonstrated. The IgA binding site of Arp4 (Emm4) has been mapped by separate laboratories to a short stretch of residues near the N terminus (50, 391). A consensus binding site was identified and found to be present in most known IgA binding M proteins including Arp60, Enn4, Enn2.2, Enn50, and Sir22. Binding to IgG has also been clearly demonstrated, but the situation is much less clear. It is apparent that some M proteins bind human IgG1, IgG2, and IgG4 whereas others bind only IgG3 and a few M proteins have the ability to bind all four subclasses (summarized in reference 75). This leads to the tempting speculation that two or more different binding sites exist on M proteins, one of which binds IgG1, IgG2, and IgG4 while the other binds IgG3. Indeed, Björck and colleagues have found two separate IgG binding sites on protein H and M1. Domains homologous to the protein H IgG binding site were also found in the A repeats of the Mrp proteins, suggesting that such domains may bind IgG1, IgG2, and IgG4 (229). The IgG binding site of M1 is homologous to a region found in Arp4, a protein that can bind only a limited subset of IgG molecules, including IgG3 but not IgG1, IgG2, or IgG4 (8). Despite this correlation, the protein H and M1 IgG binding domains are found only in a subset of the Emm proteins known to have IgG binding activity. Obviously, more experiments must be performed to determine the locations of the various IgG binding domains.
As mentioned above, surface proteins can be released from some gram-positive cells by the action of specific proteases. In GAS, biologically active fragments of protein H, M1, and C5a peptidase can all be released from the cell surface by the action of SCP, a streptococcal cysteine protease (39). It has recently been shown that a soluble protein H-antibody complex can mediate the activation of the classical (antibody-mediated) pathway of complement (40). When immobilized on a surface, however, the protein H-antibody complex was found to instead inhibit the activation of complement. These findings suggest that the role of the Ig binding domains may involve inhibition of the classical complement-mediated pathway at the surface of the bacteria while the soluble forms of these proteins may serve to deplete complement at sites further away from the bacteria (40).
M proteins have been suggested to play a role in the generation of an inflammatory response by the binding of fibrinogen (831), kininogen (34), or plasminogen (76). By doing so, the bacteria may gain the ability to invade deeper tissues by increasing vascular permeability, dissolving clots, and disrupting the tight interactions within the ECM. Björck and colleagues tested the kininogen binding abilities of 49 strains of GAS and found that 41 were able to bind radiolabeled kininogen to their surface (35). This binding ability correlated well with the presence of M protein, since none of six M-protein-deficient strains were capable of binding kininogens. Direct analysis of kininogen binding to three M proteins revealed that the interaction occurs within the N-terminal portion of the M protein at a site that does not overlap the albumin, fibrinogen, or IgG binding sites (35).
Plasminogen is the biologically inactive precursor to plasmin, a fibrinolytic molecule speculated to aid in the invasive properties of cells (76). The role of plasmin(ogen) binding on the surface of GAS is not entirely clear but is apparently of some importance since more than 10 different species of invasive bacterial pathogens have been shown to bind these molecules (76). Most strains of GAS possess a surface-exposed glyceraldehyde-3-dehydrogenase-like molecule (SDH) that binds plasmin and not plasminogen (614). In addition, five GAS strains, of M serotypes 33, 41, 52, 53, and 56, that are usually associated with skin infections possess M proteins with plasminogen binding activity (42, 112). The common plasminogen binding motif in these M proteins was identified and localized to a set of tandem repeats found within the N-terminal domain (112). Surprisingly, Pancholi and Fischetti have recently reported the presence of a surface-localized alpha-enolase (SEN) that binds both plasmin and plasminogen (610). The finding that SEN is expressed ubiquitously throughout all strains of GAS conflicts with the finding that surface binding of plasminogen is a rare occurrence (112, 610). The reason for this discrepancy is unclear. Both SEN and SDH belong to a recently identified group of unusual enzymes that have the ability to bind plasma components. Their role in virulence and even their surface localization are under considerable debate, since they do not possess N-terminal leader peptides. These proteins are discussed further below.
Like plasminogen, kininogen is a biologically inactive precursor molecule. Proteolytic activation of kininogen generates the peptide hormone bradykinin, a vasoactive molecule that may assist in the generation of a localized inflammatory response near the site of infection. The effect of M-protein binding to either plasminogen or kininogen does not interfere with the ability of either molecule to be proteolytically converted into a biologically active form by their respective cognate proteases (34, 112).
It has long been speculated that M proteins may also act as adhesins to host tissue, but the mechanisms by which this may occur remain unclear (180), although adhesion has recently been implicated in the ability of GAS to generate a proinflammatory response in keratinocytes (829). Expression of the Emm5, Emm18, and Emm24 proteins in a deletion mutant of strain Vaughn enables these cells to adhere to HEp-2 cells. The M6 protein has been shown to mediate adherence to both HEp-2 cells (832) and keratinocytes (597). The binding of M6-expressing cells to HEp-2 cells appears to require fucose-containing glycoproteins (833). On the other hand, the ligand on the surface of keratinocytes responsible for binding M proteins was identified as the membrane cofactor protein (MCP or CD46 ). Interestingly, MCP is a membrane-bound complement regulator that contains SCRs and belongs to the RCA family of proteins like C4BP, factor H, and FHL-1. The binding of M proteins to the MCP molecule is speculated to occur through the C-repeat domains and can be inhibited by factor H (596). In contrast, the binding of HEp-2 cells appears to require the N-terminal domain of the M6 protein (832). It is clear that other receptors are responsible for mediating streptococcal adherence to many other cell types including fibroblasts (597). Other proteins implicated in the adhesion of GAS to host tissues are the fibronectin binding proteins FBP54 (132) and SfbI/protein F (see below).
A fundamental question in the biology of M proteins is how they evolve to vary their antigenic epitopes without disturbing the critical contacts necessary to maintain their structural and ligand binding characteristics. This question is particularly valid in light of the findings that the N-terminal hypervariable domains bind complement regulatory proteins. Both sequencing of several M protein genes and PCR analysis have allowed several laboratories to study the mechanisms by which sequence variation can occur. M proteins generate sequence diversity by point mutations, small insertions and deletions, and intragenic recombination between tandem repeat domains (212, 306, 347).
Analysis of several Mga regulons by PCR suggested that the evolution of the Mga regulons involved gene duplication followed by sequence divergence (344). Other studies have shown that the GAS also possess the ability to horizontally transfer gene segments, not only with group A organisms but possibly also with other streptococcal groups (420, 730, 849). Unusual mosaic genes that possess sequence elements from both enn and emm genes have been observed in three GAS strains, indicating that M proteins probably have the ability to recombine with one another (635, 849). Recently an insertion sequence has been identified within the Mga regulon of strain AP1, suggesting at least one possible mechanism by which the transfer of gene segments could occur (41). Another laboratory found another insertion sequence, IS1548, immediately downstream of the scpA gene in several GAS and GBS strains (277). It is possible that the considerable genetic diversity observed in the streptococcal Mga regulons is assisted by the presence of nearby insertion sequences.
The discovery of horizontal transfer of M-protein genes has challenged the previous assumption that the relatedness of GAS strains could be determined through the serologic and genetic properties of their M proteins. A comparison of 79 different M serotypes by multilocus enzyme electrophoresis revealed that the variation of the emm genes and overall genetic relationship between the various GAS strains are not congruent (848). This finding was supported by another study of several clinical M type 1 isolates that found that identity of scpA and emm genes between strains did not correlate with similarity in overall genetic structure (568). Therefore, the fact that two GAS strains possess identical or similar M serotypes does not by necessity indicate that the two strains are closely related. These findings have significant implications for the epidemiological study of GAS.
In summary, M proteins are multifunctional binding proteins that have evolved the ability to bind several of the common building block domains of eukaryotic plasma and matrix proteins including albumins, fibrinogens, fibronectins, Ig, plasmins, and proteins containing SCRs. These binding functions may serve the multiple purposes of protecting the bacteria from the various phagocytic mechanisms that exist within the host, assisting in the adhesion to certain cell types, generating an inflammatory response, and triggering the invasion of deeper tissues. The conserved C repeats may mediate adhesion to keratinocytes by binding MCP and/or factor H, but the binding properties of this domain are still the subject of considerable debate. Indeed, despite recent advances, the role of M proteins in protection from phagocytosis remains something of an enigma.
Many different pathogenic bacterial species have the ability to bind fibronectin, and the ability of GAS to bind fibronectin was established long before genes encoding the protein receptors responsible for the binding had been cloned. Fibronectin itself is a very large, multifunctional glycoprotein that is found as a disulfide-linked, soluble dimer in most body fluids or in a matrix form that is insoluble and is a major component of both basement membranes and ECM. Fibronectin has multiple binding domains that are capable of binding to a wide variety of compounds such as collagens, fibrin, heparin, and actin. It stands to reason, therefore, that GAS would use this molecule as a target for binding because it is ubiquitous throughout the human body and is especially prevalent at sites, such as the dermis, where an infection would be initiated.
Many factors have been implicated in the binding of fibronectin to the surface of GAS. Although some early studies had suggested that lipoteichoic acid is responsible for the binding of GAS to fibronectin, later studies refuted those claims. There are reports in the literature of at least five streptococcal fibronectin binding factors including Emm3 (666) and protein H (230), SDH (614), OF/SfbII (448, 658), FBP54 (133), and protein F/SfbI (303, 778). Given the rather promiscuous binding abilities of fibronectin itself, the results of any in vitro binding study must be viewed with caution until the biological relevance of such an interaction can be clearly demonstrated.
The best-characterized group of streptococcal fibronectin binding proteins are the SfbI/protein F family members. SfbI and protein F are essentially identical fibronectin binding proteins that were cloned from different strains of GAS. Southern blot analysis of several streptococcal isolates has revealed that SfbI-like proteins are present in approximately 70% of GAS strains. The gene encoding protein F, prtF, was cloned by screening a streptococcal expression library in E. coli for the expression of a protein which could inhibit the binding of radiolabeled fibronectin to the surface of GAS (303). A role for this protein in binding fibronectin has been demonstrated, since mutants generated by insertional inactivation are unable to adhere to either fibronectin or respiratory epithelial cells (303) and the exogenous addition of purified protein F/SfbI is able to inhibit the binding of fibronectin to the surface of GAS. More convincingly, expression of protein F in both nonadherent streptococci and enterococci enables these bacteria to bind fibronectin and respiratory epithelial cells (304).
The SfbI/F proteins contain a rather large N-terminal leader peptide followed by a slightly variable aromatic region of approximately 200 amino acids and a set of conserved proline-rich repeat domains (RD1). The fibronectin binding region begins immediately downstream from RD1 and consists of a stretch of 43 amino acids (UR or UFBD) followed by up to five repeat domains (RD2) which are homologous in primary sequence to other fibronectin binding proteins including that of S. aureus (720, 777, 778). Both the number of repeats and the sequence variation within the N-terminal variable region account for the heterogeneity between strains. Binding studies have demonstrated that the minimal functional unit of the RD2 domains is generated at the junction between two adjacent repeats and that the RD2 and UR domains of protein F bind different domains of the fibronectin molecule (609). A more recent study has indicated that protein F can also bind fibrinogen through the N-terminal variable domain that precedes the RD1 repeat domains (414).
One of the terminal steps of both the alternative and classical complement cascades is the proteolytic conversion of the full-length C5 protein into the biologically active C5a and C5b molecules by the C5 convertase complex. C5b plays a role in the membrane attack complex, which is generally ineffective against gram-positive bacteria. C5a, on the other hand, is a potent chemoattractant that recruits PMNs to the site of infection (360). To inhibit the recruitment of these PMNs, the streptococci produce a protease that proteolytically inactivates C5a (845, 846).
The gene for the C5a peptidase (scpA) encodes an 1,167-residue protein with a molecular mass of approximately 128 kDa (118). A 31-residue N-terminal leader peptide and a very large N-terminal catalytic domain are followed by four short repeats that lie almost immediately upstream of the cell wall sorting signal. The enzymatic domain bears significant homology to the subtilisin family of proteases, but, unlike the broad activities displayed by many members of this class of enzymes, the C5a peptidase shows remarkable specificity by being unable to cleave the full-length C5 molecule and cleaving only a single bond within C5a (125).
Insertional inactivation of the scpA gene demonstrated that the C5a peptidase plays a role in virulence in a mouse model (386). Mutant strains were cleared more efficiently than wild-type strains were. In addition, the mutant strains were trafficked to the lymph nodes whereas the wild-type strain was found predominantly in the spleen. The C5a peptidase has been investigated as a vaccine candidate due to its widespread conservation throughout GAS (385). Genes nearly identical to scpA have also been found in both group B and G streptococci, although the expression of scpA in the group G streptococci appears to be restricted to the specific strains that are capable of infecting humans (119, 123, 124).
OF is an apoproteinase responsible for the ability of some strains of GAS to cause opalescence in serum (703). As mentioned above, the expression of OF is usually restricted to GAS strains that express class II M proteins. The cloning of OF in 1995 by two independent groups showed unequivocally that OF is distinct from M proteins (448, 658).
Whereas Rakonjac et al. cloned the OF gene on the basis of its ability to cause opalescence in serum (658), Kreikmeyer et al. cloned the gene based on its ability to bind fibronectin (448) and hence named the protein SfbII. The OF/SfbII protein itself is very large (1,025 residues) and contains an N-terminal leader peptide and a C-terminal cell wall sorting signal with a seryl residue substituted instead of the threonyl of the LPXTG motif. A domain repeated 2.5 times near the C terminus of the protein is homologous to the RD2 repeats from SfbI/protein F, and purified fusion proteins containing these domains are capable of competitively inhibiting the binding of radiolabeled fibronectin to streptococcal cells. Both groups noted that the gene is present only in a subset of streptococcal isolates, which correlates with what was previously known about OF (see above).
T antigens form a group of approximately 25 serologically distinct surface molecules found on the surface of group A, B, C, and G streptococci that have the common property of being resistant to digestion with trypsin (437, 469). Antibodies against T antigens are not protective, and therefore very few studies on the biological relevance of these molecules have been carried out. Although T antigens are expressed independently of M antigens and more than one T antigen can be expressed simultaneously on the same strain, most M types are associated with a single T type (389). T antigens have been used as a means of subclassifying different strains of GAS and are used in cases where M antigen, due to either a lack of available typing sera or a lack of expression, cannot be used. Although antibodies to T antigen are generated during the course of an infection, these antibodies are not protective against subsequent infections.
Although two T antigens have been purified (390, 502), to date the nucleotide sequence of only one T-antigen gene, tee6 from the M6 serotype strain D471, has been determined (714). The tee6 gene encodes a 55-kDa polypeptide of 537 amino acids (T6) that possesses a C-terminal cell wall sorting signal. The protein is rich in serine, threonine, and lysine residues and is rare among the streptococcal surface proteins in containing no discernible repeat domains. DNA hybridization studies demonstrated that the 5′ end of the tee6 gene hybridized with chromosomal DNA of only 10 of the 25 known T types, and probes constructed from the 3′ end of tee6 hybridized with even fewer isolates (397). Indeed, chromosomal DNA from several T-type isolates failed to hybridize with any tee6 probe. These findings leave open the possibility that the T antigens are actually several different molecules that have in common only their ability to resist trypsin digestion.
Group B streptococci (GBS or S. agalactiae) are responsible for the majority of cases of neonatal sepsis and meningitis in developed countries (25). GBS are frequently present in the vaginal flora of humans and can be transmitted to the neonate during or before birth. GBS can also cause invasive infections in adults, particularly the immunocompromised and the elderly (195). GBS can be subclassified immunologically according to their polysaccharide capsules, which fall into at least nine distinct types (436, 842). A majority of the invasive GBS infections are due to strains expressing the type III capsular polysaccharide, although strains expressing other capsular types are increasingly gaining in importance (66). The GBS can also be grouped according to the expression of a set of surface proteins designated Cα (alpha), Cβ, and Rib (202). In experiments with animal models, it has been determined that antibodies against either the polysaccharide capsule or the surface proteins are protective against infection with GBS (52, 685, 751, 843). Capsular polysaccharides can be poorly immunogenic, however, and this has led many laboratories to focus on the use of surface proteins as possible vaccines (471, 842). Unfortunately, research on the GBS surface proteins has focused primarily on their vaccine potential and relatively little has been determined about the role of these proteins in virulence.
Cα and Rib are homologous trypsin-resistant proteins found on the surface of the GBS. Rib is almost always found on the more invasive GBS strains of capsular type III, whereas Cα is usually expressed on strains expressing other capsular types (220, 751). Recently another Cα-like protein has been identified on a type V strain, but the sequence of this protein has not yet been published (464, 465). Rarely, if ever, are Cα and Rib expressed together on the same cell (220). The genes encoding Cα (bca) and Rib (rib) were sequenced from the type Ia GBS strain A909 and the type III strain BM110, respectively (550, 837). The two proteins have a similar organization with unusually long leader peptides followed by N-terminal domains of ≈170 amino acids that are 61% identical to each other. The N-terminal domains are followed by set of 9 (Cα) or 12 (Rib) tandem repeat domains that in both cases are followed by a classical C-terminal cell wall sorting signal. The repeat domains of Cα and Rib contain 82 and 79 amino acids, respectively, and are 47% identical to one another. Strikingly, there is no variation between individual repeats in either protein, even at the nucleotide level. In both cases the repeat domains constitute greater than 70% of the total length of the protein.
Both Cα and Rib proteins are capable of eliciting protective immunity against GBS, but immunity against one does not confer immunity against GBS strains expressing the other even though the two proteins are highly homologous (471, 751). Since almost all strains of GBS express either alpha or Rib, a vaccine composed of both proteins should, in theory, provide broad-range immunity against these bacteria. The number of repeat domains among Cα proteins of different clinical isolates of GBS can vary significantly but averages between 8 and 9 (509). It is apparent that the repeat domains in both Cα and Rib are subject to insertions and deletions due to homologous recombination. There is some evidence that strains of GBS expressing Cα antigens with fewer repeat domains are less susceptible to the host immune response than are those that express Cα antigens containing larger numbers of repeats (280, 510). Conversely, it has been shown that Cα proteins with fewer repeat domains are more immunogenic (278, 279). It would therefore appear that the selective advantage conferred by Cα molecules containing fewer repeats is counterbalanced by their ability to generate a more vigorous immune response.
The role of Rib and Cα in the pathogenesis of GBS is not understood; however, a recent experiment has shown that a mutant GBS strain in which the bca gene was insertionally inactivated is attenuated in its ability to cause disease in mice (483). More studies are necessary to determine the exact role of these proteins in virulence.
The Cβ antigen is found on less than half of GBS strains, and many of these strains appear to express a truncated, secreted form of the protein (81, 512). The role of Cβ in virulence is unclear, but the protein has a high affinity for serum IgA and, under some conditions, secretory IgA (52, 81, 487, 691). Immunization of pregnant mice with purified Cβ can confer resistance against infection with GBS onto neonatal pups (511). The gene encoding Cβ (bac) was cloned from GBS strains SB35 (319) and A909 (382) by two independent laboratories. Both genes possess a 37-residue leader peptide and a classical C-terminal cell wall sorting signal. Approximately 180 residues upstream of the sorting signal is an unusual domain composed almost entirely of a triplet repeat, in which the first residue is a proline, the second residue alternates between a positively and a negatively charged amino acid, and the third residue is uncharged. The two genes differ slightly in the proline-rich repeat region, with that from the SB35 strain containing two short deletions. Analysis of recombinant Cβ expressed in E. coli indicated that the Cβ protein contains two or more separate IgA binding domains within the N-terminal half of the molecule, designated A and B (382). A later study mapped the binding site of the A domain to a region centered around a hexapeptide motif (MLKKIE), but it was not conclusively determined that this motif actually participated in the binding of IgA (383). Neither IgA binding domain on Cβ bears any significant homology to the IgA binding domains found in M proteins (50, 391). The Cβ typing antisera appear to recognize a separate domain that lies C-terminal to the IgA binding domain(s) (319).
Streptococci in Lancefield groups C and G (GCS and GGS) comprise a group of closely related species that are capable of causing disease in both humans and a wide variety of animals. The large-colony forms of GCS, including the alpha-hemolytic S. dysgalactiae and beta-hemolytic S. equi (subspp. equi, equisimilis, and zooepidemicus), are common veterinary pathogens. Less often, these strains will infect humans, with S. equi subsp. equisimilis (S. equisimilis) being the most frequently isolated along with the small-colony form, S. anginosus (56). GGS are generally not given a species designation and are also associated with disease in both humans and animals.
Clinical isolates of GCS and GGS can possess antiphagocytic M proteins similar to those found in GAS (56, 57, 129, 395, 745). However, several studies have suggested that the expression of such proteins is limited to the specific strains that can infect humans (555, 717, 731). As mentioned previously, the GGS and GAS appear to be able to share their M protein genes by horizontal transfer (730, 750). A recent study of 38 human GGS isolates revealed that each possessed a single class I emm gene that could be divided into one of 14 distinct types based on the N-terminal sequence of its protein (717). The sequences of some of these proteins was similar to known N-terminal sequences from GAS M proteins, whereas others were found to be quite divergent.
S. equi and S. zooepidemicus frequently colonize the nasopharynx and genital mucosa of horses. S. equi is the causative agent of strangles, a highly contagious disease of horses, whereas S. zooepidemicus is a normal mucosal commensal organism that can cause disease in an opportunistic fashion. Recent genetic studies have indicated that S. equi (also known as S. equi subsp. equi) is actually a variant strain of the more genetically diverse S. zooepidemicus. Strains of S. zooepidemicus express an acid-extractable surface antigen (SzP) that, like the GAS M proteins, is antigenically variable among strains. In contrast, S. equi appears to express two invariant surface proteins, one of which is related to SzP while the other appears to be unique to this strain (SeM).
The genes encoding the SzP protein have been sequenced from S. equi (SzPSe) and S. zooepidemicus W60 (SzPW60), as has the gene encoding SeM (784, 785). Both SzP and SeM bind equine fibrinogen, and it appears that these proteins can limit the deposition of equine C3 on the bacterial surface (72, 73, 784). Antibodies against SzP are opsonigenic against S. zooepidemicus (but not against S. equi), whereas S. equi is opsonized by antibodies against SeM (784). For these and other reasons, the SzP and SeM proteins were designated “M-like.” However, despite their functional similarities and the presence of an N-terminal leader peptide and C-terminal cell wall sorting signal, SeM and SzP do not bear significant homology to each other or to the M proteins of GAS. SzPSe and SzPW60 are 85% homologous to one another and do not possess large tandem repeats; however, the proline-rich region of each is composed of several copies of a 4-residue repeat, PEPK. SeM contains two tandem repeats of 21 residues each and several shorter C-terminal tandem repeats. No specific functions have yet been ascribed to any domain of either SeM or SzP.
As mentioned above, Myhre and Kronvall, in 1977, classified the IgG binding activities of the gram-positive bacteria, giving the distinction “type III” to proteins with binding properties similar to those found on the GCS and GGS (569). Shortly thereafter, Myhre and Kronvall also identified an albumin-binding activity on the surface of the GCS and GGS (569a). It has since become apparent that a single protein, protein G, confers both the IgG and albumin binding properties on these bacteria. Although possessing multiple ligand binding properties similar to those found in some GAS M proteins, protein G and related proteins are structurally distinct from the M proteins. Over the past decade, protein G-like molecules have been identified in several group C and G species of human and bovine origin as well as in S. dysgalactiae (MIG and MAG) (398, 399, 401), S. zooepidemicus (ZAG) (400), and the distantly related anaerobe Peptostreptococcus magnus (see below).
Protein G was first isolated based on its IgG binding activity from the group G strain G148 (60) and from the group C strain C26RP66 (667, 668). The albumin binding properties of this molecule were not described until the gene had been isolated and expressed in E. coli (59). Genes encoding protein G have been sequenced from strains G148 (290, 603), GX7805 (204), GX7809 (194), G43, and the group C strain C43 (734), allowing a detailed comparison of their structures. Unlike the GAS M proteins, the binding activities of the protein G family are contained within distinct modular domains, whose number can vary from strain to strain. The globular structures of these domains are more reminiscent of staphylococcal protein A (see below) than the extended alpha-helical coiled-coil structure of the M proteins.
The binding of albumin and that of IgG occur within separate domains of protein G, each encoded by a set of tandem repeats (6, 735). The modular albumin binding domains (GA modules) are contained within the N-terminal region of protein G, whereas the IgG binding domains are encoded in a more C-terminal region of the molecule. The structures of both domains have been determined (286, 387, 388). GA modules are found in a wide variety of streptococcal and peptostreptococcal albumin binding proteins; however, these domains are not present in all protein G molecules from human clinical isolates (388, 733, 734). Protein G also binds the protease inhibitor α2-macroglobulin (α2M), but this binding is inhibited by IgG and it is unclear whether it is physiologically relevant (736).
MIG and MAG are surface proteins cloned from two different strains of S. dysgalactiae, whereas ZAG was isolated from a strain of S. zooepidemicus (399–401). Earlier studies had suggested that S. zooepidemicus possessed a unique type (type V) of IgG binding protein; however, sequence analysis revealed that MIG, MAG, and ZAG are similar to type III protein G. Like protein G, MAG and ZAG bind albumin (MIG does not), and all three of these proteins have been shown to bind α2M. The binding of α2M occurs through an N-terminal domain unique to MIG, MAG, and ZAG that is not found in protein G.
Fibronectin binding proteins have been cloned from the GCS strains S. dysgalactiae, S. equisimilis, and S. zooepidemicus (490–492). S. dysgalactiae has two adjacent genes, fnbA and fnbB, that encode fibronectin binding proteins (490). Unlike the two fibronectin binding proteins from S. aureus, however, FnbA and FnbB are quite dissimilar and are most probably not the product of gene duplication. In fact, the C-terminal fibronectin binding domains of FnbA and FnbB each have greater homology to protein F than they do to one another. The reasons why this bacteria contains two proteins with this activity are unclear, although under laboratory conditions only FnbB appears to be expressed (490). The C-terminal domain of FnbB is also highly homologous to the fibronectin binding protein of S. equisimilis (491).
The fibronectin binding protein of S. zooepidemicus, FNZ, is structurally similar to protein F, including the presence of nearly identical fibronectin repeat domains and the presence of an UR-like sequence. The UR domain of FNZ, however, lies between the first and second fibronectin binding repeats rather than immediately upstream as is found in protein F. FNZ, at ≈60 kDa, is rather small in comparison to the rather large (>100-kDa) fibronectin binding proteins of S. dysgalactiae and S. equisimilis (492).
The pneumococci are one of the leading causes of septicemia, meningitis, and lower respiratory tract infections in humans. Despite rapid and aggressive antibiotic therapy, pneumococcal infections can lead to a variety of debilitating complications including hearing loss. The polysaccharide capsule expressed by these bacteria is required for virulence but in itself is not toxic. Although proteins are obviously important, their role in the pathogenicity of these bacteria has not been extensively studied (620). S. pneumoniae produces two hemolysins, one of which, the well-studied pneumolysin, has been clearly demonstrated to contribute to the disease process (38, 109, 620, 663). In addition, the pneumococci produce two adhesins, PspA and PsaA, which may also contribute to the ability of these bacteria to establish an infection within a host (48, 875). Only two cell wall-anchored enzymes, a neuraminidase and a β-N-acetylglycosaminidase, have thus far been cloned in S. pneumoniae. Both of these glycosidases probably play a role in virulence as well.
It has long been appreciated that all clinical isolates of S. pneumoniae possess neuraminidase activity (425, 608). Neuraminidases, found in a wide variety of pathogenic microbes, cleave terminal sialic acid residues from a variety of glycosylated proteins, lipids, and oligosaccharides found both in body fluids and on the surfaces of cells. Neuraminidase activity correlates with virulence in pneumococcal meningitis, and early studies with crudely purified pneumococcal neuraminidase demonstrated its toxicity in mice (426, 608). At least two genes encode neuraminidases in the pneumococci (47, 108). One gene, nanB, encodes a putative secreted protein, while a gene 4.5 kb upstream, nanA, encodes a protein with the typical characteristics of a gram-positive surface proteins (47, 107). NanA and NanB have no significant homology to one another and appear to have optimal activities at different pH values, suggesting that they may act at different sites or stages during an infection (47). NanA is >100 kDa and has four copies of a consensus bacterial neuraminidase sequence motif (SXDXGXTW), suggesting that it belongs to the “large” family of bacterial neuraminidases (107, 675). The surface localization of NanA has been confirmed by immunoelectron microscopy (107).
The exact role(s) that the neuraminidases play in virulence is unclear, although it has been suggested that removal of sialic acid from surface molecules can “unmask” carbohydrate ligands for pneumomcoccal adhesion (449, 489). A recent experiment demonstrated that a strain of S. pneumoniae with an interrupted nanA gene was not defective in its ability to cause cochlear damage in an experimental guinea pig meningitis model whereas strains deficient in pneumolysin were avirulent (856). These strains, however, still possessed an intact nanB gene, although the authors claimed that the activity from this gene was minimal.
The gene for the pneumococcal β-N-acetylglucosaminidase (strH) was cloned from a phage lambda expression library from S. pneumoniae by screening for plaques with enzymatic activity (122). StrH is very large (144 kDa) and possesses both an N-terminal leader peptide and a typical C-terminal cell wall sorting signal. The protein also contains two rather large (335-residue) tandem repeat domains, which account for approximately half of the mature protein. The repeats, separated by approximately 100 residues, each contain a 30-residue sequence motif that is homologous to conserved sequences found in other bacterial hexosaminidases. Host tissues are abundant in surface molecules containing GlcNAcβ1-4-linked residues, a substrate for the enzyme, but the role of the StrH protein in the biology of S. pneumoniae is unclear.
The oral cavity is typically host to several species of gram-positive bacteria, most notably the viridans group streptococci. Important among the viridans group are S. mutans and S. sobrinus, which have been implicated as primary causes of dental caries (376, 497). Colonization of the oral cavity requires that the bacteria bind to salivary components, to other bacteria, and to both soluble (dextran) and insoluble (mutan) extracellular polysaccharides that are synthesized by the bacteria themselves. Surface hydrophobicity has been speculated to play a role in the ability of bacteria to adhere to the oral pellicle, and surface proteins have been implicated in contributing to overall surface hydrophobicity.
Surface proteins from these organisms have been widely studied for their role in oral colonization as well as their potential as vaccine targets. At least 20 polypeptides are exposed on the cell surface of S. gordonii and S. sanguis (14, 377). Sequence analysis of several of these surface proteins has revealed that viridans streptococci use similar adhesins to attach to both host tissue components and other bacteria (160).
The streptococcal antigen I/II proteins have been found on the surface of almost every oral streptococcal species studied thus far. Antigen I/II (688) of S. mutans serotype c has also previously been designated B antigen (689), IF (359), P1 (221), MSL-1 (150), and PAc (598). Genes homologous to antigen I/II have been cloned and sequenced in S. mutans serotype f (sr protein; ), S. gordonii (SspA and SspB ), and S. sobrinus (SpaA  and pag).
These proteins are almost identical and have both N-terminal leader peptides and C-terminal cell wall sorting signals. They are large (1,500 to 1,566 amino acid residues) and, like many other surface proteins, contain multiple binding activities and repeat domains (378). I/II antigens have been reported to be able to bind salivary glycoproteins as well as other oral microbes. The structure and function of this group of proteins have been the subject of several recent reviews (375, 376, 378).
The surface localization and cell wall attachment of P1 to the streptococcal cell wall has recently been demonstrated (353, 565, 566). A mutation was isolated in S. mutans GS-5 that resulted in the complete loss of surface attachment of P1 (565). Sequencing of the GS-5 variant revealed a frameshift mutation that resulted in the production of a I/II variant which was lacking the C-terminal end. Homonylo-McGavin and Lee found that antigen I/II of S. mutans could be localized to the surface of S. mutans, S. gordonii, and Enterococcus faecalis only if the C-terminal domain was intact (353). In addition, antigen I/II remained associated with the cell after boiling in SDS only if the cell wall sorting signal was present. This was confirmed in a later study in which it was shown that removal of the charged tail led to the complete solubilization of antigen I/II from the cell surface (566).
CshA is a very large (≈290-kDa, 2,508-amino-acid) surface protein which was identified in S. gordonii (538, 539). Cells defective in the expression of this protein have decreased cell surface hydrophobicity and a defect in their ability to coaggregate with oral actinomycetes (538, 539) and other bacteria (537), suggesting that this protein is probably necessary for colonization and adherence to the oral pellicle. Indeed, cshA mutants are defective in their ability to colonize the oral cavity of mice (539).
Like many other surface proteins, much of CshA is composed of repeating domains (539). The N-terminal domain of the mature protein (amino acids 42 to 878) is nonrepetitive, whereas the central portion of the protein (amino acids 879 to 2417) is composed of 16 repeat regions of ≈100 amino acids each. All of the binding regions thus far identified in CshA have been mapped by antibody inhibition studies to the N-terminal nonrepetitive domain (537). The C terminus of CshA is necessary for the retention of the polypeptide on the cell surface, since C-terminal mutations which truncate CshA to a protein of ≈260 kDa result in the complete secretion of CshA into the growth medium and a subsequent loss of surface hydrophobicity (538).
Oral streptococci produce polysaccharides that have a short half-life in dental plaque and are thought to be used for short-term sugar storage. These polymers are synthesized by a set of glucosyl-transferases (see below). Unlike the insoluble mutan polysaccharides, which are composed of α1-3 linkages and are unable to be enzymatically degraded, the storage polysaccharides are composed of glycosidic bonds that are susceptible to enzymatic lysis. β-d-Fructosidase (FruA) is a surface-associated enzyme capable of liberating fructose through the degradation of levans as well as sucrose, raffinose, and inulins. The fruA gene was cloned and sequenced from S. mutans, but Northern blot analysis suggests that homologues are likely to exist in all of the mutans streptococci (103). FruA is notable in having almost no proline or glycine residues in the area of the protein predicted to span the cell wall.
Interestingly, a study of S. mutans grown in a chemostat demonstrated that over 95% of the fructosidase activity was located in the supernatant (105) whereas a vast majority of the activity was cell associated in batch culture (103). An investigation into this phenomenon revealed that most of the fructosidase can be found in the supernatant when the bacterium is grown at a pH of 7.0, whereas approximately half is cell surface bound when the organism is grown at a pH of 5.0. The study also demonstrated that the release of fructosidase into the extracellular media can be inhibited by copper, a phenomenon also observed for the release of antigen I/II (104).
Cell surface dextranases cleave glycosidic bonds and are thought to be involved both in the metabolic utilization of sugar and in controlling the amount and content of extracellular polymerized glucans (126, 249). The streptococcal dextranases degrade dextran into isomaltosaccharides (362). Genes encoding extracellular dextranases have been cloned from S. mutans (363), S. sobrinus (827), S. salivarius (594), and S. suis (723), although only the dextranases from S. mutans and S. sobrinus possess C-terminal cell wall sorting signals. The S. mutans and S. sobrinus dextranases are 47% homologous to one another, although the S. sobrinus enzyme is somewhat larger. In addition, the two enzymes have similar enzymatic properties and pH optima, indicating functional similarity (827). Inactivation of the dextranase gene of S. mutans appears to increase the ability of the bacteria to adhere to smooth surfaces but has no effect on sucrose-dependent cell-cell aggregation (126).
Under conditions of stress, certain strains of S. mutans express a glucan binding protein that enables these cells to aggregate in the presence of dextran (704). The gene for this protein, gbpC, has recently been cloned and has been shown to encode a ≈60-kDa protein that contains both an N-terminal leader peptide and classic C-terminal cell wall sorting signal (704). The glucan binding activity of GbpC has been demonstrated in extracts of E. coli which harbor a plasmid encoding the gene (704). GbpC has limited homology to streptococcal antigen I/II, and the protein cross-reacts with anti-antigen I/II antisera. More work is required to determine why there appear to be several glucan binding proteins in S. mutans. The only other glucan binding protein to be sequenced thus far has no C-terminal cell wall sorting signal (26).
Wall-associated protein antigen A (WapA [201, 689]) and antigen III (686) were recently shown to be identical (687). WapA is a 45-kDa surface antigen of S. mutans, which has been implicated in helping the bacteria to bind smooth surfaces and to undergo sucrose-dependent aggregation (655), although this is a matter of some dispute (307, 687). Also disputed is its ability to serve as an effective vaccine against dental caries (687).
The actinomycetes, including Actinomyces naeslundii and, perhaps, A. odontolyticus, are some of the most common bacteria in the oral flora. Like the oral streptococci, these bacteria are involved in the early stages of plaque development and have the ability to bind both the tooth surface and other oral bacteria (441, 484). A. naeslundii binds to the tooth surface via a fimbrial structure, designated type 1, which has affinity for proline-rich proteins that coat the tooth enamel (121, 257). The adherence of actinomycetes to eukaryotic host cells and other bacteria, including several streptococcal species, is directed by a separate set of fimbriae designated type 2. Type 2 fimbriae appear to utilize a lectin-like activity to bind to the surface of target cells, and this binding can often be disrupted by high concentrations of sugars such as lactose (298, 440).
Strains of A. naeslundii may express either one or both types of fimbriae on their surface. One strain, T14V, expresses both types of fimbriae and has been used as a model system for the study of these structures in detail at the molecular level (120). The gene encoding the type 1 fimbrial subunit (fimP) of strain T14V has been cloned and sequenced, as has the gene encoding the type 2 fimbrial subunits (fimA) of T14V and another strain, WVU45 (867, 868). All three cloned subunit genes encode proteins of ≈54 to 59 kDa that have typical N-terminal leader peptides as well as C-terminal cell wall sorting signals. The type 2 subunit from T14V has a very high degree of homology (≈65% identity) to the type 2 subunit from WVU45 but only moderate homology (≈31% identity) to the type 1 subunit from T14V (868, 869).
The presence of the cell wall sorting signals is somewhat unexpected, and their possible role in fimbrial assembly is unclear. Accessory genes that appear to be involved in the assembly of intact fimbriae from individual subunits have been identified for both the type 1 and 2 fimbriae from strain T14V, but no pathway for assembly has been determined and specific roles for these genes have not yet been determined (869, 870). A clue to the role of the C-terminal cell wall sorting signal may be revealed by the fact that it is not possible to dissociate assembled fimbriae into their individual monomer subunits, suggesting that they may be covalently linked to one another. Additionally, antibodies raised to the C-terminal domain of a recombinant monomer subunit are unable to recognize the intact fimbriae, suggesting that the C-terminal domain may be proteolytically removed (869). This evidence, although far from conclusive, has led Yeung et al. to suggest that perhaps the C-terminal sorting signal directs the covalent multimerization of the fimbrial subunits (869). If this were true, it would represent a novel use of the sorting machinery to synthesize a complex fimbrial organelle. Obviously the verification of this model will require the structural characterization of both free and complex-bound fimbrial subunits.
The gram-positive anaerobe Peptostreptococcus magnus is a common human commensal organism that can also cause a variety of diseases including vaginosis and urethritis. Approximately 10% of clinical isolates express a unique Ig receptor, protein L (PPL), that has the unusual ability to bind the variable portions of Ig kappa light chains (4, 58, 188, 587, 855). This unusual binding specificity confers upon PPL the ability to bind antibodies of almost any subclass, and therefore this molecule has generated much interest as a tool for purifying and studying Ig. The gene encoding PPL has been cloned and sequenced from P. magnus 312 (413) and 3316 (567). A comparison of their structures reveals that these proteins have a modular design similar to that found for protein G from the GGS. PPL3316 and PPL312 possess four and five tandem repeat domains, respectively, that encode the Ig light-chain binding modules, whose three-dimensional structure has recently been determined (854). The two proteins also possess common tandem repeat domains of unknown function. PPL3316, but not PPL312, possesses four copies of the albumin binding GA modules similar to those found in protein G, and, as expected, strain 3316 binds albumin whereas strain 312 does not (413, 567).
A physiological role for the light-chain binding function of PPL has been proposed. Addition of strain 312 to a culture of human basophils was found to stimulate the release of histamines, whereas no such stimulation was observed upon the addition of a strain that lacks PPL (619). A similar effect was observed upon the addition of purified recombinant PPL to the cultures, presumably due to cross-linking of the membrane-bound IgE receptors present on these types of host cells.
Approximately half of P. magnus strains that do not possess PPL bind albumin through a receptor known as PAB. The gene encoding PAB has been sequenced from strain ALB8 and was found to encode a 43-kDa protein that, like protein G and PPL, is strikingly modular in its design (141). Although PAB lacks the kappa light-chain binding domains found in PPL, it does have many similarities to PPL, including the presence of a “GA module” as well as a sequence homologous to one of the repeats of unknown function from PPL312. The structural similarities between PAB and PPL seem to indicate that these two proteins are actually derived from a common ancestral molecule that evolved through the shuffling of the relevant binding modules (140). Indeed, a recent study has identified short nucleotide sequences within PPL and PAB that appear to facilitate the shuffling of these modular domains in a manner similar to the swapping of exons in eukaryotic cells (140).
Enterococci possess a unique and highly efficient mechanism for the conjugal transfer of plasmid DNA (166, 178). Expression of this transfer system in donor cells is induced by an oligopeptide mating pheromone that is secreted by recipient cells (167). Induction results in the increased expression of surface protein (aggregation substance [AS]) on the donor that acts as an adhesin specific for the surface of recipient cells (239). An additional surface protein (surface exclusion protein) is involved in preventing donor cells from conjugating with identical donors (170). The pheromone-induced expression of aggregation protein appears to be controlled by a novel posttranscriptional mechanism (36, 37). The genetics of this system has recently been reviewed in detail (168, 169). A similar conjugative plasmid transfer system has recently been identified and sequenced in the Enterococcus faecium plasmid pHKK701 (315).
The genes encoding AS from three conjugal plasmids, pCF10 (411), pAD1 (237), and pPD1 (236), have been sequenced, and it is believed that almost all enterococcal conjugal plasmids encode similar aggregation proteins (238, 339, 857). The three known AS proteins are all large (≈145-kDa) proteins that have significant sequence homology to each other and possess a typical C-terminal cell wall sorting signal as well as a proline-rich domain. The domain structure of these proteins is similar to that found in both the antigen I/II proteins of the viridans streptococci and the CluA protein of L. lactis, consistent with their roles as mediators of bacterial aggregation. The ligand for AS, designated enterococcal binding substance (EBS), is unidentified to date. The inactivation of several unlinked genes is required to obtain an EBS-negative phenotype, and it is speculated that lipoteichoic acid may be a component (711). The expression of both AS and EBS has been suggested to play a role in virulence in addition to their role in conjugal transfer of DNA (602, 711).
As mentioned above, surface exclusion proteins reduce the probability of mating between enterococci carrying identical plasmids, although the exact mechanism underlying this phenotype is unclear. The surface exclusion proteins of the plasmids of pAD1 (840) and pCF10 (411) have been sequenced. Both genes encode homologous 95-kDa proteins that have both N-terminal leader peptides and C-terminal cell wall sorting signals. The gene encoding the surface exclusion protein of plasmid pPD1 has also been sequenced and has been found to contain an insertion sequence that results in a truncated protein without a C-terminal cell wall sorting signal (339). Cells containing plasmid pPD1 lack the surface exclusion phenotype, strongly suggesting that cell wall anchoring is required for the surface exclusion phenotype (339).
Not unlike their close relatives the enterococci, certain strains of lactococci are able to exchange genetic information through conjugal mating. Recently a gene encoding the aggregation factor was cloned from Lactococcus lactis subsp. cremoris (271). The gene encodes a polypeptide of 1,243 residues that has a modular structure similar to other surface proteins. Sequence analysis revealed homology to the aggregation proteins of E. faecalis. Three central domains have homology to both the E. faecalis domains and the antigen I/II proteins from the viridans streptococci.
Lantibiotics are ribosomally synthesized antimicrobial oligopeptides that are produced by several gram-positive species. These (β-methyl)lanthionine-containing peptides display activity primarily against other gram-positive bacteria and have been the subject of considerable interest because of their ability to inhibit food spoilage (147). Nisin, produced by L. lactis, is one of the best characterized of the lantibiotics. The genetics and modification of nisin and other lantibiotics have recently been the subject of several reviews (154, 190, 373, 693).
Nisin is synthesized as a 57-amino-acid precursor peptide in the cytoplasm, at which point it undergoes a series of posttranslational modifications before being exported from the cell. The N-terminal 23 amino acids of pre-nisin serve as an atypical leader peptide that assists in the secretion of pre-nisin from the cytoplasm by a putative ATP binding cassette transporter (186, 451). The exported nisin pre-peptide is inactive due to the presence of the N-terminal leader sequence but is converted to an active form upon removal of its leader peptide by a surface-located serine protease (product of the nisP gene). NisP has also been implicated in providing immunity to nisin, although the mechanism by which this would occur is unclear (866).
Sequence analysis of NisP has revealed that it contains both an N-terminal leader sequence and C-terminal cell wall sorting signal (810). Neither the surface localization nor attachment of NisP to the peptidoglycan has been verified experimentally. It is unclear why such an attachment would be necessary, given that functionally homologous lantibiotic-peptidases from other species do not have a cell wall sorting signal in their sequence (154).
During growth in milk, the primary source of amino acids for the lactococci are the milk proteins (caseins), since the supply of free amino acids in milk is relatively limited (405). The caseins are initially broken down by the bacterium into small peptides, which are subsequently taken up by an oligopeptide permease system (453). The imported peptides can then be further degraded into individual amino acids by a series of intracellular proteases. The complete protease system of L. lactis has recently been reviewed (453).
The casein degradation pathway is initiated by a cell wall-bound serine protease called PrtP or CEP (192, 744). The prtP gene has been cloned and sequenced in a number of L. lactis strains (155, 429, 439), and each displays >95% homology to the other prtP genes. Sequence analysis has revealed that the proteins encoded by prtP are approximately 200 kDa and contain both an N-terminal leader peptide and a C-terminal cell wall sorting signal. The substrate specificity of the PrtP proteinase varies from strain to strain (192) but generally falls into one of two classes, designated PI and PIII, depending on the specific casein that the enzyme preferentially degrades (453). It has been demonstrated that PrtP is synthesized as a preproenzyme and, after being exported from the cytosol, is activated by the action of another cell surface protease, PrtM (293, 821), which is a lipoprotein (294). The attachment of PrtP to the cell wall and the role of the C terminus had been suggested several years before the cell wall sorting mechanism had been elucidated for protein A in S. aureus (155, 821).
Interestingly, a distantly related gram-positive bacterium, Lactobacillus paracasei, has a surface protein (PepP) which is homologous to the lactococcal PrtP. The putative C-terminal cell wall sorting sequence is unusual in that the ultimate glycine of the consensus motif has been replaced with an alanine (LPKTA) (343). It remains to be proven whether this sorting signal is functional.
Listeria monocytogenes is an invasive intracellular pathogen that can infect animal hosts from within the intestinal lumen. Once inside the host, these bacteria invade eukaryotic cells and replicate intracellularly, thereby escaping clearance by the humoral immune response (131). The ability of L. monocytogenes to enter eukaryotic cells has been traced to a family of surface and secreted proteins, the internalins. Thus far, seven members of the internalin family have been found to exist (InlA to InlC, InlC2, and InlD to InlF) and all have been found to share certain structural features (162, 235). All feature a N-terminal leucine-rich-repeat domain (LRR) followed by a conserved inter-repeat region (IR) (Fig. (Fig.15).15). InlA, InlC2, and InlD to InlF possess C-terminal cell wall sorting sequences at their C termini; InlB is targeted to the cell envelope (see below) (82); and InlC (also called IrpA) is secreted (159, 185, 493). It has been demonstrated that the LRRs and IR of InlA are sufficient to confer an invasive phenotype when expressed on the surface of latex beads or the noninvasive bacteria L. innocua or E. faecalis (475). Similar findings have recently been made for InlB (83, 83a).
InlA binds to E-cadherin, and it has been shown that this interaction is critical for internalin-mediated invasion of epithelial cell lines (543). It has also been demonstrated the LRR and IR are both necessary and sufficient for this interaction (475). No ligand has been identified for any of the other internalin family members. InlA is one of the few proteins for which a covalent linkage to the cell wall has been demonstrated (474).
The purpose of multiple internalin proteins is unclear, but it has been hypothesized that they confer tropism toward different cell types. For example, it has been suggested that InlB is critical for the invasion of Listeria into hepatocytes as well as several other cell types (161, 368). This result is disputed by in vivo experiments in which it was determined that InlB deletion mutants are able to invade hepatocytes (283) but are unable to escape the phagocytic vacuole and replicate intracellularly (284). Indeed, the precise role of any of the internalins in vivo is not yet entirely clear (131).
Staphylococci are common, highly virulent nosocomial pathogens that are receiving widespread attention due to the appearance of strains that are resistant to all useful antibiotics. Primary to their ability to establish infection in the hospital setting is their ability to bind biomaterial deposited on catheters and other foreign bodies (226). With the exception of protein A, every staphylococcal wall-anchored surface protein thus far characterized is an MSCRAMM. The regulation of surface proteins in S. aureus is widely believed to be under the control of the agr virulence regulon, which induces the expression of surface proteins during growth at low cell density (384, 591, 665). However, a recent study of the collagen adhesion protein suggests that it is under the control of another, overlapping regulatory system, sar (262).
It has been reported that the staphylococcal epidermal cell differentiation inhibitor (EDIN) (763) has a cell wall sorting signal; however, this report appears to be incorrect (215). The C-terminal domain of the protein contains a sequence, LPRGT, which is similar but not identical to the consensus LPXTG motif, and has a tail containing two lysine residues (366). EDIN, however, does not possess a significantly hydrophobic stretch of residues at its C-terminal end. In addition, the protein is found primarily in culture supernatants, and amino acid analysis demonstrates that EDIN is not processed at its C-terminal end, thereby indicating that this protein is not anchored to the cell wall (366, 763).
Staphylococcal protein A was the first nonimmune Ig binding protein identified (222). Since its discovery, it has become the best studied and most widely used surface protein of gram-positive bacteria. Early studies established the localization of protein A to the cell wall and even suggested that the protein was covalently attached to the staphylococcal peptidoglycan. The sequence of protein A was determined in 1983 and 1984; however, it should be noted that the earlier publications included a sequencing error at the 3′ end of the gene that eliminated the proper translation of the coding sequence for the cell wall sorting signal (291, 803). The IgG binding properties of protein A are due to the presence of five (in some strains, four) almost identical globular domains that bind the Fc regions of IgG1, IgG2, and IgG4. The three-dimensional structure of a protein A-antibody complex revealed that the protein A-Ig interaction occurs at the CH2 and CH3 domains of the Fc fragment (143). Despite all that is known about the structure and binding properties of protein A, almost nothing is known about its exact role in virulence. Studies on protein A mutants in animal model systems have yielded conflicting results, suggesting that other proteins may play a greater role in staphylococcal pathogenesis than protein A (226, 618).
The gene encoding the collagen adhesion protein of S. aureus (cna) was cloned from a phage expression library by screening with antibodies raised against a purified collagen adhesion protein (626). Like other MSCRAMMs of other Gram-positive bacteria, the Cna protein has a large unique N-terminal region followed by a set of 187 amino acid repeats, a proline-rich region, and a classic cell wall sorting signal. The collagen-binding domain had been localized to the unique N-terminal domain (622, 625), an observation now verified by the recent determination of the crystal structure (773). The role of the repeat domains, which can vary in number from strain to strain (772), remains unclear.
The fact that cna is both necessary and sufficient for the binding of S. aureus to cartilage has been demonstrated convincingly in vitro (772). In addition, mutants with mutations in the cna gene are less virulent in a rat experimental endocarditis model and in a mouse septic arthritis model (623). However, a recent study of clinical isolates from patients with endocarditis or bone or joint infection found that strains from only half of these patients possessed the cna gene, which indicates that collagen binding is probably not a prerequisite for these types of infection (692). The potential of collagen adhesin as a staphylococcal vaccine is being explored (589).
S. aureus was the first bacterium in which specific binding to fibronectin could be demonstrated, although the gene product responsible remained elusive for almost a decade (462). A >200-kDa fibronectin binding protein was purified from S. aureus Newman (191, 233), and soon thereafter a gene encoding a similar activity was cloned from strain 8325-4 (218). Sequencing of this gene, fnbA, revealed that it encodes a surface protein, designated FnBPA, with a predicted Mr of 108,000 (727). The gene for another fibronectin binding protein, FnBPB, was discovered only 682 nucleotides downstream of the fnbA gene (402), although not all S. aureus strains possess both genes (226).
Both FnBPA and FnBPB are similar in their general organization to the FnBPs from other gram-positive cocci such as Streptococcus pyogenes (see above). FnBPA is a 982-residue protein composed of a unique N-terminal domain that is interrupted by two copies of a 35-amino-acid repeat, termed B1 and B2 (727). Downstream from the B repeats are 3.5 repeats of 38 amino acids that are homologous to repeat domains found in the other fibronectin binding MSCRAMMs from S. pyogenes and S. dysgalactiae. FnBPB does not possess B repeats and is only 45% similar to FnBPA in the N-terminal domain, but it is essentially identical to FnBPA in all other aspects (402).
A study of S. aureus 8325-4 revealed that binding of S. aureus to fibronectin-coated coverslips was not significantly affected in either an fnbA or fnbB single mutant but was completely abolished in a double mutant. Complementation with either one of the two fnb genes restored the fibronectin binding phenotype to near wild-type levels (282), thereby demonstrating that both genes can be expressed in S. aureus and that both are capable of mediating staphylococcal adherence to fibronectin. Antibodies to this protein are capable of providing protection against infection in several model animal systems, indicating that this protein may be a potential vaccine candidate (514, 683, 684, 708). The role of this protein in pathogenicity is still the subject some debate, since mutants do not display a virulence defect in a rat model of endocarditis (219).
S. aureus forms clumps in plasma due to its ability to bind fibrinogen, a multisubunit glycoprotein that is proteolytically converted into fibrin, a major component in blood clots. Fibrinogen is deposited in large quantities at wound sites and is quickly deposited on synthetic material such as catheters (226). Fibrinogen binding, therefore, is seen as a major primary virulence determinant in S. aureus, especially in infections due to the presence of foreign bodies.
The fibrinogen binding protein of S. aureus was identified by transposon mutagenesis followed by screening for mutants unable to clump in plasma (531). All four mutants isolated had a defect in a gene (clfA) encoding an 896-amino-acid protein that harbors an N-terminal leader peptide and a C-terminal cell wall sorting signal. The protein has an N-terminal nonrepetitive domain that is responsible for binding fibrinogen. The clumping-factor protein does not possess a typical proline-rich wall-spanning segment but instead has a very unusual stretch of 154 amino acids composed almost entirely of alternating serine and aspartate residues. This domain is not involved in ligand binding and has recently been shown to be required for the proper display of the molecule on the surface of the cell (308). Mutants with mutations in clfA are defective for adherence to fibrinogen-coated surfaces including coated catheter material, suggesting that clumping factor is involved in the establishment of infections due to the presence of foreign bodies (226).
It had been proposed that coagulase, a mostly secreted enzyme found in S. aureus, was responsible for the clumping phenotype, since it has been shown to bind fibrinogen in vitro (67). Coagulase mutants, however, were still able to clump in plasma and were unable to cause clotting, whereas clfA mutants were unable to clump but were still able to clot plasma (532). These results indicate that the fibrinogen binding clumping factor is the product of the clfA gene and not of coa. A detailed kinetic study to address the relative contributions of coagulase and clumping factor to the binding of S. aureus to fibrinogen-coated surfaces was carried out with a radial-flow chamber (157). By using this method, it was possible to determine the binding constants of intact cells under a variable set of shear force conditions. Clumping factor, not coagulase, was found to be primarily responsible for both the initial attachment and the resulting adhesion.
Like S. aureus, the coagulase-negative bacterium Staphylococcus epidermidis is a frequent nosocomial pathogen and has been shown to bind to fibrinogen deposited on synthetic material such as catheters (629). The gene encoding a fibrinogen binding protein from S. epidermidis HB has recently been cloned and has been designated fbe (590). Although they are not identical in primary sequence, the basic structure and organization of the Fbe protein is almost identical to that of the fibrinogen binding protein from S. aureus (see above), including the unusual region of alternating serine and aspartate residues immediately preceding the cell wall sorting signal. The fibrinogen binding site was identified by using a phage display technique, and the domain was mapped to the same region found to be responsible for fibrinogen binding in the S. aureus homologue. Strangely, although this protein does bind fibrinogen and is similar to the clumping factor from S. aureus, S. epidermidis cells do not clump in the presence of plasma (590).
The protein secretory pathway (Sec) releases polypeptides into the surrounding medium of gram-positive bacteria. Some of these secreted products bind to the bacterial surface of microorganisms and play critical roles in the synthesis and turnover of the peptidoglycan exoskeleton as well as the pathogenesis of animal infections. We describe here what is known about the mechanisms by which secreted proteins are directed to the cell surface and the physiological role these proteins play.
The peptidoglycan of gram-positive organisms is hydrolyzed at specific times and sites during physiological growth of the exoskeleton (261). This is accomplished by murein hydrolases. These enzymes have been studied extensively, and an excellent review summarizes what is known about their activities and physiological roles (725). Murein hydrolases can be grouped according to their enzymatic activities; the various peptidoglycan cleavage sites of these enzymes are diagrammed in Fig. Fig.1717 (250). N-Acetylmuramidases (muramidase) and N-acetylglucosaminidases (glucosaminidase) cleave MurNAc(β1-4)GlcNAc and GlcNAc(β1-4)MurNAc, respectively (793). Lytic transglycosylases and lysozymes also attack the glycan chains (349); however, these enzymes have not been found in gram-positive bacteria and are not discussed here. Several different classes of enzymes attack the peptide backbone of the cell wall. N-Acetylmuramoyl-l-alanine amidase (amidase) cleaves the amide bond between the d-lactyl group of MurNAc and the amino group of l-Ala. Lysostaphin and Myxobacter AL1 proteases are glycyl-glycine endopeptidases that cleave the pentaglycine crossbridge of staphylococcal peptidoglycans (189, 709). Another class of enzyme cleaves the attachment site of the peptidoglycan crossbridges: d-Ala–X (250). In staphylococci, the d-Ala is linked to pentaglycine, and this bond is cleaved by the murein hydrolase of phage 11 (LytA) (576). In other bacterial species, the d-Ala can be linked to l-Ala–l-Ala, d-iAsn, or the side chain amino group of m-diaminopimelic acid, d-ornithine, and l-Lys (710). Enzymatic activities that cleave these peptide bonds of d-Ala have been found; however, the primary structures of these enzymes have hitherto not been identified (250). Carboxypeptidases trim the murein pentapeptide subunits by removing the terminal residue of d-Ala–d-Ala; these membrane-anchored PBPs do not hydrolyze the peptidoglycan and are therefore also not considered here (761).
Sequence alignment of murein hydrolases of gram-positive bacteria revealed that most of these enzymes display a domain structure (404). In general, murein hydrolases harbor an N-terminal signal peptide followed by a second domain containing the enzymatic activity (404). In addition, these proteins harbor repeat structures that flank either the N- or C-terminal side of the enzymatic domain (404). Figure Figure18 displays18 displays the domain structure of murein hydrolases from S. aureus. Although the enzymatic domains are often conserved between enzymes from different species, the repeat domains are not. The functions of these repeat domains have been studied most extensively for Streptococcus pneumoniae LytA amidase, Staphylococcus simulans lysostaphin (Lst), and S. aureus Atl and are discussed in detail below. Ghuysen et al. reported sequence and structural similarity between the noncatalytic regions of muramidases and amidases from Clostridium acetobutylicum and Bacillus spp. and the crystallized structure of Sreptomyces albus G Zn dd-peptidase (253). This region is required for the binding of dd-peptidase to insoluble peptidoglycan, suggesting that murein hydrolases from several other gram-positive bacteria retain this domain to bind to their substrate (253). This substrate binding domain is distinct from targeting domains, which direct some murein hydrolases to specific sites on the bacterial cell surface (see below).
Staphylococcus simulans bv. staphylocolyticus secretes lysostaphin (709), which hydrolyzes the peptidoglycan of all the staphylococci that synthesize peptidoglycan with pentaglycine crossbridges (95, 878) (Fig. (Fig.19).19). In this way, lysostaphin functions as a bacteriocin to kill competing microorganisms in mixed bacterial populations. Lysostaphin is synthesized as a preproenzyme and initiated into the secretory pathway by an N-terminal leader peptide. The proenzyme is released into the culture medium and contains 15 tandem repeats of a 13-residue peptide at the N-terminal end (324). Prolysostaphin is 4.5-fold less active than mature lysostaphin, and the N-terminal repeats are removed in a growth phase-dependent manner by a secreted cysteine protease (782). Mature lysostaphin cleaves the peptidoglycan of S. aureus cells much more actively than it cleaves the peptidoglycan of its S. simulans host (314). There appear to be two reasons for this. First, S. simulans cells elaborate an immunity factor that causes the incorporation of serine residues into the cell wall crossbridge (782). Although, on balance, only one or two serine residues are found in the otherwise poly-glycyl crossbridge, this modification renders S. simulans resistant (immune) to this bacteriocin (782). The S. simulans gene required for incorporation of the serine residues, lif (lysostaphin immunity factor), displays striking homology to the femA and femB genes (145, 331, 513). The latter genes are thought to specify enzymes which synthesize the cell wall crossbridge by using lipid II peptidoglycan precursor and charged Gly-tRNA as substrates (43). Consistent with the notion that Lif is the functional equivalent of a FemA/FemB enzyme is the observation that Lif can complement femB mutant staphylococci (798). Furthermore, immediately adjacent to the immunity factor gene is a locus encoding seryl-tRNA, suggesting that Lif may also utilize charged tRNAs as substrates (782). Recently, a glycyl-glycine endopeptidase of Staphylococcus capititis was described and characterized (764, 765). The molecular structure, targeting mechanism, and immunity of this enzyme appear to be identical to those described for lysostaphin.
Purified lysostaphin binds much more avidly to S. aureus than to S. simulans cells (21, 519). When added to mixed bacterial populations, purified lysostaphin kills 1,000 S. aureus cells for every S. simulans cell (21). This selectivity requires the C-terminal targeting domain of Lst, which is located in the 92 C-terminal amino acids (21). When deleted from lysostaphin, this mutant enzyme no longer binds to staphylococci and has lost the ability to distinguish between S. aureus and S. simulans cells (21). Furthermore, when the targeting domain is linked to the C terminus of secreted enterotoxin B, the fusion protein is exported but directed to the cell wall compartment of S. aureus cells. Similarly, when fused to the C terminus of reporter proteins such as glutathione S-transferase, the recombinant protein binds to the surface of S. aureus cells but not of S. simulans cells (21). Hence, lysostaphin is specifically directed to the surface of S. aureus by its C-terminal targeting domain (20). This targeting mechanism functions in mixed bacterial populations and therefore must be species specific. In other words, murein hydrolases such as lysostaphin are probably designed such that the targeting domain recognizes receptors found on the surface of specific species of microorganisms. It follows that the targeting domain cannot recognize peptidoglycan alone, because its chemical structure is largely conserved in all gram-positive bacteria (20).
Not all murein hydrolases function to kill microorganisms. Staphylococci divide in a direction perpendicular to the previous cell division plane (258, 259, 261). These coccal cells require hydrolysis of their thick peptidoglycan layer at the midcell to synthesize new cell wall and separate the dividing daughter cells (261). This depends on the Atl enzyme, because atl mutants show characteristic defects in cell separation and grow as large clusters of staphylococci (606, 766). The atl gene specifies a large polypeptide (preproenzyme) with several domains (225, 605). An N-terminal signal peptide is required for export (22). Secreted Pro-Atl is processed at two sites, residues 198 and 775 (22, 605). These cleavage events generate an N-terminal propeptide of unknown function as well as two domains with enzymatic functions, amidase and glucosaminidase (732, 786). Three repeat domains are located at the center of pro-Atl, such that mature amidase retains two C-terminal repeats and mature glucosaminidase retains one N-terminal repeat domain (605). The repeat domains function to direct Atl to the future cell division site of staphylococci (22), a structure named the equatorial surface ring (766, 767, 864) (Fig. (Fig.20).20). Targeting of pro-Atl occurs before it is processed; however, mature amidase and glucosaminidase remain bound at the surface ring structure (22, 864). The significance of the timing of processing and targeting is thus far not understood. Fusion of the repeat domains to reporter proteins directs the hybrid proteins to the equatorial surface rings, demonstrating that the repeat domains of Atl properly address the enzyme to the cell division site (22). This mechanism implies that a specific receptor for Atl might be located at this site to attract the secreted enzyme for localized hydrolysis (22). Although this hypothesis provides a model for cell separation in gram-positive bacteria, several questions are left unanswered. To define a spatial location on the cell surface, staphylococci presumably synthesize a specific decoration of the peptidoglycan. This decoration must be positioned in a manner that defines both the equatorial surface ring and future cell division sites. The same dilemma exists in the bacterial cytoplasm, where cell division occurs via constriction of an FtsZ ring structure which is assembled at midcell from soluble components and then constricts to fuse the membrane, thereby separating two daughter cells (54, 505). It is conceivable that the definition of the cell division site is provided by the synthesis of a chemical decoration in the rigid exoskeleton, which might provide an anchoring point for cell wall hydrolases and the membrane fusion machinery alike (681, 682).
As mentioned previously, coagulase-negative staphylococci, such as S. epidermidis, are important nosocomial pathogens that adhere to foreign bodies implanted into human tissues (629). Adherence of staphylococci to catheters and other materials is followed by the production of large amounts of biofilm (629). Götz and coworkers screened a collection of Tn917 insertional mutants of S. epidermidis for their inability to produce biofilm and/or adhere to polystyrene surfaces (321). Wild-type staphylococci synthesize a linear carbohydrate polymer composed of β1-6-linked 2-deoxy-2-amino-d-glucopyranosyl, in which almost all moieties are N-acetylated, named polysaccharide intercellular adhesin (PIA) (506, 507). One class of Tn917 mutants is defective in the production of PIA, and the mutation maps to the icaABC operon (323, 508). The ica genes specify the enzymatic machine that synthesizes PIA in both mutant S. epidermidis and S. carnosus, a microorganism otherwise unable to synthesize biofilm (323). A second class of mutants is unable to adhere to polystyrene surfaces, and its Tn917 insertion maps to the autolysin gene of S. epidermidis (atlE) (322). Binding of staphylococci to foreign materials is thought to be aided by human plasma proteins. A search for bound plasma proteins reveals that vitronectin may bind to AtlE (322). The authors observed less vitronectin binding to mature Atl-amidase and Atl-glucosaminidase than to pro-Atl, suggesting that the propeptide may be required for binding (322). The autolysin of S. saprophyticus is thought to be involved in binding to fibronectin proteins (325). This microorganism causes a large number of female urinary tract infections and expresses a surface protein which binds to fibronectin (246). A search for fibronectin binding activity of cloned staphylococcal sequences revealed the aas gene, which is homologous to the atlE and atl genes (325). Mutant S. saprophyticus strains carrying a defective aas gene were unable to bind either fibronectin or erythrocytes, and the fibronectin binding activity of purified Aas protein was mapped to the R1 to R3 repeats (325). Thus, the autolysin of staphylococci appears to be a multifunctional protein, required for cell wall hydrolysis at a designated site as well as for attachment of bacteria to eukaryotic proteins and/or mammalian tissues.
Bacteriophage lambda of E. coli employs the S lysis protein (holin) to disrupt the host cell membrane (11, 63, 245, 664). This mechanism allows phage-encoded intrabacterial murein hydrolases to leave the cytoplasm and degrade the peptidoglycan sacculus, which is located in the periplasmic space of gram-negative bacteria (61, 494). Without murein hydrolysis, phage particles could not be released into the extracellular medium (664). A similar strategy is used by the bacteriophages of gram-positive organisms (679). For example, S. aureus phage 11 expresses a murein hydrolase (LytA) devoid of an N-terminal signal peptide (834, 835). The LytA protein has a C-terminal targeting signal similar to that of lysostaphin (21), and its N-terminal domains code for both its d-Ala–Gly endopeptidase and amidase activities (576). Adjacent to the lytA gene is an open reading frame specifying a lysis protein with sequence homology to the holins of bacteriophages of gram-negative bacteria (70). A similar genetic arrangement to that of 11 has been observed for staphylococcal phages 80α and Twort (68, 498). Thus, although this has never been demonstrated directly, it seems likely that phage-encoded murein hydrolases are released from the cytoplasm of gram-positive bacteria only after the holin-induced disruption of the cytoplasmic membrane that occurs during the phage lytic cycle.
Several groups have searched for autolysins, enzymes that degrade the cell wall of gram-positive bacteria at the end of logarithmic growth. Autolysin activity can be measured on agar plates containing heat-killed bacteria (274, 606). Mutant colonies that fail to hydrolyze peptidoglycan can be screened from a population of transposon insertion mutants. Several such mutants were isolated in S. aureus, and mutations in lytA, atl, lytM, and lytRS were found in screen (98, 605, 660, 835). It now appears that two of these genes encode staphylococcal autolysins. The Atl enzyme, specifying amidase as well as glucosaminidase activity, is described above. The lytM gene encodes a protein harboring an N-terminal signal peptide followed by an enzymatic domain that is homologous to that of lysostaphin and probably functions as a glycyl-glycine endopeptidase (660). LytM does not harbor any of the known targeting signals of either lysostaphin or Atl, and its location on the staphylococcal surface is still unknown. The lytA gene is encoded by bacteriophage 11, which has lysogenized strain 8325 (834). The lytRS genes display homology to a two-component regulatory system involved in signal transduction (98). The lytRS locus controls the activity of two other genes, lrgA and lrgB, encoding a staphylococcal holin lysis protein (LrgA) similar to those described for phages 11, 80α, and Twort and a protein with unknown function (LrgB) (99).
Murein hydrolases of gram-positive bacteria appear to fall into two classes: the enzymes which are encoded by genes on the bacterial chromosome, exported by an N-terminal signal peptide, and presumably involved in physiological cell wall turnover (class I), and others that are located on the chromosome of bacteriophages, devoid of signal peptides, and exported by a general lysis mechanism (class II). The latter class of genes found in the screen described above were identified due to the induction of lysogenic phages at the end of logarithmic growth.
S. pneumoniae synthesizes a major autolysin which hydrolyzes the cell walls of these bacteria in stationary-phase cultures (795). The enzyme is thought to exist in two states, a cytoplasmic inactive species and a membrane-associated active counterpart (91). The inactive species (E) accumulates in cells that are grown in ethanolamine, whereas the more active enzyme (C) accumulates in cultures grown in media containing choline (91). The inactive enzyme has been purified to homogeneity by affinity chromatography on choline-Sepharose (91) or by DEAE ion-exchange chromatography (702). The pure enzyme can be activated by the addition of choline (352), suggesting that choline is a necessary cofactor (265). Presumably, LytA undergoes a conformational change when encountering choline, resulting in enzymatic activity (350–352, 698). Cloning and sequencing of the lytA gene revealed an open reading frame specifying a 31-kDa polypeptide (242). The open reading frame displays sequence homology to the gene sequence encoding enzymatic domains of many other known amidases (677). However, six repeat domains located at the C-terminal end of LytA (242, 699) are not found in the amidase sequences from other bacterial species but are present in a variety of phage-encoded murein hydrolases of S. pneumoniae (499). When expressed in E. coli and purified to homogeneity, these LytA repeats were shown to bind choline and hence have been named choline binding repeats. Thus, LytA binds to choline-containing teichoic acids, resulting in its enzymatic activation and peptidoglycan hydrolysis on the cell surface (350–352). The common repeat structure of pneumococcal cell wall teichoic acid and lipoteichoic acid (31, 207, 211, 379) [Glc(β1-3)AATGal(α1-4)GalNac(6-Cho-P)(α1-3)GalNac(6-Cho-P)(β1-1)ribitol-P] suggests that LytA can bind to both structures. The lytA open reading frame did not reveal an N-terminal signal peptide (242), and either LytA is not exported under physiological conditions or its export may occur by a pathway other than the known Sec machinery (156). When expressed in E. coli, LytA appears to be translocated across the cytoplasmic membrane and was detected by immunogold labeling on the periplasmic side of the cytoplasmic membrane (156). Immunogold labeling of S. pneumoniae revealed LytA staining on the cell surface, indicating that the enzyme is also translocated and surface displayed (156). Future work is needed to further characterize the export mechanism of LytA.
Briles et al. identified PspA (pneumococcal surface protein A) as a major antigen during animal infections by S. pneumoniae (93). Antibodies raised against purified PspA protect mice from challenge with virulent strains (92) and pneumococcal pspA mutants are less virulent in a mouse model system (530). PspA displays size and antigenic variation; however, the immunodominant antibodies are opsonic (phagocytic) and cross-reactive with PspA species of other strains (528, 529, 659, 779). Thus, PspA might serve as a vaccine candidate in protecting animals from pneumococcal infections. Sequencing of the structural gene revealed that PspA is synthesized with an N-terminal signal peptide (875). Because much of the N-terminal part of the mature polypeptide showed a 7-residue periodicity of hydrophilic and hydrophobic residues, it is likely that this domain of PspA assumes a coiled-coil structure. At the C terminus, PspA contains repeat domains homologous to those of LytA (875). Shortening of the C-terminal domain to five repeats abolished surface display and released the mutant protein into the culture supernatant (876). These results are comparable to those obtained for LytA, which showed that binding to the pneumococcal surface required as many as four repeat domains (243, 699). The LytA repeat domains are not only required but also sufficient for binding to choline-containing teichoic acids. This has been demonstrated both by expressing fusion proteins with LytA repeats in pneumococci (676, 746) and by adding purified recombinant proteins to cultures of S. pneumoniae (135).
Bacillus and Clostridium are spore-forming species. During sporulation, the mother cell undergoes asymmetric cell division to engulf and nurture the spore (501). The cell wall peptidoglycan and associated structures (cortex) of the spore are distinct from that of cells growing under vegetative conditions (646, 836, 852). During germination, the mother and spore peptidoglycan are dissolved and the released spore once again grows as a vegetative cell (101). Sequencing of the B. subtilis genome revealed at least 16 genes with homologies to known murein hydrolases, most of which are amidases (454). To characterize these enzymes, several investigators have used zymography, i.e., the activity measurement of renatured proteins after their separation on SDS-PAGE by overlaying with heat-killed bacteria (748). Murein hydrolase activity can be visualized as a zone of bacteriolysis representing a protein with unique mobility. When this experiment is performed with proteins of B. subtilis cell extracts, at least 10 proteins with bacteriolytic activity can be revealed (224), suggesting that most if not all of the 16 genes specifying murein hydrolases are indeed expressed. Clues to the function of these genes came from the generation of specific knockout mutations. The two most prominent zymographic species, a 90-kDa glucosaminidase (LytD) and a 50-kDa amidase (LytC), may not function as autolysins, because strains carrying knockout mutations in their structural genes revealed either no (lytD) or little (lytC) decreased autolysis of stationary-phase cultures (460, 516, 517). No effects on either cell separation, sporulation, or germination of this organism were observed. The lytC gene is located within the lytRABC operon specifying the regulatory transcription factor LytR, lipoprotein LytA, and LytB, the modifier protein for the LytC amidase (473). LytB binds to LytC, altering its enzymatic activity from a random to a processive amidase removal of peptide substituents along single glycan strands (332, 333). LytB is similar in sequence to SpoIID (458, 473), a factor required for dissolution of the forespore septum with the mother cell (364). Transcription of lytRABC is regulated in part by the flagellar transcription factor sigma D (327), which also controls all lytD expression (459, 473). Because sigD mutants grow as long filaments, it is suspected that this transcription factor may control the expression of murein hydrolases required for cell division (328). There appears to be at least one enzyme other than lytC involved in this, because lytC mutants do not display filamentation (748). With the completion of the B. subtilis genome sequence, it is presumed that all open reading frames specifying murein hydrolases have been identified. Little is known, however, about the targeting of any of these enzymes. The SpoIID protein displays sequence homology to the modifier protein (LytB) of the LytC amidase (458). Whether this homologous domain is involved in directing these proteins to the septum is still unknown. Mother cell lysis depends on the compensatory effect of the LytC (CwlB) and CwlC hydrolases, suggesting that these two enzymes have redundant functions (457). The CwlD hydrolase is thought to be required for spore cortex maturation (19, 645, 719). Germination of spores also requires murein hydrolases, and several enzymes appear to fulfill compensatory functions, for example CwlJ and SleB (369, 558, 559). Whether any of these Bacillus enzymes are directed to their subcellular location by a specific targeting event is also still unknown.
Clostridium difficile, the causative agent of pseudomembranous colitis and antibiotic-associated colitis in humans, releases two toxins, TcdA and TcdB, which are thought to be major virulence factors of this organism (406, 408, 424). Both toxins are targeted into the cytosol of eukaryotic cells, where they function as glucosidases to modify small GTP binding proteins such as Rho, Rac, and Cdc42 by using UDP-Glc as a cosubstrate (10, 815). Glucosidation leads to inactivation of GTPase activity and of the signaling function of these molecules, causing actin rearrangements, fluid loss, and, finally, cell death (407). Injection of glucosylated Rho into eukaryotic cells has the same cytotoxic effect as injection of TcdB, indicating that Rho is the main target of toxin action and is dominant over unmodified Rho (407). C. sordellii and C. novyi toxins also glucosylate small GTP binding proteins, although the alpha-toxin of C. novyi employs UDP-GlcNAc as a substrate for modification (647, 722).
Neither TcdA, TcdB, nor any other member of a family of large clostridial toxins is synthesized with an N-terminal signal peptide (815). Although the mechanism by which these toxins are secreted and/or released from the bacteria is unknown, once in the extracellular milieu, toxins display binding affinity for mammalian cells and hence are taken up into the eukaryotic cytosol (705). Clostridial toxins A and B are very large polypeptides (308 and 270 kDa) (29, 706), composed of an N-terminal domain containing glucosyl transferase activity (342, 651) and a C-terminal domain with tandem repeats of a 30-residue peptide (705). The repeat domains are responsible for binding to carbohydrate compounds, and TcdA recognizes Galα1-3Galβ1-4GlcNAc (450). Such carbohydrate structures are displayed on the surface of animal cells, accounting for the hemagglutination of rabbit erythrocytes by TcdA (800). Although similar in sequence, TcdB does not hemagglutinate the same cells, presumably because its repeats do not recognize the same carbohydrates (800). Antibodies directed against the C-terminal repeats of TcdA prevent toxicity in a mouse model system, as does preincubation of cells with recombinant proteins carrying the C-terminal repeat domains but not the N-terminal domain with enzymatic activity (705). Thus, the C-terminal repeat domains function as a targeting device that directs clostridial toxins to certain eukaryotic cells.
Several authors observed that the repeat elements found in clostridial toxins have some sequence similarity to the repeat domains of streptococcal glucan binding proteins as well as the LytA repeats of pneumococcal proteins (816, 817, 860). This suggested that the ability to bind carbohydrate elements may be a common property of these repeat structures. In keeping with this hypothesis, aromatic amino acid residues (W, Y, and F) are found regularly in the repeat structures of proteins of gram-positive bacteria with targeting elements, and they may serve as stacking devices for the interaction with carbohydrate ring structures, as has been observed for sugar binding proteins in the periplasm of gram-negative bacteria (860).
InlB is a member of the internalin family of bacterial invasins and promotes the entry of Listeria into hepatocytes and several other cell lines (161, 163, 368). The broad tropism apparently conferred by InlB might indicate that the InlB receptor is more widespread than E-cadherin, the receptor for InlA. During the infection of mice, Listeria has been observed in the liver, and InlB is thought to be essential for multiplication in this organ (130, 161). InlB is synthesized as a precursor bearing an N-terminal signal peptide (235). The N-terminal domain of mature InlB displays sequence homology to other members of the internalin family whereas the C-terminal 232 amino acids are homologous to the targeting domain of a Listeria amidase (82). When added externally to Listeria, purified InlB binds to the bacterial surface (82). Binding of InlB is dependent on the C-terminal targeting signal, since mutant proteins without this sequence are found in the extracellular medium (82). The addition of purified InlB to other bacteria or to latex beads also promotes uptake of these organisms or particles into hepatocytes, indicating that InlB alone is sufficient to promote this function (83). It is not clear why Listeria chooses to display InlB via a cell wall-targeting mechanism whereas all other internalins are apparently linked (sorted) to the bacterial peptidoglycan. To investigate this further, Cossart and coworkers designed a recombinant InlA protein which was displayed on the listerial surface via the C-terminal targeting signal of InlB (82). This hybrid protein was functional in promoting invasion in a manner indistinguishable from wild-type InlA (162, 164). It is conceivable that Listeria regulates the surface display of its different internalins (162) and that this regulation is, at least in part, carried out by their anchoring and release from the cell surface.
Listeria p60 is required for cell growth and intestinal invasion of this organism (334, 861). p60 homologs have been found in many listerial species (100). The N- and C-terminal domains of p60 species are conserved in all Listeria species; however, the central sequences are variable (100). Cytotoxic T cells that recognize p60 epitopes are protective against Listeria infections in a mouse model system, suggesting that this surface protein presents an immunodominant antigen (74, 247). Purified p60 displays murein hydrolase activity, and overexpression of p60 leads to the appearance of significantly shortened Listeria cells. The C-terminal domain of p60 displays homology to the repeat domains of an autolysin of Enterococcus faecium (234), suggesting that p60 may be targeted to the cell surface and required for peptidoglycan hydrolysis similar to the staphylococcal Atl enzyme.
Pancholi and Fischetti described the purification of several different glycolytic enzymes from S. pyogenes cell extracts (614). SDH (glyceraldehyde-3-phosphate dehydrogenase) and enolase were isolated from cell wall extracts obtained after enzymatic peptidoglycan hydrolysis and removal of the resulting protoplasts by sedimentation (610). Immunoelectron microscopy, reactivity of affinity-purified antibodies, and enzyme assays with intact streptococci have each indicated that these glycolytic enzymes are found in two locations, the bacterial cytoplasm and the streptococcal surface (610). Surface-exposed and cytoplasmic enzyme appear to be one and the same species, synthesized from a single gene and devoid of an N-terminal signal peptide. Clearly, these glycolytic enzymes are not exported, sorted, or targeted to the envelope of this gram-positive bacterium by any of the established pathways. It will be interesting to see by which mechanism streptococci can display enzymes in two different locations. SDH binds to a variety of eukaryotic proteins (fibronectin, lysozyme, myosin, and actin) (615). Furthermore, purified SDH activates eukaryotic protein tyrosine kinase and protein kinase C by ADP-ribosylation (611). Inhibitors of protein kinase activity prevented S. pyogenes invasion of pharyngeal cells, suggesting that host protein phosphorylation plays a role in the uptake of this pathogen (615). Streptococcal enolase binds to plasmin and plasminogen (610). The precise role of this binding in streptococcal invasion and disease is not yet clear. Others have reported that high-affinity plasminogen binding to the surface of S. pyogenes is caused by certain M-protein-like proteins (see above) (858).
S. mutans is one of the principal etiologic agents of dental caries (497). This pathogen synthesizes both water-soluble α(1-6) glucan and insoluble α(1-3) glucan by hydrolyzing dietary sucrose (311) and polymerizing the resulting monomeric glucose residues (455). Both types of glucan are components of dental plaque, and the water-insoluble species mediates the adherence of S. mutans to smooth dental surfaces (299). Synthesis of glucan is stimulated by the GtfA enzyme, which acts as a sucrose phosphorylase to synthesize Glc-1P (200, 690). The Gtf-I (gtfB and gtfC) (301) and Gtf-S (gtfD) (301) enzymes are responsible for synthesizing water-insoluble and water-soluble glucans, respectively, and streptococcal strains carrying mutations in any of these genes display a defect in the pathogenesis of dental caries in a pathogen-free rat model (865). Sequence analysis of the gtf genes and their encoded polypeptides revealed that each of these enzymes is exported by an N-terminal signal peptide. A large N-terminal enzymatic domain is followed by a C-terminal domain with tandem repeats of 20 to 48 amino acids (26, 199, 263, 653, 859). Successive truncations of the C-terminal repeats yielded enzymes that were still active but had impaired binding activity for α1-6-linked glucans, the substrates for their glucosyltransferase activity (199). Thus, the C-terminal repeat domains of Gtf enzymes target these enzymes to their glucan substrates, which are deposited on the dental surface. As mentioned above, the repeat sequences are homologous to those found in large clostridial toxins and pneumococcal choline binding domains (817).
S-layers are two-dimensional proteinaceous crystalline arrays formed by proteinaceous subunits that cover the outer surface of many different kinds of unicellular organisms (53, 741, 742). S-layers have been identified in a range of species from the Archaea, Bacteria, and Eucarya and diverge greatly in their structures and functions. We make no attempt here to cover the many aspects of the hundreds of S-layers that have been identified and will instead focus on what is known about the mechanisms by which these S-layers are targeted to and assembled on the cell surface of gram-positive bacteria. For a more extensive review of what is known about the biology of S-layers, the reader is directed to a series of recent reviews (24, 53, 547, 657, 739).
S-layers are formed by the entropy-driven aggregation of monomer subunits which, in most cases, are exported from the cytoplasm via the general secretory pathway by an N-terminal leader peptide. S-layers are not covalently attached to the cell surface and can usually be extracted either as sheets (30) or as individual subunits in the presence of dissociating agents such as 5 M lithium chloride (500) or EDTA (768) or chaotropic denaturants such as guanidine hydrochloride or urea (357, 740, 743). In many cases the monomeric subunits can spontaneously reassemble into a two-dimensional lattice once the dissociating agent is removed. Lattices formed in vitro usually bear the same hallmark geometric crystal pattern of the original cell-bound S-layer and are often capable of reassociation with the bacterial cell surface. The spontaneous nature by which S-layers assemble means that the primary energetic cost to the cell of manufacturing the S-layer stems from synthesizing the enormous number of subunits necessary to cover the entire cell surface. S-layer proteins are often the most abundant species in cells, comprising up to 15% of the total cellular protein (742). This is not surprising considering that approximately 5 × 105 subunits are required to cover an average-sized cell (741).
S-layers are usually surface exposed, although in at least one strain of Bacillus anthracis the S-layer can coexist with a surface-exposed capsule and it appears that the expression and display of one does not interfere with that of the other (545). Multiple S-layers can be simultaneously expressed on a single cell, and it is believed that in some cases one S-layer can serve as the primary attachment or crystallization surface for other layers (418, 428, 546, 799). The S-layer in Bacillus stearothermophilus is the envelope component responsible for the attachment of a surface-displayed amylase (175, 177), indicating that these layers can serve as targeting ligands for other proteins.
Only recently has any insight been gained into the manner by which S-layers are attached to the gram-positive cell surface. Recent structural studies have revealed that several S-layer subunits share a region of homology in a domain that has been shown to interact with the cell wall (481, 504, 599). These surface layer homology (SLH) domains have been identified in S-layer subunit protein sequences from several gram-positive species as well as in some other types of surface proteins including the cellulosome of Clostridium thermocellum (477). All SLH domains identified thus far have been found at the very beginning or end of the mature proteins (503). An SLH domain is usually composed of either a single or three repeating SLH motifs of approximately 50 to 60 residues (503); a comparison of these SLH motifs reveals that they are fairly divergent with an average identity of 27% and only one universally conserved residue (504).
SLH domains have been speculated to interact either directly with peptidoglycan (599) or with other wall-associated polymers. Recent studies of three different S-layer proteins from different strains of B. stearothermophilus have identified two apparently unique secondary wall polymers as the surface ligands responsible for the adherence of the S-layer to the cell surface (176, 670). The SLH receptor on the surface of PV72/p2 was isolated by extracting isolated cell walls with hydrofluoric acid (670). Chemical analysis determined that the receptor molecule was a polymer composed of GlcNAc and ManNAc in a molar ratio of 3.5:1 and with an approximate molecular mass of 20,000 Da (670).
An S-layer subunit from Thermus thermophilus containing a single SLH motif seems to bind directly to the peptidoglycan (599). Although T. thermophilus is not taxonomically classified as a gram-positive bacterium, it is known to have a similar subcellular architecture. Chemical analysis of the envelope from this bacteria has indicated that the peptidoglycan does not contain any associated secondary wall polymers (656). It therefore appears likely that SLH domains have evolved the ability to recognize a wide variety of surface ligands, including, in some cases, the peptidoglycan itself. This property is perhaps reflected in the loose sequence conservation found among the various SLH domains.
It is important to note that S-layer proteins from several gram-positive bacteria do not have SLH domains, demonstrating that different adherence mechanisms have evolved for different S-layers. The PS2 protein of Corynebacterium glutamicum contains a C-terminal hydrophobic domain of approximately 79 residues that can be removed by protease treatment, resulting in an intact S-layer sheet that is not able to adhere to the Corynebacterium cell surface (114, 115). More recently, a secondary wall polymer ligand has been partially characterized for the S-layer proteins from B. stearothermophilus ATCC 12980 and the PV72 variant, PV72/p6 (176), both of which lack SLH domains. The composition of this wall polymer appears to differ significantly from that of the ligand for the PV72/p2 S-layer protein; the polymer appears to be composed of glucose and glucosamine in roughly equal ratio (176). It is almost certain that other adherence mechanisms will become apparent as S-layers from more gram-positive bacteria are studied in the future.
We have summarized here the currently known surface proteins of gram-positive bacteria and the mechanisms of their anchoring to the cell wall envelope. The completion of several microbial genome sequences will soon provide us with a more comprehensive view of the number and nature of these molecules. If this information is combined with biochemical analysis, our understanding of surface protein function will probably expand rapidly. Nevertheless, future work must also further investigate the mechanisms of surface protein anchoring to the cell wall envelope. The cell wall sacculus provides rigid exoskeletal functions for microorganisms and requires proper targeting of surface proteins for its concerted assembly and turnover. A detailed knowledge of these mechanisms appears absolutely necessary to our understanding other physiological events such as bacterial cell division and separation. We apologize to all those authors whose work was not mentioned here either because of space constraints or because of our oversight.
We are indebted to Marjorie Russel and Peter Model for their interest and relentless encouragement. We thank Gunnar Lindahl, Maria K. Yeung, Pascale Cossart, Agnès Fouet, and Stéphane Mesnage for their comments that much improved the manuscript. We also thank Tadashi Baba, Hung Ton-That, Sarkis Mazmanian, Vincent Lee, Kumaran Ramamurthi, Debbie Anderson, and Kym F. Faull for their support and contributions to this work.
O.S. acknowledges grant support (AI33985 and AI38897) from the Infectious Disease Branch, NIAID, NIH. W.W.N. was supported by a Cellular and Molecular Biology Training Grant (GM 07185-22) from the Public Health Service to UCLA.