|Home | About | Journals | Submit | Contact Us | Français|
Staphylococcus aureus is an opportunistic pathogen and the major causative agent of numerous hospital- and community-acquired infections. Staphylococcus epidermidis has emerged as a causative agent of infections often associated with implanted medical devices. We have sequenced the ~2.8-Mb genome of S. aureus COL, an early methicillin-resistant isolate, and the ~2.6-Mb genome of S. epidermidis RP62a, a methicillin-resistant biofilm isolate. Comparative analysis of these and other staphylococcal genomes was used to explore the evolution of virulence and resistance between these two species. The S. aureus and S. epidermidis genomes are syntenic throughout their lengths and share a core set of 1,681 open reading frames. Genome islands in nonsyntenic regions are the primary source of variations in pathogenicity and resistance. Gene transfer between staphylococci and low-GC-content gram-positive bacteria appears to have shaped their virulence and resistance profiles. Integrated plasmids in S. epidermidis carry genes encoding resistance to cadmium and species-specific LPXTG surface proteins. A novel genome island encodes multiple phenol-soluble modulins, a potential S. epidermidis virulence factor. S. epidermidis contains the cap operon, encoding the polyglutamate capsule, a major virulence factor in Bacillus anthracis. Additional phenotypic differences are likely the result of single nucleotide polymorphisms, which are most numerous in cell envelope proteins. Overall differences in pathogenicity can be attributed to genome islands in S. aureus which encode enterotoxins, exotoxins, leukocidins, and leukotoxins not found in S. epidermidis.
The staphylococci are a diverse group of bacteria that cause diseases ranging from minor skin infections to life-threatening bacteremia. In spite of large-scale efforts to control their spread, they persist as a major cause of both hospital- and community-acquired infections worldwide. In the hospital setting alone, they are responsible for upwards of one million serious infections per year (41). The two major opportunistic pathogens of this genus, Staphylococcus aureus and Staphylococcus epidermidis, colonize a sizable portion of the human population. The predominant species, S. epidermidis, is fairly widespread throughout the cutaneous ecosystem, whereas S. aureus is carried primarily on mucosal surfaces. Within this context, staphylococci generally have a benign symbiotic relationship with their host. However, breach of the cutaneous organ system by trauma, inoculation needles, or direct implantation of medical devices enables the staphylococci to gain entry into the host and acquire the role of a pathogen. S. epidermidis is primarily associated with infections of implanted medical devices, such as prosthetic heart valves and joint prostheses (49). On the other hand, S. aureus is a more aggressive pathogen, causing a range of acute and pyogenic infections, including abscesses, bacteremia, central nervous system infections, endocarditis, osteomyelitis, pneumonia, urinary tract infections, chronic lung infections associated with cystic fibrosis, and several syndromes caused by exotoxins and enterotoxins, including food poisoning and scalded skin and toxic shock syndromes (32, 41).
Successive acquisition of resistance to most classes of antimicrobial agents, such as penicillins, macrolides, aminoglycosides, chloramphenicol, and tetracycline has made treatment and control of staphylococcal infections increasingly difficult. The widespread use of methicillin and other semisynthetic penicillins in the late 1960s led to the emergence of methicillin-resistant S. aureus (MRSA) and S. epidermidis (MRSE), which continue to persist in both the health care and community environments (45). Currently, greater than 60% of S. aureus isolates are resistant to methicillin and some strains have developed resistance to more than 20 different antimicrobial agents (40). The remaining effective therapy against most strains of multidrug-resistant staphylococci, including MRSA and MRSE, is the glycopeptide antibiotic vancomycin (51). However, the emergence in 1997 (6) of S. aureus with intermediate levels of resistance to vancomycin (vancomycin-intermediate S. aureus) and the most recent emergence of S. aureus with high levels of resistance to vancomycin (vancomycin-resistant S. aureus) (7) has limited its effectiveness. Finally, the increasing incidence of hypervirulent community-acquired S. aureus (45, 48) has become a major concern to the global health community and reinforced the critical need for new methods of control and treatment.
We have determined the complete genome sequences of S. aureus COL, an early MRSA isolate, and S. epidermidis RP62a, an MRSE biofilm-producing clinical isolate. Comparison of these genomes with other sequenced staphylococcal genomes provides insights into genome features that contribute to increasing pathogenicity in S. aureus and has led to the identification of novel genome islands in S. epidermidis that may contribute to the evolution of this species from a commensal pathogen to a more aggressive pathogen.
S. aureus COL was obtained from Brian Wilkinson (Illinois State University), who has maintained the culture as a frozen stock since 1976. The COL strain was reportedly isolated as a penicillinase-negative strain in the early 1960s from the operating theatre in a hospital in Colindale, England (17, 43). COL was one of the first MRSA isolates to be identified and has been used extensively in biochemical investigations of methicillin and vancomycin resistance (16).
S. epidermidis RP62a (ATCC 35984) is a slime-producing strain isolated during the 1979 to 1980 Memphis, Tennessee, outbreak of intravascular catheter-associated sepsis (9, 10). RP62a is capable of accumulated growth and subsequent biofilm formation, which contribute to its pathogenicity in foreign-body infections (22).
Other strains used for comparative genomic analyses are S. aureus Mu50 (28), N315 (28), and MW2 (3) and S. epidermidis ATCC 12228 (54). Mu50 is a clinical MRSA strain isolated in 1996 from a Japanese patient with infection of a surgical incision site which was resistance to vancomycin therapy (19, 20). N315 is a Japanese clinical MRSA isolate identified in 1982 (28, 37). MW2 is a highly virulent community MRSA strain isolated in 1998 from a 16-month-old girl in North Dakota and initially associated with four pediatric deaths in Minnesota and North Dakota (3, 5). S. epidermidis ATCC 12228 is a non-biofilm-forming reference strain also isolated in the United States (2, 54).
S. aureus strain COL and S. epidermidis strain RP62a were sequenced to closure by the random shotgun method, with cloning, sequencing, and assembly completed as described previously for genomes sequenced at The Institute for Genomic Research (TIGR) (39). One small-insert plasmid library (2.0 to 3.0 kb) and one medium-insert plasmid library (10 to 12 kb) was constructed for each strain by random mechanical shearing and cloning of genomic DNA. In the initial random-sequencing phase, eightfold sequence coverage was achieved from the two libraries (one sequenced to fivefold coverage and the other sequenced to threefold coverage). The sequences from the respective strains were assembled separately with TIGR Assembler or Celera Assembler (www.tigr.org). All sequence and physical gaps were closed by editing the ends of sequence traces, primer walking on plasmid clones, and combinatorial PCR, followed by the sequencing of the PCR product.
An initial set of open reading frames (ORFs) that likely encode proteins was identified with GLIMMER (14), and those shorter than 90 bp as well as some of those with overlaps were eliminated. A region containing the likely origin of replication was identified, and bp 1 was designated adjacent to the dnaA gene, located in this region. All ORFs were searched against a nonredundant protein database as previously described (39). Frameshifts and point mutations were detected and corrected where appropriate. The remaining frameshifts and point mutations are considered authentic, and the corresponding regions were annotated as authentic frameshift or authentic point mutation, respectively. The ORF prediction and gene family identifications were completed by methodology described previously (39). Two sets of hidden Markov models (HMMs) were used to determine ORF membership in families and superfamilies. These included 721 HMMs from Pfam, version 2.0, and 631 HMMs from the TIGR ortholog resource. TMHMM (27) was used to identify membrane-spanning domains in proteins.
For the identification of species-specific and strain-specific genes, all predicted ORFs from the TIGR-sequenced staphylococcal genomes (S. aureus COL and S. epidermidis RP62a) and published staphylococcal genomes (S. aureus N315 and Mu50 (28), S. aureus MW2 (3), and S. epidermidis ATCC 12228 (54) were searched against an in-house database composed of 195 prokaryotic, 8 eukaryotic, 175 phage, 63 virus, and 46 plasmid genomes with WU-BLASTP (http://BLAST.wustl.edu). Those genes that matched a nonself genomic sequence at a P value of ≤10−5, an identity of ≥35%, and match lengths of at least 75% of the length of both query and subject sequences were considered nonunique. These comparisons were used to generate match tables (see Supplemental Table 6 [http://www.tigr.org/tdb/staphylococcus; all supplemental tables are at this website]). Single nucleotide polymorphisms (SNPs) were identified by comparing the genome of S. aureus COL to those of S. aureus N315, Mu50, and MW2 and by comparing the genome of S. epidermidis RP62a to that of S. epidermidis ATCC 12228 with MUMer (15). Because we did not have access to underlying sequence data for published staphylococcal genomes, identification of SNPs was based on the final draft sequence. By mapping the position of the SNP to the annotation in the S. aureus COL and S. epidermidis RP62a genomes, it was possible to determine the location of the SNP (intergenic versus intragenic) and its effect on the deduced polypeptide (synonymous versus nonsynonymous). For each deduced polypeptide, the degree of relatedness across strains was calculated by using a BLAST score ratio. The BLASTP raw score was obtained for the alignment against itself (REF_SCORE) and the most similar protein in the query strains (QUE_SCORE). Scores were normalized by dividing the QUE_SCORE for each query genome by REF_SCORE. Normalized scores were plotted as xy coordinates.
A comparative database of all staphylococcal ORFs was generated for position effect determination by identifying all matches among the six sequenced genomes by a BLAST-Extend-Repraze (BER) search (P < 0.1; bit score > 50). These BER matches were then run through Position Effect software (TIGR) to determine conservation of gene order. The query and hit genes from each match were defined as anchor points in gene sets composed of adjacent genes, with up to 10 genes upstream and downstream from each anchor gene used in creating the gene sets. An optimal alignment between the ordered gene sets was calculated by using percent similarity from BER and applying a linear gap penalty of 100. Positive-scoring optimal alignments containing gene sets of four or more matching genes were stored in the database.
Nucleotide sequences for S. aureus COL (accession numbers CP000046 for the chromosome sequence and CP000045 for the plasmid sequence) and S. epidermidis (accession numbers CP000029 for the chromosome sequence and CP000028 for the plasmid sequence) have been deposited at GenBank. The genome sequences and the annotation of the TIGR-sequenced strains are available in the TIGR Comprehensive Microbial Resource at www.tigr.org. The S. aureus COL SAXXXX and S. epidermidis RP62a SEXXXX locus numbers are listed as SACOLXXXX and SERPXXXX, respectively, in GenBank.
S. aureus is one of the leading causes of infectious disease in hospital settings and, recently, is an increasing cause of disease in the community (45, 48). Since the emergence of MRSA in the 1970s, S. aureus has continued to acquire additional antimicrobial resistance factors to the point where some isolates are resistant to more than 20 different antimicrobial agents (40). Development of antimicrobial resistance factors along with additional virulence factors and their movement through this species have likely occurred through gene transfer mediated by mobile genome islands, bacteriophage, plasmids, transposons, and insertion sequences (IS). The most recent example of such gene movement is the acquisition of the Enterococcus faecalis Tn1546 vancomycin resistance element by plasmid-mediated transfer into S. aureus (52). S. epidermidis, the less-virulent member of this genus frequently associated with hospital-acquired and biomedical device infections, has also acquired multiple resistance factors through similar processes. It is likely that gene transfer among multiple members of the staphylococcal species is a frequent event, allowing for adaptation to shifting host environments.
General genome features of S. aureus COL and S. epidermidis RP62a, along with those of S. aureus N315 (28), Mu50 (28), and MW2 (3) and S. epidermidis ATCC 12228 (54) are presented in Supplemental Table 1 (http://www.tigr.org/tdb/staphylococcus). The genome sequences of two clinical S. aureus isolates, MSSA476 and MRSA252, were published (21) just prior to submission of the manuscript and were not included in our whole-genome comparisons. Significant aspects of MSSA476 and MRSA252 are, however, included within the following results and discussion. Whole-genome analysis indicated that the genomes of S. aureus and S. epidermidis are syntenic throughout a well-conserved core region (data not shown), with differences the result of genomic elements including genome islands (νSa, νSe, SSCmec, and staphylococcus cassette chromosome [SSC]-like elements), integrated prophage, IS elements, composite transposons, and integrated plasmids (Table (Table1)1) which are associated with disease and virulence. These genomic elements make up approximately 7% of the S. aureus COL genome and 9% of the S. epidermidis RP62a genome, percentages that are similar to those for other gram-positive pathogens, such as group A streptococcus (~10%) (4) but lower than that for Enterococcus faecalis (25%) (39).
Seven pathogenicity genomic islands (νSa), in positions conserved across all sequenced genomes, have been identified in S. aureus (Table (Table11 and Fig. Fig.1).1). These islands carry approximately one-half of the S. aureus toxins or virulence factors, and allelic variation of these genes, along with presence or absence of individual νSa, contributes to the pathogenic potential of this species (Table (Table2).2). For example, island νSa3 is unique to S. aureus MW2 and carries allelic forms of enterotoxin genes sel2 and sec4, which may contribute to its increased virulence. On the other hand, S. aureus MRSA252 (21) has a novel island, SaPI4, that contains homologs of pathogenicity proteins found in previously characterized νSa1 and νSa2 islands (Table (Table1)1) but does not carry known virulence genes. Our analysis identified a novel genomic island, νSaγ (νSeγ), that is found in all S. aureus and S. epidermidis genomes (Table (Table11 and Fig. Fig.2).2). The S. epidermidis νSeγ allele contains genes for a cluster of four members of the phenol-soluble modulin (PSM) family, a potential virulence factor of S. epidermidis (38, 50). The S. aureus νSaγ allele contains a cluster of two PSM genes and a small secondary cluster of exotoxin genes similar to those in νSaα. Our analysis of S. epidermidis RP62a and ATCC 12228 also identified two integrated plasmids, νSe1 and νSe2 (Table (Table11 and Fig. Fig.2;2; Supplemental Table 2), which contain prophage integrase genes in a structure similar to that for S. aureus genome islands. While neither νSe1 or νSe2 carries virulence factors found in S. aureus, the νSe1 island in RP62a contains genes for cadmium resistance and the νSe2 island in ATCC 12228 encodes a second strain-specific sortase (encoded by srtC) not found in other staphylococci and two strain-specific LPXTG cell surface attachment proteins with likely roles in adhesion to host tissue (Supplemental Table 2).
Five types of integrated prophage were identified, with at least one phage in every genome except that of S. epidermidis ATCC 12228 (Table (Table11 and Fig. Fig.1).1). In S. aureus COL, a L54-like phage (30), which we have named COL, was integrated near the 3′ end of the lipase gene (geh). The Sa3 phage, which is integrated into the beta-hemolysin gene (hlb) of S. aureus N315, Mu50, MW2, MRSA252, and MSSA476, was not found in S. aureus COL. A single Bacillus subtilis SPβ-like phage (29) was identified in S. epidermidis RP62a (Table (Table1;1; Supplemental Table 3; see Fig. S1 in the supplemental material), where it is inserted in att sites within yeeE. Comparative genome hybridization of multiple S. epidermidis clinical isolates (S. Gill, unpublished data) shows that acquisition of the SPβ-like phage is unique to RP62a and likely a recent event. The SPβ-like phage is a mosaic structure carrying multiple staphylococcal IS elements and genes encoding a staphylococcal nuclease and an RP62a-specific LPXTG surface protein, indicating that multiple recombination events have likely occurred following entry of the phage into RP62a.
Three types of SCCmec islands (types I, II, and IVa) (23, 24) were previously identified among the S. aureus COL, N315, Mu50, and MW2 genomes (Table (Table1;1; Supplemental Table 4; see Fig. S2 in the supplemental material). The SSCmec islands are characterized by a set of site-specific recombinase genes (ccrA and ccrB) which promote site-specific integration into an att site within orfX and a mecA gene which encodes resistance to methicillin (1, 23). Our analysis of S. epidermidis RP62a identified a type II SSCmec which is 98% identical at the nucleotide level and identical in gene organization (with the exception of the region from pUB110 flanked by IS431mec) to that of the S. aureus type II SSCmec. Acquisition of additional transposon and IS elements, such as Tn554, by SSCmec types II and III corresponds with the need for S. aureus to survive the increased use of antibiotics in clinical environments. Previous structural analysis of identified SSCmec elements suggests that the ccrAB genes may form an independent mobile SSC element that mediates staphylococcal interspecies transfer of antimicrobial or virulence genes (25, 26). Our analysis of S. epidermidis ATCC 12228 has identified such an SSC element, named SSCpbp4 (also identified by Mongkolrattanothai et al. ), which lacks mec but which contains two pairs of ccrA1 and ccrB genes along with multiple IS elements, a restriction-modification system (hsdS and hsdM), and genes encoding penicillin binding protein 4 (pbp4) and resistance to mercury and cadmium (see Fig. S1 in the supplemental material). The presence of two ccrAB pairs and multiple putative att sites in addition to orfX suggests that SSCpbp4 is the result of two independent insertion events. The existence of SSCpbp4 in S. epidermidis and a novel SSCmec-like element (SSCfar) in the genome of MSSA476 (21) suggests that similar SSC elements capable of transferring virulence factors between S. aureus and S. epidermidis may already exist within these species.
The seven types of IS elements identified in S. aureus and S. epidermidis are randomly distributed throughout their genomes (Supplemental Table 1; Fig. Fig.1).1). A new staphylococcal IS element, ISSep1, was identified in both S. epidermidis genomes. Composite transposons Tn554 and Tn4001 were identified in S. aureus N315 and Mu50 and S. epidermidis RP62a, respectively, but not in S. epidermidis ATCC 12228 or S. aureus COL.
Multiple copies of the GC-rich STAR (S. aureus repeat element) signature sequence (11) were found dispersed throughout intergenic regions of the S. aureus and S. epidermidis genomes (Fig. (Fig.1).1). STAR elements are more abundant in S. aureus, but in neither species are they associated with regions of atypical genome composition or with predicted mobile genes. A single copy of the extragenic CRISPR (clustered regularly interspaced palindromic repeats) DNA repeat element (20, 21) was identified near the dnaA gene at the replication origin of the S. epidermidis RP62a genome (Supplemental Table 5; Fig. Fig.1).1). CRISPRs were not identified in S. epidermidis ATCC 12228 or in the S. aureus genomes.
A comparison of the six staphylococcal genomes against each other revealed (i) a total of 454 species-specific genes that are common to the S. aureus COL, N315, Mu50, and MW2 genomes but not found in S. epidermidis RP62a or ATCC 12228, (ii) a total of 286 species-specific genes that are common to the S. epidermidis RP62a and ATCC 12228 genomes but not found in the S. aureus genomes, (iii) 332 strain-specific genes that are found in S. epidermidis ATCC 12228 but not in S. epidermidis RP62a, and (iv) 346 strain-specific genes that are found in S. epidermidis RP62a but not in S. epidermidis ATCC 12228 (Supplemental Table 6). A core set of 1,681 genes common among all strains and both species was also identified (Supplemental Table 6). The majority of the unique genes can be accounted for by the presence or absence of prophage and genomic islands. For example, the 127-kb SPβ-like prophage in S. epidermidis RP62a (Table (Table1;1; Fig. Fig.1;1; see Fig. S1 in the supplemental material) represents approximately 5% of the RP62a genome. Similarly, νSa1 in S. aureus COL represents 0.5% of the genome and carries the staphylococcal enterotoxin B gene (seb), a major virulence factor.
Comparative analysis of the S. aureus isolates suggested variations in the evolutionary history of the pathogenicity islands, some of which appear to have been created as a result of integration and subsequent mobilization of resident prophage into other members of this species (31, 42). Movement of these islands, such as mobilization of νSa1 (SaP1) by phage 80α (31, 42), into multiple S. aureus isolates may enable them to evolve and grow through the acquisition of additional virulence genes. For example, of the seven identified pathogenicity islands (Table (Table1),1), vSa1 and vSa2 share conservation in gene order in strains COL and MW2, respectively. In S. aureus COL, however, SEB (seb) is encoded in the same position as the toxic shock syndrome toxin (encoded by tsst) in MW2, suggesting either gene displacement or independent acquisition events through phages. In νSa4, COL and MW2 share an att site into which four ORFs, including a phage integrase gene and remnants of phage have inserted. The same att site in N315 and Mu50 is occupied by a νSa4 which contains additional genes, including the enterotoxin K (sel), enterotoxin C (sec3), and tsst genes. The similarity of the phage integrases suggests that the same phage integrated in all strains but that the sek, entC, and tsst genes have been lost from COL and MW2.
Genome islands as vectors of virulence determinants in the staphylococci contrasts with what is observed in the other low-GC-content gram-positive pathogens for which we have complete genomes available. For example, the Listeria monocytogenes genomes demonstrate a high degree of synteny, with variations in the genomes due to extensive SNPs (35). This suggests that the adaptation to infectious diseases in this species relies on small but specific genomic differences. In Bacillus anthracis, the majority of differences across strains are also in SNPs (but to a smaller degree than in Listeria's) and in the presence or absence of the anthrax toxin genes carried on plasmid pX02. By comparison, S. aureus seems to demonstrate variable capabilities of virulence, depending on a combination of both genome islands in the form of phage and pathogenicity islands, as well as the presence of SNPs (see below). These differences may be the most significant factor contributing to the successive acquisition of resistance, as well as virulence factors.
Acquisition of virulence factors also appears to occur as a result of plasmid-mediated gene transfer between staphylococci and other low-GC-content gram-positive pathogens. For example, our analysis of the S. epidermidis RP62a and ATCC 12228 genomes revealed the presence of a cap operon (capABC) and gamma-glutamyl transpeptidase gene (Fig. (Fig.3)3) similar to that found on the B. anthracis pX02 plasmid, where it encodes the polyglutamate capsule, which is essential for B. anthracis virulence. Experimental verification of a functional polyglutamate capsule in S. epidermidis remains to be done, but polyglutamate may play a role in the formation of S. epidermidis biofilms. Phylogenetic analysis of cap genes in the operon (Fig. (Fig.3)3) indicates that the acquisition of this locus may have been the result of a plasmid-mediated transfer event from an ancestor of the bacilli to S. epidermidis. However, note that a number of species-specific metabolic functions, such as acetoin dehydrogenase and polyphosphate synthesis, that are encoded by complete operons in S. epidermidis could also be the result of gene loss by a common ancestor.
A total of 22,888, 22,160, and 19,599 SNPs were found in the genomes of S. aureus Mu50, N315, and MW2, respectively, compared to that of strain COL. Of these SNPs, 6,447, 6,088, and 5,390 resulted in a nonsynonymous (NS) change in amino acid sequence in strains Mu50, N315 and MW2, respectively (Table (Table33 and Fig. Fig.1).1). A total of 10,297 SNPs were found in the genome of S. epidermidis ATCC 12228 compared to that of strain RP62a. Of these SNPs, 2,579 resulted in an NS change in amino acid sequence. In S. aureus, SNPs are clustered in genome islands and, when these are grouped by function, it is found that there are a higher number of SNPs making up the cell envelope than performing other functions (~20% of total SNPs for Mu50, N315, and MW2) (Fig. (Fig.1;1; see Fig. S3 in the supplemental material). In S. epidermidis, although the majority of SNPs are found in genes for hypothetical proteins, a significant number (12.5% of the total SNPs for ATCC 12228) are in genes encoding proteins with cell envelope functions (Fig. (Fig.1;1; see Fig. S4 in the supplemental material). Variations in cell envelope or surface proteins, such as the LPXTG/NPQTN proteins and fibronectin binding proteins (encoded by fnbA and -B) likely reflect their immunogenicity and the high level of protective antibodies against these proteins which are present in human sera (18). Changes in amino acid sequence within highly immunogenic domains of these proteins may enable the bacteria to evade attack by the immune system.
Although much is known about staphylococcal virulence, very little is known about the metabolism of staphylococci. Previous studies on the metabolism and physiology of these organisms have been limited, but the complete genome sequence has allowed for an increased understanding of the basic biology of these species. In addition to the previously identified pathways for the synthesis of various amino acids, we have identified pathways (Fig. (Fig.4)4) for the synthesis of the amino acids leucine, valine, aspartate, isoleucine, glycine, and methionine. Pathways that would enable growth on a range of simple and complex sugars via the glycolytic pathway, the phosphate pathway, and the tricarboxylic acid cycle were also identified. The mevalonate pathway, required for the synthesis of isopentenyl-PP, essential for cell wall biosynthesis, as well as menaquinones and ubiquinones, needed for electron transport, were also identified.
S. aureus is primarily an inhabitant of mucous membranes, and S. epidermidis is primarily an inhabitant of the skin surface; in both environments the organisms are likely to encounter osmotic stress. With respect to transport, S. aureus and S. epidermidis possess seven and eight predicted sodium ion/proton exchangers, respectively. Both organisms are well adapted for osmotic stress, with six transport systems for proline, glycine betaine, or other probable osmoprotectants (Fig. (Fig.4).4). Other transporters related to osmoregulation include the MscL and MscS mechanosensitive ion channels and two Trk potassium ion channels.
Probably the major difference between S. aureus and S. epidermidis in terms of transport is the absence of three PTS sugar transporters, for mannitol, sorbitol, and pentitols, and an ABC family maltose transporter from S. epidermidis. Both species have a variety of transporters for inorganic cations and anions (Fig. (Fig.4),4), but it appears that iron acquisition is a serious priority. Six complete or partial ferric iron ABC uptake systems, four additional orphan ferric iron binding proteins, and two ferrous iron FeoB uptake systems were identified in S. aureus. In comparison, there are three complete or partial ferric iron ABC uptake systems, two additional orphan ferric iron binding proteins, and two ferrous iron FeoB uptake systems in S. epidermidis. Both staphylococcal species include a large number of predicted drug efflux systems, including the previously described NorA fluoroquinolone transporter and a probable ortholog of the Lactococcus lactis LmrP multidrug efflux protein.
Genome-wide analysis of the six staphylococcal genomes revealed that approximately 11 and 7% of the total ORFs are predicted to encode cell surface proteins (Table (Table4)4) and secreted virulence factors (Table (Table2),2), respectively. Many surface proteins with essential roles in host colonization, biofilm formation, and evasion of host defense mechanisms have a common C-terminal LPXTG/NPQTN cell wall attachment motif and companion sortase processing enzymes (encoded by srtA and -B), which are conserved in all gram-positive bacterial pathogens (33, 44, 47). Our analysis of the LPXTG/NPQTN surface proteins revealed that only the accumulation-associated protein (encoded by aap) and most members of the Sdr gene family (sdrCDEFG) are functional homologs in both species. Those unique to each species likely reflect key differences in host tissue specificity and multifactorial adherence mechanisms used by S. aureus and S. epidermidis. S. epidermidis ATCC 12228 encodes a novel third sortase (encoded by srtC) not found in other staphylococci and most closely related to sortases of L. lactis and Streptococcus suis. Many of the secreted proteins (Table (Table2)2) have roles in multiple mechanisms for invasion of host tissue and evasion of host defense systems. The relative abundance of virulence factors in S. aureus compared to S. epidermidis reflects the propensity of S. aureus to cause fulminant and sometimes life-threatening infections, as opposed to the more subacute or chronic infections caused by S. epidermidis. For example, members of the enterotoxin and exotoxin (53) families (Tables (Tables11 and and2)2) which function as superantigens and inducers of a proinflammatory cytokine response are unique to S. aureus and have not been identified in characterized isolates of S. epidermidis.
The most likely candidate for a bona fide virulence factor in S. epidermidis is the family of small cytokine-stimulating peptides (22 to 44 amino acids in length) previously identified as PSM (38) (Table (Table2).2). Members of the PSM family are present in other staphylococci, including S. aureus, but our analysis has revealed that they are more numerous in S. epidermidis, where they appear to have expanded as a result of gene duplication within the νSeγ genome island (Fig. (Fig.22).
Expression of staphylococcal virulence factors and cell surface adhesion proteins is regulated by two previously identified regulatory loci, the accessory gene regulator locus (agrABCD) (36) and the staphylococcal accessory regulator family (sarA, etc.) (8), which respond to environmental or host stimuli through a quorum-sensing mechanism to coordinate adherence, tissue breakdown, and further invasion. Our analysis has identified 15 additional two-component regulatory systems (Supplemental Table 7) that are similar to agr and conserved in both S. aureus and S. epidermidis, which is surprising considering the differences in adhesins and virulence factors expressed by these species. Identification of possible functional homologs of the agr locus in the genomes of Clostridium acetobutylicum, Enterococcus faecium, Enterococcus faecalis, Lactobacillus plantarum, and Listeria monocytogenes suggests a conservation of regulatory and quorum-sensing mechanisms among the low-GC-content gram-positive pathogens.
Our comparison of the S. epidermidis genomes revealed that a key difference between the biofilm-nonproducing ATCC 12228 type strain and the biofilm-producing RP62a is the presence of the intercellular adhesion locus (icaABCD) and the cell wall associated biofilm protein (Bap) or Bap homologous protein (Bhp). The ica locus, which encodes the polysaccharide intercellular adhesin protein with a key role in biofilm formation and bacterial accumulation on host surfaces, is present in biofilm-associated isolates, such as S. epidermidis RP62a, but is frequently absent in commensal isolates, such as ATCC 12228 (54). Bap was previously identified in bovine mastitis S. aureus isolates, where it has key roles in adherence to polystyrene surfaces, intercellular adhesion, and biofilm formation (12, 13, 49), but has not been found in human clinical S. aureus isolates. However, Bhp was identified in S. epidermidis RP62a, where it may have a function similar to that of the Bap homolog. Homologs of Bap and Bhp were identified in other bacteria, including the enterococcal surface protein or Esp in Enterococcus faecalis, where it plays a similar essential role in biofilm formation (46). The Esp, Bap, and Bhp surface proteins may play functional roles in biofilm formation among mixed populations of these bacteria, as documented by the recent transfer of the vancomycin conjugative transposon, Tn1546, from clinical isolates of Enterococcus faecalis to S. aureus (52).
The most significant observation from our study was evidence for gene transfer between the staphylococci and bacilli. The cap operon, encoding the polyglutamate capsule, a major virulence factor in B. anthracis, has integrated in the genomes of both S. epidermidis RP62a and ATCC 12228, likely as a result of plasmid-mediated gene transfer between the two genera. Evidence of active gene transfer and movement of mobile elements between the staphylococci and other low-GC-content gram-positive bacteria is suggestive not only of continued evolution of virulence and resistance in S. aureus but also the transition of S. epidermidis from a commensal pathogen to a more aggressive opportunistic pathogen through the acquisition of additional virulence factors. Evidence of S. epidermidis strains producing enterotoxin C (49) indicates that this gene movement has already occurred and leads us to propose the genome sequencing of additional S. epidermidis clinical isolates to examine the role of gene transfer in the evolution of staphylococcal virulence.
This work was supported in part by NIH grants U-01AI45667 and R01-AI43567 (S. R. Gill).
We thank Michael Heaney, Susan Lo, Michael Holmes, Vadim Sapiro, Robert Strausberg, Owen White, William C. Nelson, Jeremy D. Peterson, and Tanja Davidson at TIGR for support with various aspects of this project and also Brian Wilkinson (Illinois State University) for providing the S. aureus COL isolate used for sequencing.
†Supplemental material for this article may be found at http://jb.asm.org/.