|Home | About | Journals | Submit | Contact Us | Français|
Strains of Staphylococcus aureus, an important human pathogen, display up to 20% variability in their genome sequence, and most sequence information is available for human clinical isolates that have not been subjected to genetic analysis of virulence attributes. S. aureus strain Newman, which was also isolated from a human infection, displays robust virulence properties in animal models of disease and has already been extensively analyzed for its molecular traits of staphylococcal pathogenesis. We report here the complete genome sequence of S. aureus Newman, which carries four integrated prophages, as well as two large pathogenicity islands. In agreement with the view that S. aureus Newman prophages contribute important properties to pathogenesis, fewer virulence factors are found outside of the prophages than for the highly virulent strain MW2. The absence of drug resistance genes reflects the general antibiotic-susceptible phenotype of S. aureus Newman. Phylogenetic analyses reveal clonal relationships between the staphylococcal strains Newman, COL, NCTC8325, and USA300 and a greater evolutionary distance to strains MRSA252, MW2, MSSA476, N315, Mu50, JH1, JH9, and RF122. However, polymorphism analysis of two large pathogenicity islands distributed among these strains shows that the two islands were acquired independently from the evolutionary pathway of the chromosomal backbones of staphylococcal genomes. Prophages and pathogenicity islands play central roles in S. aureus virulence and evolution.
Staphylococcus aureus is a human pathogen that causes both nosocomial and community-acquired infections. The emergence of strains resistant to many antibiotics (methicillin-resistant S. aureus [MRSA]) and of highly virulent community-acquired MRSA that can cause fatal infections such as necrotizing pneumonia is of considerable concern even in countries with well-developed health surveillance systems (24, 30). In order to study mechanisms of staphylococcal antibiotic resistance and virulence, whole genome sequences of several different S. aureus strains have been determined. MRSA strains N315 and Mu50 were the first staphylococcal genomes to be sequenced (18), which were followed by nine additional strains (1, 5, 9, 10, 13, 14). All staphylococcal genomes are approximately 2.8 Mbp in size with a relatively low G+C content. Comparative analysis revealed that most regions of the staphylococcal genome are well conserved, whereas several large sequence blocks display high variability. S. aureus strains likely acquired these genomic islands horizontally and, at least initially, their integration into the genome must have required dedicated DNA recombination (integrase) genes. Furthermore, variable blocks of genome sequence frequently carry virulence and antibiotic resistance determinants that aid in the development of staphylococcal diseases. Variable regions can be classified as prophages, pathogenicity islands, or staphylococcal cassette chromosomes. The overall combination of variable sequence elements and the encoded spectrum of virulence properties varies from strain to strain and appears to be reflective of the overall large spectrum of clinical disease manifestations in humans (1, 2).
S. aureus strain Newman was isolated in 1952 from a human infection (6) and has been used extensively in animal models of staphylococcal disease due to its robust virulence phenotypes. Thirty genes that are required for staphylococcal pathogenesis were identified in S. aureus Newman after a screen of 1,736 bursa aurealis mutants with transposon insertions in different genes. Both well-characterized virulence genes and genes with unknown function were shown to be involved in the pathogenesis of staphylococcal infections (4). Additional benefits of systematic insertional mutagenesis are the identification of genes that are dispensable for staphylococcal growth under laboratory conditions. Subsequent work identified four prophages, NM1 to NM4, in the genome of strain Newman genome. Indeed, six paralogous groups of virulence determinants that were identified via bursa aurealis mutagenesis are encoded by these prophages (3). S. aureus Newman variants that lacked either NM3 or NM1, NM2, and NM4, or all four prophages (NM1 to NM4) displayed dramatic reductions in their ability to form organ specific abscesses after intravenous infection of mice, suggesting that the prophages NM1 to NM4 play important roles during the pathogenesis staphylococcal infections.
To further unravel molecular mechanisms of the physiology and pathogenesis of disease caused by S. aureus Newman, all of its genes must be known. This experimental goal was achieved, as we report here the complete genome sequence of S. aureus Newman. In contrast to other staphylococcal strains, which carry some virulence genes in mobile pathogenicity islands or genomic islets, virulence determinants of S. aureus Newman strain are conspicuous in prophages (2). Strain Newman carries a similar combination of major pathogenicity islands, νSaα and νSaβ, as S. aureus strains COL, NCTC8325, and USA300. In contrast to hospital-acquired MRSA, S. aureus Newman harbors only a small number of insertion sequences (IS) and lacks known antibiotic resistance determinants.
The whole-genome sequence was determined as described previously (1, 18). Shotgun sequencing was carried out by Hitachi High-Tech Fielding Co. (Tokyo, Japan) and Takara Bio, Inc. (Otsu, Japan). Sequences were also assembled as described previously. We have entered the whole-genome sequence of Newman in the DNA Database of Japan, with accession number AP009351.
Determination of open reading frames, structural RNAs and annotations were performed as described previously (1, 18). Briefly, open reading frames were initially extracted with Genome GAMBLER program (Xanagen, Kawasaki, Japan) based on GLIMMER and rbsfinder software. The predicted open reading frames were then individually reviewed with GAMBLER. We searched a nonredundant protein database with the determined open reading frames using BLAST software for annotation. tRNA and tmRNA genes were identified by tRNAscan-SE (22) and with web-based software (http://www.indiana.edu/~tmrna/), respectively. Illustration of G+C contents on a genome map was drawn by using Insilico molecular cloning software (Insilico Biology, Yokohama, Japan). This software was also used for comparative genome analysis among strains.
Sequences of S. aureus strains MW2 and N315 (accession numbers are BA000033 and BA000018, respectively) were used for whole-genome comparative analysis with strain Newman. The genome sequences of strains Mu50 (BA000017) (18), NCTC8325 (CP000253) (10), COL (CP000046) (9), MRSA252 (BX571856) (14), MSSA476 (BX571857) (14), RF122 (AJ938182) (13), USA300 (CP000255) (5), JH1 (CP000736) (A. Copeland et al., unpublished data), and JH9 (CP000703) (Copeland et al., unpublished) were also used for comparison among pathogenicity islands νSaα and νSaβ.
From genomes sequences of 10 different S. aureus strains nucleotide sequences of carbamate kinase, shikimate dehydrogenase, glycerol kinase, guanylate kinase, phosphate aceryltransferase, triosephosphate isomerase and acetyl coenzyme A acetyltransferase were used for multilocus sequence typing (MLST) analysis (7). Nucleotide sequences were combined and then aligned with CLUSTAL X (26). For phylogenic tree display, results from the CLUSTAL X calculation were visualized with TreeView (25).
The whole-genome sequence of S. aureus Newman was determined as described previously (1, 17). Briefly, shotgun cloning of S. aureus strain Newman genomic DNA allowed for DNA sequencing of random fragments and data assembly into contiguous genome segments (contigs). Sequence gaps between assembled contigs were read by series of PCRs, using primers based on predetermined sequences. We encountered assembly difficulties for genome segments surrounding prophages due to the high degree of sequence homology between phages, in particular for NM1 and NM2 (a nearly 40-kb identical sequence). This obstacle was overcome by deriving DNA sequences from isolated NM1 or NM2 phage particles that had been isolated distinctively by mitomycin treatment of strain Newman (3). Indeed, sequences of NM1 and NM2 showed clear differences only in their attachment core sequences and integrases that recognize the attachment sequence and had little uniqueness in other domains. The clear differences in the attachment sites and the integrase sequences strongly supported that NM1 and NM2 were distinct prophages from each other, and this was confirmed by identifying unique chromosomal locus where insertion of each phage occurred in contiguous sequences upon shotgun assembly along with the different integrase sequences of NM1 and NM2. In conjunction with sequences for phage integration sites, this eventually permitted assembly of a circular chromosomal DNA sequence. The length of the S. aureus Newman chromosome is 2,878,897 bp, and it encodes 2,614 open reading frames (Table (Table1)1) . Plasmids sequences were not identified in our experiments involving S. aureus Newman.
We compared the Newman genome with other S. aureus chromosomes thus far sequenced (Table (Table1).1). Although the strains are categorized as a single species, S. aureus, the chromosomes from different strains had unique features. The G+C contents did not vary drastically; however, the lengths of chromosomes differed by more than 5% when the longest chromosome from strain JH9 was compared to the shortest RF122, with lengths ranging 2.74 to 2.91 Mbp. Chromosomes were classified into two groups according to the number of rRNA genes. Importantly, the ribosomal gene numbers did not correlate with genomic island subtypes, as seen in SCCmec and νSa islands (see below). A number of genomic islands also had large varieties among different strains. At least one prophage was found in each genome, and Newman has as many as four, which is maximum number thus far identified in a genome. There was also wide variability among strains possessing other classes of islands and IS, indicating that these genetic elements play key roles in conferring chromosomal diversity to S. aureus strains.
Figure Figure11 displays a circular map of S. aureus Newman chromosomal DNA. Genomic islands including prophages and pathogenicity islands are shown as green lines. The integration of four different prophages is unique to strain Newman; for example, S. aureus MW2 and N315 harbor either two or only one prophage, respectively (Fig. (Fig.2).2). Two IS1181 insertion sequences and ten remnants of IS were found in strain Newman. One of the IS1181 was inserted into the corresponding site of IS1181-6 in strain N315 (17). Major transposons such as Tn554 were not found in the S. aureus Newman chromosome, in contrast to strain N315 with many IS, as well as five copies of Tn554 (Table (Table1).1). SCCmec (16) was not found in the chromosome of strain Newman, a finding in agreement with the observation that this strain is susceptible to methicillin and other β-lactam antibiotics (data not shown).
Open reading frames are indicated in second (found in the forward strand) and third (found in the reverse strand) circles as either red (virulence determinants) or blue (others) bars. The open reading frame orientation showed clear contrast according to movement of replication fork, which is in agreement with a tendency seen in other strains previously sequenced (1, 18). The G+C skew value distribution was also asymmetrical across the axle of replication origin termination sites. G+C contents tended to be relatively high not only in the loci where structural RNA genes were concentrated but also where genomic islands were located, a finding in agreement with the general hypothesis that horizontal gene transfer causes acquisition of the genomic islands.
The term νSa refers to nonphage and non-SCC genomic islands that are exclusively present in S. aureus, often (but not always) encode for virulence determinants, are inserted at a specific locus in chromosome, and are associated with either intact or remnant DNA recombinase (1). The feature of a νSa genomic island possessing DNA recombinase also supports the hypothesis that staphylococcal pathogenicity islands are acquired by horizontal gene transfer. Due to the allelicity of the islands among different strains, the designation of the islands based on their structures or genetic content may create multiple island names that differ from strain to strain, regardless of the fact that they are inserted in identical loci of S. aureus chromosomes. We therefore propose that the term νSa does not designate an island with specific structure or a particular genetic content in a strain, but rather a locus where the island is inserted in the S. aureus chromosome. Hence, a term indicating a specific pathogenicity island such as “SaPI” should be used as well as “νSa.”
Among all of the S. aureus strains sequenced thus far, two major pathogenicity islands νSaα and νSaβ are present in strain Newman, and these islands appear to be allelic among different strains (2) (see also Fig. Fig.33 and and4).4). The previously identified νSaγ, which has been shown to be present in all sequenced S. aureus strains, encoding exfoliative toxin and exotoxins (9), was also found in strain Newman. A fourth class of νSa island, νSa4, was located downstream of NM3. This island was inserted at the corresponding site into the chromosome of strain N315, where a νSa4 island coding for 20 open reading frames, including three superantigen genes, is present in N315. Unlike the νSa4 in N315, the one found in strain Newman lacks known virulence determinants and encodes for only integrase and three functionally unknown proteins. Therefore, the νSa4 island in strain Newman probably lacks the features of a pathogenicity island and is structurally similar to the νSa4 island in strain MW2. The difference in νSa4 in strain Newman or strain MW2 compared to that in N315 also indicates that νSa4 shows polymorphism among strains as νSaα and νSaβ.
Figure Figure22 shows a comparison of the S. aureus Newman genome with those of strains MW2 and N315. Depiction of homologous regions as dots or lines revealed that the overall size of the chromosome and the order of its genes are conserved among all three strains. Major homology gaps are caused by the insertion of four prophages into the genome of strain Newman. Similar to the insertion of NM3 in strain Newman, S. aureus MW2 and N315 also carry prophages in the hlb gene (encoding beta-hemolysin); however, sequence homology between these phages is low, and differences are readily detectable in the plots in Fig. Fig.2.2. SCCmec elements are found in strains N315 and MW2; however, this element is absent in S. aureus Newman. The observed gap size is larger in S. aureus N315 than in strain MW2, since SCCmec in strain N315 carries not only β-lactam resistance but also determinants for resistance to other antibiotics, whereas MW2 possesses only the β-lactam resistance gene mecA (23). Additional sequence gaps between staphylococcal strains are mainly due to differences in pathogenicity islands. The gap at 0.45 Mbp (Fig. (Fig.2A)2A) is due to differences in pathogenicity island νSaα between Newman and MW2, whereas the gap at 1.9 Mbp in Fig. Fig.2B2B reflects the differences in the νSaβ pathogenicity island between strains Newman and N315. These data suggest that S. aureus Newman νSaα is similar to νSaα in N315 but not to νSaα in MW2. Further, S. aureus Newman νSaβ is similar to νSaβ in MW2 but not to νSaβ in N315.
In S. aureus Newman, prophage NM3 is inserted into hlb, and similar insertions have been observed for hlb-converting phages of S. aureus strains N315, MW2, Mu50, NCTC8325, MSSA476, MRSA252, USA300, JH1, and JH9. The genetic content of hlb-converting phages is, however, variable, especially with regard to virulence determinants. For example, hlb-converting phages of S. aureus N315 and Newman carry genes for staphylococcal complement inhibitor and chemotaxis inhibitory protein (31); the latter is absent in the S. aureus MW2 prophage. In contrast, the MW2 hlb-converting phage carries genes for enterotoxins K2 and Q that are not found in phages of strains N315 and Newman. Despite these differences, the integrase gene of NM3 and those of related phages are virtually identical, suggesting that all hlb-converting phages evolved from a common ancestor. Other prophages of strain Newman, NM1, NM2, and NM4, are absent in S. aureus MW2 and N315 (Fig. (Fig.2).2). However, NM1 is inserted at the same integration site as 11 in S. aureus NCTC8325 (15), and 11 and NM1 harbor the same integrase gene. The integrase of NM4 is identical to that of L54a in S. aureus COL and, similarly, L54a and NM4 insert into the same locus (geh). Thus, it seems highly likely that site-specific integration of phages into the staphylococcal genome is associated with different classes of integrase genes.
Table Table22 summarizes major virulence-related genes found in strain Newman compared to strains MW2 and N315. Twelve additional virulence-related genes that belong to six paralogous groups identified in previous studies (3, 4) were encoded by the four Newman prophages, in addition to genes in other loci of the chromosome. The functionally unknown phage genes were conspicuously present in Newman, whereas most of them were absent in prophages of strains MW2 and N315, suggesting that the phage genes play an important role in the virulence of strain Newman.
Most of the exoenzymes and adhesins were commonly present in the three strains shown in Table Table2,2, although a lipase encoded in geh gene was truncated due to the insertion of NM4, and extra genes were present in the spl serine protease cluster in νSaβ in strain Newman. It has also been noted that fibronectin-binding proteins encoded by fnbA and fnbB genes were present in strain Newman, but they lack C-terminal cell wall sorting signals (27), as reported previously (11). The three strains in Table Table22 share the same sets of hemolysin components and LukDE leukocidin genes. However, unlike S. aureus N315 and MW2, genes for staphylococcal superantigens were not found in the genome in strain Newman with the single exception of enterotoxin A (sea), which is encoded by NM3. Since other sequenced S. aureus strains normally harbor at least two enterotoxin genes in their chromosomes (data not shown), strain Newman is characteristic in its possession of a smaller number of enterotoxin genes. In S. aureus MW2, genes for enterotoxin H (seh) and collagen-binding protein (cna) are located within genome islets (1) and Panton-Valentine leukocidin component genes, whose product is responsible for fatal outcomes in humans such as necrotizing pneumonia (20) in a prophage; however, these virulence genes are not present in S. aureus Newman.
Prophages, staphylococcal cassette chromosome, and pathogenicity islands (8, 21) are categorized as genomic islands and carry characteristic integrase genes (DNA recombinases) (1). Unlike most other genomic islands, the pathogenicity islands νSaα and νSaβ are present in all S. aureus genomes sequenced thus far; however, νSaα and νSaβ harbor only remnants of their integrase genes. It therefore seems reasonable to assume that νSaα and νSaβ are no longer mobile and that these pathogenicity islands must have played a major role in the evolution of this pathogen (2). For example, even though the genome of Staphylococcus epidermidis is highly homologous to that of S. aureus, the genomic islands νSaα and νSaβ are absent from the genome of this or any other coagulase-negative staphylococci (9, 19, 29, 33). These findings are in agreement with a general hypothesis that acquisition of νSaα and νSaβ into a primordial staphylococcal genome may have been associated with subsequent evolution of S. aureus as a major human pathogen.
νSaα and νSaβ display polymorphisms among strains and can be classified into three to four groups based on their structural differences and HsdS subtypes generated in each of the two islands. Tandem arrays of exotoxin and lipoprotein genes are characteristic features of νSaα in Newman (Fig. (Fig.3).3). νSaβ carries genes responsible for lantibiotic biosynthesis in addition to genes for leukocidin components (lukD and lukE) and a serine protease gene (spl) cluster. Compared to other strains, νSaα for Newman belongs to the same type as νSaα in strains N315, Mu50, NCTC8325, COL, JH1, JH9, and USA300 (type I in Fig. Fig.3A),3A), whereas νSaβ belongs to the same type as νSaβ in strains NCTC8325, COL, MW2, MSSA476, and RF122 (type II in Fig. Fig.3B).3B). Therefore, the combination of νSaα and νSaβ in S. aureus Newman (blue and brown ellipses in Fig. 4A and B, respectively) is the same as in strains NCTC8325, COL, and USA300 (see also Table Table1).1). It is noteworthy that different types of νSaα and νSaβ correlate with sequence variation in hsdS, whose product determines the sequence specificity of DNA methylation and restriction via staphylococcal restriction-modification systems (17). Figure Figure4,4, however, also shows that the type distribution of the two major pathogenicity islands in the strains does not correlate with phylogenic relationships among strains when a phylogenic tree image is drawn based on allelic distribution of seven essential housekeeping genes used for MLST analysis (7). In addition, the distribution of νSaα types among strains differs from that of νSaβ, suggesting that the two pathogenicity islands have been acquired by the strains independently of each other and differently from housekeeping genes that have presumably evolved in a vertical fashion.
Due to their general distribution in S. aureus strains and absence in other staphylococci, the two major pathogenicity islands are considered to play important roles in virulence for their human hosts. The molecular mechanisms that implement such putative strategies are, however, still unknown, and future work will need to unravel how pathogenicity islands are involved in staphylococcal virulence during host infection.
Following the first sequencing of S. aureus N315 (18), 11 additional S. aureus genomes have been determined and deposited into the databases. Here we add the whole genome sequence of S. aureus Newman to this rapidly growing list. Genome sequencing projects for multiple isolates of a bacterial pathogen are of considerable scientific value because the generated data reveal not only gene content but also conservation and variability between different strains and their associated human or animal diseases. Staphylococcal diversity is mainly due to polymorphisms that occur in genomic islands, which also carry many virulence and antibiotic resistance determinants. Nevertheless, some genes, such as the staphylocoagulase gene, are located outside of genomic islands and are known to be polymorphic (32). One can add to this list certain combinations of virulence genes, for example, seh (enterotoxin H) and cna (collagen-binding protein), which are present only in certain types of S. aureus strains. A hallmark of the S. aureus classification is the ability of these microbes to ferment mannitol and to produce characteristic proteins such as DNase, coagulase, and protein A. S. aureus strains differ from one another in virulence and drug resistance features that are carried in or outside of genomic islands.
Previous works (3, 4) revealed virulence genes or candidate virulence genes within four prophages that have integrated into the genome of S. aureus Newman. Our determination of the whole genome sequence for strain Newman showed that many virulence-related genes are encoded by prophages. One superantigen, staphylococcal enterotoxin A (sea), is located in NM3; however, unlike other staphylococcal strains, additional superantigen genes were not found. Furthermore, S. aureus Newman carries a small pathogenicity island but lacks known virulence genes. We also failed to identify the collagen adhesin gene that is present in strains MW2, MRSA252, and MSSA476. Therefore, it is likely that virulence caused by strain Newman largely relies on prophages, in addition to the contribution by other virulence determinants present in all S. aureus strains, and the nonprophage regions of strain Newman genome seem to form the basic backbone of pathogenic S. aureus. While en bloc transfer of virulence genes via prophages and pathogenicity islands appears to be important for S. aureus acquisition of virulence properties, stepwise incorporation of additional genes and/or mutations may play an additional role in the evolution of clones with similar, yet discretely different strategies for the pathogenesis of human disease.
As shown in Fig. Fig.4,4, analysis of two major pathogenicity islands in 12 different S. aureus genome sequences revealed that these strains do not always share the same combinations of νSaα and νSaβ classes. Moreover, the classes do not correlate with phylogenic relationship based on the allelic distribution of seven housekeeping genes upon MLST analysis (7). This clearly shows that these two pathogenicity islands were horizontally acquired and must have evolved independently of S. aureus genomes, whereas housekeeping genes are considered to evolve in a vertical fashion. Interestingly, sequences of hsdS gene products that determine the site specificity of methylation and restriction in restriction-modification systems vary depending on the type of pathogenicity islands that encodes them. The reasons why modification subunits of the R-M system are present in νSaα and νSaβ and have sequence variations remain unknown. One possible explanation is that sequence diversity in pathogenicity islands requires its distinct restriction modification site determined by HsdS: since self-DNA protection by modification system is promoted by sequence-specific methylation on DNA, sequence diversity in genomic islands should coincide with the methylation site determined by HsdS. DNA methylation of pathogenicity islands may further influence expression of the virulence gene and thereby affect the pathogenesis of infectious diseases caused by this organism. Recent studies have revealed that type I RM system activity and modification site specificity are related to changes in the surface antigenic protein in Mycoplasma pulmonis, depending on the organism's infection sites (12, 28). This suggests that the RM system in S. aureus also plays a direct role in virulence.
Some of the genes located within the major pathogenicity islands, νSaα and νSaβ, are presumed to be involved in virulence. However, their molecular contributions to pathogenicity are still unclear. It should also be noted that the presence of any one gene does not result in its expression. In order to reveal the mechanisms of virulence further, microarray experiments could be used to reveal their expression.
The overall spectrum and individual combinations of virulence genes, as they are diversely encoded by different genomic islands, appears to be the major factor in determining clinical symptoms after S. aureus infection and may even dictate the severity of diseases caused by this pathogen. Together with an analysis of transposon insertion mutants (4), our work here may provide experimental strategies for better understanding the pathogenicity and physiology of S. aureus.
This study was supported by a Grant-in-Aid for 21st Century COE, a Grant-in-Aid for Scientific Research on Priority Areas (no. 13226114), and a Grant-in-Aid for Scientific Research B (no. 14370097) from the Ministry of Education, Science, Sports, Culture, and Technology of Japan.
Published ahead of print on 19 October 2007.