Overall comparison of coa genes among 10 serotypes.
We aimed to compare the genes of all 10 staphylocoagulase serotypes that have been identified by using specific neutralizing antibodies. By the time we started this study, the nucleotide sequences of four types of coa
genes had already been reported. Nucleotide sequences of the type I coa
of S. aureus
BB, type II coa
of S. aureus
213, and type III coa
of S. aureus
8325-4 were reported individually (21
), and nucleotide sequences of type II, III, and VII coa
genes were available from the whole genome sequences of S. aureus
N315, S. aureus
NCTC 8325, and S. aureus
). We therefore determined nucleotide sequences of coa
in six serotypes (types IV, V, VI, VIII, IX, and X) using the corresponding staphylocoagulase reference strains. In addition to the nucleotide sequence of type I coa
of S. aureus
strain BB, which has already been reported, we also determined the nucleotide sequence of coa
in type I reference strain S. aureus
104 to investigate the phylogenetic relations between coa
and other chromosomal regions.
Table shows the characteristics of coa
in the 10 serotypes. The lengths of these genes are rather diverse. They ranged from the shortest size of 1,902 bp (type VII) to the longest size of 2,280 bp (type VIII), and the average length was 2,034 bp. The numbers of 81-bp tandem repeats located at the 3′ terminus were diverse, too. They ranged from the smallest number of repeats, five (NCTC 8325, MW2, and 19), to the largest number of repeats, nine (Ku). The differences in nucleotide sequences of the 81-bp tandem repeats and the number of tandem repeats were reported and have been used for epidemiological markers as well (42
). When we assigned alleles defined by nucleotide differences in the 81-bp repeat units that constitute the tandem repeat, 36 of 63 were new alleles (V3 to G5 in Table ). The average GC content was 35.7%, which was slightly higher than that of the whole genome sequence of S. aureus
N315 (29%) (28
). Table shows overall identities of the nucleotides and their deduced amino acid sequences of 10 coa
genes. The nucleotide identity was 77.1% on average, and amino acid identities and similarities were 75.3% and 79.8%, respectively. The highest value of amino acid identity was 88.4% (type IV versus type IX), and the lowest was 69.5% (type V versus type VII).
Comparison of 10 staphylocoagulase genes
Nucleotide and amino acid identities among all serotypes
Nucleotide sequences of the same serotype were highly conserved. The identities among three type II coa genes of strains 213, N315, and Mu50 (type II) were 99.9 to 100%. Two type III coa genes of 8325 and COL were exactly identical. The identities between the two type IV coa genes of Stp-28 and MRSA252-2 and those between the two type VII coa genes of MW2 and MSSA476 were 97.2% and 99.9%, respectively, although the number of tandem repeats was not identical, since Stp-28 possessed six repeats whereas MRSA252-2 possessed four repeats.
It was reported that staphylocoagulase could bind fibrinogen in addition to prothrombin (6
). Cheung et al. purified a protein that binds to fibrinogen by using a fibrinogen column, and they designated it FbpA (9
). However, after nucleotide sequence determination of the gene (fbpA
), they found that its structure was very similar to coa
, and they reported that it should belong to the coa
family. When we compared the deduced amino acid sequence of FbpA to that of the 10 staphylocoagulase serotypes, we found that FbpA is nearly identical to type X staphylocoagulase, with an identity of 98.0%. The nucleotide identity between fbpA
and type X coa
Structural comparison of staphylocoagulases among 10 serotypes.
Figure shows the multiple alignments of the deduced amino acid sequences of staphylocoagulases of all 10 serotypes. We found that all deduced amino acids were composed of six regions as previously reported by Fridrich et al. (12
). The six regions are as follows: (i) 26-amino-acid signal sequence, (ii) N-terminal D1 region, which contains the prothrombin-activating site and the prothrombin-binding sites, (iii) D2 region, which is required for high-affinity binding of prothrombin through pro-exosite I, (iv) highly conserved central region, (v) 27-amino-acid repeat region, and (vi) 5-amino-acid C-terminal sequence.
FIG. 1. Deduced amino acid sequences of 10 serotypes of staphylocoagulase compared using the CLUSTAL X program. The corresponding nucleotide sequences were determined using DNA fragments of the following strains: 104 (type I coa), N315 (type II), NCTC 8325 (type (more ...)
The N-terminal 26 amino acids of the primary translated products, which are signal sequences, were exactly identical among all serotypes. The first seven amino acids of the secreted mature forms of staphylocoagulase, including two amino acids required for prothrombin activation, were also identical among all serotypes. However, thereafter up to about 300 amino acids the amino acid sequences diverged. The means of amino acid identities and similarities of the D1 regions other than the first seven amino acids among all serotypes were 52.8% and 62.9%, respectively. Those of D2 regions were 60.2% and 69.3%, respectively. Although the identities of amino acid sequences in the D1 and the D2 regions were rather low, amino acid residues that might interact with the surface of prothrombin were highly conserved among them. Interestingly, we found that there were three pairs of serotypes whose amino acid identities in the D2 region were rather high (nearly 90%). They were the pairs of type I and type X (87.1%), type II and type III (88.2%), and type IV and type IX (88.9%).
The central regions located between the D2 region and the tandem repeats were relatively more conserved, and the averages of amino acid identity and similarity were 86.7% and 88.0%. Three strains had insertions that were not carried by other types. They were as follows: 18 amino acids (GTQGKIVGRSKYPTMEQH) in the type V staphylocoagulase of strain no. 55, 18 amino acids (TIQGVTAEGPKYPTMEQH) in the type VIII staphylocoagulase of strain Ku, and 11 amino acids (SVTLPSITGES) in the type X staphylocoagulase of strain 19.
Tandem repeats composed of 27 amino acids were located at the C terminus in all staphylocoagulases we investigated. The number of tandem repeat units varied, but the 27-amino-acid sequences were relatively conserved. The average amino acid identities among the 10 serotypes was 92.9%. The C-terminal 5-amino-acid sequences were completely identical among all serotypes.
Structural relation to vWbp. S. aureus
produces another blood coagulation-causing protein, von Willebrand factor-binding protein (vWbp). Its overall structure is very similar to that of staphylocoagulase, and it is classified into the same zymogen activator and adhesion protein family as staphylocoagulase (3
). The deduced amino acids of two regions, called D1 and D2 of vWbp, showed homology to D1 and D2 regions in type I staphylocoagulase, with identities of 30.7% and 28.3%, respectively. Interestingly, vWbp is allelic, too (4
). Among seven S. aureus
strains whose genome sequences have been published (2
), the gene (vwb
) is not always intact. It was truncated in three strains, NCTC 8325 (staphylocoagulase type III), MW2 (staphylocoagulase type VII), and MRSA252-2 (staphylocoagulase type IV). When we compared vwb
genes in four other strains, two alleles were found. The first allele was found in two strains, COL (staphylocoagulase type III) and MSSA476 (staphylocoagulase type VII), and the second one was found in N315 (staphylocoagulase type II) and Mu50 (staphylocoagulase type II). The vwb
gene in strain Newman (3
) and two truncated vwb
genes in NCTC 8325 and MW2 belonged to the first allele. The amino acid identity and similarity of vWbp between the two alleles represented by S. aureus
N315 and S. aureus
COL were 70.9% and 76.3%, respectively. The amino acid identities between them are rather low in D1, D2, and the central region, with identities of 56.9%, 53.9%, and 59.7% and similarities of 66.2%, 58.6%, and 70.1%, respectively, whereas the identities and similarities are very high in the vWb binding site and the C-terminal region, with identities of 96.2% and 96.0% and with similarities of 96.2% and 97.6%, respectively. These regions showing low homology are similar to those of staphylocoagulase.
coa flanking regions are highly homologous.
Determination of the entire genome sequences of all seven S. aureus strains showed that coa is located at approximately 1 o'clock on the chromosome. To investigate whether all coa genes are located at the same position, we determined the nucleotide sequences of coa flanking regions in strains whose genomes had not been determined. The strategy and results are summarized in Fig. . In the case of S. aureus N315, its coa is located between two open reading frames (ORFs), SA0221 (encoding a hypothetical protein) and SA0223 (encoding an acetyl-coenzyme A acetyltransferase homologue). By using the sets of primers indicated in Fig. , two types of DNA fragments, each spanning the upstream and downstream regions of coa, respectively, were amplified by PCR, using chromosomal DNA of the seven reference strains as templates, and sequenced. All 10 strains carried two ORFs corresponding to SA0221 and SA0223 that are located at upstream and downstream of coa. Nucleotide identities of the regions corresponding to the two ORFs are shown in Fig. . They were highly homologous in nucleotide identities, with averages of 98.4% and 99.0%, respectively. In addition, noncoding regions flanking coa were well conserved (Fig. ). Our data showed that the regions flanking coa are nearly identical, while coa sequences located within them are very diverse, especially in the D1 and D2 regions and 81-bp repeat regions.
FIG. 2. Organization of the region in and around coa and comparison of the nucleotide sequences of coa and surrounding regions. ORFs are shown by bold arrows. The locations of the primers used in this study are indicated by short arrows. The numbers in the boxes (more ...) Phylogenetic relationship among coa of 10 serotypes.
To investigate whether allelic differences in coa correlate with the nucleotide differences in the chromosomal region other than coa, we determined nucleotide sequences of seven housekeeping genes used for MLST. STs of 10 strains carrying 10 different serotypes of coa each were different from one another (Table ). We investigated the phylogenetic relation among the 10 strains by using the nucleotide sequences of the above seven genes. We found that the 10 strains were roughly divided into two groups (Fig. ). One group was comprised of five strains carrying type I, IV, V, VIII, and IX coa (group 1), and the other group was comprised of five strains carrying type II, III, VI, VII, and X coa (group 2). When we created a phylogenetic tree using nucleotide sequences of the upstream and downstream regions of coa, it was found that their relative relations were similar to those of the housekeeping genes (Fig. ).
FIG. 3. Maximum likelihood trees constructed using GCG paupsearch and visualized with Treeview software. Nucleotide sequences used for comparison were as follows: A, nucleotide sequencesof coa other than the repeat region and C-terminal sequence; B, regions surrounding (more ...)
In contrast, the phylogenetic relation among the 10 coa shown in Fig. was not similar to those of housekeeping genes or the flanking region shown in Fig. . The most striking difference was the position of the type I staphylocoagulase of strain 104. It was most distantly related to the other four types of coa (types IV, V, VIII, and IX), which constituted group 2 in tree A, whereas it belonged to the same group 1 together with four strains carrying types IV, V, VIII, and IX coa in the trees of Fig. . The phylogenetic relation of type III coa in strain NCTC 8325 did not agree with the phylogenetic relation of noncoding regions. In tree B, the region in NCTC 8325 was related to that in MW2 (type VII), more closely than to those in N315 (type II) and Stp-12 (type VI), whereas in trees A and C, the corresponding regions of NCTC 8325 were more closely related to those in N315 than to those in MW2.
Correlations among serotypes of coa and genetic backgrounds.
To determine the correlations between coa alleles and the genetic backgrounds of the strains, we investigated the strains carrying the same serotype of coa but with different genetic backgrounds and the strains carrying different types of coa but with identical genetic backgrounds as determined by ribotyping.
The nucleotide sequences of type IV coa in six S. aureus strains (three of them belonged to ribotype 4 and three of them belonged to ribotypes 47, 48, and 49) were highly homologous, with nucleotide identities of more than 99.9%. The difference in genotypes inferred by ribotyping was also confirmed by MLST. Three ribotype 4 strains, 85/2082, 85/3907, and 86/961, belonged to CC8, and three strains of different ribotypes, MR108, 93/H44, and 85/2232, belonged to CC30. These data suggested that the same coa types are located in S. aureus strains of different genotypes.
Furthermore, we determined nucleotide sequences of type III coa of three strains which belonged to ribotype 4 and CC8. The nucleotide identities of coa of three strains, NCTC 10442, 61/6219, and 86/4372, were highly homologous, with identities of more than 99.9%. The data suggested that different types of coa are located on S. aureus strains of the same genetic background, ribotype 4 and CC8.