|Home | About | Journals | Submit | Contact Us | Français|
Staphylococcus aureus pathogenesis depends on a specialized protein secretion system, ESX-1, that delivers a range of virulence factors to assist infectivity. We report the characterization of two such factors, EsxA and EsxB; small acidic dimeric proteins carrying a distinctive WXG motif. EsxA crystallized in triclinic and monoclinic forms and high-resolution structures were determined. The asymmetric unit of each crystal form is a dimer. The EsxA subunit forms an elongated cylindrical structure created from side-by-side α-helices linked with a hairpin bend formed by the WXG motif. Approximately 25% of the solvent accessible surface area of each subunit is involved in interactions, predominantly hydrophobic, with the partner subunit. Secondary structure predictions suggest that EsxB displays a similar structure. The WXG motif helps to create a shallow cleft at each end of the dimer, forming a short β-sheet-like feature with an N-terminal segment of the partner subunit. Structural and sequence comparisons, exploiting biological data on related proteins found in Mycobacteria tuberculosis suggest that this family of proteins may contribute to pathogenesis by transporting protein cargo through the ESX-1 system exploiting a C-terminal secretion signal and / or are capable of acting as adaptor proteins to facilitate interactions with host receptor proteins.
The commensal Gram-positive bacterium Staphylococcus aureus lives on human skin and in nostrils. Under certain conditions it represents a serious problem in hospitals especially for immunocompromised and surgery patients1 and the number of deaths attributed to S. aureus infection is comparable to that of acquired immune deficiency syndrome.2 Increasing levels of antibiotic resistance contribute significantly to this statistic and create an urgent need for the development of new treatments .3 Thankfully, current developments in this area look promising with new drug candidates progressing through clinical trials.4 Nevertheless, it would be highly desirable to have access to a vaccine to complement new drug treatments, and an understanding of the molecular determinants of virulence has the potential to underpin such a development.5
The ability of S. aureus to adhere to the surface of human cells is key to infection, subsequently leading to cell invasion and / or destruction.6 Pathogenesis is dependant upon the secretion of a wide range of exoproteins and virulence factors,7 such as fibronectin-binding proteins.8,9 To extrude proteins Gram-positive bacteria use a general secretion (Sec) pathway, which exploits an N-terminal signal sequence to assist translocation,10 and a twin-arginine translocation (Tat) pathway.7 Of particular interest in our research is a third secretion system discovered in Mycobacteria and directly involved in pathogenesis of Mycobacterium tuberculosis 11 and S. aureus.12
Two virulence factors of M. tuberculosis, termed ESAT-6 (Early secreted antigenic target-6 kDa) and CFP-10 (Culture Filtrate Protein-10 kDa), elicit a strong T-cell immune response and stimulate the production of γ-interferon.13,14 These proteins were identified as the products of two open reading frames (esat-6 and cfp-10) in the region of difference (RD1 or ESAT-6 locus), a segment of DNA present in virulent M. bovis but absent from the BCG (Bacille Calmette-Guérin) strain used as a live vaccine and from avirulent M. microti strains.15-17 The re-introduction of RD1 into M. bovis BCG restores ESAT-6 and CFP-10 expression, and reinstates virulence and immunogenicity.17-19 The RD1 segment contains additional genes flanking esat-6 and cfp-10 encoding snm1 (Rv3870), snm2 (Rv3871) and snm4 (Rv3877).20-22 These three, membrane anchored proteins together help form the non-Sec dependent ESX-1 (ESAT-6 secretion) system and are required for virulence.22 Two of the proteins, snm1 and snm2, carry FtsK/SpoIIIE domains (FSD),21 a distinctive protein sub-structure often associated with ATPase activity. CFP-10 and ESAT-6 are representative of a protein family, observed in many Gram-positive bacteria, all comprising approximately 100 amino acids and carrying a Trp-X-Gly signature motif.21 Strikingly, there are similarities also in flanking genes and encoded proteins and it was suggested,23 and later proven21 that this locus encodes a secretion system. Similar loci are present in S. aureus,12 and termed the ESX-1 secretion system (Ess). Here, genes encoding an ESAT-6-like protein, EsxA (Ess extracellular A) and the CFP-10-like protein EsxB (Ess extracellular B) are found. The flanking genes encode four membrane-bound proteins, EsaA, EssA, EssB, and EssC of which the last three are conserved in Mycobacteria and required for secretion of S. aureus EsxA (SaEsxA) and EsxB (SaEsxB).12 EssC is homologous to snm1 and snm2 proteins of M. tuberculosis.12
We set out to establish if accurate structural data could provide insight into the biological role of these proteins and now describe the production and characterization of recombinant SaEsxA and SaEsxB together with the high-resolution structure of SaEsxA determined in two crystal forms. Sequence and structural comparisons, in conjunction with previous studies on the related M. tuberculosis virulence factors, allow us to describe structural features that appear important for the integrity of the structure of this protein family and to propose functional contributions to pathogenesis.
Recombinant expression systems to produce SaEsxA (Mwt ~ 11.0 kDa, 97 residues) and SaEsxB (Mwt ~ 11.5 kDa, 104 residues) were prepared and both proteins purified to homogeneity with a good yield of approximately 8 mg L−1 of bacterial culture. SDS-PAGE (sodium dodecyl sulfate polyacrylamide gel electrophoresis) and matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry were used to confirm the high level of purity and integrity of the products (data not shown). Gel filtration was applied as the final purification step and indicated that each protein formed a homodimer, of approximate molecular weight 23 kDa with no evidence for any monomer in solution. The high stability of such a quaternary structure is evident by the observation that dimer bands were even present in the SDS-PAGE gels (data not shown).
The application of fluorescence, circular dichroism, and nuclear magnetic resonance spectroscopy methods indicate that ESAT-6 in isolation is a monomeric molten globule protein, CFP-10 a monomeric and unstructured protein, but when mixed together they associate tightly to form a stable heterodimeric complex.24,25 The NMR derived structure of the CFP-10/ESAT-6 complex has been reported.26 Based on these observations it was previously accepted that SaEsxA and SaEsxB would form a heterodimeric assembly.12 However, in our case we were unable to find any evidence for a SaEsxA:SaEsxB heterodimer following incubation of the two proteins together under similar conditions employed for the M. tuberculosis proteins. We note the recent report that under the specific conditions of yeast two-hybrid experiments CFP-10 and ESAT-6 form homodimers,27 an observation not described by other investigators.
Crystallization trials of homodimeric SaEsxA and SaEsxB were carried out individually, and also with a SaEsxA:SaEsxB mixture. SaEsxB did not crystallize but two highly ordered crystal forms of SaEsxA were obtained. Triclinic crystal form I and monoclinic form II crystals diffracted to 1.4 Å and 1.9 Å resolution respectively. The form II crystals were obtained from the SaEsxA:SaEsxB mixture. The structure of SaEsxA form I was solved by multi-wavelength anomalous dispersion (MAD) methods exploiting the signal from a selenomethionine (SeMet) derivative. This model was then used in molecular replacement (MR) calculations to determine the structure of form II. Refinement statistics are presented in Table 1 and indicate that the analyses have produced acceptable high-resolution models with all residues displaying / combinations within the allowed regions of a Ramachandran plot.
Both crystal forms contain a dimer per asymmetric unit with subunits labeled A and B. Residues 4-87 in subunit A of both crystal forms are ordered, while in subunit B the electron density is well defined for residues 4-92 of form I and for all but the C-terminal residue of form II. Two additional N-terminal residues (Ser-His) are observed in subunit B of form II. These are remnants of the proteolysis carried out to remove the histidine-tag.
Several prominent features in electron and difference density maps were observed for both crystal forms. These corresponded to the major peaks also seen in anomalous difference Fouriers calculated with data from crystal form I (data not shown). On the basis of difference density and anomalous difference density peak heights, the presence of 80 mM zinc acetate and 100 mM sodium cacodylate in the crystallization medium, coordination geometry,28 and successful refinement these peaks were assigned as Zn2+ ions in both crystal forms and a single cacodylate in form I. Seven Zn2+ ions were located in form II, five in form I and we note that these five binding sites, on the surface of the homodimer, are conserved between the two forms (data not shown). The binding of Zn2+ ions is mainly achieved using glutamate and aspartate side chains, with several water molecules also involved. In form II, Lys73 NZ from both subunits coordinate a Zn2+. Lysine is not a common ligand for cation binding but there is precedent for such an interaction in high resolution structures.28 In form I, a cacodylate ion interacts with Lys73 of subunit A and one of the Zn2+ ions (data not shown).
The superposition of subunits within each crystal form yields an r.m.s.d (root-mean-square-deviation) of 0.3 Å for 85 Cα atoms (data not shown). Superpositions of subunit A of form I with subunit A or B of form II results in an r.m.s.d of 0.7 Å (data not shown). Such a high degree of structural similarity means that it is only necessary to detail one structure and we concentrate on form II since it contains the more ordered C-terminus.
SaEsxA is a small acidic protein of 97 amino acids (theoretical pI 4.6), and crystallized as a dimer in both crystal forms. The SaEsxA monomer is composed of two (α1, α2) side-by-side anti-parallel helices folded to create an elongated elliptical cylinder (Fig. 1A) of approximate dimensions 15 Å x 24 Å x 80 Å. The SaEsxA primary structure displays a heptad repeat where the first and fourth positions comprise hydrophobic residues, and the other positions are mainly hydrophilic. This pattern is recognized as a signature for interlocking α-helices.29,30
Approximately 85% of the residues contribute to the two sections of secondary structure. A flexible N-terminal segment, lying across α2, leads to Pro8 which then initiates 1. Ten turns of helix extend to Trp43 and then a short hairpin bend links onto 2, which has thirteen turns of helix (Gly45 to Asn93). This second helix is distorted at turn four by Pro60 (Fig. 1A). The Cα atoms of 1 and 2 prior to Pro60 are separated by approximately 9 Å. After the distortion imposed by Pro60 the distance increases to 12 Å. The monomer is slightly curved with one side presenting a concave hydrophobic surface while the other displays a polar, acidic surface (Fig. 1B, C).
The dimerization of SaEsxA (Fig. 2) is dominated by hydrophobic interactions with approximately 75% of the interface constructed by non-polar residues. Each monomer has a dimerization interface area of ~1830 Å2 (calculated using the Protein Interfaces, Surfaces and Assemblies service at the European Bioinformatics Institute (http://www.ebi.ac.uk/msd-srv/prot_int/pistart.html),31 This interface uses approximately 25% of the solvent accessible surface area of a single subunit. The few residues that form polar interactions (five hydrogen bonds and two salt bridges) directly implicated in dimerization are concentrated at the C- and N-terminus and the hairpin bend. At the N-terminal section, the main chain of Met3 and Lys5 form a short β-sheet-like interface using two hydrogen bonds with the main chain of residues Asn42 and Glu44 of the neighboring subunit hairpin region (LysA5 N and MetA3 O with AsnB42 O and GluB44 N respectively are shown in Fig. 3). Tyr18 OH and Gln36 NE2 of both subunits form two hydrogen bonds with each other and salt bridges between Lys14 and Glu38 are also observed. GlnA75 OE1 accepts a hydrogen bond donated from ArgB50 NH2. In addition to these stabilizing interactions formed between the amino acids, the carboxylate side chains of GluA10 and GluB38 are involved in a Zn2+-mediated interaction (data not shown).
The hairpin bend between α1 and α2 contains the WXG motif (Trp43-Glu44-Gly45, Fig. 1A, ,3).3). The side chain of Trp43 is positioned between Phe48 on the same polypeptide and Lys5 of the partner subunit. The indole moiety also forms van der Waals interactions with Ile39 of the same subunit, Met6, Ile11, Thr79 and Val83 from the partner to stabilize one end of the SaEsxA dimer (data not shown). In most of the 226 putative EsxA/B sequences identified in the Pfam database,32 (PF06013) the central residue of the tripeptide motif is large and hydrophilic. In 28% of the sequences the central residue is an acidic aspartate or glutamate and in 11 % aspargine or glutamine. A basic residue (arginine, histidine or lysine) occupies 12 % and serine 20 % of the central positions. The notable exceptions are all but one of the sequences derived from Mycobacteria species where the motif is strictly conserved as Trp-Gly-Gly. In subunit A the carboxylate side chain of Glu44 is 2.8 Å from the carbonyl of Ala40 suggesting that protonation has occurred to allow a hydrogen bond to form (data not shown). This is plausible given the acidic conditions under which crystals were obtained (pH ~ 5.5). In subunit B, Glu44 adopts a different conformation and interacts with a Zn2+ ion, which in turn is linked to a symmetry related molecule (data not shown). Gly45 completes the tight hairpin turn into α2.
The residues that coordinate Zn2+ are Glu9, Glu10, Asp23, Glu38, Glu44, Glu52, Glu70, Asp81, Glu85 and Lys73 (data not shown). Four of these are conserved in EsxA/B sequences. Glu9 and Glu10 are conserved in 44 % and 33 % of the sequences respectively. Glu70 and Asp81 are conserved in about 25 % of the sequences. These residues do not create specific metal ion binding environments on the EsxA dimer but rather participate with symmetry related molecules in the crystal lattice to bind the ions.
A sequence alignment (MUSCLE)33 of SaEsxA with ESAT-6, CFP-10 and SaEsxB is presented in Fig. 4. SaEsxA shares 12 % sequence identity with ESAT-6, 14 % with CFP-10 and 13 % with SaEsxB. SaEsxB shares 8% sequence identity with ESAT-6 and 13% with CFP-10. Such a low level of sequence conservation immediately suggests that caution must be exercised in comparisons. In this, we are helped by access to structural data and despite the low sequence conservation the SaEsxA and CFP-10/ESAT-6 structures are similar. However, there are also important differences.
A representative conformer in the deposited ensemble of NMR structures for the CFP-10/ESAT-6 heterodimer26 is model 9 and a superposition of 80 Cα atoms of one subunit of SaEsxA with each subunit of the CFP-10/ESAT-6 heterodimer gives an r.m.s.d of 2.0 Å. One overlay, of SaEsxA with ESAT-6 is depicted in Fig. 5. SaEsxA shares a similar helix-hairpin-helix topology with the CFP-10/ESAT-6 heterodimer. The main differences between these structures lie at the C-terminus and an extra turn in α1 prior to the hairpin in ESAT-6. In contrast to both CFP-10 and ESAT-6, which display a flexible, poorly defined C-terminus, the C-terminal segment of SaEsxA retains a helical structure to Asn93.
Secondary structure prediction of SaEsxB (PSIPRED)34 suggested, with a high level of confidence, two α-helices between residues Lys13 and Ile42, Gly49 (of the WXG motif) and Gln97 (data not shown). Despite the low level of sequence homology between SaEsxA and SaEsxB this would strongly suggest similar structures. It is not common to find proline embedded in an α-helical structure and a distinctive feature of SaEsxA is the presence of Pro60, which creates a distortion in α2. In SaEsxB Pro72 is predicted to occur in α2 and may well produce a distortion in that protein although at a different position, two turns of helix further along compared to SaEsxA. There are no proline residues in CFP-10/ESAT-6 and the helices appear more regular.
The low level of sequence conservation in the EsxA/EsxB family and the complication of now knowing which are homo or heterodimers renders it difficult to definitively assign structural features related to dimerization. However, inspection of the 226 EsxA/EsxB sequences in Pfam family PF06013 identified that SaEsxA Pro60 is conserved in 40 % of the entries (data not shown). This residue and the specific distortion it imparts to α2 may represent a signature for homodimer formation. Experimental data on quaternary structure and the ability to be transported through the ESX-1 system on range of orthologues will be required to confirm or refute this suggestion.
Mutational analysis of M. tuberculosis ESAT-6 has identified several residues required for targeting, secretion, complex formation with CFP-10 and immunogenicity.35 These residues are, in general, structurally conserved in SaEsxA, marked on the sequence alignment presented in Fig. 4 and on the molecular structures in Fig. 5. A total of sixteen residues were mutated, three at the N-terminus, three within α1, two in the hairpin region and eight in α2. A major conclusion from that study is that changes likely to perturb the dimer structure have a pronounced affect on virulence.
Virulence was abolished in ESAT-6 Leu28Ala, Leu29Ala, Trp43Arg, Gly45Thr Val90Arg and Phe94Gln mutants. These mutations can be divided into three groups according to the proposed structural contributions of the specific residues. Leu28 and Leu29 are in the middle of α1, buried at and important contributors to the dimer interface. The corresponding residues in SaEsxA are Ile28 and Leu29, and with hydrophobic side-chains conserved they are likely to play a similar role in oligomerization and maintenance of the helical structure of SaEsxA. Trp43 and Gly45 of the WXG motif are conserved in SaEsxA and positioned on the hairpin loop linking α1 and α2. Mutations such as these would be predicted to disrupt the dimer. ESAT-6 Val90 and Phe94 are conserved as Leu90 and Phe94 in SaEsxA. These residues are part of the C-terminal sequence important for secretion and will be discussed later.
In SaEsxA, Phe55, Gln56, Ala63, and Gln67 align with four of the mutated residues Gln55, Gln56, Asn66, and Asn67. In ESAT-6 these polar residues are all exposed on the surface of the dimeric assembly. In SaEsxA, Gln56 and Gln67 are likewise solvent exposed but the others, Phe55 and Ala63 are buried and contribute hydrophobic interactions to stabilize the dimer. The difference between the two proteins here may be a result of the distortion to SaEsxA α2 caused by Pro60, a distortion that is absent from ESAT-6. Three other mutations were made to ESAT-6 (Gln4Leu, Ala14Arg, Met83Ile). These changes introduced amino acids that are conserved in other ESAT-6 family members and it is perhaps not surprising that these mutations had no apparent effect on ESAT-6 activity.35 In SaEsxA the corresponding residues are Ile4, Lys14, and Val83. Of these only Lys14 appears to be significant from a structural viewpoint; the aliphatic part of the side chain contributing to the hydrophobic interactions that stabilize the dimer and the amine group a salt bridge with Glu38 of the partner subunit as described earlier.
CFP-10 interacts with the C-terminal domain of the FtsK/SpoIIIE ATP-dependent transporter protein smn2.22 The conjugation of the seven C-terminal residues of M. tuberculosis CFP-10 to Saccharomyces cerevisiae ubiquitin and heterologous expression led to secretion of the yeast protein into the culture indicating that these residues comprise a secretion signal. The level of protein secreted was however reduced significantly indicating that other factors contribute to the recognition and translocation mechanism.36 Site-directed mutagenesis targeting these seven residues identified that mutation of CFP-10 Leu94, Met98, Gly99 or Phe100 to alanine abolishes interaction with smn2 and prevents secretion whilst similar alterations to ESAT-6 have no effect.36 The conclusion therefore is that a key determinant for secretion resides on the C-terminus of only one component and this explains the requirement for heterodimerization of CFP-10/ESAT-6 in M. tuberculosis.
The sequence alignment shown in Fig. 4 indicates a significant degree of conservation at the C-terminus of CFP-10, SaEsxA and SaEsxB. CFP-10 Leu94 and Gly99 are strictly conserved in the S. aureus proteins and Phe100 conservatively replaced by leucine. Met98 in CFP-10 is replaced by phenylalanine and glutamine in SaEsxA and SaEsxB respectively. It appears that both SaEsxA and SaEsxB carry the C-terminus translocation signal, which means that each protein is transported and acts independently as a virulence factor with no requirement for heterodimerization. It may be significant that the C-terminus of SaEsxA adopts a helical conformation (Fig. 3), a structural feature that may be important in the molecular recognition of EsxA by the transport machinery.
Under acidic conditions such as those found in phagosomes, ESAT-6 can dissociate from CFP-10 and bind, with high specificity, to liposomes containing cholesterol and phosphatidylcholine.37 ESAT-6 can also disrupt artificial lipid bilayers but not when complexed to CFP-10,38 and this property appears linked to a concomitant increase in α-helical content.25 It has been suggested that the complex with CFP-10 is required to facilitate secretion with regulation of ESAT-6 cytolytic activity provided by the requirement for it to be released from the partner subunit. Renshaw et al., did not observe any cytolytic activity in fluorescence based binding assays,26 however the pH used in the experiments may not have been sufficiently acidic to promote subunit dissociation, and hence the lytic activity of ESAT-6 may have been masked. It remains to be shown if either SaEsxA or SaEsxB display similar properties.
Dimeric CFP-10/ESAT-6 proteins are secreted to the culture supernatant and are not found in the cytosol, nor are they associated with membrane fractions.26 This suggests to us that this protein family is not a component of the transporter complex but may fulfill other roles. For example, the homo (in the case of S. aureus proteins) or CFP-10/ESAT-6 heterodimers carry a C-terminal segment that interacts with the transport system. Since these proteins are secreted the weight of evidence suggests they are more likely to be cargo, a transport module or chaperone to assist export of protein by the outer membrane ESX-1 secretion system.
Our study has revealed that SaEsxA forms a homodimeric four-helix bundle and that SaEsxB forms a homodimer likely of similar structure. The polar surface of the SaEsxA dimer suggests it is unlikely to be a membrane pore forming protein. The ProKnow server (http://proknow.mbi.ucla.edu) was used seeking clues to infer function, based on structure and sequence. The highest gene ontology score for molecular function was for a structural molecule and for a biological process of protein folding. Further analysis, by searching nearest structural neighbors against the Protein Data Bank using the DALI server.39 and structure analysis using the Profunc server40 identified the closest structural homologues of the SaEsxA dimer as vinculin,41 the talin rods that bind to vinculin,42 and paxillin-binding focal adhesion proteins43 (data not shown). These proteins are involved in the maintenance of cell structure integrity and adhesion events. The SaEsxA monomer also gives a structural match to the BtuB porin-binding domain of the colicin E3 receptor.44 In BtuB, hydrophobic residues displayed on a hairpin bend bind the hydrophobic core of the membrane bound porin.
Indeed, the structural features of SaEsxA, the tightly bound helical section with a hydrophilic surface, and the extended C-terminus containing exposed hydrophobic residues suggest that EsxA, or indeed the ESAT-6 family could be adaptor proteins, capable of bridging bacterial proteins to host cell surface receptors, that facilitate localization of virulence factors and exotoxins or indeed any secreted factor that contributes to establishing an infection.
We have shown that SaEsxA binds divalent cations and these were assigned as Zn2+ as discussed above. This observation may simply reflect the presence of these cations in the crystallization medium. We note that some of the liganding residues are, to a degree conserved in EsxA/EsxB family members and an association with cations may facilitate interactions with other metal binding proteins to support an adaptor role. This is an area that warrants further investigation.
The M. tuberculosis genome is predicted to encode more than 20 paired CFP-10/ESAT-6 paralogs, of which several have been found in culture supernatant.36 This suggests that the numerous gene duplication events that have occurred have given M. tuberculosis a high degree of redundancy with respect to these particular proteins. The formation of heterodimers with a CFP-10 type protein carrying the secretion signal provides a mechanism whereby a diverse range of proteins might be assisted through the ESX-1 system. The C-terminal sequences of some CFP-10 paralogs lack the Met-X-Phe tripeptide important for secretion in M. tuberculosis and for these we suggest that heterodimer formation with paralogs carrying the correct sequence is required to ensure interaction with the ESX-1 secretion system or that separate ESX-1-like systems drive their secretion. A plethora of secretion chaperones such as these would provide a highly specialized mechanism of translocating different virulence factors and proteins required to establish an infection or if acting as an adaptor system they would provide diversity in terms of recognition for host cell targets.
The WXG motif and nearby interactions between the subunits result in a shallow cleft positioned close to the important helical C-terminal signal sequence in SaEsxA. The strict conservation of this motif suggests an important role. This tripeptide is involved in dimer formation and the distinctive hairpin bend structure that is observed. In certain pathogenic bacteria, assembly of the bacterial flagellum is dependent on small, acidic, dimeric polypeptides such as FliS, FlgN and FliT that function as export chaperones to transport proteins across the outer membrane. These proteins display similarities to the chaperones that support the delivery of cytotoxic effector proteins into eukaryotic cells by the bacterial type III secretion system. The structure of one such chaperone, FliS, reveals a compact four-helix bundle that binds its target protein in an extended conformation.45 We speculate that EsxA might function in similar fashion and that the shallow cleft at the WXG motif may represent a peptide recognition feature by which cargo proteins might be acquired for transport. One possible model is that N-terminus of the partner subunit could adopt or be forced into a different conformation allowing partner proteins to bind at the WXG motif.
The next and important stage of research in this area is to identify exactly which substrates are exported, investigate interactions with EsxA/B and the mechanism of transport. With the new structural information presented here we would then be well placed to further dissect the ESX-1 system.
Chemicals of the highest quality available were sourced from Sigma-Aldrich and VWR International except where stated otherwise. The genes encoding S.aureus EsxA and EsxB were obtained by PCR amplification from genomic DNA (Strain SA 113 - ATCC 35556). Oligonucleotide primers for EsxA (forward, 5′ ACG CAT ATG GCA ATG ATT AAG ATG AGT CCA 3′ and reverse, 5′ ACG CTC GAG TTA TTG CAA ACC GAA ATT ATT 3′) and EsxB (forward, 5′ ACG CAT ATG GGT GGA TAT AAA GGG ATT AAA 3′ and reverse 3′ ACG CTC GAG TCA TGG GTT CAC CCT ATC AAG 5′) were designed to introduce unique restriction sites for Nde1 and Xho1, respectively (shown in bold). The PCR products were blunt-end ligated into pUC18 (SureClone; GE Healthcare) then subcloned into a modified pET-15b vector (Novagen) which produces a histidine-tagged protein with a a Tobacco Etch Virus (TEV) cleavage site. Native protein was obtained from expression in E. coli BL21 (DE3) GOLD cells (Novagen). Cells were grown in LB supplemented with 100 μg.ml−1 carbenicillin. Expression was induced at 25 °C with the addition of 1 mM isopropyl-D-thiogalactopyranoside, and growth continued for a further 8 hours. Selenomethionine derivative SaEsxA was prepared using the methionine auxotroph strain E. coli B834. Cells were grown in minimal media supplemented with normal amino acids, except that L-methionine was replaced with L-SeMet using an established protocol.46
Cells were harvested by centrifugation, resuspended in buffer A 100 mM Tris-HCl (pH 7.5) 5 mM benzamidine, 5 mM lysozyme, 100 units DNase, 5 mM phenylmethylsulphonyl fluoride, and left on ice for 15 minutes before lysis in a One-Shot cell disruptor (Constant Cell Disruption Systems) at 33 Kpsi. The insoluble debris was separated by centrifugation (53,000 g, 10 min, 4 °C), the supernatant passed through a 0.2 μm filter then applied to a 5 ml HisTrap HP column (Amersham Biosciences) that had been pre-charged with Ni2+ and eluted with a linear imidazole gradient. Fractions were analyzed by SDS-PAGE, and those that contained a high level of purified EsxA or EsxB were pooled and dialyzed against buffer A. The proteins were then incubated with a His-tagged TEV protease at 30 °C for 4 hours to remove the His-tag, and the sample passed through a HisTrap HP column to remove the cleaved components, the protease and uncleaved sample. The cleaved protein retains a Gly-Ser-His sequence prior to the initiating Met1. The flow-through was collected, and passed through a size exclusion chromatography column equilibrated with buffer A (HiLoad 16/60, Superdex 75 Prep grade, Amersham Biosciences). This column had previously been calibrated with molecular weight standards, blue dextran (> 2,000 kDa), bovine serum albumin (67 kDa), carbonic anhydrase (29.5 kDa) and cytochrome c (12.5 kDa, GE Healthcare; data not shown).
The EsxA and EsxB containing fractions were pooled, and buffer exchanged into 100 mM sodium cacodylate (pH 5.5) then concentrated to 10 mg ml−1 (5kD centrifugal concentrator, Vivascience). Protein concentration was determined spectrophotometrically using a theoretical extinction coefficient of 6990 M−1cm−1.47 The full incorporation of SeMet and purity of the sample was confirmed by matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF) and SDS-PAGE.
Two crystal forms were obtained for SaEsxA. Triclinic blocks (form I) with dimensions 0.3 × 0.2 × 0.1 mm, were grown at 20 °C by vapor diffusion in hanging drops assembled from 2 μl of protein solution and 2 μl of reservoir (0.1M sodium MES (2-morpholinoethanesulfonic acid) pH 5.5, 80 mM zinc acetate, 8 % w/v polyethylene glycol 800). SeMet labeled protein crystals were grown under similar conditions, except that the drops were maintained at 4 °C. The second crystal form of SaEsxA was obtained during attempts to form a EsxA:EsxB complex following methods applied to prepare the heterodimeric M. tuberculosis CFP-10/ESAT-6.25,26 SaEsxA and SaEsxB were mixed in equimolar ratio and incubated for 30 minutes prior to setting up crystallization trials. The crystallization conditions were similar to those described above, leading to formation of monoclinic crystals. Crystals of both forms appeared after several days.
These crystals were cryo-protected using 20% PEG 300 then cooled in a stream of gaseous nitrogen at -173°C. The form I crystals were characterized in-house on a Rigaku Micromax 007 X-ray generator equipped with a Raxis IV++ image plate and stored under nitrogen. Multi-wavelength anomalous dispersion diffraction data were measured from a single SeMet labeled crystal on ID29 with an ADSC Q315 Charge Couple Device detector at the European Synchrotron Radiation Facility, Grenoble, France. The data were processed using MOSFLM48 and SCALA49, which are part of the Collaborative Computational Project number 4 suite of programs,50 and relevant statistics are shown in Table 1. Data for the form II crystals were measured in-house then integrated and scaled using DENZO and SCALEPACK.51
The structure of SeMet substituted SaEsxA was solved using MAD data, collected from a single form I crystal, in the program SHELX.52 A strong anomalous dispersion signal was observed to 2.0 Å resolution and applying this resolution limit allowed the location of two selenium positions out of the six expected for a dimer in the asymmetric unit. Two metal ion sites were also identified, as determined after structure solution, but these were not included in calculation of the first phase set. The phase set was used with the automated electron density map interpretation software Arp-Warp,53 employing default settings, and phases extended to 1.6 Å resolution. The figure-of-merit increased from 0.5 to 0.8. However, only about one-third of the structure was modeled, mainly a single helix as poly-alanine. Areas of the map for which no model had been constructed were near the metal ion binding sites and the electron density for the divalent cations and cacodylate may have complicated automated map interpretation and subsequent assignment of sequence to structure. Nevertheless, the resulting high resolution electron density map was of excellent quality and the remainder of model was built using COOT.54 Refinement by maximum-likelihood restrained least-squares methods was then carried out against the remote (λ3) dataset in Refmac555 and the resolution extended to 1.4 Å. These calculations were interspersed with inspection of electron and difference Fourier maps calculated with the CCP4 suite and incorporation of water molecules and ions. Metal ions were assigned as Zn2+ and a cacodylate. We cannot rule out the possibility that another divalent cation such as Ni2+ is present but the concentration of Zn2+ in the crystallization mixture (80 mM) gives us confidence in the assignment.
A subset of data (5%) was used for R-free calculations to assist decisions during refinement. Strict non-crystallographic symmetry restraints and isotropic B-factor refinement were used during the initial rounds of refinement. Once the model was complete these non-crystallographic symmetry restraints were removed and anisotropic B-factor refinement carried out. The inclusion of anisotropic B-factors produced a reduction in R-work/R-free from 22/25 % to 16/21 %, improved the quality of the electron density map and served to reduce difference density features near to the metal ions. The stereochemistry of the model was inspected with PROCHECK.56 The structure of form II was solved by MR using the refined form I structure as the search model in PHASER.57 The structure was refined to 1.9 Å resolution combining least-squares calculations in Refmac5 with electron and difference density map inspection to guide model manipulation, water and cation identification in COOT. The resolution to which data were available did not justify the use of anisotropic B-factor refinement. Molecular images were prepared using PyMol‡ and structural superpositions performed using SSM.58
Atomic coordinates and structure factors have been deposited in the Protein Data Bank, www.pdb.org, with codes 2vs0 and 2vrz for crystal forms I and II respectively.
Funded by the Biotechnology and Biological Sciences Research Council (UK), [Structural Proteomics of Rational Targets] and The Wellcome Trust [grant numbers 082596 and 083481]. We are grateful to David Norman and Tracy Palmer for advice and to the European Synchrotron Radiation Facility for beam time and excellent staff support.
‡DeLano, W.L. (2002) The PyMOL Molecular Graphics System, DeLano Scientific, San Carlos, CA, USA. http://pymol.sourceforge.net/http://pymol.sourceforge.net.