|Home | About | Journals | Submit | Contact Us | Français|
Non-ribosomal peptide synthetases (NRPS) and polyketide synthases (PKS) found in bacteria, fungi and plants utilize two different types of thioesterases for the production of highly active biological compounds1, 2. Type I thioesterases (TEI) catalyze the release step from the assembly line3 of the final product where it is transported from one reaction center to the next as a thioester linked to a 4′-phosphopantetheine cofactor (4′-PP) that is covalently attached to thiolation (T) domains4-9. The second enzyme involved in the synthesis of these secondary metabolites, the type II thioesterase (TEII), is a crucial repair enzyme for the regeneration of functional 4′-PP cofactors of holo T-domains of NRPS and PKS systems11-13. Mispriming of 4′-PP cofactors by acetyl- and short chain acyl-residues interrupts the biosynthetic system. This repair reaction is very important, since roughly 80% of coenzyme A (CoA), the precursor of the 4′-phosphopantetheine cofactor, is acetylated in bacteria14. Here we report the first three-dimensional structure of a type II thioesterase free and in complex with a T domain. Comparison with structures of TEI enzymes3, 15 shows the basis for substrate selectivity and the different modes of interaction of TEII and TEI enzymes with T domains. In addition, we show that the TEII enzyme exists in several conformations of which only one is selected upon interaction with its native substrate, a modified holo-T domain.
The cyclic lipoheptapetide surfactin is one of the most potent biosurfactants showing antibacterial and antiviral activity2,3,16. It is synthesized by a complex of three large subunits that consist of either three modules (subunits SrfA-A and SrfA-B) or one module (subunit SrfA-C) (Fig. 1) with each module being responsible for the addition of one amino acid2,3. The transport of the growing chain between individual modules is achieved by small ~80 amino acid long T domains that interact with the aminoacyl-forming activation A-domain and both the up- and downstream peptide-bond-forming condensation C-domains4-10. In addition, some of the T domains of the Surfactin-synthetase have to interact with epimerization domains, located at the C-terminus of subunits SrfA-A and SrfA-B, or with the covalently linked type I thioesterase3. The type I thioesterase catalyzes the macrolactone formation between Leu(7) and the β-hydroxy fatty acid to release the mature surfactin. Modifications blocking the reactive thiol group of the 4′-PP cofactor attached to any T domain can occur with small molecules present in the cell (acetylation, succinylation and modification with fatty acids) and pose a significant challenge for the organism to keep the assembly line running.
The surveillance and repair tasks for the surfactin assembly line are carried out by the stand-alone surfactin type II thioesterase (SrfTEII)11-13. The importance of this enzyme has been demonstrated by genetic deletions that reduced the production of surfactin by 84%17. Due to the large variety of acylation modifications and the fact that the SrfTEII has to be able to interact with all seven T domains of the entire assembly line, this TEII has to be - in contrast to the type I thioesterase at the end of the last module - rather non-specific. At the same time, premature cleavage of the correct growing peptide chain has to be avoided. In addition to this repair function, the SrfTEII might also be responsible for loading the β-hydroxy fatty acid onto the first C-domain.
SrfTEII shows the typical α/β-hydrolase fold, with a central 7-stranded β-sheet surrounded by 8 helices (Fig. 2). Comparison with the structure of the in cis-acting SrfTEI reveals a similar overall architecture3, but with significant differences (Supplementary Fig. 5). The most obvious alteration is the N-terminal truncation of α-helix II (Asp60 - Leu67). Further modifications include the insertion of an additional helix (helix VII; Cys193 - Trp200) between the active site residues Asp189 and His216 and repositioning of the helix-turn-helix motif canopying the active site in SrfTEI. This lid is moved from its central position observed in the SrfTEI crystal structure3, towards the β-strands VI and VII in SrfTEII. Consequently, the active site residues in the SrfTEII structure (Ser86, Asp189, and His216)11,12 are just partially covered by a short loop (Gln138 - Ala 144) and become more accessible compared to SrfTEI. The space surrounding the active site is further enlarged by a kink in one of the helices of the helix-turn-helix motif.
Many of the amide proton resonances of the SrfTEII enzyme show significant line broadening or even two separate peaks, suggesting the existence of multiple conformations (Supplementary Fig. 2). Recently we had described the existence of multiple conformations in the TycC3-T domain5 (third module of the tyrocidine C subunit, a substrate for SrfTEII) and had suggested that this conformational exchange is an important driving force for the selective interaction with other domains of the NRPS system4-10.
NMR titration experiments18 of 15N-enriched SrfTEII were performed with CoA, myristoyl-CoA, apo- and holo-TycC3-T domains. The resulting chemical shift perturbations (Supplementary Fig. 3) revealed interaction with the holo form and virtually no interaction with the apo form in agreement with previous experiments in which the chemical shift differences of the different T domain forms were analyzed5. Interestingly, myristoyl-CoA also binds to SrfTEII, suggesting recognition of this β-hydroxy-fatty acid. This might support the previously assumed second function of SrfTEII as the starter enzyme, responsible for loading the β-hydroxy-fatty acid onto the first C-domain16. A comparison of the interaction interface of the SrfTEII to the in cis T recognition of EntF T-TEI15 (Frueh et al., this issue) reveals that in particular the sequentially diverse helix-turn-helix motif and its connecting loops are important for selecting the specific substrate.
To obtain more detailed insight into the interaction between the SrfTEII and the TycC3-T domain we carried out further NMR titration experiments with the acetyl-holo-form of the T domain and the inactive point mutation Ser86Ala of SrfTEII (Supplementary Fig. 4). Based on the magnitude of the chemical shift differences the acetyl-holo-T domain shows a significantly stronger interaction with SrfTEII than the holo-T domain, again in agreement with previous experiments5. To determine the relative orientation of both domains, we measured NOEs between selectively protonated methyl (Ile, Leu, Val) and aromatic (Phe) groups of SrfTEII and amide protons of the TycC3-T domain in otherwise perdeuterated proteins19, 20. These NOEs and the chemical shift perturbations allowed us to calculate the structure of the complex between the TEII enzyme and the T domain (Fig. 3b). Previous investigations had revealed that the holo-form of the TycC3 T domain exists in two different conformations that differ in the relative orientation of helix αII and the positioning of the 4′-PP cofactor5. They also demonstrated that interaction with the SrfTEII enzyme selects the more open conformation, the H-state5. Interestingly, modification of the 4′-PP cofactor with an acetyl group also shifts the equilibrium and locks the T domain in the H-state. Consequently, we used the H-state of TycC3-T in our structure calculations. The interface between both proteins involves helix αII, parts of helix αI and the C-terminus of the T domain, in agreement with the results reported previously5. The interface of the SrfTEII enzyme includes the N-terminal loop, the loops between β-strand II and α-helix I (Phe19 - Gly23), between β-strand IV and α-helix III (Gly88 - Met90), β-strand VI and α-helix VII (Asp189, Asp190) and the loop N-terminal to α-helix VIII (Met217, Phe218 and Gln222) as well as the helix-turn-helix “lid” region located between β-strands V and VI (Fig. 3). This arrangement positions the thiol-group of the 4′-PP cofactor in close proximity to the active site residues of the SrfTEII. The rest of the cofactor is surrounded by hydrophobic amino acids (Fig. 3)6. The helix-turn-helix motif is involved in the recognition of helix αII of the T domain, with close contacts to the peptide sequence surrounding the active site Ser45, as suggested previously21 and observed as well in the EntF T-TEI interaction (Frueh et al., this issue).
The titration experiments further demonstrated that interaction with the acetyl-holo-T domain selects one of the conformations of the thioesterase as indicated by the disappearance of the double peaks in the HSQC spectrum of the complex (Supplementary Fig. 4). Mapping the sites of the double peaks onto the structure of the TEII reveals that amino acids in the helix-turn-helix motif, loops surrounding the active site, residues located at the N-termini of the first and last helix are affected (Supplementary Figs. 1, 4). Unfortunately, the amino acid sequences that show double peaks are separated by stretches that display only one chemical shift value, making an unambiguous assignment to one particular conformation impossible. Thus, the structure of the SrfTEII represents an intermediate between two distinct conformations. The identity of the regions showing this conformational exchange combined with the selection of one state by binding of the T-domain, however, allows us to build a model of the exchange process. Most likely the two different states represent an open and a closed conformation of the enzyme (Supplementary Fig. 7). A similar plasticity of the lid region has been observed in the crystal structure of the SrfTEI enzyme in which two molecules with different conformations are located in the asymmetric unit3. The main difference between both molecules is the position of the lid, with one conformation representing a more open and the other a more closed state. Since both molecules, however, also form a non-native dimer in the crystal structure, the significance of this observation was not clear. The detection of an equilibrium of distinct conformations in the SrfTEII enzyme shifted towards one conformation by the interaction with a modified holo-T domain is the first confirmation of such a conformational exchange process in solution. Similar exchange processes are also observed in the structure of the EntF T-TEI di-domain (Frueh et al. this issue).
Inspection of the complex structure further reveals the structural basis for the recognition of short acyl groups attached to the 4′-PP thiol. Comparison of the active sites of the TEII and TEI enzymes shows that, despite the wide opening of the active site of TEII, the space available to accommodate a group attached to the 4′-PP cofactor is rather limited (Fig. 3, Supplementary Figs. 6 and 8). The open conformation of the SrfTEI enzyme is characterized by a pronounced active site cavity with a volume of 630 Å3. Modeling of a heptapeptide based on residual electron density has shown that this cavity is large enough to accommodate the entire peptide and to enforce a conformation that allows cyclization. In contrast to the deep and bowl-shaped cavity of the TEI enzyme the active site of SrfTEII is embedded in a shallow groove that can accommodate only small acyl substituents on the 4′-PP cofactor (Supplementary Fig. 8). This specificity of TEIIs for small acyl substrates was previously demonstrated in kinetic studies with TycF, the TEII of the Tyrocidin synthetase13 and with SrfTEII22. To further investigate this selectivity we have performed titration experiments with 15N-labeled SrfTEII and unlabeled TycC3-T loaded with either a single amino acid (Ala) or a tri-peptide (Phe-Pro-Phe). Whereas the titration with the Ala-loaded T-domain showed results very similar to the titration experiments with the acetyl-holo-T domain with the active site and surrounding residues showing chemical shift changes or significant line broadening, the titration with the tri-peptide loaded T-domain resulted in only minor chemical shift differences, limited to parts remote from the active site (Fig.4). The titration results also indicate differences in the dynamic behavior between acetyl-holo-T and Ala-holo-T on the one hand and holo-T and peptide-loaded holo-T on the other. While titrations with the first group indicate the formation of a stable complex by selection of one of the conformations represented by double peaks, titrations with the second group show a less pronounced selection of one conformation and mainly result in limited chemical shift differences.
In addition to shape and volume of the active site, the interaction of the enzymes with the 4′-PP cofactor might also differ. While in the structure of the SrfTEI the 4′-PP cofactor is almost completely surrounded by a channel formed by the TEI3, it seems to be sandwiched between the SrfTEII and the T domain in the in trans complex, demonstrating again the wider opening of the active site.
The structures reported here and their comparison with TEI enzymes show how modulation of the conserved thioesterase fold is used to change the function of the enzyme from one that recognizes the final product of the assembly line to one with a shallow but easily accessible active site that provides a rather unspecific but indispensable repair function.
Recently crystal structure determinations of type I fatty acid synthetases (FAS)6,23 have revealed the complex interaction between individual domains in these multi-enzymatic assemblies. These structures underscore the central role that the T domains and different orientations of their cofactors play in the iterative substrate shuttling between active sites.
The C-terminal His6-tagged wild type SrfTEII and the inactive Ser86Ala mutant proteins were heterologously expressed in E. coli and purified by Ni-chelation affinity chromatography. All isotope enriched protein samples were produced in supplemented M9 minimal media with selectively labeled carbon and nitrogen sources. NMR spectra24-29 for backbone and side chain resonance assignment and structure calculation were recorded on Bruker Avance800 and Avance900 spectrometers. All NMR titration experiments18 of labeled protein samples with small molecules and unlabeled proteins were performed on a Bruker Avance700 spectrometer. Bruker XWINNMR or Topspin 1.3 was used for data processing and UCSF SPARKY 3.111 for resonance assignment and NOE integration. The structure of SrfTEII was calculated on the basis of 2442 NOE upper distance limits and constraints for 92 hydrogen bonds in regular secondary structure elements using the simulated annealing program CYANA 2.130,31 The 20 conformers with lowest target function values were energy-refined in explicit water using the RECOORD scripts32 and the CNS 1.1 protocol33, to represent the solution structure of SrfTEII. All distance constraint violations were smaller than 0.2 Å. The in trans complex structure was calculated using the CNS 1.1 simulated annealing protocol. The contact surfaces of both proteins were identified by NMR titration experiments of labeled protein samples with the corresponding unlabeled interacting protein. Constraints to describe the relative domain orientation were obtained from a 3D-15N-NOESY-TROSY spectrum of a complex of 15N-labeled, perdeuterated holo-TycC3-T domain and fully perdeuterated, selectively Phe, Ile, Leu and Val protonated SrfTEII19,20. All structural figures were prepared using UCSF CHIMERA 1.247034 from the Computer Graphics Laboratory, University of California, San Francisco.
We thank Matthias Strieker2,4 for editing the manuscript and Birgit Schaefer1 for her great help and support in sample preparation. We thank Chi Scientific Inc. (Maynard, MA, USA) for the fast and high quality supply of the substrate peptides. The research was funded by the research grant BE-19/11 of the Deutsche Forschungsgemeinschaft (FB, MAM), an enclosed fellowship (AK), the Centre for Biomolecular Magnetic Resonance at the University Frankfurt (BMRZ) and the Cluster of Excellence Frankfurt (Macromolecular Complexes). AK thanks the Human Frontier Science Program Organization for a long-term fellowship awarded in Apr 2007.
Both structure are deposited at the RCSB Protein Data Bank with the accession codes 2RON (structure of SrfTEII) and 2K2Q (complex structure of SrfTEII and H-state TycC3-PCP).
The SrfTEII protein (residues 1-241) and its inactive S86A mutant were heterologously expressed in the E. coli strain BL21 (DE3) using a pET-expression vector system containing a C-terminal His6-tag. They were purified by Ni-chelation affinity chromatography and subsequent gel filtration chromatography using a Pharmacia Superdex-75 column. The purity of all protein samples was validated by SDS PAGE analysis. All labeled samples of the SrfTEII protein were produced in supplemented M9 media with stable isotope enriched glucose (13C or 13C/2H; Cambridge Isotopes Laboratories, Andover, USA) as the only carbon source and 15N ammonium chloride (Cambridge Isotopes Laboratories, Andover, USA) as the nitrogen source. For the preparation of fully perdeuterated samples the aqueous solvent was replaced by D2O and perdeuterated glucose was used as the carbon source. The stable isotope enriched (2H, 15N, 12C) and unlabeled samples of the TycC3-T domain (1-87) were expressed and purified as described previously5.
All NMR spectra for backbone and side chain resonance assignment and structure calculation of the SrfTEII protein were recorded on an Avance800 or an Avance900 spectrometer equipped with a 5 mm triple resonance, z-gradient cryogenic probe at 298 K. The resonance assignment was based on standard triple-resonance experiments, following standard protocols24-29. An additional 4-dimensional 15N/15N-resolved NOESY spectrum of 15N-labeled and 70% perdeuterated SrfTEII was recorded to verify the sequential resonance assignment and to define unambiguous amide-amide based distance constraints. DSS (4, 4-dimethyl-4-silapentane-1-sulfonate) was used as an internal chemical shift reference. All 15N-HSQC based NMR titration experiments18 were performed on an Avance700 spectrometer at 296 K. The spectral width was set to 12.5 ppm in the proton dimension and 35 ppm in the nitrogen dimension. A total of 2048 points in the direct and 1024 points in the indirect dimension were collected for all HSQC spectra. The TROSY-based 15N-edited NOESY experiments to measure interface distance constraints in the complex of Ser86Ala SrfTEII with the acetyl-holo-TycC3-T domain (1:1 ratio) and wild type SrfTEII with the holo-TycC3-T domain (1:2.5 ratio) were recorded on an Avance 900 with a mixing time of 182 ms. XWINNMR 3.1 or Topspin 1.3 (Bruker) were used for processing and SPARKY 3.111 (T. D. Goddard and D. G. Kneller, SPARKY 3, University of California, San Francisco) for resonance assignment and NOE peak integration. The SrfTEII protein could be assigned to 92% completeness of all 227 non-proline backbone resonances and 91.5% of all resonances.
Structures of SrfTEII were calculated by simulated annealing in torsion angle space with the program CYANA 2.130,31. Refinements in explicit water (TIP3P) were performed using the RECOORD scripts32 and the CNS 1.1 protocol33. In all calculations 210 unambiguously identified backbone amide to amide contacts from a 4D 15N/15N-resolved NOESY experiment27 were applied with upper distance bounds of 6.5 Å. The secondary structure was defined by 184 distance constraints for 92 backbone hydrogen-bonds identified on the basis of unambiguous NOEs in a 3D-15N-resolved NOESY spectrum and by 278 torsion angle constraints derived from chemical shift values with the program TALOS35. Furthermore 2442 distance constraints were generated based on 7802 NOESY cross peaks from the same 3D-15N-resolved NOESY and three 3D-13C-resolved NOESY spectra recorded for aliphatic, methyl and aromatic side chains separately. An initial structure was calculated from 2851 manually assigned NOEs and the torsion angle constraints with CYANA 2.131. The calculation yielded a structural bundle (20 out of 100 calculated structures, sorted by lowest energy) with a precision (RMSD) of 3.8 Å and an accuracy (target function) of 1.9 Å2. The CYANA script with automatic NOE assignment was used with the resulting 1706 distance constraints of the previous calculation and additional 94 distance constraints obtained from manually assigned CH-NOEs selective for aromatic side chains. The interpretation of 4951 ambiguous NOEs resulted in an additional 642 distance constraints and in a structural bundle with an RMSD of 1.37 Å for all heavy atoms and an averaged target function value of 4.9 Å2 for the 20 structures with lowest CYANA target function values out of 100 calculated conformers. The final set of 20 out of 150 calculated structures does not show distance or van-der-Waals violations larger than 0.20 Å, no angle violations > 2.6° and 75% of all dihedral angles are located in most favored regions and additional 24.8% in additionally allowed regions. Energy-refinement in explicit water (TIP3P) was performed using the RECOORD scripts32 and the CNS 1.1 protocol33, after transforming the CYANA derived structures into the CNS/XPLOR format. Pockets in SrfTEI3 and SrfTEII were identified and their sizes computed using the CASTp algorithm37,38 with a probe radius of 2.0 Å.
Chemical shift perturbations were measured in 15N-HSQC based NMR spectra for titration experiments18 of 15N-labeled holo- and acetyl-holo-TycC3-T domain with unlabeled SrfTEII and vice versa. The chemical shift perturbations and the line shapes of all titration experiments were analyzed using MestReC 220.127.116.116. It has been already demonstrated that the H-state of TycC3-T domain is recognized as a substrate by the SrfTEII5. Interaction surfaces of the carrier protein and the thioesterase were identified by the titration experiments and the structures of both proteins were used to calculate the structure of the enzymatically active complex. Constraints to describe the relative domain orientation were obtained from a 3D-15N-resolved NOESY-TROSY spectrum of a 15N-labeled completely perdeuterated holo-TycC3-T domain and fully perdeuterated selectively Phe, Ile, Leu and Val protonated SrfTEII19,20. Nine unambiguous and 17 additional ambiguous constraints between TycC3-T domain amide protons and SrfTEII FILV-side chain protons were identified. Distance constraints based on these NOEs were applied to the structural calculation of the in trans di-domain complex using the CNS 1.1 protocol. All structural figures were prepared using UCSF CHIMERA 1.247034.
The in-vitro modification of unlabeled apo-TycC3-T domain was carried out in 2 ml reaction mixtures of 0.25 mM apo-TycC3-T domain, 0.5 mM acetyl-CoA (SIGMA), 20 μM Sfp and 5 mM MgCl2, buffered in 100 mM sodium phosphate at pH 8.0 for 45 min at room temperature. The reaction mixture was subsequently purified by desalting (Econo-pac 10 DG (BIORAD) desalting column) and concentrated using an Amicon Ultra 4 Ultracell - 5k (MILLLIPORE) filter device with a molecular weight cut off (MWCO) of 5 kD.
The general procedure for the synthesis of peptidyl-amino CoA substrates was based on a synthesis described previously39, 40. Briefly, 10 mmol of amino-CoA41, 15 mmol of PyBOP, and 40 mmol of potassium carbonate were added to 10 mmol of the Boc-protected peptide. The solids were subsequently dissolved in a 1:1 THF/water (500 mL total) mixture and allowed to stir at room temperature overnight. The reaction mixture was directly purified by preparative high-performance liquid chromatography (HPLC) using a single injection on a Phenomenex C18 250 × 21.2 mm, 10 mm, 100 Å column and eluting using a gradient of 0 to 60 % acetonitrile containing 0.1 % trifluoroacetic acid (TFA) over 30 min and a flow rate of 10 mL/min while monitoring at 260 nm. The identities of the Boc-protected peptidyl-amino CoA substrates were verified by high-performance liquid chromatography-mass spectrometry (HPLC-MS) and matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) MS. Next, cleavage of the Boc-protecting group was carried out by dissolving the Boc-protected peptidyl-amino CoA in a 95:2.5:2.5 mixture of TFA, trifluoroethanol, and water and allowing this mixture to stir at room temperature for 2h. The deprotected peptidyl-amino CoA substrates were purified by preparative HPLC using the same conditions as described above. HPLC-MS and MALDI-TOF MS were used to confirm the identity of the peptidyl-amino CoA substrates.
Full Methods and any associated references are available in the online version of the paper at www.nature.com/nature.