|Home | About | Journals | Submit | Contact Us | Français|
Edited by Harold Scheraga
Microcin J25 (MccJ25) is a 21 amino acid (aa) ribosomally synthesized antimicrobial peptide with an unusual structure in which the eight N-terminal residues form a covalently cyclized macrolactam ring through which the remaining 13 aa tail is fed. An open question is the extent of sequence space that can occupy such an extraordinary, highly constrained peptide fold. To begin answering this question, here we have undertaken a computational redesign of the MccJ25 peptide using a two-stage sequence selection procedure based on both energy minimization and fold specificity. Eight of the most highly ranked sequences from the design algorithm, each of which contained two or three amino acid substitutions, were expressed in Escherichia coli and tested for production and antimicrobial activity. Six of the eight variants were successfully produced by E.coli at production levels comparable with that of the wild-type peptide. Of these six variants, three retain detectable antimicrobial activity, although this activity is reduced relative to wild-type MccJ25. The results here build upon previous findings that even rigid, constrained structures like the lasso architecture are amenable to redesign. Furthermore, this work provides evidence that a large amount of amino acid variation is tolerated by the lasso peptide fold.
Lasso peptides, also known as lariat peptides, are a unique class of cyclic peptides in which a linear portion of the peptide is fed through a macrocycle forming a knotted structure. Lasso peptides are further classified depending on the nature of the peptide cyclization. Class I lasso peptides are cyclized via the formation of two disulfide bonds and an isopeptide bond, while class II lasso peptides utilize only an isopeptide bond between the N-terminus of the peptide and either a glutamic acid or aspartic acid sidechain for cyclization (Fig. 1) (Rebuffat et al., 2004). The most well-studied class II lasso peptide is the antimicrobial peptide microcin J25 (MccJ25), which was isolated from an infant fecal strain of Escherichia coli (Salomon and Farias, 1992; Rebuffat et al., 2004; Vincent and Morero, 2009). Other examples of class II lasso peptides include the antimicrobial peptides lariatin (Iwatsuki et al., 2006) and capistruin (Knappe et al., 2008) and the endothelin B receptor antagonist RES 701-1 (Katahira et al., 1995). More recently, the structure of the peptide BI 32169, which functions as a glucagon receptor antagonist, was elucidated and found to be a lasso peptide (Knappe et al., 2010). BI 32169 is unique among lasso peptides because it contains an isopeptide bond and only a single disulfide linkage, departing from the canonical class I structure (Knappe et al., 2010). One consequence of the knotted topology of lasso peptides is that these molecules are quite constrained. For example, the ‘tail’ portion of MccJ25, amino acids 9–21, is firmly locked into place within the macrolactam ring formed by residues 1–8 due to the presence of two aromatic residues in positions 19 and 20 of the peptide (Fig. 1) (Bayro et al., 2003; Rosengren et al., 2003; Wilson et al., 2003). This highly constrained structure endows the peptide with tremendous stability to both thermal and chemical denaturation with MccJ25 retaining antimicrobial function after autoclaving or treatment in 8 m urea (Salomon and Farias, 1992; Blond et al., 1999).
Given the highly constrained nature of the class II lasso peptide fold, we are generally interested in the question of the extent of sequence space that can be accommodated by such a fold. Given that there are currently only a handful structurally confirmed examples of class II lasso peptides, a simple comparison of known lasso peptide sequences does not provide much information. An alternative route to examine which sequences can be tolerated by the lasso fold is to generate variants of a known lasso peptide and examine these variants for retention of structure and function. Of the known class II lasso peptides, MccJ25 is the best candidate for such studies since it has an easily assayable antimicrobial activity (Salomon and Farias, 1992; Portrait et al., 1999; Clarke and Campopiano, 2007; Cheung et al., 2010; Pan et al., 2010). Mass spectrometry can be used to determine whether the peptide is cyclized since the formation of the isopeptide bond results in a loss of a water molecule; thus the resulting cyclic peptide has a mass 18 units lower than the corresponding linear peptide. The biosynthesis of MccJ25 is reasonably well-understood; four genes are required for the synthesis, maturation and export of the peptide (Fig. 2) (Solbiati et al., 1996). The MccJ25 precursor, encoded for by the gene mcjA, is 58 aa. The mcjB and mcjC genes encode the maturation proteins that cleave the mcjA gene product and cyclize it into its lasso conformation (Clarke and Campopiano, 2007; Duquesne et al., 2007), while the mcjD gene encodes an ABC transporter that pumps mature MccJ25 out of the cell.
Pavlova et al. have taken the first step toward understanding the extent of sequence space that can be populated by the lasso fold by making a set of nearly all possible MccJ25 variants with a single amino acid change (Pavlova et al., 2008). Each single amino acid variant was tested for production and export from E.coli. In addition, the variants that produced were tested for the ability to inhibit RNA polymerase (RNAP), the protein target of MccJ25 (Delgado et al., 2001; Yuzenkova et al., 2002; Mukhopadhyay et al., 2004; Semenova et al., 2005) and finally for antimicrobial activity. Of the 381 single amino acid variants, 64% were successfully produced and exported, 41% were active in vitro (inhibition of RNAP) and 18% retained in vivo antimicrobial function. An alanine-scanning study of another lasso peptide, capistruin, revealed that most single alanine substitutions and even several double alanine substitutions were tolerated by the lasso structure (Knappe et al., 2009). Here we take a different approach in which we have carried out a computational redesign of MccJ25 and have tested the redesigned variants, carrying multiple amino acid substitutions, for production and function. The computational method is based upon two stages. The first stage is the sequence selection stage, which is based on a quadratic assignment-like model that is solved to global optimality as a reformulated equivalent integer linear optimization model and which produces a rank-ordered list of amino acid sequences that will fold into the given template (Floudas, 1995; Klepeis et al., 2003a–c, 2004; Fung et al., 2007). The second stage calculates a fold specificity of each sequence from stage 1 (Klepeis et al., 2003a–c, 2004; Fung et al., 2008) and ranks the sequences based upon how well each folds into the template structure.
All plasmids and primers used in this study are listed in Table I. The pTUC202 (Solbiati et al., 1996) and pJP3 (Pan et al., 2010) plasmids were previously described. Briefly, the pTUC202 plasmid harbors the native gene cluster for the production of MccJ25 while pJP3 contains an engineered mcjABCD gene cluster in which the mcjA gene is placed under the control of an IPTG-inducible promoter (Fig. 2). The pJP4 plasmid is identical to pJP3 except that it lacks the mcjA gene. Both pJP3 and pJP4 are derivatives of the commercial vector pQE-60 (Qiagen). To construct each plasmid encoding a mutant mcjA gene, the mcjA signal sequence was first amplified from pTUC202 using primers mcjA signal F and mcjA signal R. The mcjA mature sequence with the designed mutations was synthesized by primer overlap PCR using the forward and reverse primers for each mutant listed in Table I. The amplified signal sequence was then overlapped with the mature sequence in a second PCR step to generate a reconstituted mutant mcjA gene.
Next, the amplified mutant mcjA fragment was digested with BsaI and HindIII and ligated into pJP4 digested with NcoI and HindIII, placing the mutant mcjA gene after the phage T5 promoter and generating the plasmids pJP13–pJP15 and pWC1–pWC5 encoding the MccJ25 variants mut1–mut8 (Table I).
Plasmids harboring a mutant (pJP13–15, pWC1–5) or wild-type (pJP3) mcjA were transformed into the E.coli strain DH5α. The transformants were grown in Luria-Bertani (LB) broth. At 0.5 OD600, 1 mM IPTG was added to induce MccJ25 production. Following 20 h of induction, 1–2 ml of culture supernatants was obtained by centrifugation at 14 000 rpm for 1 min. The culture supernatants were then heated at 100°C for 5 min and stored at 4°C until needed.
MccJ25 variant activity was assessed using a zone of inhibition assay described previously (Pan et al., 2010). Briefly, an M63 minimal media plate was overlaid with M63 soft agar inoculated with 107 colony-forming units per milliliter of exponentially growing Salmonella enterica subsp. enterica serovar Newport. This organism is colloquially referred to as Salmonella newport. After solidification, 5 μl of prepared supernatant was spotted onto the overlay. The plate was incubated overnight at 37°C and analyzed for zones of inhibited growth.
Culture supernatant was prepared as described in the previous section, but on a larger scale (10–15 ml). The supernatant was extracted with two volumes of n-butanol. The organic phase was separated and dried via rotary evaporation. The residue was then re-dissolved in water (1 ml).
Reverse-phase high-performance liquid chromatography (HPLC) using an Agilent 1200 Series LC system with analytical scale Zorbax C18 column (300SB-C18, 9.4 × 250 mm) was used to purify the peptide from the extract. Ultrapure water (0.1% trifluoroacetic acid) and acetonitrile (0.1% trifluoroacetic acid) were used to create a gradient from 10 to 50% acetonitrile in 20 min, rising to 90% acetonitrile in 5 min after which 90% acetonitrile was applied for 5 min. For variants with antimicrobial activity, fractions were collected and tested using the zone of inhibition assay. The active fraction was evaporated and re-dissolved in water. For variants with no activity, putative peaks were collected. The identity of each exported variant was confirmed by electrospray ionization mass spectrometry.
Culture supernatant was extracted with two volumes of n-butanol. A 1 ml sample of the organic extract was dried using a speed-vac and re-dissolved in 200 μl of water. The Agilent LC system was used to analyze this sample, using the same gradient as described in the methods section for purification. The area of the peak corresponding to each variant was used to quantitatively evaluate the amount of peptide produced.
Culture supernatants were serially diluted in LB broth and a zone of inhibition assay as described above was performed using the dilution samples. The maximum dilution at which an inhibition zone was observed was used to determine the relative antimicrobial activity of the MccJ25 derivatives.
The de novo design framework (Klepeis et al., 2003a–c, 2004; Rajgaria et al., 2006, 2008; Fung et al., 2008) consists of (a) the design inputs component, (b) stage 1 which is the sequence selection stage and produces a rank-ordered list of amino acid sequences that fold into the postulated rigid or flexible backbone template and (c) stage 2 which validates the predicted sequences by fold specificity calculations. The aforementioned components are detailed below.
The original sequence selection method was first developed by Klepeis et al. (2003a–c, 2004). It selects and ranks amino acid sequences according to their energies in the design template using a novel integer linear programming (ILP) model that can be solved rigorously using branch and bound techniques (Floudas, 1995). The performance of branch and bound techniques can be significantly enhanced by introducing reformulation linearization techniques. This is done by multiplying appropriate constraints by bounded non-negative factors (such as the reformulated variables). Tighter relaxations are introduced as the products of the original variables by the new variables. The method was later improved by the use of a more computationally efficient sequence selection model for single template structure and the development of models that can address flexible template structures (Fung et al., 2007). Two types of sequence selection models were developed: (a) the weighted average and (b) the distance bin model (Fung et al., 2007).
Fold specificity calculations are used to rank the sequences from the sequence selection stage based upon a measure of how well the sequence folds into the design template. The fold specificity calculations can take on one of the two approaches: the ASTRO-FOLD approach (Klepeis and Floudas, 1999, 2002, 2003a,b, 2005; Klepeis et al., 1999, 2002, 2003a,b; Monnigmann and Floudas, 2005; Floudas et al., 2006; McAllister et al., 2006; Rajgaria et al., 2006, 2008) and the Tinker/CYANA based approach (Guntert et al., 1997; Guntert, 2004; Fung et al., 2008). The first approach employs the deterministic global optimization-based protein structure prediction framework of ASTRO-FOLD to generate two sets of conformational ensembles: one in which the protein is constrained to a region around the flexible backbone template (set temp) and the other in which the protein is allowed to fold freely (set total), while maintaining the secondary structure. The probability for the amino acid sequence to assume the target fold is then calculated from the energies of these two ensembles based on the Boltzmann distribution. The second approach (Fung et al., 2008) introduces an approximate method for fold specificity calculation which is computationally efficient. First, a flexible template is defined based on the upper and lower bounds on both the distances between α-carbons and the backbone dihedral angles between residues. An ensemble of hundreds of random conformers is then generated within the confines of the flexible template using the CYANA 2.1 software package for NMR structure refinement (Guntert et al., 1997; Guntert, 2004). CYANA 2.1 is then used to perform annealing calculations that simulate a rapid heating of the protein followed by a slow cooling in which high temperature torsion dynamics and annealing torsion dynamics are performed. Violations of van der Waals radii and of the flexible template are minimized, minimizing the energy of the target structures. For each structure in the ensemble, local minimizations are then performed by the TINKER package (Ren and Ponder, 2003) as directed by gradients in the fully atomistic force field AMBER (Cornell et al., 1995). The specificity of each mutant sequence to the target fold is then calculated relative to the native sequence using the Boltzmann distribution from statistical mechanics. To calculate the relative factor for specificity, the set native is defined as the set of all data points from the native sequence that are below upper bounds on the energy and RMSD of each conformer, and the set novel is defined as the set of all data points from the novel sequence that meet the same criterion. The bounds are calculated by first determining the mean and standard deviation of both RMSD and AMBER energies. The upper bound on RMSD is one and a half standard deviations above the mean, while the upper bound on energy is two and a half standard deviations above the mean. The factor for specificity, fspecificity, is then calculated using Boltzmann probabilities, where b = 1/kbT
The first step in our computational design was to decide upon the set of amino acid substitutions tolerated at each residue throughout MccJ25. The study by Pavlova et al. on single amino acid variants of MccJ25 (Pavlova et al., 2008) had not yet been published when we initiated our design, so we relied instead on a set of heuristics to constrain the peptide design. The structure of MccJ25 was examined and residues were classified as core or surface residues depending on their solvent-accessible surface area. Generally, the core residues were allowed to vary to a set of eight large or hydrophobic residues, while surface residues were allowed to vary to a larger set of 15 amino acids (Fig. 3) excluding only the charged amino acids and cysteine. We took into account that amino acids such as glycine and proline often play important structural roles with glycine endowing the peptide backbone with flexibility and proline constraining the peptide backbone. Glycine appears six times in MccJ25, at positions 1, 2, 4, 12, 14 and 21, so these positions were not allowed to vary from glycine in our design. Likewise, the two proline positions, Pro-7 and Pro-16, were fixed. The macrolactam ring of MccJ25 is formed between the N-terminus of Gly-1 and the sidechain of Glu-8, so the Glu-8 position was not allowed to vary in the design. Finally, His-5 was only allowed to vary to Arg or Lys in order to conserve the positive charge in the macrolactam ring. It is noteworthy that His-5 has been demonstrated to play a role in import of MccJ25 to bacterial cells via its interaction with the inner membrane protein SbmA (de Cristobal et al., 2006). The allowed amino acid variations for the computational design are summarized in Fig. 3. The total number of possible variants in this library is on the order of 3.2 × 1011. In order to reduce the complexity of the library, we restricted the results to a maximum of three amino acid substitutions.
The computational design was carried out using the de novo protein design framework presented earlier. The computations for the generation of 1000 sequences in a ranked order list based on their energies (i.e. stage 1 which solves the ILP model to global optimality) took only a few CPU seconds using the CPLEX 9.0 software package (IBM ILOG Software) on a Pentium IV 3.2 GHz processor. The computations for the re-ranking of the predicted sequences based on the fold specificity calculations are more demanding and were performed on a Beowulf cluster. Variants were ranked based on two metrics: energy from sequence selection and fold specificity. Based on both of these metrics, we selected a panel of eight variants to construct and test for both production and antimicrobial activity.
The sequences of the eight MccJ25 peptide variants, labeled mut1 to mut8, are given in Table II. Of these eight variants, seven contain three amino acid substitutions while the eighth variant contains two substitutions. It is noteworthy that the design algorithm selects residues that are found both in the macrolactam ring portion of MccJ25 (residues 1–8) and the threaded tail portion of the peptide (residues 9–21). However, the variants generated by the design algorithm tend to contain more amino acid substitutions in the ring portion of the peptide relative to the tail. In particular, Val-6 of the peptide appears to be a ‘hotspot’ for the design algorithm as seven of the eight variants have an amino acid substitution in this position.
To test the in vivo production and antimicrobial activity of the eight selected MccJ25 variants, the genes encoding mut1–8 were assembled from oligonucleotides and inserted into an engineered MccJ25 gene cluster, in place of the wild type mcjA gene, which was transcribed from a strong viral T5 promoter (Fig. 2) (Pan et al., 2010). The mcjBCD operon in this engineered cluster remained driven by its natural pmcjBCD promoter. Production of the variants was carried out in E.coli DH5α transformed with the plasmids (pJP13–15 and pWC1–5, Table I) harboring the engineered gene clusters. To determine whether these lasso peptide variants were produced and exported by the cells, we subjected the butanolic extracts of the culture supernatants to analytical HPLC and examined the traces for peaks near the retention time of authentic MccJ25, about 15.8 min. Remarkably, six of the variants, mut1–mut3 and mut5–mut7, were identified by HPLC analysis (Table III). Subsequent mass spectrometry confirmed that these peaks represented cyclized peptides corresponding to the predicted sequences. It should be noted that an alternative cyclized peptide, the ‘unthreaded lariat', is topologically possible, but previous work has demonstrated that this topology of MccJ25 is not functional in vitro or in vivo (Wilson et al., 2003). Thus, we assume that successful cyclization of the peptide and observation of antimicrobial activity imply that the peptide is in a lasso or threaded lariat topology. We cannot preclude the possibility that the inactive variants (mut2, mut3 and mut7) are present in an unthreaded lariat topology, though it is unlikely that such a topology would be correctly exported by the McjD transporter. Each of the variants was represented by a well-isolated peak in the HPLC chromatogram; however, there was no detectable peak for mut4 or mut8 (Fig. 4). The retention times of the six produced variants ranges from 15.8 to 18.3 min. We integrated the area under the peak on the HPLC chromatogram for wild-type MccJ25 and each of the MccJ25 variants to obtain an estimate of the relative production level of the peptides (Table III). The production level of the variants was shown to range from 0.5- to 1.5-fold relative to that of wild-type. Thus, we found that six of the eight MccJ25 variants tested were capable of being synthesized, processed into the mature lasso fold and exported out of the cells into the culture medium, at a production level that is close to or even exceeds that of the wild-type peptide.
The MccJ25 variants were evaluated for their antimicrobial activities by the zone of inhibition assay. Culture supernatants of E.coli DH5α producing mut1–mut8 were spotted on plates overlaid with soft agar that was inoculated with MccJ25-sensitive strain Salmonella newport. After overnight incubation, zones of inhibition were examined. The variants mut1, mut5 and mut6 exhibited antimicrobial activity, but the zones of inhibition formed by the variants were significantly smaller than the zone formed by authentic MccJ25 (Fig. 5). This suggested that the three variants were not as efficacious as wild-type MccJ25. To quantify this suspected decrease in efficacy, we performed serial dilution of the culture supernatants followed by spotting of the dilutions on S.newport plates. These experiments demonstrated quantitatively that the antimicrobial activities of the three functional variants are between 128- and 256-fold lower than that of wild-type (Table IV).
Here we have demonstrated that the highly constrained lasso topology of the antimicrobial peptide MccJ25 can be computationally redesigned. Two levels of experimental verification were performed. First, we tested whether the redesigned variants were produced and correctly exported using HPLC analysis of the culture supernatants. This test also implicitly interrogates whether the peptide is correctly cyclized into its lasso conformation since uncyclized peptide likely would not be exported by McjD. Secondly, we tested the variants for retention of antimicrobial activity against the MccJ25-sensitive strain Salmonella newport. This test of the MccJ25 function is multifaceted; successful killing of this bacterium requires recognition and internalization of the antibiotic by the outer membrane protein FhuA (Salomon and Farias, 1993), transport of the MccJ25 variant across the cytoplasmic membrane by SbmA (de Cristobal et al., 2006) and recognition of the peptide by RNAP. It should be noted that an alternative mode of action for MccJ25 has been proposed in which the antibiotic causes an increase in the reactive oxygen species production via disruption of the respiratory chain (Bellomio et al., 2007). It is possible that our variants kill S.newport by this mode of action rather than by the canonical killing mechanism of RNAP inhibition. Six of the eight redesigned MccJ25 variants tested in this work were successfully produced and exported by E.coli, while three of these six were able to kill S.newport, albeit not as efficaciously as the wild-type peptide.
The finding that the mut1–mut3 and mut5–mut7 peptides were produced demonstrates that the computational redesign algorithm can be successful even on a very stable, highly constrained structure like the lasso architecture. Moreover, this result provides additional information regarding the question of the extent of amino acid sequence space that the lasso fold can occupy. Though the structure of MccJ25 is quite rigid, it can accommodate multiple amino acid substitutions. Of the six variants that are produced, five contain three amino acid substitutions while the sixth has two substitutions. In the case of the triple variants, one in every seven amino acids in the peptide has been changed while still retaining the ability to fold into the lasso conformation. Some of the recurring amino acid changes in the variants that are produced and exported are quite non-conservative, such as the A3Q substitution that occurs in mut3 and mut5–mut7. The Val-6 position is replaced by either tryptophan or tyrosine, both much larger amino acids, in all of the exported variants. In general, our design variants have amino acid changes that result in increases in sidechain volume (Table II). Future redesign efforts of this peptide could focus on the amino acids A3 and V6, both of which are frequently changed in our designed peptides. These results build on previous observations (Pavlova et al., 2008; Knappe et al., 2009) that the sequence space that can be occupied by lasso peptides may actually be quite large and diverse. Fig. 6 shows a sequence alignment of the four structurally confirmed lasso peptides and the six exported variants from this study. The fact that the aa substitutions we observe in our successful redesigns are largely not present in known lasso peptides (with the exception of A3Q and V6Y, which are found in lariatin) affirms the use of the redesign technique in probing the set of sequences that can adopt the lasso architecture.
The computational redesign algorithm functions solely at the structural level to determine a sequence that will fit a given structure or set of structures. In other words, the algorithm does not directly take into account the function of the peptide or protein being designed. Thus, it is fairly remarkable that three of the eight redesigned MccJ25 variants retain some antimicrobial activity. As mentioned above, Pavlova et al. published a study in which a near-complete set of single amino acid variants of MccJ25 was tested sequentially for production and export, inhibition of RNAP, and for antimicrobial activity against E.coli or Shigella flexneri (Pavlova et al., 2008). Most of the amino acid substitutions we find in our computational redesign of MccJ25 are tolerated substitutions according to this study. One possible exception is the V6Y substitution we observe in the mut5 peptide, which is both exported and has antimicrobial activity. Pavlova et al. report that V6Y single variant is competent in export, but falls below the threshold (20% of wild-type activity) of inhibiting RNAP activity (Pavlova et al., 2008). Since variants that fell below this threshold were not tested for antimicrobial activity, it is possible that the V6Y single variant has weak antimicrobial activity on par with the weak activity we observed for mut5. It is also possible that mut5 is an antibiotic that functions via inhibition of respiration. An alternative explanation is that the V6Y substitution does cripple MccJ25, but one of the other aa substitutions in mut5 (A3Q and T15Q) plays a compensatory role and restores some function to the variant. At the opposite end of the spectrum, we also found a variant, mut4, that includes three ‘tolerated’ aa substitutions (A3Q, V6W and V11L) according to Pavlova et al. but is not even exported.
Lasso peptides represent an intriguing architecture for future drug design (Everts, 2010), and the stability and protease resistance (Rebuffat et al., 2004) of these peptides eliminate some major limitations to using peptides as drugs. The first step toward the goal of using these molecules as drugs is an understanding of which sequences can be tolerated in a lasso fold. We demonstrate here that computational redesign is a valuable technique toward this end. Our group is also complementing these computational approaches with purely experimental approaches based on high-throughput screening of random libraries of MccJ25 for antimicrobial activity to determine more sequences that can exist in the lasso fold.
This work was supported by the National Science Foundation (CBET-0952875 to A.J.L. and CTS-0426691 to C.A.F.), the National Institutes of Health (R01GM52032 to C.A.F.) and Princeton University (startup funds to A.J.L.).