|Home | About | Journals | Submit | Contact Us | Français|
Methods for site-specific modification of proteins are in high demand. Reactions that yield bioconjugates should be quantitative, site-specific, and versatile with respect to nature and size of the biological/chemical targets involved, require minimal modification of the target, display acceptable kinetics under physiological conditions, and be orthogonal to other labeling methods. Sortase-mediated transpeptidation reactions meet these criteria. Here we describe the expression and purification conditions for two orthogonal sortase A enzymes and provide the protocol that allows functionalization of any given protein at its C-terminus or for select proteins at an internal site. Sortase-mediated reactions take only a few minutes, but reaction times can be extended to increase yields.
One of the goals of protein engineering is the installation of desirable features, template-encoded or otherwise, on proteins that naturally lack them. The ability to confer different functionalities onto a protein of interest enables a broad array of applications. Attachment of a fluorophore to a protein allows its use in live-cell microscopy, while generation of a fusion between an antibody and a payload of interest, such as a toxin or an antigen, can find use in therapeutics and vaccine development, respectively.
Several strategies based on genetic, chemical, enzymatic, or chemo-enzymatic methods equip proteins with functional groups. Genetic engineering is the method of choice when modification at a precise site is required. However, the effect of the genetically appended sequence on expression, folding, and function of the final product is difficult to predict. Some proteins are simply refractory to the construction of functional fusions by standard genetic means 1. The range of modifications that can be applied to a protein as a fusion product is limited in the first instance to those that are template-encoded. Chemical modifications of proteins are more versatile but lack precision, as they usually target exposed cysteine or lysine residues. Moreover, because the reactions often call for non-physiological reaction conditions (pH, reducing conditions, ionic composition), chemical damage of the target protein can occur. While enzymatic methods can overcome some of drawbacks and afford site-specific protein modification, they often require genetic installation of sizable catalytically active protein domains (such as the O6-alkylguanine-DNA alkyltransferases (SNAP or CLIP-base technology 2,3), haloalkane dehalogenases (halo-tag technology, 20–40 kDa, 4) onto the protein substrate; installation of the 15-amino acid BirA acceptor peptide 5, the use of which is limited to biotin and its synthetically demanding chemical derivatives; engineering of a 13-amino acid acceptor peptide for lipoic acid ligase 6, with the limitation that it primarily accepts lipid substrates and therefore mutant screens are required to incorporate new functionalities; the use of the formylglycine-generating enzyme that converts a cysteine residue within the context of a LCTPSR sequence (aldehyde tag) into formylglycine that can be used in oxime ligations 7,8; or exploiting the enzyme phosphopantetheinyl transferase to conjugate CoA-derived molecules to a specific 11-amino acid sequence 9. Sortase-mediated transpeptidation reactions are a versatile complement to these protein modification strategies 10,11,12,13 and predominantly rely on the use of modified peptides, readily accessible by solid phase peptide synthesis using commercially available building blocks. Using sortases, we achieve labeling with similar precision afforded by genetic fusions, and moreover provide ready access to protein derivatives structures that are unattainable genetically.
Gram-positive bacteria display proteins at their surface to enable them to acquire nutrients, evade host immunity, or adhere to sites of infection 14. Sortases comprise a family of membrane-associated transpeptidases that anchor those proteins to the cell wall 11,15. The different members of the sortase family can be divided into four subfamilies, based on their distinct primary sequence and substrates 16. While sortases A accept a large number of protein substrates, the sortases of B-D type have more specialized functions and therefore fewer substrates. In Staphylococcus aureus, proteins targeted to the bacterial surface display a conserved sortase recognition motif: Leu-Pro-Xxx-Thr-Gly (LPXTG, where X is any amino acid and Gly cannot be a free carboxylate), at or near their C-terminus. Upon recognition, sortase A cleaves between the threonine and glycine residues to form an acyl-enzyme intermediate. The active site cysteine of sortase A bonds with the carbonyl of the threonine residue of the target protein. This intermediate is then resolved by nucleophilic attack by the free amino group of the cell wall precursor lipid II. This lipid II-linked protein conjugate is incorporated during cell wall synthesis and consequently the protein is displayed at the surface 17 (Fig. 1).
Sortase-mediated reactions are applicable to any protein of interest, provided it comprises a LPXTG motif as the sortase target, or a suitably exposed Gly residue to serve as the incoming nucleophile. Both modifications (LPXTG, Gly) can be introduced using standard molecular cloning protocols. Also, sortases A are easily expressed in soluble recombinant form and in excellent yield in E.coli (see “Expression and production of sortases ”). The natural nucleophile, lipid II, can be replaced by any peptide with an oligoglycine (Gly1-5) at the N-terminus. In turn, the peptides can be decorated with any molecule accessible through chemical synthesis (e.g., fluorophores, biotin, cross-linkers, lipids, carbohydrates, nucleic acids) 1,10,18,19,20,21,22,23 provided a free N-terminal Gly remains available on the peptide used as incoming nucleophile. Thus, incubation of sortase, LPXTG-containing protein, and nucleophile leads to the covalent attachment of that nucleophile to the protein of interest, in a site-specific manner. Because the oligoglycine peptide that serves as the nucleophile is functionalized beforehand, the chemical reaction conditions used to incorporate the functional group inflict damage on neither sortase nor the protein substrate, as long as the modified oligoglycine peptide remains in solution once added to the sortase reaction. Proteins can also serve as nucleophiles, provided they display a suitably exposed (stretch of) glycine(s) at their N-terminus (Gly1-5). Such modification allows the proteins to be N-terminally labeled with functionalized peptides 24,25 or to form protein-protein adducts 1,26,27.
Relying on a common mechanistic principle, sortagging affords ready access to a wealth of site-specific modifications: C-terminal 10,19,20, internal loop regions 1,28, N-terminal 24,25, and formation of cyclized (poly)peptides 29,30,31. We have mainly used sortases A derived from Staphylococcus aureus and Streptococcus pyogenes. Versions of Staphylococcus aureus with improved Kcat values 32, as well as mutant versions that do not require Ca++ ions 33 have been reported and further extend the range of reaction conditions and applications. Streptococcus pyogenes sortase A accepts dialanine (poly)peptides as nucleophiles 34. The possibility of using two orthogonal sortases increases the versatility of these labeling reactions, as one can attach two different labels to one and the same molecule of choice 24.
More than 50 different substrates including peptides 21,30, soluble proteins 10,20,28,29, membrane proteins displayed at the cell surface 10,35, M13 bacteriophage 36, budding influenza virus 37, antibodies 22,26,27, bacterial toxins 1,24, pre-assembled complexes 1,24 have yielded to sortase labeling. We have yet to encounter a protein that could not be labeled using sortase. One initial limitation of using sortase A from S. aureus was its obligate Ca2+ dependent activity 10. The presence of calcium in the reaction buffer precludes the use of phosphate-based buffers. Sortase A from S. pyogenes 34 or a mutant form of S. aureus (E105K/E108A33) are Ca2+ independent and thus circumvent this limitation.
Not every lab is equipped to perform peptide synthesis, and commercial vendors provide such services. To assist those interested in synthesizing their own peptides, we have included protocols that describe the synthesis of probes of general utility (biotin and fluorophores). It requires minimum specialized equipment (reaction vessels for peptide synthesis) and involves reactions readily executed in a laboratory outfitted for biochemical work (fume hoods, appropriate organic waste disposal, lyophilizer).
The components of any sortagging reaction are: sortase, substrate, and nucleophile. Thus, we have divided this protocol in five sections: “Engineering substrates for sortagging”, “Expression and production of sortases”, “Peptide synthesis”, and “C-terminal sortagging reactions ”.
Substrates equipped with the LPXTG recognition motif for S. aureus, or LPXTA for S. pyogenes can be engineered using standard molecular cloning protocols. Although any amino acid can precede the Thr residue, a glutamic acid is often used because it is commonly found in the natural sortase A substrates 38. The sortase recognition sequence is normally engineered at the C-terminus of the protein to be modified, with the G or A residue in amide linkage, followed by an affinity purification handle (e.g., His6) that is lost upon reaction (Fig. 2). The efficiency of the sortase-mediated reaction depends on the flexibility and accessibility of the region comprising the sortase A-recognition motif. Thus, if the C-terminus of the protein is known to be hidden (in the absence of a known structure, a failure to yield to sortase labeling would be a clear indication of lack of accessibility), we recommend engineering a flexible linker composed of (Gly4Ser)n preceding the LPETG/A sequence. The length of such linkers needs to be tested empirically for each protein of interest. Critical step. Identify the first amino acid of the protein substrate. Gly or Ala residues may cause cyclization as a competing side-reaction during labeling 29.
Site-specific modification of an internal solvent-exposed region in the protein of interest is a particular case of C-terminal labeling. As long as the LPXTG/A motif is introduced in an unstructured segment of the substrate protein, sortase can recognize it 28. An LPXTG motif that is highly structured is usually a poor sortase substrate 10,28. Flexibility can be ensured through installation of a specific protease cleavage site, immediately downstream of the sortase motif 1. Upon cleavage, the newly exposed C-terminus including the LPETG/A motif is likely to be unstructured. Accessibility to protease cleavage is a useful indicator for successful execution of a subsequent sortase reaction. Depending on the sequence of the protein, we rely on established site-directed mutagenesis strategies to engineer or to insert a LPXTG/A motif at the intended site. Since sortase cleaves the protein at the site of recognition, it is likely that the two halves of the protein will separate upon sortagging unless otherwise stabilized, for example through a disulfide bond 1 or for topological reasons 28. Thus, selecting an internal location for a sortase site in a loop region constrained by a disulfide bond might minimize the risk of such disintegration (Fig. 3). Critical step. Identifying the adequate protease to use has to be tested empirically to ensure that the protease does not cleave elsewhere within the protein. Trypsin and Factor Xa are examples of proteases used for this purpose 1.
We commonly use three different sortases: the Ca2+ dependent sortase A from S. aureus, its mutant version [E105K/E108A 33] and sortase A from S. pyogenes; the latter two are Ca2+ independent. Because sortase is a membrane protein in Gram-positive bacteria, we use versions where the transmembrane domain has been eliminated and replaced with a hexahistidine purification tag. Two soluble versions exist for S. aureus wild-type sortase A, with either a N-terminal deletion of 25 amino acids 17 or 59 amino acids. The enzymatic activity of both versions is identical 39, but the molecular weight is different. This is a useful trait to explore in those cases where the molecular weights of the protein to be labeled and of the sortase to be used are similar. In addition, we equipped the Δ59 truncated version with a thrombin cleavage site that releases the His tag upon digestion. This not only facilitates further downstream purification but also increases the mobility of sortase in SDS-PAGE gels, allowing a clear-cut distinction between sortase and substrate if so required.
The following standard procedure for protein expression and purification yields approximately 40 mg L−1 S. aureus sortase A and 10 mg L−1 S. pyogenes sortase A.
We here describe the manual synthesis of several peptides that can be used in reactions mediated by sortase A from S. aureus, and therefore contain glycine residues at the N-terminus. For reactions using sortase A from S. pyogenes an alanine based peptide is used as a nucleophile. Their synthesis follows the same protocol except for the use of Fmoc-Ala-OH in place of Fmoc-Gly-OH. Perform two or three repeated couplings of Fmoc-Ala-OH to obtain the di and tri-alanine sequence, respectively. Also, Fmoc-Lys(biotin)-OH, Fmoc-Lys(5-TAMRA)-OH, and other conjugates can be purchased as premade building blocks. These building blocks are more expensive, but reduce the time required for synthesis and ease purification of the desired final product.
Monitor peptide couplings by performing a Kaiser test 40.
In-solution coupling of NHS esters to (partially) protected peptides provides a route to couple base/acid labile and/or more costly dyes (such as Alexa fluor 647) or other suitable precursors to a GGG-containing peptide.
An alternative for the installation of chemical groups of interest onto a peptide is to use a cysteine-maleimide reaction. The procedure is similar to that of the GGGK NHS ester peptide, except the lysine is replaced by a cysteine and a maleimide-derived dye/probe is substituted for the NHS ester.
Note: Under the appropriate conditions, the maleimide will react with the cysteine exclusively and therefore fully deprotected peptides can be used.
Note: Prolonged storage of the probe in aqueous solution can result in hydrolysis of the maleimide. The resulting ring-opened product is still an excellent probe for the sortase reaction, but may cause difficulty during ion-exchange purification of the sortagged proteins. It is therefore crucial that probes containing maleimide-coupled functional groups are stored as lyophilized powders.
A successful sortase reaction often results in the formation of the acyl-enzyme intermediate if the oligoglycine probe is omitted. Small quantities of hydrolysis product (loss of (His)6 or epitope tag may occur upon cleavage of the sortase motif in the absence of added nucleophile. The acyl-enzyme intermediate usually survives reducing SDS-PAGE in detectable amounts. A full sortase reaction often yields a reaction product of mobility distinct from that of the input substrate and the hydrolysis product. The ability to distinguish the various intermediates critically depends on the molecular weight of the anticipated products and the gel systems used to analyze them.
Overview: Because the protein substrate is constructed with a His6 handle downstream of the LPXTG/A motif (see section “Engineering substrates for sortagging”), the protein substrate molecules that were not labeled will retain the His6 tag upon reaction. The sortases are themselves tagged with His6. Thus, a convenient strategy to separate the labeled product from sortase and unreacted substrate is the use of affinity-chromatography (Ni-NTA beads), followed by HLPC or by a desalting column to remove the unreacted peptide nucleophile. Critical step. Check compatibility of the incorporated functionality with Ni-NTA purification. We have noted that some functional groups, including acylhydrazones, bind to the resin.
The exact reaction conditions for protein labeling need to be determined empirically for each protein substrate. The range provided has yielded acceptable results in the majority of the cases. To achieve maximal levels of labeling, it is helpful to titrate the sortase, substrate, and probe concentrations relative to one another, to vary the time of labeling, and the temperature of reaction.
|No sortase expression||Wrong antibiotic selection||Confirm you used the right antibiotic|
|No induction||Make sure you are using BL21(DE3)|
Prepare a fresh solution of IPTG
|No protein labeling||No acyl-intermediate is being formed||(see below)|
|Not enough nucleophile added to the reaction||Reaction conditions have to be determined ad-hoc. Increase the amount of nucleophile to 10 mM and/or decrease the amount of substrate|
|pH of the reaction buffer not compatible with sortase activity||Ensure that the pH of the reaction buffer is neutral. Check the pH of the stock solutions. Probe solutions have the tendency to have a low pH due to residual traces of TFA. Multiple rounds of lyophilization and/or neutralization with buffer or aq. NaHCO3 solves this issue|
|Proteolysis of sortase||Verify the integrity of sortase upon reaction by coomassie staining or anti-His blot. The amount of sortase before and after reaction should be equal. If not, consider the presence of a contaminating protease, which most probably co-purified with the protein of interest. We recommend further purification of the protein substrate.|
|No acyl-intermediate is observed in the substrate-sortase control||The substrate is not correctly engineered||Confirm that the LPXTG/A motif is in frame and that no stop codons exist upstream of this sequence|
|Ensure that one or a few amino acids are present downstream of the Gly/Ala residues (Gly/Ala in LPXTG/A must be in amide linkage)|
|The LPXTG/A motif is not well exposed or it is designed in a conformationally constrained region of the protein||Extend the C-terminus of the protein introducing a linker (Gly4Ser)2 immediately upstream of the sortase motif|
|Sortase is inactive||Test the preparation of sortase using GFP.LPETG/A.His6 as the substrate|
|Not enough sortase added to the reaction||Sortase concentration has to be titrated for each substrate to be labeled. Increase the substrate concentration and/or decrease the amount of protein to be labeled|
|Detection of a circular form of the protein when attempting labeling the C-terminus. The circular versions of such sortase substrates often show more rapid migration on SDS-PAGE than their linear counterparts.||The N- and C-termini of the protein are in close proximity and the N-terminal amino acid of the protein (Gly or Ala) acts as a nucleophile||Confirm that the first amino acid in the protein of interest is not a Gly or Ala. If it is, mutate it to Ser for example. If the first amino acid has to be a glycine or an alanine, design a thrombin cleavage site (LVPR) immediately upstream the first amino acid and proceed to digestion with thrombin upon sortase labeling, to expose the Gly or Ala residues. It is possible that a Lys residue close to the N-terminus can serve as the incoming nucleophile for circularization. Consider this possibility if replacement of N-terminal Gly or Ala fails to suppress circularization.|
|Detection of a white fluffy precipitate during the sortase-labeling reaction||Ca2+ precipitate||Do not use phosphate buffers|
|Protein is precipitating during the labeling reaction||High concentration of protein, specially when attempting protein-protein fusions||Optimize reaction temperature and incubation time so less protein may be used to achieve the same reaction yield without precipitation|
|The protein of interest precipitates at high temperatures||Perform the labeling reaction at RT and extend the reaction time|
|No labeling when attempting modification at an internal site even when the site is nicked by a protease||The region is not flexible or accessible.||Extend the sequence of the protein by a few amino acids. The number of amino acids required must be determined empirically.|