The ability to pursue NMR studies of large proteins and biological complexes has steadily improved due to technology development. NMR investigations of multi-domain proteins have specifically benefited from development of segmental labeling techniques that permit selective observation of a single domain out of several. Segmental labeling requires use of a protein ligation method to covalently attach an isotopically labeled protein domain to another, unlabeled one for selective NMR detection. Multiple ligation techniques have been demonstrated in the literature, primarily including native chemical ligation, expressed protein ligation (EPL), protein trans-splicing (PTS) and chemical modification (Dawson et al. 1994
; Severinov and Muir 1998
; Muralidharan and Muir 2006
; Nilsson et al. 2003
). Among these, EPL and PTS have been the most successful for segmental labeling of proteins for NMR study (Muona et al. 2010
; Skrisovska et al. 2010
Despite the appeal of using ligases and transpeptidases for segmental labeling, the specific implementation of enzymes for this particular purpose has not been demonstrated yet (Skrisovska et al. 2010
; Iwai and Züger 2007
; Kobashigawa et al. 2009
). A key advantage of enzymatic protein ligation is that the reaction can be conducted using near physiological temperature and pH conditions. Furthermore, enzymatic ligation precludes the use of cysteines on the substrates and concerns regarding redox chemistry, both of which are integral to the more commonly employed EPL and PTS methods. Consequently, enzymatic ligation has potentially greater applicability to a wider range of proteins. In general, the ability to ligate proteins requiring relatively inert amino acid sequences presents greater opportunities for studies of multi-domain proteins by structural biologists.
Sortase A is a transpeptidase found in the Gram-positive bacterium S. aureus
. In 2004, it was demonstrated that sortase enzyme can be used to ligate two proteins together in vitro (Mao et al. 2004
). Sortase ligations are specific since this enzyme requires the presence of a ‘LPXTG’ sequence motif at the end of the N-terminal chain as well as at least one glycine at the N-terminus of the C-terminal chain ().
Fig. 1 Scheme of protein ligation by sortase A enzyme. The consensus sortase residues required at the end of the N-domain are shown (in red) followed by an ‘R’ group (typically His6, in black). Sortase-specific glycine residue(s) required at (more ...)
Sortase-mediated ligation was used here to segmentally label MecA, a two-domain B. subtilis
protein which functions to proteolytically regulate bacterial competence by targeting a transcription factor for ATP-dependent hydrolysis. Structural studies of intact MecA have focused primarily upon the individual domains as evidenced by the relatively recent X-ray study of the C-terminal domain of this protein (Wang et al. 2009
). Detailed NMR analysis of the entire MecA protein would benefit from segmental labeling of specific domains to reduce resonance overlap. Another issue concerns protein dimerization since both the intact as well as the individual domains of this protein form homodimers (Persuh et al. 1999
). Line broadening concomitant with dimerization can further contribute to resonance congestion. Clearly, segmental labeling of MecA would enhance NMR studies of this multi-domain protein.
The overall feasibility of this sortase-mediated segmental labeling approach relies upon several factors. First, both domains of interest must be connected by a linker sequence of suitable length (five or more residues). Then, the sortase consensus five residue sequence, “LPXTG”, must also exist somewhere within this linker. Another requirement is the presence of an N-terminal glycine cap at the beginning of the C-terminal domain. All of these specific residues must then be sterically accessible to sortase ().
To determine whether sortase-mediated ligation would be feasible for segmental labeling of MecA, the amino acid sequence and domain organization of this protein were examined. The two domains are connected via a linker region corresponding to residues 83–93 (Persuh et al. 1999
). The last five residues of this linker, LPIPE, were targeted as a potential sortase ligation site since mutation of residues P92 and E93 to T and G, respectively, results in the sortase-compatible sequence, LPITG. The MecA P93T/E94G double mutant represents the starting point for use of sortase-mediated ligation for MecA segmental labeling. Consequently, this particular MecA double mutant (‘MecAmut’), as well as several, slightly modified forms of the individual domains were generated for this study. One domain modification is the addition of a C-terminal hexa-histidine sequence to the MecA N-domain (‘MecAN-His’). A second involves addition of a second glycine at the N-terminus of the MecA C-domain (‘MecAC2G’). Both of these modifications have been documented to improve ligation yields (Mao et al. 2004
; Parthasarathy et al. 2007
; Clow et al. 2008
; Kobashigawa et al. 2009
). The amino acid sequences of these various protein constructs are provided in Supplementary Information (Figure S1)
A preliminary, 1 ml test ligation was conducted by incubation of unlabeled protein domain substrates with sortase at room temperature for 24 h. Based upon previous studies (Lu et al. 2007
), the final reaction component concentrations were 12.4 μM (MecAN-His), 42.2 μM (MecAC) and 50 μM sortase in a pH 7.5, 50 mM Tris, 150 mM NaCl, 5 mM CaCl2
and 2 mM BME buffer. The reaction was conducted within a 6–8 kDa MWCO dialysis bag with continuous dialysis against the above reaction buffer (Pritz et al. 2007
). Reaction samples removed at 0, 2, 4, 6 and 24 h were analyzed by SDSPAGE.
Over time, a new protein band appeared and increased in intensity (). To verify identity of this band, the gel was submitted for peptide sequencing analysis by mass spectrometry. The MecAmut and new gel bands were removed and digested with trypsin. MS/MS sequencing of the resultant tryptic peptide fragments indicated that the sequences of both proteins correlated well with native MecA (Eismann et al. 2008
). Since the fragments spanned the N- and C-terminal domains, the unknown band corresponds to a ligated product comprised of both domains ().
Fig. 2 Confirmation of successful sortase ligation by SDSPAGE and mass spectrometry. The SDSPAGE gel is shown on the left. The left lane corresponds to double mutant MecAmut (band ‘A’) while the remaining lanes correspond to the ligation mixture (more ...)
Upon successful demonstration that sortase ligation of the N- and C-terminal domains results in production of full length MecAmut, conditions for a large scale sortase ligation reaction were optimized. Initial reactions that were performed in batch mode at 37°C resulted in low product yields. Dialysis of the reaction was then employed to favor formation of product at room temperature. At lower temperature, the equilibrium product formation was higher and stable up to 5 days with maximal product observed via SDSPAGE by 72 h. For room temperature and dialysis reaction conditions, the sortase A concentration was varied from 2–50 uM with the optimal concentration determined to be 5 uM. The MecAN-His versus MecAC concentration ratios were subsequently tested and these ranged from 1:1 to 1:5 (MecAN-His:MecAC). In general, saturating the reaction with high concentrations of C-terminal protein relative to N-terminal protein domains results in more efficient ligations with minimal hydrolysis (Mao et al. 2004
). Finally, formation of MecA ligation product was further improved when the C-terminal domain was capped with a double rather than a single glycine at its N-terminus (Parthasarathy et al. 2007
). This ligation product, ‘MecA-mut2G’, contains an extra glycine within the linker relative to MecAmut (Figure S1
In order to generate a segmentally labeled form of MecA consisting of 15
N-labeled N-terminal domain only, 15
N-labeled MecAN-His and unlabeled MecAC2G substrates were mixed and incubated with sortase in a pH 7.5, 50 mM Tris, 150 mM NaCl, 10 mM CaCl2
and 2 mM BME buffer. The final concentrations of reaction components were 5 μM sortase, 15uM 15
N MecAN-His and 51 uM unlabeled MecAC2G in 26 ml. The reaction was conducted within a 6–8 kDA MWCO dialysis bag that was incubated for 72–96 h at 20–22°C with daily reaction buffer exchange and monitoring via SDSPAGE (Figure S2
Once product formation reached steady state (approximately 72 h), the reaction was quenched by addition of EDTA (20 mM final concentration, Parthasarathy et al. 2007
), followed by overnight dialysis at 8°C against a 50 mM Tris, 150 mM NaCl and 5 mM EDTA, pH 7.5 buffer. The mixture was concentrated via ultrafiltration (Millipore Ultra-free 3 kDa) and purified by SEC FPLC chromatography (Superdex 75). Fractions containing MecAmut were verified by SDSPAGE (Figure S3
) before they were pooled and concentrated via ultrafiltration. The crude yield of conjugate, based upon SDSPAGE and FPLC peak integration (Figure S3
) is estimated to be 60%. The NMR sample corresponded to 0.5 mM protein (quantitated with Bradford reagent, Bio-Rad) in a 90%:10% H2
O, 25 mM sodium phosphate, 5 mM BME, 5 mM EDTA, pH 7.5 buffer. The final quantity of purified, segmentally 15
N-labeled MecAN-MecAC conjugate obtained was 4 milligrams. This corresponds to a final yield calculated relative to the limiting reagent, 15
N-MecAN-His, of 40%.
Sortase-mediated ligations of protein-peptide or peptide-peptide systems have maximal efficiencies approaching 70% (Pritz et al. 2007
; Mao et al. 2004
). Reported protein–protein ligation efficiencies have varied from low (Mao et al. 2004
) to 90% (Kobashigawa et al. 2009
). For the latter, 90% represents an apparent ligation efficiency that is based upon qualitative disappearance of the SDSPAGE N-terminal domain substrate band (Kobashigawa et al. 2009
). For our study, we chose to report traditional product yields because gel band intensities can decrease for various reasons including ligation progress as well as degradation. Based upon the amount of conjugate protein obtained relative to the amount of limiting protein substrate, the crude and final purified yield are 60 and 40%, respectively, for our system. It is also anticipated that these yields may be affected by the dimeric properties of both MecA protein domains.
In general, the final recovery of sortase-mediated ligation product will depend upon linker accessibility as well as efficiency of product purification. Based upon these results, we believe that sortase A represents a viable alternative for segmental labeling of multi-domain proteins.
One-dimensional proton NMR spectra were recorded from the segmentally labeled MecA protein using various edit/filter options. In Figure S4
, effects of segmental labeling are evident from observation of only a subset of 15
N-coupled amide proton resonances in the 15
N-edited spectrum, relative to those recorded in a reference, unedited/unfiltered spectrum. The number and pattern of observable 15
N-edited proton resonances are also distinctive from those of the 14
N-coupled amide proton resonances observed in the 15
N-filtered spectrum (Figure S4
Comparison of 2D 1
N HSQC spectra recorded from the native and double mutant forms of two-domain MecA indicate that both proteins adopt the same global fold with only a very small number of observed resonance shifts (less than 5%, Figure S5
). Comparison of the 2D 1
N HSQC spectra of segmentally labeled 15
N-MecAN-MecAC (MecAmut2G) conjugate versus fully 15
N-labeled MecA-mut indicates that the conjugate yields far fewer resonance peaks, with minimal differences between the N-terminal domain resonance patterns of both ().
Fig. 3 2D 1H-15N HSQC spectra of fully 15N-labeled MecA double mutant (MecAmut) (top left) and N-terminally labeled 15N-MecAN-MecAC (MecAmut2G) protein conjugate (top right). Overlay of the two spectra is shown at bottom (black—fully labeled protein, (more ...)
These results represent the first application of sortase-mediated ligation for segmental labeling of a multi-domain protein. This enzymatic approach involves micromolar protein domain concentrations and near physiological conditions. Although the fully native versus the ligated MecA proteins differ by three amino acids within the linker region, the detected NMR differences indicate that this does not significantly affect the segmentally labeled protein in terms of the global structure of MecA. Truncation mutants at the N-terminus of the C-terminal domain of MecA that encompass the LPIPE target site suggest that this linker region is quite flexible, as judged by strong resonance intensities and predominantly random-coil chemical shifts in the corresponding 2D 1H-15N HSQC spectra (Cavanagh et al. unpublished data). The NMR results provide no evidence for any significant changes in inter-domain contacts in MecA arising from the ligation process.
Segmental isotopic labeling via sortase-mediated ligation has significant implications for improving the spectral quality and simplifying analyses of multi-domain proteins for structural biology studies by NMR or other methods. This methodology can be employed for example to support NMR experiments designed to detect inter-domain NOEs from proteins with different domain isotope labeling patterns, and to facilitate characterization of individual domain dynamics via NMR spin relaxation studies.