A simple, general method to selectively introduce metal ion binding sites into polypeptides would greatly facilitate the engineering of catalytic and redox active sites, radioisotope binding sites, structural elements and spectroscopic probes into proteins.1 For example, recently it was shown that the genetic introduction of (2,2'-bipyridin-5-yl)alanine (Bpy-Ala) into the DNA binding protein catabolite activator protein generated a Cu2+ dependent oxidative DNA cleaving agent.2 Here we report that an amino acid derivative (HQ-Ala, 1) of the metal ion chelating group 8-hydroxyquinoline, which forms highly stable complexes with most transition metal ions and some lanthanides,3,4 can be genetically encoded in E. coli in response to the amber codon, TAG. Moreover, we show that addition of Zn2+ to HQ-Ala containing proteins results in the formation of both a fluorescent probe and of a heavy metal binding site for SAD phasing in protein crystallographic structure determination.
2-Amino-3-(8-hydroxyquinolin-3-yl)propanoic acid (HQ-Ala, 1) was synthesized in three steps starting from 3-methylquinolin-8-yl acetate5 (2) (Scheme 1). Bromination of 2 by NBS and AIBN afforded the bromomethyl quinoline intermediate to which diethyl acetamido-malonate was added. Subsequent acidic decarboxylation and hydrolysis afforded HQ-Ala in racemic form. To genetically encode HQ-Ala in E. coli, an orthogonal Methanococcus jannaschii amber suppressor tRNA (MjtRNA) /tyrosyl-tRNA synthetase (MjTyrRS) pair was used.6 To alter the amino acid specificity of MjTyrRS, a mutant library2a was generated by randomizing (NNK) nine active site residues (Y32, L65, H70, F108, Q109, Q155, D158, I159 and L162). The library was then subjected to rounds of alternating positive and negative selections. In the positive selection, cell survival is dependent on the suppression of an amber mutation (D112) in the chloramphenicol acetyl-transferase gene in the presence of the unnatural amino acid. In the negative selection, cell survival is dependent on the suppression of amber mutations (Q2, D44 and G65) in the barnase gene in the absence of the unnatural amino acid.
After three rounds of positive and two rounds of negative selection, 15 clones were isolated which survived on chloramphenicol only in the presence of HQ-Ala. DNA sequencing revealed eight unique mutants with HQ-3D4 corresponding to the most common sequence (Table 1). To determine the efficiency and fidelity with which the selected MjTyrRS incorporates HQ-Ala into proteins, a Z-domain protein with an amber codon at position 7 was expressed in E. coli in the presence of the MjtRNA/HQ-3D4 pair. Expression in glycerol minimal medium7 produced 1.5 mg/L full-length protein in the presence of 1 mM HQ-Ala, while no full-length protein was detected by SDS-PAGE analysis in the absence of HQ-Ala (Figure 1a). MALDI-TOF mass spectrometry (MS) analysis of the purified protein confirmed the incorporation of HQ-Ala (Figure 1b).
To examine whether binding of metal ions to HQ-Ala in proteins would create a site specific fluorescent reporter8, the HQ-Ala Z-domain mutant (10 μM) was titrated with Zn2+ and fluorescence from the HQ-Zn2+ complex was measured (Figure 2). The mutant protein was not fluorescent in the absence of Zn2+ when excited at 400 nm, but became fluorescent when Zn2+ was added; the fluorescence increased with increasing concentration of Zn2+. Thus the genetic incorporation of HQ-Ala into proteins should prove useful for the generation of biological metal ion sensors, and local fluorescent probes of protein structures, dynamics and ligand binding.
Next we examined whether HQ-Ala could be used to introduce heavy metal ion binding sites into proteins for crystallographic structure determination. To solve the structure of macromolecules by X-ray crystallography, both the amplitude and the phase angle of the diffraction pattern have to be determined. Without a reasonable homology model or ultra high resolution data, experimental phasing methods, such as single and multi-wavelength anomalous diffraction ((S/M)AD), and single or multiple isomorphous replacement ((S/M)IR) have to be used to calculate initial phases. All of these methods require site-specific placement of heavy atom(s) into the crystal. Currently, recombinant protein expression is widely used to incorporate selenium atoms into proteins by replacement of methionine residues in the sequence allowing SeMet (S/M)AD experiments to be performed with tunable X-ray sources.9 However, introduction of too few or disordered selenium atoms are problematic in phase determination. The incorporation of selenium methionine can also adversely affect the yield and solubility of the recombinant protein. An alternative approach is to soak the protein crystal in solutions containing heavy atoms, however, the likelihood of obtaining a site-specifically bound heavy atom at high occupancy is low. It would be ideal to be able to rapidly engineer metal ion binding-sites into proteins that bind heavy atoms with high occupancy.
To this end, TM0665 (O-acetylserine sulfhydrylase), a test protein from the Thermotoga maritima Structural Genomics program10 was used to examine the utility of HQ-Ala incorporation for SAD phasing. A His-tagged TM0665 Phe22 to HQ-Ala mutant was expressed and purified by Ni2+ affinity chromatography. Crystallization of the HQ-Ala TM0665 mutant was accomplished by the hanging-drop vapor diffusion method. Upon incubation with Zn2+, the HQ-Ala TM0665 mutant was fluorescent both in solution and as crystals, which indicates successful metal ion coordination. The structure of TM0665 was determined from data collected at beamline 5.0.2 of the Advanced Light Source at 100 K using a SAD data collection strategy with an energy corresponding to the maximum value of δf″ as determined from a fluorescent scan (wavelength of 1.2815 Å). In all, 120° of data were collected in two 30° segments using an inverse beam strategy in 5° wedges to a maximum resolution of 2.1 Å (Table 2). Crystals of TM0665 belong to the space group P42212 with cell dimensions a = b = 135.31 Å, c = 74.93 Å, α = 90°, β = 90°, and γ = 90°, with two molecules in each asymmetric unit.
SAD phasing was carried out with the program SOLVE in PHENIX (http://www.phenix-online.org/). The peak wavelength data was used within the resolution range 2.1 to 20 Å. The anomalous signal was high as would be expected from an almost fully occupied stable metal site; the average ratio of Bijovet pairs (<|ΔF|>/<F>) was 5.3%. This combined with the fact that were only two zinc atoms in the asymmetric unit enabled both direct methods and Patterson-based substructure determination programs such as SHELX and SOLVE to easily locate the position of the atoms, a feat that could also be performed by manual interpretation of the Patterson map. The initial mean figure of merit (FOM) is 0.34. The electron density map was further improved by the RESOLVE program (FOM = 0.66). A clearly interpretable electron density map was calculated using FFT in the CCP4 suite11. The electron density of Zn2+ became evident at the 5-σ level. The initial model was built with an automatic modeling building program, TEXAL12,13. The structure was rebuilt using XtalView and refined with the program REFMAC11,14. Atomic coordinates and structure factors of TM0665_HQ-Ala have been deposited in the Protein Data Bank (PDB code: XXX).
The structure of TM0665 HQ-Ala can be readily superimposed with that of the wild type protein. There is no significant difference between wild type TM0665 and the HQ-Ala mutant (root mean squared deviation (rmsd) of C alpha atoms is 0.48 Å), except for the replacement of phenylalanine by HQ-Ala at position 22. The HQ-Ala residue is located on the surface of one subunit and near the dimer interface of the two subunits in the asymmetric unit. The side chain of HQ-Ala is exposed toward the solvent and chelates Zn2+ through Zn-N (2.1 Å) and Zn-O (2.0 Å) interactions. There is no other residue directly interacting with Zn2+, indicating that the HQ-Ala group is enough to provide a well-ordered metal binding site per se.
In summary, a metal-chelating amino acid was site-specifically incorporated into proteins in E. coli with high efficiency and specificity in response to the amber codon. The ability to selectively add metal ions to protein crystals supplements other powerful techniques for structure determination and facilitates the introduction of well-ordered heavy metals into proteins with high occupancy for phasing techniques such as S/MIR and (S/MAS) experiments. This method becomes particularly useful for those proteins not suitable for selenomethionine phasing. Because HQ complexed with Zn2+ or Mg2+ is fluorescent, proteins with HQ-Ala can also be used as fluorescence-based sensors in vitro and in vivo.15 In addition, this method would facilitate the de novo design of metalloproteins with other novel structures and functions.