The New York SGX Research Center for Structural Genomics (NYSGXRC) of the NIGMS Protein Structure Initiative (PSI) has applied its high-throughput X-ray crystallographic structure determination platform to systematic studies of all human protein phosphatases and protein phosphatases from biomedically-relevant pathogens. To date, the NYSGXRC has determined structures of 21 distinct protein phosphatases: 14 from human, 2 from mouse, 2 from the pathogen Toxoplasma gondii, 1 from Trypanosoma brucei, the parasite responsible for African sleeping sickness, and 2 from the principal mosquito vector of malaria in Africa, Anopheles gambiae. These structures provide insights into both normal and pathophysiologic processes, including transcriptional regulation, regulation of major signaling pathways, neural development, and type 1 diabetes. In conjunction with the contributions of other international structural genomics consortia, these efforts promise to provide an unprecedented database and materials repository for structure-guided experimental and computational discovery of inhibitors for all classes of protein phosphatases.
Structural genomics; Phosphatase; NYSGXRC; X-ray crystallography
The haloacid dehalogenase enzyme superfamily (HADSF) is largely composed of phosphatases, which, relative to members of other phosphatase families, have been particularly successful at adaptation to novel biological functions. Herein, we examine the structural basis for divergence of function in two bacterial homologs: 2-keto-3-deoxy-D-manno-octulosonate 8-phosphate phosphohydrolase (KDO8P phosphatase, KDO8PP) and 2-keto-3-deoxy-9-O-phosphonononic acid phosphohydrolase (KDN9P phosphatase, KDN9PP). KDO8PP and KDN9PP catalyze the final step of KDO and KDN synthesis, respectively, prior to transfer to CMP to form the activated sugar nucleotide. KDO8PP and KDN9PP orthologs derived from an evolutionarily diverse collection of bacterial species were subjected to steady-state kinetic analysis to determine their specificities towards catalyzed KDO8P and KDN9P hydrolysis. Although each enzyme was more active with its biological substrate, the degree of selectivity (as defined by the ratio of kcat/Km for KDO8P vs. KDN9P) varied significantly. High-resolution X-ray structure determination of Haemophilus influenzae KDO8PP bound to KDO/VO3− and Bacteriodes thetaiotaomicron KDN9PP bound to KDN/VO3− revealed the substrate-binding residues. Structures of the KDO8PP and KDN9PP orthologs were also determined to reveal the differences in active site structure that underlies the variation in substrate preference. Bioinformatic analysis was carried out to define the sequence divergence among KDN9PP and KDO8PP orthologs. The KDN9PP orthologs were found to exist as single domain proteins or fused with the pathway nucleotidyl transferases; fusion of KDO8PP with the transferase is rare. The KDO8PP and KDN9PP orthologs share a stringently conserved Arg residue, which forms a salt bridge with the substrate carboxylate group. The split of the KDN9PP lineage from the KDO8PP orthologs is easily tracked by the acquisition of a Glu/Lys pair that supports KDN9P binding. Moreover, independently evolved lineages of KDO8PP orthologs exist, separated by diffuse active-site sequence boundaries. We infer high tolerance of the KDO8PP catalytic platform to amino acid replacements that, in turn, influence substrate specificity changes and thereby facilitate divergence of biological function.
2-keto-3-deoxyoctulosonic acid (KDO); 2-keto-3-deoxynononic acid (KDN); phosphohydrolases; enzyme evolution; ortholog boundaries
The nuclear pore complex, composed of proteins termed nucleoporins (Nups), is responsible for nucleocytoplasmic transport in eukaryotes. Nuclear pore complexes (NPCs) form an annular structure composed of the nuclear ring, cytoplasmic ring, a membrane ring, and two inner rings. Nup192 is a major component of the NPC’s inner ring. We report the crystal structure of Saccharomyces cerevisiae Nup192 residues 2–960 [ScNup192(2–960)], which adopts an α-helical fold with three domains (i.e., D1, D2, and D3). Small angle X-ray scattering and electron microscopy (EM) studies reveal that ScNup192(2–960) could undergo long-range transition between “open” and “closed” conformations. We obtained a structural model of full-length ScNup192 based on EM, the structure of ScNup192(2–960), and homology modeling. Evolutionary analyses using the ScNup192(2–960) structure suggest that NPCs and vesicle-coating complexes are descended from a common membrane-coating ancestral complex. We show that suppression of Nup192 expression leads to compromised nuclear transport and hypothesize a role for Nup192 in modulating the permeability of the NPC central channel.
Rhodospirillum rubrumproduces 5-methylthioadenosine (MTA) from S-adenosylmethionine (SAM) in polyamine biosynthesis; however, R. rubrum lacks the classical methionine salvage pathway. Instead, MTA is converted to 5-methylthio-D-ribose 1-phosphate (MTR 1-P) and adenine; MTR 1-P is isomerized to 1-methylthio-D-xylulose 5-phosphate (MTXu 5-P) and reductively dethiomethylated to 1-deoxy-D-xylulose 5-phosphate (DXP), an intermediate in the nonmevalonate isoprenoid pathway (Erb et al. Nature Chem. Biol., in press). Dethiomethylation, a novel route to DXP, is catalyzed by MTXu 5-P methylsulfurylase. An active site Cys displaces the enolate of DXP from MTXu 5-P, generating a methyl disulfide intermediate.
The nuclear pore complex (NPC), embedded in the nuclear envelope, is a large, dynamic molecular assembly that facilitates exchange of macromolecules between the nucleus and cytoplasm. The yeast NPC is an eight-fold symmetric annular structure composed of ~456 polypeptide chains contributed by ~30 distinct proteins termed nucleoporins (Nups). Nup116, identified only in fungi, plays a central role in both protein import and mRNA export through the NPC. Nup116 is a modular protein with N-terminal “FG” repeats containing a Gle2p-binding sequence motif (GLEBS motif) and a NPC targeting domain at its C-terminus. We report the crystal structure of the NPC targeting domain of Candida glabrata Nup116, consisting of residues 882-1034 [CgNup116(882-1034)], at 1.94 Å resolution. The X-ray structure of CgNup116(882-1034) is consistent with the molecular envelope determined in solution by Small Angle X-ray Scattering (SAXS). Structural similarities of CgNup116(882-1034) with homologous domains from Saccharomyces cerevisiae Nup116, S. cerevisiaeNup145N, and human Nup98 are discussed.
Nuclear Pore Complex; Nup116; Nup98; Nup100; Nup145; mRNA export; structural genomics
LigI from Sphingomonas paucimobilis catalyzes the reversible hydrolysis of 2-pyrone-4,6-dicarboxylate (PDC) to 4-oxalomesaconate (OMA) and 4-carboxy-2-hydroxymuconate (CHM) in the degradation of lignin. This protein is a member of the amidohydrolase superfamily of enzymes. The protein was expressed in E. coli and then purified to homogeneity. The purified recombinant enzyme does not contain bound metal ions and the addition of metal chelators or divalent metal ions to the assay mixtures does not affect the rate of product formation. This is the first enzyme from the amidohydrolase superfamily that does not require a divalent metal ion for catalytic activity. The kinetic constants for the hydrolysis of PDC are 340 s−1 and 9.8 × 106 M−1s−1 for the values of kcat, and kcat/Km respectively. The pH dependence on the kinetic constants suggests that a single active site residue must be deprotonated for the hydrolysis of PDC. The site of nucleophilic attack was determined by conducting the hydrolysis of PDC in 18O-labeled water and subsequent 13C NMR analysis. The crystal structures of wild-type LigI and the D248A mutant in the presence of the reaction product were determined to a resolution of 1.9 Å. The C-8 and C-11 carboxylate groups of PDC are coordinated within the active site via ion pair interactions with Arg-130 and Arg-124, respectively. The hydrolytic water molecule is activated by a proton transfer to Asp-248. The carbonyl group of the lactone substrate is activated by electrostatic interactions with His-180, His-31 and His-33.
Four proteins from NCBI cog1816, previously annotated as adenosine deaminases, have been subjected to structural and functional characterization. Pa0148 (Pseudomonas aeruginosa PAO1), AAur1117 (Arthrobacter aurescens TC1), Sgx9403e, and Sgx9403g, have been purified and their substrate profiles determined. Adenosine is not a substrate for any of these enzymes. All of these proteins will deaminate adenine to produce hypoxanthine with values of kcat/Km that exceed 105 M−1s−1. These enzymes will also accept 6-chloropurine, 6-methoxypurine, N-6-methyladenine, and 2,6-diaminopurine as alternate substrates. X-ray structures of Pa0148 and AAur1117 have been determined and reveal nearly identical distorted (β/α)8-barrels with a single zinc ion that is characteristic of members of the amidohydrolase superfamily. Structures of Pa0148 with adenine, 6-chloropurine and hypoxanthine were also determined thereby permitting identification of the residues responsible for coordinating the substrate and product.
Nuclear pore complexes (NPCs), responsible for the nucleo-cytoplasmic exchange of proteins and nucleic acids, are dynamic macromolecular assemblies forming an eight-fold symmetric co-axial ring structure. Yeast (Saccharomyces cerevisiae) NPCs are made up of at least 456 polypeptide chains of ~30 distinct sequences. Many of these components (nucleoporins, Nups) share similar structural motifs and form stable subcomplexes. We have determined a high-resolution crystal structure of the C-terminal domain of yeast Nup133 (ScNup133), a component of the heptameric Nup84 subcomplex. Expression tests yielded ScNup133(944-1157) that produced crystals diffracting to 1.9Å resolution.
ScNup133(944-1157) adopts essentially an all α-helical fold, with a short two stranded β-sheet at the C-terminus. The 11 α-helices of ScNup133(944-1157) form a compact fold. In contrast, the previously determined structure of human Nup133(934-1156) bound to a fragment of human Nup107 has its constituent α-helices are arranged in two globular blocks. These differences may reflect structural divergence among homologous nucleoporins.
Nuclear Pore Complex; Nup133; structural genomics
Adenine deaminase (ADE) catalyzes the conversion of adenine to hypoxanthine and ammonia. The enzyme isolated from Escherichia coli using standard expression conditions was low for the deamination of adenine (kcat = 2.0 s−1; kcat/Km = 2.5 × 103 M−1 s−1). However, when iron was sequestered with a metal chelator and the growth medium was supplemented with Mn2+ prior to induction, the purified enzyme was substantially more active for the deamination of adenine with values of kcat and kcat/Km of 200 s−1 and 5 × 105 M−1s−1, respectively. The apo-enzyme was prepared and reconstituted with Fe2+, Zn2+, or Mn2+. In each case, two enzyme-equivalents of metal were necessary for reconstitution of the deaminase activity. This work provides the first example of any member within the deaminase sub-family of the amidohydrolase superfamily (AHS) to utilize a binuclear metal center for the catalysis of a deamination reaction. [FeII/FeII]-ADE was oxidized to [FeIII/FeIII]-ADE with ferricyanide with inactivation of the deaminase activity. Reducing [FeIII/FeIII]-ADE with dithionite restored the deaminase activity and thus the di-ferrous form of the enzyme is essential for catalytic activity. No evidence for spin-coupling between metal ions was evident by EPR or Mössbauer spectroscopies. The three-dimensional structure of adenine deaminase from Agrobacterium tumefaciens (Atu4426) was determined by X-ray crystallography at 2.2 Å resolution and adenine was modeled into the active site based on homology to other members of the amidohydrolase superfamily. Based on the model of the adenine-ADE complex and subsequent mutagenesis experiments, the roles for each of the highly conserved residues were proposed. Solvent isotope effects, pH rate profiles and solvent viscosity were utilized to propose a chemical reaction mechanism and the identity of the rate limiting steps.
Two enzymes of unknown function from the amidohydrolase superfamily were discovered to catalyze the deamination of N-6-methyladenine to hypoxanthine and methyl amine. The methylation of adenine in bacterial DNA is a common modification for the protection of host DNA against restriction endonucleases. The enzyme from Bacillus halodurans, Bh0637, catalyzes the deamination of N-6-methyladenine with a kcat of 185 s−1 and a kcat/Km of 2.5 × 106 M−1 s−1. Bh0637 catalyzes the deamination of N-6-methyladenine two orders of magnitude faster than adenine. A comparative model of Bh0637 was computed using the three-dimensional structure of Atu4426 (PDB code: 3NQB) as a structural template and computational docking was used to rationalize the preferential utilization of N-6-methyladenine over adenine. This is the first identification of an N-6-methyladenine deaminase (6-MAD).
Fatty acyl-AMP ligase (FAAL) is a new member of a family of adenylate-forming enzymes that were recently discovered in Mycobacterium tuberculosis (Mtb). They are similar in sequence to fatty acyl-CoA ligases (FACLs). However, while FACLs perform a two-step catalytic reaction, AMP ligation followed by CoA ligation using ATP and CoA as cofactors, FAALs produce only the acyl adenylate and are unable to perform the second step. We report X-ray crystal structures of full length FAALs from E. coli (EcFAAL) and Legionella pneumophila (LpFAAL) bound to acyl adenylate, determined at resolution limits of 3.0 and 1.85 Å, respectively. The structures share a larger N-terminal domain and a smaller C-terminal domain, which together resemble the previously determined structures of FAAL and FACL proteins. Our two structures occur in quite different conformations. EcFAAL adopts the adenylate forming conformation typical of FACLs, whereas LpFAAL exhibits a unique intermediate conformation. Both EcFAAL and LpFAAL have insertion motifs that distinguish them from the FACLs. Structures of EcFAAL and LpFAAL reveal detailed interactions between this insertion motif and the interdomain hinge region and with the C-terminal domain. We suggest that the insertion motifs support sufficient interdomain motions to allow substrate binding and product release during acyl adenylate formation, whereas they preclude CoA binding thereby preventing CoA ligation.
Fatty acyl-AMP ligase; Fatty acyl-CoA ligase; X-ray structure; AMP; CoA
Magnesium transporter; cytosolic domain; x-ray structure
The X-ray structure of a putative BenF-like (gene name: PFL1329) protein from Pseudomonas fluorescens Pf-5 (PflBenF) has been determined at 2.6Å resolution. X-ray crystallography revealed a canonical 18-stranded β-barrel fold that forms a central pore with a diameter of ∼4.6Å, which is consistent with the size and physicochemical properties of the presumed aromatic acid substrate, benzoate. Detailed comparisons with the previously-determined structure of Pseudomonas aeruginosa OpdK, a vanillate influx channel, revealed an arginine-rich aromatic acid selectivity filter of nearly identical structure composed of seven highly conserved residues Arg∼Asp∼Arg∼Arg∼Ser∼Asp∼Arg (R∼D∼R∼R∼S∼D∼R sequence motif, where ∼ denotes intervening residues) that define the narrowest part of the pore.
BenF-like; substrate specific porin; OprD superfamily; OprD subfamily; OpdK subfamily; benzoate; Pseudomonas; integral membrane protein
A strategy for increasing the efficiency of protein crystallization/structure determination with mass spectrometry has been developed. This approach combines insights from limited proteolysis/mass spectrometry and crystallization via in situ proteolysis. The procedure seeks to identify protease-resistant polypeptide chain segments from purified proteins on the time-scale of crystal formation, and subsequently crystallizing the target protein in the presence of the optimal protease at the right relative concentration. We report our experience with ten proteins of unknown structure, two of which yielded high-resolution X-ray structures. The advantage of this approach comes from its ability to select only those structure determination candidates that are likely to benefit from application of in situ proteolysis, using conditions most likely to result in formation of a stable proteolytic digestion product suitable for crystallization.
Mass spectrometry; in situ proteolysis; crystallization; x-ray crystallography
Two uncharacterized enzymes from the amidohydrolase superfamily belonging to cog1228 were cloned, expressed and purified to homogeneity. The two proteins, Sgx9260c (gi|44242006) and Sgx9260b (gi|44479596), were derived from environmental DNA samples originating from the Sargasso Sea. The catalytic function and substrate profiles for Sgx9260c and Sgx9260b were determined using a comprehensive library of dipeptides and N-acyl derivative of L-amino acids. Sgx9260c catalyzes the hydrolysis of Gly-L-Pro, L-Ala-L-Pro and N-acyl derivatives of L-Pro. The best substrate identified to date is N-acetyl-L-Pro with a value of kcat/Km of 3 × 105 M−1 s−1. Sgx9260b catalyzes the hydrolysis of L-hydrophobic L-Pro dipeptides and N-acyl derivatives of L-Pro. The best substrate identified to date is N-propionyl-L-Pro with a value of kcat/Km of 1 × 105 M−1 s−1. Three dimensional structures of both proteins were determined by X-ray diffraction methods (PDB codes: 3MKV and 3FEQ). These proteins fold as distorted (β/α)8-barrels with two divalent cations in the active site. The structure of Sgx9260c was also determined as a complex with the N-methyl phosphonate derivative of L-Pro (PDB code: 3N2C). In this structure the phosphonate moiety bridges the binuclear metal center and one oxygen atom interacts with His-140. The α-carboxylate of the inhibitor interacts with Tyr-231. The proline side chain occupies a small substrate binding cavity formed by residues contributed from the loop that follows β-strand 7 within the (β/α)8-barrel. A total of 38 other proteins from cog1228 are predicted to have the same substrate profile based on conservation of the substrate binding residues. The structure of an evolutionarily related protein, Cc2672 from Caulobacter crecentus, was determined as a complex with the N-methyl phosphonate derivative of L-arginine (PDB code: 3MTW).
Nuclear Pore Complex; Nup145; Nup145N; structural genomics; autoproteolysis
Two previously uncharacterized proteins have been identified that efficiently catalyze the deamination of isoxanthopterin and pterin-6-carboxylate. The genes encoding these two enzymes, NYSGXRC-9339a (gi|44585104) and NYSGXRC-9236b (gi|44611670), were first identified from DNA isolated from the Sargasso Sea as part of the Global Ocean Sampling Project. The genes were synthesized, and the proteins were subsequently expressed and purified. The X-ray structure of Sgx9339a was determined at 2.7 Å resolution (PDB code: 2PAJ). This protein folds as a distorted (β/α)8-barrel and contains a single zinc ion in the active site. These enzymes are members of the amidohydrolase superfamily and belong to cog0402 within the clusters of orthologous groups (COG). Enzymes in cog0402 have previously been shown to catalyze the deamination of guanine, cytosine, S-adenosyl homocysteine, and 8-oxoguanine. A small compound library of pteridines, purines, and pyrimidines was used to probe catalytic activity. The only substrates identified in this search were isoxanthopterin and pterin-6-carboxylate. The kinetic constants for the deamination of isoxanthopterin with Sgx9339a were determined to be 1.0 s−1, 8.0 μM, and 1.3 × 105 M−1 s−1 for kcat, Km, and kcat/Km, respectively. The active site of Sgx9339a most closely resembles the active site for 8-oxoguanine deaminase (PDB code: 2UZ9). A model for substrate recognition of isoxanthopterin by Sgx9339a was proposed based upon the binding of guanine and xanthine in the active site of guanine deaminase. Residues critical for substrate binding appear to be conserved glutamine and tyrosine residues that hydrogen bond with the carbonyl oxygen at C4, a conserved threonine residue that hydrogen bonds with N5, and another conserved threonine residue that hydrogen bonds with the carbonyl group at C7. These conserved active site residues were used to identify 24 other genes which are predicted to deaminate isoxanthopterin.
Plants and microorganisms reduce environmental inorganic nitrogen to ammonium, which then enters various metabolic pathways solely via conversion of 2-oxoglutarate (2OG) to glutamate and glutamine. Cellular 2OG concentrations increase during nitrogen starvation. We recently identified a novel family of 2OG-sensing proteins – the nitrogen regulatory protein NrpR – that bind DNA and repress transcription of nitrogen assimilation genes. We used X-ray crystallography to determine the structure of NrpR regulatory domain. We identified the NrpR 2OG-binding cleft and show that residues predicted to interact directly with 2OG are conserved among diverse classes of 2OG-binding proteins. We show that high levels of 2OG inhibit NrpRs ability to bind DNA. Electron microscopy analyses document that NrpR adopts different quaternary structures in its inhibited 2OG-bound state compared with its active apo state. Our results indicate that upon 2OG release, NrpR re-positions its DNA-binding domains correctly for optimal interaction with DNA thereby enabling gene repression.
Nitrogen assimilation; Nitrogen regulation; Electron microscopy; X-ray crystallography; 2-oxoglutarate (2OG); α-ketoglutarate; transcriptional regulation
An enzyme from Pseudomonas aeruginosa, Pa0142 (gi|9945972) has been identified for the first time that is able to catalyze the deamination of 8-oxoguanine (8-oxoG) to uric acid. 8-Oxoguanine is formed by the oxidation of guanine residues within DNA by reactive oxygen species and this lesion results in the G:C to T:A transversions. The value of kcat/Km for the deamination of 8-oxoG by Pa0142 at pH 8.0 and 30 °C is 2.0 × 104 M−1 s−1. This enzyme can also catalyze the deamination of isocystosine and guanine at rates that are approximately an order of magnitude slower. The three-dimensional structure of a homologous enzyme (gi|44264246) from the Sargasso Sea has been determined by x-ray diffraction methods to a resolution of 2.2Å (PDB code: 3h4u). The enzyme folds as a (β/α)8− barrel and it is a member of the amidohydrolase superfamily with a single zinc in the active site. This enzyme catalyzes the deamination of 8-oxoG with a value of kcat/Km of 2.7 × 105 M−1 s−1. Computational docking of potential high energy intermediates for the deamination reaction to the x-ray crystal structure suggests that the active site binding of 8-oxoG is facilitated by hydrogen bond interactions from a conserved glutamine that follows β-strand 1 with O6, a conserved tyrosine that follows β-strand 2 with N7, and a conserved cysteine residue that follows β-strand 4 with O8. A bioinformatic analysis of available protein sequences suggest that approximately 200 other bacteria possess an enzyme capable of catalyzing the deamination of 8-oxoG.
The structure of an uncharacterized member of the enolase superfamily from Oceanobacillus iheyensis (GI: 23100298; IMG locus tag Ob2843; PDB Code 2OQY) was determined by the New York SGX Research Center for Structural Genomics (NYSGXRC). The structure contained two Mg2+ ions located 10.4 Å from one another, with one located in the canonical position in the (β/α)7β-barrel domain (although the ligand at the end of the fifth β-strand is His, unprecedented in structurally characterized members of the superfamily); the second is located in a novel site within the capping domain. In silico docking of a library of mono- and diacid sugars to the active site predicted a diacid sugar as a likely substrate. Activity screening of a physical library of acid sugars identified galactarate as the substrate (kcat = 6.8 s−1, KM = 620 μM; kcat/KM = 1.1 × 104 M−1 s−1), allowing functional assignment of Ob2843 as galactarate dehydratase (GalrD-II) The structure of a complex of the catalytically impaired Y90F mutant with Mg2+ and galactarate allowed identification of a Tyr 164-Arg 162 dyad as the base that initiates the reaction by abstraction of the α-proton and Tyr 90 as the acid that facilitates departure of the β-OH leaving group. The enzyme product is 2-keto-3-deoxy-D-threo-4,5-dihydroxyadipate, the enantiomer of the product obtained in the GalrD reaction catalyzed by a previously characterized bifunctional L-talarate/galactarate dehydratase (TalrD/GalrD). On the basis of the different active site structures and different regiochemistries, we recognize that these functions represent an example of apparent, not actual, convergent evolution of function. The structure of GalrD-II and its active site architecture allow identification of the seventh functionally and structurally characterized subgroup in the enolase superfamily. This study provides an additional example that an integrated sequence/structure-based strategy employing computational approaches is a viable approach for directing functional assignment of unknown enzymes discovered in genome projects.
Microsporidia are protists that have been reported to cause infections in both vertebrates and invertebrates. They have emerged as human pathogens particularly in patients that are immunnosuppressed and cases of gastrointestinal infection, encephalitis, keratitis, sinusitis, myositis and disseminated infection are well described in the literature. While benzimidazoles are active against many species of Microsporidia, these drugs do not have significant activity against Enterocytozoon bieneusi. Fumagillin and its analogues have been demonstrated to have activity in vitro and in animal models of microsporidiosis and human infections due to E. bieneusi. Fumagillin and its analogues inhibit methionine aminopeptidase type 2. Encephalitozoon cuniculi MetAP2 (EcMetAP2) was cloned and expressed as an active enzyme using a baculovirus system. The crystal structure of EcMetAP2 was determined with and without the bound inhibitors fumagillin and TNP470. This structure classifies EcMetAP2 as a member of the MetAP2c family. The EcMetAP2 structure was used to generate a homology model of the E. bieneusi MetAP2. Comparison of microsporidian MetAP2 structures with human MetAP2 provide insights into the design of inhibitors that might exhibit specificity for microsporidian MetAP2.
Microsporidia; X-ray crystal structure; Methionine Aminopeptidase; Encephalitozoon cuniculi; Enterocytozoon bieneusi; therapeutics
Two proteins from the amidohydrolase superfamily of enzymes were cloned, expressed and purified to homogeneity. The first protein, Cc0300, was from Caulobacter crescentus CB-15 (Cc0300) while the second one (Sgx9355e) was derived from an environmental DNA sequence originally isolated from the Sargasso Sea (gi| 44371129). The catalytic functions and the substrate profiles for the two enzymes were determined with the aid of combinatorial dipeptide libraries. Both enzymes were shown to catalyze the hydrolysis of L-Xaa-L-Xaa dipeptides where the amino acid at the N-terminus was relatively unimportant. These enzymes were specific for hydrophobic amino acids at the C-terminus. With Cc0300, substrates terminating in isoleucine, leucine, phenylalanine, tyrosine, valine, methionine, and tryptophan were hydrolyzed. The same specificity was observed with Sgx9355e but this protein was also able to hydrolyze peptides terminating in threonine. Both enzymes were able to hydrolyze N-acetyl and N-formyl derivatives of the hydrophobic amino acids and tripeptides. The best substrates identified for Cc0300 were L-Ala-L-Leu with values of kcat and kcat/Km of 37 s−1 and 1.1 × 105 M−1 s−1, respectively, and N-formyl-L-Tyr with values of kcat and kcat/Km of 33 s−1 and 3.9 × 105 M−1 s−1, respectively. The best substrate identified for Sgx9355e was L-Ala-L-Phe will values of kcat and kcat/Km of 0.41 s−1 and 5.8 × 103 M−1 s−1. The three-dimensional structure of Sgx9355e was determined to a resolution of 2.33 Å with L-methionine bound in the active site. The α-carboxylate of the methionine is ion-paired to His-237 and also hydrogen bonded to the backbone amide groups of Val-201 and Leu-202. The α-amino group of the bound methionine interacts with Asp-328. The structural determinants for substrate recognition were identified and compared with other enzymes in this superfamily that hydrolyze dipeptides with different specificities.
The substrate profiles for two proteins from Caulobacter crescentus CB15 (Cc2672 and Cc3125) and one protein (Sgx9359b) derived from a DNA sequence (gi| 44368820) isolated from the Sargasso Sea were determined using combinatorial libraries of dipeptides and N-acyl derivatives of amino acids. These proteins are members of the amidohydrolase superfamily and are currently misannotated in NCBI as catalyzing the hydrolysis of L-Xaa-L-Pro dipeptides. Cc2672 was shown to catalyze the hydrolysis of L-Xaa-L-Arg/Lys dipeptides and the N-acetyl and N-formyl derivatives of lysine and arginine. This enzyme will also hydrolyze longer peptides that terminate in either lysine or arginine. The N-methyl phosphonate derivative of L-lysine was a potent competitive inhibitor of Cc2672 with a Ki value of 120 nM. Cc3125 was shown to catalyze the hydrolysis of L-Xaa-L-Arg/Lys dipeptides but will not hydrolyze tripeptides or the N-formyl and N-acetyl derivatives of lysine or arginine. The substrate profile for Sgx9359b is similar to that of Cc2672 except that compounds with a C-terminal lysine are not recognized as substrates. The x-ray structure of Sgx9359b was determined to a resolution of 2.3 Å. The protein folds as a (β/α)8-barrel and self associates to form a homo-octamer. The active site is composed of a binuclear metal center similar to that found in phosphotriesterase and dihydroorotase. In one crystal form, arginine was bound adventitiously to the eight active sites within the octamer. The orientation of the arginine in the active site identified the structural determinants for recognition of the α-carboxylate and the positively charged side chains of arginine containing substrates. This information was used to identify 18 other bacterial sequences that possess identical or similar substrate profiles.
The mechanistically diverse enolase superfamily is a paradigm for elucidating Nature’s strategies for divergent evolution of enzyme function. Each of the different reactions catalyzed by members of the superfamily is initiated by abstraction of the α-proton of a carboxylate substrate that is coordinated to an essential Mg2+. The muconate lactonizing enzyme (MLE) from Pseudomonas putida, a member of a family that catalyzes the syn-cycloisomerization of cis,cis-muconate to (4S)-muconolactone in the β-ketoadipate pathway, has provided critical insights into the structural bases for evolution of function within the superfamily. A second, divergent family of homologues MLEs that catalyzes anti-cycloisomerization has been identified. Structures of members of both families liganded with the common (4S)-muconolactone product (syn, Pseudomonas fluorescens, GI:70731221; anti, Mycobacterium smegmatis, GI:118470554) document that the conserved Lys at the end of the second β-strand in the (β/α)7β-barrel domain serves as the acid catalyst in both reactions. The different stereochemical courses (syn and anti) result from different structural strategies for determining substrate specificity: although the distal carboxylate group of the cis,cis-muconate substrate attacks the same face of the proximal double bond, opposite faces of the resulting enolate anion intermediate are presented to the conserved Lys acid catalyst. The discovery of two families of homologous, but stereochemically distinct, MLEs likely provides an example of “pseudoconvergent” evolution of the same function from different homologous progenitors within the enolase superfamily, in which different spatial arrangements of active site functional groups and substrate specificity determinants support catalysis of the same reaction.
To study the substrate specificity of enzymes, we use the amidohydrolase and enolase superfamilies as model systems; members of these superfamilies share a common TIM barrel fold and catalyze a wide range of chemical reactions. Here, we describe a collaboration between the Enzyme Specificity Consortium (ENSPEC) and the New York SGX Research Center for Structural Genomics (NYSGXRC) that aims to maximize the structural coverage of the amidohydrolase and enolase superfamilies. Using sequence- and structure-based protein comparisons, we first selected 535 target proteins from a variety of genomes for high-throughput structure determination by X-ray crystallography; 63 of these targets were not previously annotated as superfamily members. To date, 20 unique amidohydrolase and 41 unique enolase structures have been determined, increasing the fraction of sequences in the two superfamilies that can be modeled based on at least 30% sequence identity from 45% to 73%. We present case studies of proteins related to uronate isomerase (an amidohydrolase superfamily member) and mandelate racemase (an enolase superfamily member), to illustrate how this structure-focused approach can be used to generate hypotheses about sequence–structure–function relationships.
Electronic supplementary material
The online version of this article (doi:10.1007/s10969-008-9056-5) contains supplementary material, which is available to authorized users.
Amidohydrolase and enolase superfamilies; Structural genomics; Structure annotation; Target selection