Despite a multitude of recent technical breakthroughs speeding high-resolution structural analysis of biological macromolecules, production of sufficient quantities of well-behaved, active protein continues to represent the rate-limiting step in many structure determination efforts. These challenges are only amplified when considered in the context of ongoing structural genomics efforts, which are now contending with multi-domain eukaryotic proteins, secreted proteins, and ever-larger macromolecular assemblies. Exciting new developments in eukaryotic expression platforms, including insect and mammalian-based systems, promise enhanced opportunities for structural approaches to some of the most important biological problems. Development and implementation of automated eukaryotic expression techniques promises to significantly improve production of materials for structural, functional, and biomedical research applications.
Thermobifida fusca o-succinylbenzoate synthase (OSBS), a member of the enolase superfamily that catalyzes a step in menaquinone biosynthesis, shares 22% and 28% amino acid sequence identity with two previously characterized OSBS enzymes from Escherichia coli and Amycolatopsis sp. T-1-60, respectively. These values are considerably lower than typical sequence identities among homologous proteins that have the same function. To determine how such divergent enzymes catalyze the same reaction, we solved the structure of T. fusca OSBS and identified amino acids that are important for ligand binding. We discovered significant differences in structure and conformational flexibility between T. fusca OSBS and other members of the enolase superfamily. In particular, the 20s loop, a flexible loop in the active site that permits ligand binding and release in most enolase superfamily proteins, has a four-amino acid deletion and is well ordered in T. fusca OSBS. Instead, flexibility of a different region allows the substrate to enter from the other side of the active site. T. fusca OSBS was more tolerant of mutations at residues that were critical for activity in E. coli OSBS. Also, replacing active site amino acids found in one protein with the amino acids that occur at the same place in the other protein reduces catalytic efficiency. Thus, the extraordinary divergence between these proteins does not appear to reflect a higher tolerance of mutations. Instead, large deletions outside the active site were accompanied by alteration of active site size and electrostatic interactions, resulting in small but significant differences in ligand binding.
The New York SGX Research Center for Structural Genomics (NYSGXRC) of the NIGMS Protein Structure Initiative (PSI) has applied its high-throughput X-ray crystallographic structure determination platform to systematic studies of all human protein phosphatases and protein phosphatases from biomedically-relevant pathogens. To date, the NYSGXRC has determined structures of 21 distinct protein phosphatases: 14 from human, 2 from mouse, 2 from the pathogen Toxoplasma gondii, 1 from Trypanosoma brucei, the parasite responsible for African sleeping sickness, and 2 from the principal mosquito vector of malaria in Africa, Anopheles gambiae. These structures provide insights into both normal and pathophysiologic processes, including transcriptional regulation, regulation of major signaling pathways, neural development, and type 1 diabetes. In conjunction with the contributions of other international structural genomics consortia, these efforts promise to provide an unprecedented database and materials repository for structure-guided experimental and computational discovery of inhibitors for all classes of protein phosphatases.
Structural genomics; Phosphatase; NYSGXRC; X-ray crystallography
The haloacid dehalogenase enzyme superfamily (HADSF) is largely composed of phosphatases, which, relative to members of other phosphatase families, have been particularly successful at adaptation to novel biological functions. Herein, we examine the structural basis for divergence of function in two bacterial homologs: 2-keto-3-deoxy-D-manno-octulosonate 8-phosphate phosphohydrolase (KDO8P phosphatase, KDO8PP) and 2-keto-3-deoxy-9-O-phosphonononic acid phosphohydrolase (KDN9P phosphatase, KDN9PP). KDO8PP and KDN9PP catalyze the final step of KDO and KDN synthesis, respectively, prior to transfer to CMP to form the activated sugar nucleotide. KDO8PP and KDN9PP orthologs derived from an evolutionarily diverse collection of bacterial species were subjected to steady-state kinetic analysis to determine their specificities towards catalyzed KDO8P and KDN9P hydrolysis. Although each enzyme was more active with its biological substrate, the degree of selectivity (as defined by the ratio of kcat/Km for KDO8P vs. KDN9P) varied significantly. High-resolution X-ray structure determination of Haemophilus influenzae KDO8PP bound to KDO/VO3− and Bacteriodes thetaiotaomicron KDN9PP bound to KDN/VO3− revealed the substrate-binding residues. Structures of the KDO8PP and KDN9PP orthologs were also determined to reveal the differences in active site structure that underlies the variation in substrate preference. Bioinformatic analysis was carried out to define the sequence divergence among KDN9PP and KDO8PP orthologs. The KDN9PP orthologs were found to exist as single domain proteins or fused with the pathway nucleotidyl transferases; fusion of KDO8PP with the transferase is rare. The KDO8PP and KDN9PP orthologs share a stringently conserved Arg residue, which forms a salt bridge with the substrate carboxylate group. The split of the KDN9PP lineage from the KDO8PP orthologs is easily tracked by the acquisition of a Glu/Lys pair that supports KDN9P binding. Moreover, independently evolved lineages of KDO8PP orthologs exist, separated by diffuse active-site sequence boundaries. We infer high tolerance of the KDO8PP catalytic platform to amino acid replacements that, in turn, influence substrate specificity changes and thereby facilitate divergence of biological function.
2-keto-3-deoxyoctulosonic acid (KDO); 2-keto-3-deoxynononic acid (KDN); phosphohydrolases; enzyme evolution; ortholog boundaries
The nuclear pore complex, composed of proteins termed nucleoporins (Nups), is responsible for nucleocytoplasmic transport in eukaryotes. Nuclear pore complexes (NPCs) form an annular structure composed of the nuclear ring, cytoplasmic ring, a membrane ring, and two inner rings. Nup192 is a major component of the NPC’s inner ring. We report the crystal structure of Saccharomyces cerevisiae Nup192 residues 2–960 [ScNup192(2–960)], which adopts an α-helical fold with three domains (i.e., D1, D2, and D3). Small angle X-ray scattering and electron microscopy (EM) studies reveal that ScNup192(2–960) could undergo long-range transition between “open” and “closed” conformations. We obtained a structural model of full-length ScNup192 based on EM, the structure of ScNup192(2–960), and homology modeling. Evolutionary analyses using the ScNup192(2–960) structure suggest that NPCs and vesicle-coating complexes are descended from a common membrane-coating ancestral complex. We show that suppression of Nup192 expression leads to compromised nuclear transport and hypothesize a role for Nup192 in modulating the permeability of the NPC central channel.
Alanine racemase from O. oeni exists as a dimer in the crystal structure. Both monomers contribute to the two active sites present, one for each monomer.
The crystal structure of alanine racemase from Oenococcus oeni has been determined at 1.7 Å resolution using the single-wavelength anomalous dispersion (SAD) method and selenium-labelled protein. The protein exists as a symmetric dimer in the crystal, with both protomers contributing to the two active sites. Pyridoxal 5′-phosphate, a cofactor, is bound to each monomer and forms a Schiff base with Lys39. Structural comparison of alanine racemase from O. oeni (Alr) with homologous family members revealed similar domain organization and cofactor binding.
alanine racemase; Oenococcus oeni; pyridoxal 5′-phosphate
The periplasmic glucose-binding protein from T. maritima consists of two domains with the ligand β-d-glucose buried between them. The two domains adopt a closed conformation.
ABC transport systems have been characterized in organisms ranging from bacteria to humans. In most bacterial systems, the periplasmic component is the primary determinant of specificity of the transport complex as a whole. Here, the X-ray crystal structure of a periplasmic glucose-binding protein (GBP) from Thermotoga maritima determined at 2.4 Å resolution is reported. The molecule consists of two similar α/β domains connected by a three-stranded hinge region. In the current structure, a ligand (β-d-glucose) is buried between the two domains, which have adopted a closed conformation. Details of the substrate-binding sites revealed features that determine substrate specificity. In toto, ten residues from both domains form eight hydrogen bonds to the bound sugar and four aromatic residues (two from each domain) stabilize the substrate through stacking interactions.
glucose-binding proteins; ABC transporters; hinge motion; closed conformation
Rhodospirillum rubrumproduces 5-methylthioadenosine (MTA) from S-adenosylmethionine (SAM) in polyamine biosynthesis; however, R. rubrum lacks the classical methionine salvage pathway. Instead, MTA is converted to 5-methylthio-D-ribose 1-phosphate (MTR 1-P) and adenine; MTR 1-P is isomerized to 1-methylthio-D-xylulose 5-phosphate (MTXu 5-P) and reductively dethiomethylated to 1-deoxy-D-xylulose 5-phosphate (DXP), an intermediate in the nonmevalonate isoprenoid pathway (Erb et al. Nature Chem. Biol., in press). Dethiomethylation, a novel route to DXP, is catalyzed by MTXu 5-P methylsulfurylase. An active site Cys displaces the enolate of DXP from MTXu 5-P, generating a methyl disulfide intermediate.
The nuclear pore complex (NPC), embedded in the nuclear envelope, is a large, dynamic molecular assembly that facilitates exchange of macromolecules between the nucleus and cytoplasm. The yeast NPC is an eight-fold symmetric annular structure composed of ~456 polypeptide chains contributed by ~30 distinct proteins termed nucleoporins (Nups). Nup116, identified only in fungi, plays a central role in both protein import and mRNA export through the NPC. Nup116 is a modular protein with N-terminal “FG” repeats containing a Gle2p-binding sequence motif (GLEBS motif) and a NPC targeting domain at its C-terminus. We report the crystal structure of the NPC targeting domain of Candida glabrata Nup116, consisting of residues 882-1034 [CgNup116(882-1034)], at 1.94 Å resolution. The X-ray structure of CgNup116(882-1034) is consistent with the molecular envelope determined in solution by Small Angle X-ray Scattering (SAXS). Structural similarities of CgNup116(882-1034) with homologous domains from Saccharomyces cerevisiae Nup116, S. cerevisiaeNup145N, and human Nup98 are discussed.
Nuclear Pore Complex; Nup116; Nup98; Nup100; Nup145; mRNA export; structural genomics
LigI from Sphingomonas paucimobilis catalyzes the reversible hydrolysis of 2-pyrone-4,6-dicarboxylate (PDC) to 4-oxalomesaconate (OMA) and 4-carboxy-2-hydroxymuconate (CHM) in the degradation of lignin. This protein is a member of the amidohydrolase superfamily of enzymes. The protein was expressed in E. coli and then purified to homogeneity. The purified recombinant enzyme does not contain bound metal ions and the addition of metal chelators or divalent metal ions to the assay mixtures does not affect the rate of product formation. This is the first enzyme from the amidohydrolase superfamily that does not require a divalent metal ion for catalytic activity. The kinetic constants for the hydrolysis of PDC are 340 s−1 and 9.8 × 106 M−1s−1 for the values of kcat, and kcat/Km respectively. The pH dependence on the kinetic constants suggests that a single active site residue must be deprotonated for the hydrolysis of PDC. The site of nucleophilic attack was determined by conducting the hydrolysis of PDC in 18O-labeled water and subsequent 13C NMR analysis. The crystal structures of wild-type LigI and the D248A mutant in the presence of the reaction product were determined to a resolution of 1.9 Å. The C-8 and C-11 carboxylate groups of PDC are coordinated within the active site via ion pair interactions with Arg-130 and Arg-124, respectively. The hydrolytic water molecule is activated by a proton transfer to Asp-248. The carbonyl group of the lactone substrate is activated by electrostatic interactions with His-180, His-31 and His-33.
lysostaphin peptidase; LytM; glycyl-glycine or glycyl-alanine; latent form
The araBAD operon encodes three different enzymes required for catabolism of L-arabinose, which is one of the most abundant monosaccharides in nature. L-ribulokinase, encoded by the araB gene, catalyses conversion of L-ribulose to L-ribulose-5-phosphate, the second step in the catabolic pathway. Unlike other kinases, ribulokinase exhibits diversity in substrate selectivity and catalyses phosphorylation of all four 2-ketopentose sugars with comparable kcat values. To understand ribulokinase recognition and phosphorylation of a diverse set of substrates, we have determined the X-ray structure of ribulokinase from Bacillus halodurans bound to L-ribulose and investigated its substrate and ATP co-factor binding properties. The polypeptide chain is folded into two domains, one small and the other large, with a deep cleft in between. By analogy with related sugar kinases, we identified 447GGLPQK452 as the ATP binding motif within the smaller domain. L-ribulose binds in the cleft between the two domains via hydrogen bonds with the sidechains of highly conserved Trp126, Lys208, Asp274, and Glu329 and the main chain nitrogen of Ala96. The interaction of L-ribulokinase with L-ribulose reveals versatile structural features that help explain recognition of various 2-ketopentose substrates and competitive inhibition by L-erythrulose. Comparison of our structure to that of the structures of other sugar kinases, revealed conformational variations that suggest domain-domain closure movements are responsible for establishing the observed active site environment.
Crystal structure; ribulokinase; ribulose; araBAD; araB; arabinose-catalolism
Four proteins from NCBI cog1816, previously annotated as adenosine deaminases, have been subjected to structural and functional characterization. Pa0148 (Pseudomonas aeruginosa PAO1), AAur1117 (Arthrobacter aurescens TC1), Sgx9403e, and Sgx9403g, have been purified and their substrate profiles determined. Adenosine is not a substrate for any of these enzymes. All of these proteins will deaminate adenine to produce hypoxanthine with values of kcat/Km that exceed 105 M−1s−1. These enzymes will also accept 6-chloropurine, 6-methoxypurine, N-6-methyladenine, and 2,6-diaminopurine as alternate substrates. X-ray structures of Pa0148 and AAur1117 have been determined and reveal nearly identical distorted (β/α)8-barrels with a single zinc ion that is characteristic of members of the amidohydrolase superfamily. Structures of Pa0148 with adenine, 6-chloropurine and hypoxanthine were also determined thereby permitting identification of the residues responsible for coordinating the substrate and product.
Nova onconeural antigens are neuron-specific RNA-binding proteins implicated in paraneoplastic opsoclonus-myoclonus-ataxia (POMA) syndrome. Nova harbors three K-homology (KH) motifs implicated in alternate splicing regulation of genes involved in inhibitory synaptic transmission. We report the crystal structure of the first two KH domains (KH1/2) of Nova-1 bound to an in vitro selected RNA hairpin, containing a UCAG-UCAC high-affinity binding site. Sequence-specific intermolecular contacts in the complex involve KH1 and the second UCAC repeat, with the RNA scaffold buttressed by interactions between repeats. While the canonical RNA-binding surface of KH2 in the above complex engages in protein-protein interactions in the crystalline state, the individual KH2 domain can sequence-specifically target the UCAC RNA element in solution. The observed anti-parallel alignment of KH1 and KH2 domains in the crystal structure of the complex generates a scaffold that could facilitate target pre-mRNA looping upon Nova binding, thereby potentially explaining Nova’s functional role in splicing regulation.
Nuclear pore complexes (NPCs), responsible for the nucleo-cytoplasmic exchange of proteins and nucleic acids, are dynamic macromolecular assemblies forming an eight-fold symmetric co-axial ring structure. Yeast (Saccharomyces cerevisiae) NPCs are made up of at least 456 polypeptide chains of ~30 distinct sequences. Many of these components (nucleoporins, Nups) share similar structural motifs and form stable subcomplexes. We have determined a high-resolution crystal structure of the C-terminal domain of yeast Nup133 (ScNup133), a component of the heptameric Nup84 subcomplex. Expression tests yielded ScNup133(944-1157) that produced crystals diffracting to 1.9Å resolution.
ScNup133(944-1157) adopts essentially an all α-helical fold, with a short two stranded β-sheet at the C-terminus. The 11 α-helices of ScNup133(944-1157) form a compact fold. In contrast, the previously determined structure of human Nup133(934-1156) bound to a fragment of human Nup107 has its constituent α-helices are arranged in two globular blocks. These differences may reflect structural divergence among homologous nucleoporins.
Nuclear Pore Complex; Nup133; structural genomics
Adenine deaminase (ADE) catalyzes the conversion of adenine to hypoxanthine and ammonia. The enzyme isolated from Escherichia coli using standard expression conditions was low for the deamination of adenine (kcat = 2.0 s−1; kcat/Km = 2.5 × 103 M−1 s−1). However, when iron was sequestered with a metal chelator and the growth medium was supplemented with Mn2+ prior to induction, the purified enzyme was substantially more active for the deamination of adenine with values of kcat and kcat/Km of 200 s−1 and 5 × 105 M−1s−1, respectively. The apo-enzyme was prepared and reconstituted with Fe2+, Zn2+, or Mn2+. In each case, two enzyme-equivalents of metal were necessary for reconstitution of the deaminase activity. This work provides the first example of any member within the deaminase sub-family of the amidohydrolase superfamily (AHS) to utilize a binuclear metal center for the catalysis of a deamination reaction. [FeII/FeII]-ADE was oxidized to [FeIII/FeIII]-ADE with ferricyanide with inactivation of the deaminase activity. Reducing [FeIII/FeIII]-ADE with dithionite restored the deaminase activity and thus the di-ferrous form of the enzyme is essential for catalytic activity. No evidence for spin-coupling between metal ions was evident by EPR or Mössbauer spectroscopies. The three-dimensional structure of adenine deaminase from Agrobacterium tumefaciens (Atu4426) was determined by X-ray crystallography at 2.2 Å resolution and adenine was modeled into the active site based on homology to other members of the amidohydrolase superfamily. Based on the model of the adenine-ADE complex and subsequent mutagenesis experiments, the roles for each of the highly conserved residues were proposed. Solvent isotope effects, pH rate profiles and solvent viscosity were utilized to propose a chemical reaction mechanism and the identity of the rate limiting steps.
Two enzymes of unknown function from the amidohydrolase superfamily were discovered to catalyze the deamination of N-6-methyladenine to hypoxanthine and methyl amine. The methylation of adenine in bacterial DNA is a common modification for the protection of host DNA against restriction endonucleases. The enzyme from Bacillus halodurans, Bh0637, catalyzes the deamination of N-6-methyladenine with a kcat of 185 s−1 and a kcat/Km of 2.5 × 106 M−1 s−1. Bh0637 catalyzes the deamination of N-6-methyladenine two orders of magnitude faster than adenine. A comparative model of Bh0637 was computed using the three-dimensional structure of Atu4426 (PDB code: 3NQB) as a structural template and computational docking was used to rationalize the preferential utilization of N-6-methyladenine over adenine. This is the first identification of an N-6-methyladenine deaminase (6-MAD).
Fatty acyl-AMP ligase (FAAL) is a new member of a family of adenylate-forming enzymes that were recently discovered in Mycobacterium tuberculosis (Mtb). They are similar in sequence to fatty acyl-CoA ligases (FACLs). However, while FACLs perform a two-step catalytic reaction, AMP ligation followed by CoA ligation using ATP and CoA as cofactors, FAALs produce only the acyl adenylate and are unable to perform the second step. We report X-ray crystal structures of full length FAALs from E. coli (EcFAAL) and Legionella pneumophila (LpFAAL) bound to acyl adenylate, determined at resolution limits of 3.0 and 1.85 Å, respectively. The structures share a larger N-terminal domain and a smaller C-terminal domain, which together resemble the previously determined structures of FAAL and FACL proteins. Our two structures occur in quite different conformations. EcFAAL adopts the adenylate forming conformation typical of FACLs, whereas LpFAAL exhibits a unique intermediate conformation. Both EcFAAL and LpFAAL have insertion motifs that distinguish them from the FACLs. Structures of EcFAAL and LpFAAL reveal detailed interactions between this insertion motif and the interdomain hinge region and with the C-terminal domain. We suggest that the insertion motifs support sufficient interdomain motions to allow substrate binding and product release during acyl adenylate formation, whereas they preclude CoA binding thereby preventing CoA ligation.
Fatty acyl-AMP ligase; Fatty acyl-CoA ligase; X-ray structure; AMP; CoA
Imidazolonepropionase (HutI) (imidazolone-5-propanote hydrolase; EC 126.96.36.199) is a member of amidohydrolase superfamily and catalyzes the conversion of imidazolone-5-propanoate to N-formimino -L-glutamate in the histidine degradation pathway. We have determined the three dimensional crystal structures of HutI from A. tumefaciens (At-HutI) and an environmental sample from the Sargasso Sea Ocean Going Survey (Es-HutI) bound to the product [N-formimino-L-glutamate (NIG)] and an inhibitor [3-(2,5-dioxoimidazolidin-4yl)-propionic acid (DIP), respectively. In both structures the active site is contained within each monomer and its organization displays the landmark feature of amidohydrolase superfamily showing a metal ligand (iron), four histidines and one aspartic acid. A catalytic mechanism involving His265 is proposed based on the inhibitor bound structure. This mechanism is applicable to all HutI.
AHS; amidohydrolases; NIG; DIP; At-HutI; Es-HutI
Magnesium transporter; cytosolic domain; x-ray structure
Bacillus cereus Hemolysin BL enterotoxin, a ternary complex of three proteins, is the causative agent of food poisoning and requires all three components for virulence. The X-ray structure of the binding domain of HBL suggests that it may form a pore similar to other soluble channel forming proteins. A putative pathway of pore formation is discussed.
HBL-B; Hemolysin; pore-formation; β-hairpain
The X-ray structure of a putative BenF-like (gene name: PFL1329) protein from Pseudomonas fluorescens Pf-5 (PflBenF) has been determined at 2.6Å resolution. X-ray crystallography revealed a canonical 18-stranded β-barrel fold that forms a central pore with a diameter of ∼4.6Å, which is consistent with the size and physicochemical properties of the presumed aromatic acid substrate, benzoate. Detailed comparisons with the previously-determined structure of Pseudomonas aeruginosa OpdK, a vanillate influx channel, revealed an arginine-rich aromatic acid selectivity filter of nearly identical structure composed of seven highly conserved residues Arg∼Asp∼Arg∼Arg∼Ser∼Asp∼Arg (R∼D∼R∼R∼S∼D∼R sequence motif, where ∼ denotes intervening residues) that define the narrowest part of the pore.
BenF-like; substrate specific porin; OprD superfamily; OprD subfamily; OpdK subfamily; benzoate; Pseudomonas; integral membrane protein
A strategy for increasing the efficiency of protein crystallization/structure determination with mass spectrometry has been developed. This approach combines insights from limited proteolysis/mass spectrometry and crystallization via in situ proteolysis. The procedure seeks to identify protease-resistant polypeptide chain segments from purified proteins on the time-scale of crystal formation, and subsequently crystallizing the target protein in the presence of the optimal protease at the right relative concentration. We report our experience with ten proteins of unknown structure, two of which yielded high-resolution X-ray structures. The advantage of this approach comes from its ability to select only those structure determination candidates that are likely to benefit from application of in situ proteolysis, using conditions most likely to result in formation of a stable proteolytic digestion product suitable for crystallization.
Mass spectrometry; in situ proteolysis; crystallization; x-ray crystallography
Two uncharacterized enzymes from the amidohydrolase superfamily belonging to cog1228 were cloned, expressed and purified to homogeneity. The two proteins, Sgx9260c (gi|44242006) and Sgx9260b (gi|44479596), were derived from environmental DNA samples originating from the Sargasso Sea. The catalytic function and substrate profiles for Sgx9260c and Sgx9260b were determined using a comprehensive library of dipeptides and N-acyl derivative of L-amino acids. Sgx9260c catalyzes the hydrolysis of Gly-L-Pro, L-Ala-L-Pro and N-acyl derivatives of L-Pro. The best substrate identified to date is N-acetyl-L-Pro with a value of kcat/Km of 3 × 105 M−1 s−1. Sgx9260b catalyzes the hydrolysis of L-hydrophobic L-Pro dipeptides and N-acyl derivatives of L-Pro. The best substrate identified to date is N-propionyl-L-Pro with a value of kcat/Km of 1 × 105 M−1 s−1. Three dimensional structures of both proteins were determined by X-ray diffraction methods (PDB codes: 3MKV and 3FEQ). These proteins fold as distorted (β/α)8-barrels with two divalent cations in the active site. The structure of Sgx9260c was also determined as a complex with the N-methyl phosphonate derivative of L-Pro (PDB code: 3N2C). In this structure the phosphonate moiety bridges the binuclear metal center and one oxygen atom interacts with His-140. The α-carboxylate of the inhibitor interacts with Tyr-231. The proline side chain occupies a small substrate binding cavity formed by residues contributed from the loop that follows β-strand 7 within the (β/α)8-barrel. A total of 38 other proteins from cog1228 are predicted to have the same substrate profile based on conservation of the substrate binding residues. The structure of an evolutionarily related protein, Cc2672 from Caulobacter crecentus, was determined as a complex with the N-methyl phosphonate derivative of L-arginine (PDB code: 3MTW).