Spatially selective heteronuclear multiple-quantum coherence (SS HMQC) NMR spectroscopy was devised for solution studies of proteins. Due to ‘time-staggered’ acquisition of free induction decays (FIDs) in different slices, SS HMQC allows one to employ long delays for longitudinal nuclear spin relaxation at high repetition rates for the acquisition of the FIDs. To also achieve high intrinsic sensitivity, SS HMQC was implemented by combing a single spatially selective 1H excitation pulse with non-selective 1H 180° pulses. High-quality spectra could be obtained within 66 seconds for a 7.6 kDa uniformly 13C,15N-labeled protein, and within 45 and 90 seconds for, respectively, two uniformly 2H,13C,15N-labeled but isoleucine, leucine and valine methyl group protonated proteins with molecular weights of 7.5 and 43 kDa.
rapid data acquisition; spatially selective NMR; time staggered data acquisition; flip-back pulses; HMQC
The 500 kDa protein plectin is essential for the cytoskeletal organization of most mammalian cells and it is up-regulated in some types of cancer. Here, we report nearly complete sequence-specific polypeptide backbone, 13Cβ and methyl group resonance assignments for 24 kDa human plectin(4403-4606) containing the C-terminal plectin repeat domain 6.
cytoskeletal linker protein; HCPIN; plakin repeat domain; plectin repeat domain; selective isotpe labeling; structural genomics
BH3 peptides are key mediators of apoptosis and have served as the lead structures for the development of anticancer therapeutics. Previously, we reported the application of a simple cysteine-based side chain cross-linking chemistry to NoxaBH3 peptides that led to the generation of the cross-linked NoxaBH3 peptides with increased cell permeability and higher inhibitory activity against Mcl-1 (Muppidi, A.; Doi, K.; Edwardraja, S.; Drake, E. J.; Gulick, A. M.; Wang, H.-G.; Lin, Q. J. Am. Chem. Soc.
2012, 134, 14734). To deliver cross-linked NoxaBH3 peptides selectively into cancer cells for enhanced efficacy and reduced systemic toxicity, here we report the conjugation of the NoxaBH3 peptides with the extracellular ubiquitin, a recently identified endogenous ligand for CXCR4—a chemokine receptor overexpressed in cancer cells. The resulting ubiquitin-NoxaBH3 peptide conjugates showed increased inhibitory activity against Mcl-1 and selective killing of the CXCR4-expressing cancer cells. The successful delivery of the NoxaBH3 peptides by ubiquitin into cancer cells suggests that the ubiquitin/CXCR4 axis may serve as a general route for the targeted delivery of anticancer agents.
peptides are key mediators of apoptosis and have served as the lead
structures for the development of anticancer therapeutics. Previously,
we reported the application of a simple cysteine-based side chain
cross-linking chemistry to NoxaBH3 peptides that led to the generation
of the cross-linked NoxaBH3 peptides with increased cell permeability
and higher inhibitory activity against Mcl-1 (Muppidi, A., Doi, K., Edwardraja, S., Drake, E. J., Gulick, A. M.,
Wang, H.-G., Lin, Q. (2012) J.
Am. Chem. Soc.134, 1473422920569). To deliver cross-linked NoxaBH3 peptides selectively
into cancer cells for enhanced efficacy and reduced systemic toxicity,
here we report the conjugation of the NoxaBH3 peptides with the extracellular
ubiquitin, a recently identified endogenous ligand for CXCR4, a chemokine
receptor overexpressed in cancer cells. The resulting ubiquitin-NoxaBH3
peptide conjugates showed increased inhibitory activity against Mcl-1
and selective killing of the CXCR4-expressing cancer cells. The successful
delivery of the NoxaBH3 peptides by ubiquitin into cancer cells suggests
that the ubiquitin/CXCR4 axis may serve as a general route for the
targeted delivery of anticancer agents.
A high-quality structure of the 68-residue protein CD1104B from Clostridium difficile strain 630 exhibits a distinct all α-helical fold. The structure presented here is the first representative of bacterial protein domain family PF14203 (currently 180 members) of unknown function (DUF4319) and reveals that the side-chains of the only two strictly conserved residues (Glu 8 and Lys 48) form a salt bridge. Moreover, these two residues are located in the vicinity of the largest surface cleft which is predicted to contribute to a surface area involved in protein-protein interactions. This, along with its coding in transposon CTn4, suggests that CD1104B (and very likely all members of Pfam 14230) functions by interacting with other proteins required for the transfer of transposons between different bacterial species.
CD1104B; PF14203; DUF4319; Transposon; Structural Genomics
High-quality NMR structures of the C-terminal domain comprising residues 484-537 of the 537-residue protein Bacterial chlorophyll subunit B (BchB) from Chlorobium tepidum and residues 9-61 of 61-residue Asr4154 from Nostoc sp. (strain PCC 7120) exhibit a mixed α/β fold comprised of three α-helices and a small β-sheet packed against second α-helix. These two proteins share 29 % sequence similarity and their structures are globally quite similar. The structures of BchB(484-537) and Asr4154(9-61) are the first representative structures for the large protein family (Pfam) PF08369, a family of unknown function currently containing 610 members in bacteria and eukaryotes. Furthermore, BchB(484-537) complements the structural coverage of the dark-operating protochlorophyllide oxidoreductase (DPOR).
BchB; DPOR; Asr4154; PF08369; PCP-red; structural genomics
Bacterial species in the Enterobacteriaceae typically contain multiple paralogues of a small domain of unknown function (DUF1471) from a family of conserved proteins also known as YhcN or BhsA/McbA. Proteins containing DUF1471 may have a single or three copies of this domain. Representatives of this family have been demonstrated to play roles in several cellular processes including stress response, biofilm formation, and pathogenesis. We have conducted NMR and X-ray crystallographic studies of four DUF1471 domains from Salmonella representing three different paralogous DUF1471 subfamilies: SrfN, YahO, and SssB/YdgH (two of its three DUF1471 domains: the N-terminal domain I (residues 21–91), and the C-terminal domain III (residues 244–314)). Notably, SrfN has been shown to have a role in intracellular infection by Salmonella Typhimurium. These domains share less than 35% pairwise sequence identity. Structures of all four domains show a mixed α+β fold that is most similar to that of bacterial lipoprotein RcsF. However, all four DUF1471 sequences lack the redox sensitive cysteine residues essential for RcsF activity in a phospho-relay pathway, suggesting that DUF1471 domains perform a different function(s). SrfN forms a dimer in contrast to YahO and SssB domains I and III, which are monomers in solution. A putative binding site for oxyanions such as phosphate and sulfate was identified in SrfN, and an interaction between the SrfN dimer and sulfated polysaccharides was demonstrated, suggesting a direct role for this DUF1471 domain at the host-pathogen interface.
A high-quality NMR solution structure is presented for protein hMcl-1(171–327) which comprises residues 171–327 of the human anti-apoptotic protein Mcl-1 (hMcl-1). Since this construct contains the three Bcl-2 homology (BH) sequence motifs which participate in forming a binding site for inhibitors of hMcl-1, it is deemed to be crucial for structure-based design of novel anti-cancer drugs blocking the Mcl1 related anti-apoptotic pathway. While the coordinates of an NMR solution structure for a corresponding construct of the mouse homologue (mMcl-1) are publicly available, our structure is the first atomic resolution structure reported for the ‘apo form’ of the human protein. Comparison of the two structures reveals that hMcl-1(171–327) exhibits a somewhat wider ligand/inhibitor binding groove as well as a different charge distribution within the BH3 binding groove. These findings strongly suggest that the availability of the human structure is of critical importance to support future design of cancer drugs.
Rational drug design relies on three-dimensional structures of biological macromolecules, especially proteins. Structural genomics high-throughput (HTP) structure determination platforms established by the NIH Protein Structure Initiative are uniquely suited to provide these structures. NMR plays a critical role since (i) many important protein targets do not form single crystals required for X-ray diffraction and (ii) NMR can provide valuable structural and dynamic information on proteins and their drug complexes that cannot be obtained with X-ray crystallography. In this article, recent advances of NMR driven by structural genomics projects are reviewed. These advances promise that future pharmaceutical discovery and design of drugs can increasingly rely on protocols for rapid and accurate NMR structure determination.
protein NMR; structural genomics; structural proteomics; drug discovery; protein interaction networks; structural bioinformatics
A high-quality NMR structure of the helicase associated (HA) domain comprising residues 627–691 of the 753-residue protein BVU_0683 from Bacteroides vulgatus exhibits an all α-helical fold. The structure presented here is the first representative for the large protein domain family PF03457 (currently 742 members) of HA domains. Comparison with structurally similar proteins supports the hypothesis that HA domains bind to DNA and that binding specificity varies greatly within the family of HA domains constituting PF03457.
A6KY75_BACV8; BVU_0683; PF03457; Helicase associated domain; Structural genomics; SANT domain
Identification of a non-invasive technique to assess embryo implantation potential in assisted reproduction would greatly increase success rates and lead more efficiently to single embryo transfer. Early studies suggested metabonomic analysis of spent culture media could improve embryo selection. The goal of this study is to assess if embryo implantation can be predicted based on proton nuclear magnetic resonance (1H NMR) profiles of spent embryo culture media from patients undergoing transfer of multiple embryos on cycle day 3.
We conducted a retrospective study in an academic assisted reproduction technology (ART) program and analyzed the data in a university research center. Two hundred twenty-eight spent culture media samples originating from 108 patients were individually analyzed. Specifically, five distinct sets (1 to 5) of different types of spent media samples (volume ~14 μL) from embryos that resulted in clinical pregnancy (positive heart rate at 6 weeks gestation) (n1 = 29; n2 = 19; n3 = 9; n4 = 12; n5 = 33; ntotal = 102) and from embryos that did not implant (n1 = 28; n2 = 29; n3 = 18; n4 = 15; n5 = 36; ntotal = 126) were collected on day 3 of embryo growth. The media samples were profiled using 1H NMR spectroscopy, and the NMR profiles of sets 1 to 5 were subject to standard uni- and multi-variate data analyses in order to evaluate potential correlation of profiles with implantation success.
For set 1 of the media samples, a borderline class separation of NMR profiles was obtained by use of principal component analysis (PCA) and logistic regression. This tentative class separation could not be repeated and validated in any of the other media sets 2 to 5.
Despite the rigorous technical approach, 1H NMR based profiling of spent culture media cannot predict success of implantation for day 3 human embryos.
Electronic supplementary material
The online version of this article (doi:10.1007/s10815-012-9877-9) contains supplementary material, which is available to authorized users.
ART; 1H NMR; Metabonomics; IVF; Implantation success
The yeast mitochondrial protein Sdh5 is required for the covalent attachment of flavin adenine dinucleotide (FAD) to protein Sdh1, a subunit of the hetero-tetrameric enzyme succinate dehydrogenase (SDH). The NMR structure of Sdh5 represents the first eukaryotic structure of the Pfam family PF03937 and reveals a conserved surface region, which likely represents a putative Sdh1-Sdh5 interaction interface. Point mutations in this region result in the loss of covalent flavinylation of Sdh1. Moreover, backbone chemical shift perturbation measurements showed that Sdh5 does not bind FAD in vitro, indicating that it does not function as simple cofactor transporter in vivo.
Protein design tests our understanding of protein stability and structure. Successful design methods should allow the exploration of sequence space not found in nature. However, when redesigning naturally occurring protein structures most fixed backbone design algorithms return amino acid sequences that share strong sequence identity with wild-type sequences, especially in the protein core. This behavior places a restriction on functional space that can be explored and is not consistent with observations from nature, where sequences of low identity have similar structures. Here, we allow backbone flexibility during design to mutate every position in the core (38 residues) of a four-helix bundle protein. Only small perturbations to the backbone, 1-2 Å, were needed to entirely mutate the core. The redesigned protein, DRNN, is exceptionally stable (melting point > 140 °C). An NMR and X-ray crystal structure show that the side chains and backbone were accurately modeled (all-atom RMSD = 1.3 Å).
Computational Protein Design; de novo Protein Design; Flexible Backbone Protein Design
High-quality NMR structures of the homo-dimeric proteins Bvu3908 (69-residues in monomeric unit) from Bacteroides vulgatus and Bt2368 (74-residues) from Bacteroides thetaiotaomicron reveal the presence of winged helix-turn-helix (wHTH) motifs mediating tight complex formation. Such homo-dimer formation by winged HTH motifs is otherwise found only in two DNA-binding proteins with known structure: the C-terminal wHTH domain of transcriptional activator FadR from E. coli and protein TubR from B. thurigensis, which is involved in plasmid DNA segregation. However, the relative orientation of the wHTH motifs is different and residues involved in DNA-binding are not conserved in Bvu3908 and Bt2368. Hence, the proteins of the present study are not very likely to bind DNA, but are likely to exhibit a function that has thus far not been ascribed to homo-dimers formed by winged HTH motifs. The structures of Bvu3908 and Bt2368 are the first atomic resolution structures for PFAM family PF10771, a family of unknown function (DUF2582) currently containing 128 members.
Bvu3908; Bt2368; PF10771; DUF2582; Winged helix-turn-helix; Structural genomics
De novo proteins provide a unique opportunity for investigating the structure-function relationships of metalloproteins in a minimal, well-defined, and controlled scaffold. Herein, we describe the rational programming of function in a de novo designed di-iron carboxylate protein from the due ferri family. Originally created to catalyze O2-dependent, two-electron oxidation of hydroquinones, the protein was reprogrammed to catalyze the selective N-hydroxylation of arylamines by remodeling the substrate access cavity and introducing a critical third His ligand to the metal binding cavity. Additional second-and third-shell modifications were required to stabilize the His ligand in the core of the protein. These changes resulted in at least a 106 –fold increase in the relative rates of the two reactions. This result highlights the potential for using de novo proteins as scaffolds for future investigations of geometric and electronic factors that influence the catalytic tuning of di-iron active sites.
de novo design; metalloproteins; di-iron proteins; four-helix bundle; oxidase
The protein family (Pfam) PF04536 is a broadly conserved domain family of unknown function (DUF477), with more than 1,350 members in prokaryotic and eukaryotic proteins. High-quality NMR structures of the N-terminal domain comprising residues 41–180 of the 684-residue protein CG2496 from Corynebacterium glutamicum and the N-terminal domain comprising residues 35–182 of the 435-residue protein PG0361 from Porphyromonas gingivalis both exhibit an α/β fold comprised of a four-stranded β-sheet, three α-helices packed against one side of the sheet, and a fourth α-helix attached to the other side. In spite of low sequence similarity (18%) assessed by structure-based sequence alignment, the two structures are globally quite similar. However, moderate structural differences are observed for the relative orientation of two of the four helices. Comparison with known protein structures reveals that the α/β architecture of CG2496(41–180) and PG0361(35–182) has previously not been characterized. Moreover, calculation of surface charge potential and identification of surface clefts indicate that the two domains very likely have different functions.
CG2496; PG0361; CgR26A; PgR37A; PF04536; DUF477; Structural genomics
The protocols currently used for protein structure determination by NMR depend on the determination of a large number of upper distance limits for proton-proton pairs. Typically, this task is performed manually by an experienced researcher rather than automatically by using a specific computer program. To assess whether it is indeed possible to generate in a fully automated manner NMR structures adequate for deposition in the Protein Data Bank, we gathered ten experimental datasets with unassigned NOESY peak lists for various proteins of unknown structure, computed structures for each of them using different, fully automatic programs, and compared the results to each other and to the manually solved reference structures that were not available at the time the data were provided. This constitutes a stringent “blind” assessment similar to the CASP and CAPRI initiatives. This study demonstrates the feasibility of routine, fully automated protein structure determination by NMR.
Computationally designing protein-protein interactions with high affinity and desired orientation is a challenging task. Incorporating metal-binding sites at the target interface may be one approach for increasing affinity and specifying the binding mode, thereby improving robustness of designed interactions for use as tools in basic research as well as in applications from biotechnology to medicine. Here we describe a Rosetta-based approach for the rational design of a protein monomer to form a zinc-mediated, symmetric homodimer. Our metal interface design, named MID1 (NESG target ID OR37), forms a tight dimer in the presence of zinc (MID1-zinc) with a dissociation constant <30 nM. Without zinc the dissociation constant is 4 μM. The crystal structure of MID1-zinc shows good overall agreement with the computational model, but only three out of four designed histidines coordinate zinc. However, a histidine-to-glutamate point mutation resulted in four-coordination of zinc, and the resulting metal binding site and dimer orientation closely matches the computational model (Cα RMSD = 1.4 Å).
computational protein interface design; protein-protein interaction; metal; zinc; cobalt; homodimer; de novo
The soluble monomeric domain of lipoprotein YxeF from the Gram positive bacterium B. subtilis was selected by the Northeast Structural Genomics Consortium (NESG) as a target of a biomedical theme project focusing on the structure determination of the soluble domains of bacterial lipoproteins. The solution NMR structure of YxeF reveals a calycin fold and distant homology with the lipocalin Blc from the Gram-negative bacterium E.coli. In particular, the characteristic β-barrel, which is open to the solvent at one end, is extremely well conserved in YxeF with respect to Blc. The identification of YxeF as the first lipocalin homologue occurring in a Gram-positive bacterium suggests that lipocalins emerged before the evolutionary divergence of Gram positive and Gram negative bacteria. Since YxeF is devoid of the α-helix that packs in all lipocalins with known structure against the β-barrel to form a second hydrophobic core, we propose to introduce a new lipocalin sub-family named ‘slim lipocalins’, with YxeF and the other members of Pfam family PF11631 to which YxeF belongs constituting the first representatives. The results presented here exemplify the impact of structural genomics to enhance our understanding of biology and to generate new biological hypotheses.
We show that 1H NMR based metabonomics of serum allows the diagnosis of early stage I/II epithelial ovarian cancer (EOC) required for successful treatment. Because patient specimens are highly precious, we conducted an exploratory study using a micro-flow probe requiring only 20 μL serum. By use of logistic regression on principal components (PCs) of the NMR profiles, we built a 4-variable model for early stage EOC prediction (training set: 69 EOC specimens, 84 healthy controls; test set: 40 EOC, 44 controls) with operating characteristics estimated for the test set at 80% specificity [95% confidence interval (CI): 65% to 90%], 63% sensitivity (95% CI: 46% to 77%), and an area under the Receiver Operator Characteristic Curve (AUC) of 0.796. Independent validation (50 EOC, 50 controls) of the model yielded 95% specificity (95% CI: 86% to 99.5%), 68% sensitivity (95% CI: 53% to 80%) and an AUC of 0.949. A test on cancer type specificity showed that women diseased with renal cell carcinoma were not incorrectly diagnosed with EOC, indicating that metabonomics bears significant potential for cancer type-specific diagnosis. Our model can potentially be applied for women at high risk for EOC, and our study promises to contribute to developing a screening protocol for the general population.
Ovarian Cancer; Early Stage Detection; Metabonomics; Cancer-type Specificity; NMR; Micro-flow Probe; Principal Component Analysis; Predictive Statistical Model
Recording of four-dimensional (4D) spectra for proteins in the solid state has opened new avenues to obtain virtually complete resonance assignments and three-dimensional (3D) structures of proteins. As in solution state NMR, the sampling of three indirect dimensions leads per se to long minimal measurement time. Furthermore, artifact suppression in solid state NMR relies primarily on radio-frequency pulse phase cycling. For an n-step phase cycle, the minimal measurement times of both 3D and 4D spectra are increased n times. To tackle the associated ‘sampling problem’ and to avoid sampling limited data acquisition, solid state G-Matrix Fourier Transform (SS GFT) projection NMR is introduced to rapidly acquire 3D and 4D spectral information. Specifically, (4,3)D (HA)CANCOCX and (3,2)D (HACA)NCOCX were implemented and recorded for the 6 kDa protein GB1 within about 10% of the time required for acquiring the conventional congeners with the same maximal evolution times and spectral widths in the indirect dimensions. Spectral analysis was complemented by comparative analysis of expected spectral congestion in conventional and GFT NMR experiments, demonstrating that high spectral resolution of the GFT NMR experiments enables one to efficiently obtain nearly complete resonance assignments even for large proteins.
Magic-angle spinning; Chemical shift assignments; GB1; Correlation spectroscopy
The New York Consortium on Membrane Protein Structure (NYCOMPS) was formed to accelerate the acquisition of structural information on membrane proteins by applying a structural genomics approach. NY-COMPS comprises a bioinformatics group, a centralized facility operating a high-throughput cloning and screening pipeline, a set of associated wet labs that perform high-level protein production and structure determination by x-ray crystallography and NMR, and a set of investigators focused on methods development. In the first three years of operation, the NYCOMPS pipeline has so far produced and screened 7,250 expression constructs for 8,045 target proteins. Approximately 600 of these verified targets were scaled up to levels required for structural studies, so far yielding 24 membrane protein crystals. Here we describe the overall structure of NYCOMPS and provide details on the high-throughput pipeline.
Membrane proteins; Structural genomics; High throughput; NMR; X-ray
We describe a computational protocol, called DDMI, for redesigning scaffold proteins to bind to a specified region on a target protein. The DDMI protocol is implemented within the Rosetta molecular modeling program and uses rigid-body docking, sequence design, and gradient-based minimization of backbone and side chain torsion angles to design low energy interfaces between the scaffold and target protein. Iterative rounds of sequence design and conformational optimization were needed to produce models that have calculated binding energies that are similar to binding energies calculated for native complexes. We also show that additional conformation sampling with molecular dynamics can be iterated with sequence design to further lower the computed energy of the designed complexes. To experimentally test the DDMI protocol we redesigned the human hyperplastic discs protein to bind to the kinase domain of p21-activated kinase 1 (PAK1). Six designs were experimentally characterized. Two of the designs aggregated and were not characterized further. Of the remaining four designs, three bound to the PAK1 with affinities tighter than 350 μM. The tightest binding design, named Spider Roll, bound with an affinity of 100 μM. NMR –based structure prediction of Spider Roll based on backbone and 13Cβ chemical shifts using the program CS-ROSETTA indicated that the architecture of human hyperplastic discs protein is preserved. Mutagenesis studies confirmed that Spider Roll binds the target patch on PAK1. Additionally, Spider Roll binds to full length PAK1 in its activated state, but does not bind PAK1 when it forms an auto-inhibited conformation that blocks the Spider Roll target site. Subsequent NMR characterization of the binding of Spider Roll to PAK1 revealed a comparably small binding `on-rate' constant (<< 105 M−1 s−1). The ability to rationally design the site of novel protein-protein interactions is an important step towards creating new proteins that are useful as therapeutics or molecular probes.
Computational protein design; protein-protein interactions; protein docking; Rosetta molecular modeling program; NMR; CS-ROSETTA
VPA0419; yiiS; PFAM 04175; structural genomics; GFT NMR
analytical methods; clean absorption mode; GFT projection NMR; NMR spectroscopy; resolution enhancement