Search tips
Search criteria

Results 1-25 (35)

Clipboard (0)

Select a Filter Below

Year of Publication
more »
1.  Trendspotting in the Protein Data Bank 
FEBS letters  2013;587(8):1036-1045.
The Protein Data Bank (PDB) was established in 1971 as a repository for the three dimensional structures of biological macromolecules. Since then, more than 85,000 biological macromolecule structures have been determined and made available in the PDB archive. Through analysis of the corpus of data, it is possible to identify trends that can be used to inform us about the future of structural biology and to plan the best ways to improve the management of the ever-growing amount of PDB data.
PMCID: PMC4068610  PMID: 23337870
2.  The Future of the Protein Data Bank 
Biopolymers  2012;99(3):218-222.
The Worldwide Protein Data Bank (wwPDB) is the international collaboration that manages the deposition, processing and distribution of the PDB archive. The wwPDB’s mission is to maintain a single archive of macromolecular structural data that are freely and publicly available to the global community. Its members [RCSB PDB (USA), PDBe (Europe), PDBj (Japan), and BMRB (USA)] host data-deposition sites and mirror the PDB ftp archive. To support future developments in structural biology, the wwPDB partners are addressing organizational, scientific, and technical challenges.
PMCID: PMC3684242  PMID: 23023942
Protein Data Bank; structural biology; archive
3.  Chemical annotation of small and peptide-like molecules at the Protein Data Bank 
Over the past decade, the number of polymers and their complexes with small molecules in the Protein Data Bank archive (PDB) has continued to increase significantly. To support scientific advancements and ensure the best quality and completeness of the data files over the next 10 years and beyond, the Worldwide PDB partnership that manages the PDB archive is developing a new deposition and annotation system. This system focuses on efficient data capture across all supported experimental methods. The new deposition and annotation system is composed of four major modules that together support all of the processing requirements for a PDB entry. In this article, we describe one such module called the Chemical Component Annotation Tool. This tool uses information from both the Chemical Component Dictionary and Biologically Interesting molecule Reference Dictionary to aid in annotation. Benchmark studies have shown that the Chemical Component Annotation Tool provides significant improvements in processing efficiency and data quality. Database URL:
PMCID: PMC3843158  PMID: 24291661
4.  The Nucleic Acid Database: new features and capabilities 
Nucleic Acids Research  2013;42(D1):D114-D122.
The Nucleic Acid Database (NDB) ( is a web portal providing access to information about 3D nucleic acid structures and their complexes. In addition to primary data, the NDB contains derived geometric data, classifications of structures and motifs, standards for describing nucleic acid features, as well as tools and software for the analysis of nucleic acids. A variety of search capabilities are available, as are many different types of reports. This article describes the recent redesign of the NDB Web site with special emphasis on new RNA-derived data and annotations and their implementation and integration into the search capabilities.
PMCID: PMC3964972  PMID: 24185695
5.  The Protein Structure Initiative Structural Biology Knowledgebase Technology Portal: A Structural Biology Web Resource 
The Technology Portal of the Protein Structure Initiative Structural Biology Knowledgebase (PSI SBKB; is a web resource providing information about methods and tools that can be used to relieve bottlenecks in many areas of protein production and structural biology research. Several useful features are available on the web site, including multiple ways to search the database of over 250 technological advances, a link to videos of methods on YouTube, and access to a technology forum where scientists can connect, ask questions, get news, and develop collaborations. The Technology Portal is a component of the PSI SBKB (, which presents integrated genomic, structural, and functional information for all protein sequence targets selected by the Protein Structure Initiative. Created in collaboration with the Nature Publishing Group, the SBKB offers an array of resources for structural biologists, such as a research library, editorials about new research advances, a featured biological system each month, and a Functional Sleuth for searching protein structures of unknown function. An overview of the various features and examples of user searches highlight the information, tools, and avenues for scientific interaction available through the Technology Portal.
PMCID: PMC3588887  PMID: 22527514
Database; Protein; Protein Production; Structural Biology; Structural Genomics; Technology
6.  The RCSB Protein Data Bank: new resources for research and education 
Nucleic Acids Research  2012;41(D1):D475-D482.
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) develops tools and resources that provide a structural view of biology for research and education. The RCSB PDB web site ( uses the curated 3D macromolecular data contained in the PDB archive to offer unique methods to access, report and visualize data. Recent activities have focused on improving methods for simple and complex searches of PDB data, creating specialized access to chemical component data and providing domain-based structural alignments. New educational resources are offered at the PDB-101 educational view of the main web site such as Author Profiles that display a researcher’s PDB entries in a timeline. To promote different kinds of access to the RCSB PDB, Web Services have been expanded, and an RCSB PDB Mobile application for the iPhone/iPad has been released. These improvements enable new opportunities for analyzing and understanding structure data.
PMCID: PMC3531086  PMID: 23193259
7.  The Protein Data Bank at 40: Reflecting on the Past to Prepare for the Future 
A symposium celebrating the 40th anniversary of the Protein Data Bank archive (PDB), organized by the Worldwide Protein Data Bank, was held at Cold Spring Harbor Laboratory (CSHL) October 28–30, 2011. PDB40’s distinguished speakers highlighted four decades of innovation in structural biology, from the early era of structural determination to future directions for the field.
PMCID: PMC3501388  PMID: 22404998
8.  E. coli trp repressor forms a domain-swapped array in aqueous alcohol 
Structure (London, England : 1993)  2004;12(6):1099-1108.
The E. coli trp repressor (trpR) homodimer recognizes its palindromic DNA-binding site through a pair of flexible helix-turn-helix (HTH) motifs displayed on an intertwined helical core. Flexible N-terminal arms mediate association between dimers bound to tandem DNA sites. The 2.5 Å X-ray structure of trpR crystallized in 30% (v/v) isopropanol reveals a substantial conformational rearrangement of HTH motifs and N-terminal arms, with the protein appearing in the unusual form of an ordered 3D domain-swapped supramolecular array. Small angle X-ray scattering measurements show that the self-association properties of trpR in solution are fundamentally altered by isopropanol.
PMCID: PMC3228604  PMID: 15274929
9.  The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods 
The Protein Structure Initiative’s Structural Biology Knowledgebase (SBKB, URL: is an open web resource designed to turn the products of the structural genomics and structural biology efforts into knowledge that can be used by the biological community to understand living systems and disease. Here we will present examples on how to use the SBKB to enable biological research. For example, a protein sequence or Protein Data Bank (PDB) structure ID search will provide a list of related protein structures in the PDB, associated biological descriptions (annotations), homology models, structural genomics protein target status, experimental protocols, and the ability to order available DNA clones from the PSI:Biology-Materials Repository. A text search will find publication and technology reports resulting from the PSI’s high-throughput research efforts. Web tools that aid in research, including a system that accepts protein structure requests from the community, will also be described. Created in collaboration with the Nature Publishing Group, the Structural Biology Knowledgebase monthly update also provides a research library, editorials about new research advances, news, and an events calendar to present a broader view of structural genomics and structural biology.
PMCID: PMC3123456  PMID: 21472436
Protein; Protein production; Structural biology; Structural databases; Structural genomics; Theoretical models
10.  Quality assurance for the query and distribution systems of the RCSB Protein Data Bank 
The RCSB Protein Data Bank (RCSB PDB, is a key online resource for structural biology and related scientific disciplines. The website is used on average by 165 000 unique visitors per month, and more than 2000 other websites link to it. The amount and complexity of PDB data as well as the expectations on its usage are growing rapidly. Therefore, ensuring the reliability and robustness of the RCSB PDB query and distribution systems are crucially important and increasingly challenging. This article describes quality assurance for the RCSB PDB website at several distinct levels, including: (i) hardware redundancy and failover, (ii) testing protocols for weekly database updates, (iii) testing and release procedures for major software updates and (iv) miscellaneous monitoring and troubleshooting tools and practices. As such it provides suggestions for how other websites might be operated.
Database URL:
PMCID: PMC3056270  PMID: 21382834
11.  The RCSB Protein Data Bank: redesigned web site and web services 
Nucleic Acids Research  2010;39(Database issue):D392-D401.
The RCSB Protein Data Bank (RCSB PDB) web site ( has been redesigned to increase usability and to cater to a larger and more diverse user base. This article describes key enhancements and new features that fall into the following categories: (i) query and analysis tools for chemical structure searching, query refinement, tabulation and export of query results; (ii) web site customization and new structure alerts; (iii) pair-wise and representative protein structure alignments; (iv) visualization of large assemblies; (v) integration of structural data with the open access literature and binding affinity data; and (vi) web services and web widgets to facilitate integration of PDB data and tools with other resources. These improvements enable a range of new possibilities to analyze and understand structure data. The next generation of the RCSB PDB web site, as described here, provides a rich resource for research and education.
PMCID: PMC3013649  PMID: 21036868
12.  Promoting a structural view of biology for varied audiences: an overview of RCSB PDB resources and experiences 
Journal of Applied Crystallography  2010;43(Pt 5):1224-1229.
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) serves a community of users with diverse backgrounds and interests. In addition to processing, archiving and distributing structural data, it also develops educational resources and materials to enable people to utilize PDB data and to further a structural view of biology.
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) supports scientific research and education worldwide by providing an essential resource of information on biomolecular structures. In addition to serving as a deposition, data-processing and distribution center for PDB data, the RCSB PDB offers resources and online materials that different audiences can use to customize their structural biology instruction. These include resources for general audiences that present macromolecular structure in the context of a biological theme, method-based materials for researchers who take a more traditional approach to the presentation of structural science, and materials that mix theme-based and method-based approaches for educators and students. Through these efforts the RCSB PDB aims to enable optimal use of structural data by researchers, educators and students designing and understanding experiments in biology, chemistry and medicine, and by general users making informed decisions about their life and health.
PMCID: PMC2943739  PMID: 20877496
Protein Data Bank; crystallographic education; macromolecular structures; biological crystallography
13.  Signatures of protein-DNA recognition in free DNA binding sites 
Journal of molecular biology  2009;386(4):1054-1065.
One obstacle to achieving complete understanding of the principles underlying sequence-dependent recognition of DNA is the paucity of structural data for DNA recognition sequences in their free (unbound) state. Here we carried out crystallization screening of 50 DNA duplexes containing cognate protein binding sites and obtained new crystal structures of free DNA binding sites for three distinct modes of DNA recognition: anti-parallel beta strands (MetR), helix-turn-helix motif + hinge helices (PurR), and zinc fingers (Zif268). Structural changes between free and protein-bound DNA are manifested differently in each case. The new DNA structures reveal that distinctive sequence-dependent DNA geometry dominates recognition by MetR, protein-induced bending of DNA dictates recognition by PurR, and deformability of DNA along the A-B continuum is important in recognition by Zif268. Together, our findings show that crystal structures of free DNA binding sites provide new information about the nature of protein-DNA interactions and thus lend insights towards a structural code for DNA recognition.
PMCID: PMC2753591  PMID: 19244617
DNA structure; transcription factors; indirect readout; protein-DNA interactions; gene regulation
14.  Catabolite activator protein (CAP): DNA binding and transcription activation 
Recent structures of Escherichia coli catabolite activator protein (CAP) in complex with DNA, and in complex with RNA polymerase α subunit C-terminal domain (αCTD) and DNA, have yielded insights into how CAP binds DNA and activates transcription. Comparison of multiple structures of CAP-DNA complexes has revealed contributions of direct readout and indirect readout to DNA binding by CAP. The structure of the CAP-αCTD-DNA complex has provided the first structural description of interactions between a transcription activator and its functional target within the general transcription machinery. Using the structure of the CAP-αCTD-DNA complex, the structure of an RNAP-DNA complex, and restraints from biophysical, biochemical, and genetic experiments, it has been possible to construct detailed three-dimensional models of intact Class I and Class II transcription activation complexes.
PMCID: PMC2765107  PMID: 15102444
catabolite activator protein (CAP); cAMP receptor protein (CRP); RNA polymerase; σ70; promoter; DNA binding; DNA bending; transcription activation
15.  Outcome of a Workshop on Applications of Protein Models in Biomedical Research 
We describe the proceedings and conclusions from a “Workshop on Applications of Protein Models in Biomedical Research” that was held at University of California at San Francisco on 11 and 12 July, 2008. At the workshop, international scientists involved with structure modeling explored (i) how models are currently used in biomedical research, (ii) what the requirements and challenges for different applications are, and (iii) how the interaction between the computational and experimental research communities could be strengthened to advance the field.
PMCID: PMC2739730  PMID: 19217386
16.  The Protein Model Portal 
Structural Genomics has been successful in determining the structures of many unique proteins in a high throughput manner. Still, the number of known protein sequences is much larger than the number of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. Thereby, experimental structure determination efforts and homology modeling complement each other in the exploration of the protein structure space. One of the challenges in using model information effectively has been to access all models available for a specific protein in heterogeneous formats at different sites using various incompatible accession code systems. Often, structure models for hundreds of proteins can be derived from a given experimentally determined structure, using a variety of established methods. This has been done by all of the PSI centers, and by various independent modeling groups. The goal of the Protein Model Portal (PMP) is to provide a single portal which gives access to the various models that can be leveraged from PSI targets and other experimental protein structures. A single interface allows all existing pre-computed models across these various sites to be queried simultaneously, and provides links to interactive services for template selection, target-template alignment, model building, and quality assessment. The current release of the portal consists of 7.6 million model structures provided by different partner resources (CSMP, JCSG, MCSG, NESG, NYSGXRC, JCMM, ModBase, SWISS-MODEL Repository). The PMP is available at and from the PSI Structural Genomics Knowledgebase.
PMCID: PMC2704613  PMID: 19037750
Protein model portal; PSI structural genomics knowledgebase; Comparative protein structure modeling; Homology modeling; Model database
17.  The protein structure initiative structural genomics knowledgebase 
Nucleic Acids Research  2008;37(Database issue):D365-D368.
The Protein Structure Initiative Structural Genomics Knowledgebase (PSI SGKB, has been created to turn the products of the PSI structural genomics effort into knowledge that can be used by the biological research community to understand living systems and disease. This resource provides central access to structures in the Protein Data Bank (PDB), along with functional annotations, associated homology models, worldwide protein target tracking information, available protocols and the potential to obtain DNA materials for many of the targets. It also offers the ability to search all of the structural and methodological publications and the innovative technologies that were catalyzed by the PSI's high-throughput research efforts. In collaboration with the Nature Publishing Group, the PSI SGKB provides a research library, editorials about new research advances, news and an events calendar to present a broader view of structural biology and structural genomics. By making these resources freely available, the PSI SGKB serves as a bridge to connect the structural biology and the greater biomedical communities.
PMCID: PMC2686438  PMID: 19010965
18.  Representation of viruses in the remediated PDB archive 
A new data model for PDB entries of viruses and other biological assemblies with regular noncrystallographic symmetry is described.
A new scheme has been devised to represent viruses and other biological assemblies with regular noncrystallographic symmetry in the Protein Data Bank (PDB). The scheme describes existing and anticipated PDB entries of this type using generalized descriptions of deposited and experimental coordinate frames, symmetry and frame transformations. A simplified notation has been adopted to express the symmetry generation of assemblies from deposited coordinates and matrix operations describing the required point, helical or crystallographic symmetry. Complete correct information for building full assemblies, subassemblies and crystal asymmetric units of all virus entries is now available in the remediated PDB archive.
PMCID: PMC2677383  PMID: 18645236
virus structures; Protein Data Bank; database integration; uniform curation; point symmetry; helical symmetry; biological assemblies
19.  BioMagResBank (BMRB) as a partner in the Worldwide Protein Data Bank (wwPDB): new policies affecting biomolecular NMR depositions 
Journal of Biomolecular Nmr  2008;40(3):153-155.
We describe the role of the BioMagResBank (BMRB) within the Worldwide Protein Data Bank (wwPDB) and recent policies affecting the deposition of biomolecular NMR data. All PDB depositions of structures based on NMR data must now be accompanied by experimental restraints. A scheme has been devised that allows depositors to specify a representative structure and to define residues within that structure found experimentally to be largely unstructured. The BMRB now accepts coordinate sets representing three-dimensional structural models based on experimental NMR data of molecules of biological interest that fall outside the guidelines of the Protein Data Bank (i.e., the molecule is a peptide with 23 or fewer residues, a polynucleotide with 3 or fewer residues, a polysaccharide with 3 or fewer sugar residues, or a natural product), provided that the coordinates are accompanied by representation of the covalent structure of the molecule (atom connectivity), assigned NMR chemical shifts, and the structural restraints used in generating model. The BMRB now contains an archive of NMR data for metabolites and other small molecules found in biological systems.
PMCID: PMC2268728  PMID: 18288446
Archived NMR data; Metabolomics; NMR structure; Structural restraints; Unstructured regions
Journal of molecular biology  2006;357(1):173-183.
The catabolite activator protein (CAP) bends DNA in the CAP-DNA complex, typically introducing a sharp DNA kink, with a roll angle of ∼40° and a twist angle of ∼20°, between positions 6 and 7 of the DNA half-site, 5′-A1A2A3T4G5T6G7A8T9C10T11-3′ (“primary kink”). In previous work, we showed that CAP recognizes the nucleotide immediately 5′ to the primary-kink site, T6, through an “indirect-readout” mechanism involving sequence effects on energetics of primary-kink formation. In this work, to understand further this example of indirect readout, we have determined crystal structures of CAP-DNA complexes containing each possible nucleotide at position 6. The structures show that CAP can introduce a DNA kink at the primary-kink site with any nucleotide at position 6. The DNA kink is sharp with the consensus pyrimidine-purine step T6G7 and the nonconsensus pyrimidine-purine step C6G7 (roll angles of ∼42°, twist angles of ∼16°), but is much less sharp with the nonconsensus purine-purine steps A6G7 and G6G7 (roll angles of ∼20°, twist angles of ∼17°). We infer that CAP discriminates between consensus and non-consensus pyrimidine-purine steps at positions 6-7 solely based on differences in the energetics of DNA deformation, but that CAP discriminates between the consensus pyrimidine-purine step and non-consensus purine-purine steps at positions 6-7 both based on differences in the energetics of DNA deformation and based on qualitative differences in DNA deformation. The structures further show that CAP can achieve a similar, ∼46° per DNA half-site, overall DNA bend through a sharp DNA kink, a less sharp DNA kink, or a smooth DNA bend. Analysis of these and other crystal structures of CAP-DNA complexes indicates that there is a large, ∼28° per DNA half-site, out-of plane, component of CAP-induced DNA bending in structures not constrained by end-to-end DNA lattice interactions and that lattice contacts involving CAP tend to involve residues in or near biologically functional surfaces.
PMCID: PMC1479893  PMID: 16427082
catabolite activator protein (CAP); cAMP receptor protein (CRP); protein-DNA interaction; protein-induced DNA bending; indirect readout
21.  The RCSB PDB information portal for structural genomics 
Nucleic Acids Research  2005;34(Database issue):D302-D305.
The RCSB Protein Data Bank (PDB) offers online tools, summary reports and target information related to the worldwide structural genomics initiatives from its portal at . There are currently three components to this site: Structural Genomics Initiatives contains information and links on each structural genomics site, including progress reports, target lists, target status, targets in the PDB and level of sequence redundancy; Targets provides combined target information, protocols and other data associated with protein structure determination; and Structures offers an assessment of the progress of structural genomics based on the functional coverage of the human genome by PDB structures, structural genomics targets and homology models. Functional coverage can be examined according to enzyme classification, gene ontology (biological process, cell component and molecular function) and disease.
PMCID: PMC1347482  PMID: 16381872
22.  RNA conformational classes 
Nucleic Acids Research  2004;32(5):1666-1677.
RNA exhibits a large diversity of conformations. Three thousand nucleotides of 23S and 5S ribosomal RNA from a structure of the large ribosomal subunit were analyzed in order to classify their conformations. Fourier averaging of the six 3D distributions of torsion angles and analyses of the resulting pseudo electron maps, followed by clustering of the preferred combinations of torsion angles were performed on this dataset. Eighteen non-A-type conformations and 14 A-RNA related conformations were discovered and their torsion angles were determined; their Cartesian coordinates are available.
PMCID: PMC390331  PMID: 15016910
23.  The distribution and query systems of the RCSB Protein Data Bank 
Nucleic Acids Research  2004;32(Database issue):D223-D225.
The Protein Data Bank (PDB; is the primary source of information on the 3D structure of biological macromolecules. The PDB’s mandate is to disseminate this information in the most usable form and as widely as possible. The current query and distribution system is described and an alpha version of the future re-engineered system introduced.
PMCID: PMC308830  PMID: 14681399
24.  Using electrostatic potentials to predict DNA-binding sites on DNA-binding proteins 
Nucleic Acids Research  2003;31(24):7189-7198.
A method to detect DNA-binding sites on the surface of a protein structure is important for functional annotation. This work describes the analysis of residue patches on the surface of DNA-binding proteins and the development of a method of predicting DNA-binding sites using a single feature of these surface patches. Surface patches and the DNA-binding sites were initially analysed for accessibility, electrostatic potential, residue propensity, hydrophobicity and residue conservation. From this, it was observed that the DNA-binding sites were, in general, amongst the top 10% of patches with the largest positive electrostatic scores. This knowledge led to the development of a prediction method in which patches of surface residues were selected such that they excluded residues with negative electrostatic scores. This method was used to make predictions for a data set of 56 non-homologous DNA-binding proteins. Correct predictions made for 68% of the data set.
PMCID: PMC291864  PMID: 14654694
25.  The Protein Data Bank and structural genomics 
Nucleic Acids Research  2003;31(1):489-491.
The Protein Data Bank (PDB; continues to be actively involved in various aspects of the informatics of structural genomics projects—developing and maintaining the Target Registration Database (TargetDB), organizing data dictionaries that will define the specification for the exchange and deposition of data with the structural genomics centers and creating software tools to capture data from standard structure determination applications.
PMCID: PMC165515  PMID: 12520059

Results 1-25 (35)