Search tips
Search criteria

Results 1-25 (39)

Clipboard (0)

Select a Filter Below

Year of Publication
more »
1.  Data to knowledge: how to get meaning from your result 
IUCrJ  2015;2(Pt 1):45-58.
This paper presents a variety of techniques and technologies aimed at the transformation of crystallographic data into information and knowledge.
Structural and functional studies require the development of sophisticated ‘Big Data’ technologies and software to increase the knowledge derived and ensure reproducibility of the data. This paper presents summaries of the Structural Biology Knowledge Base, the VIPERdb Virus Structure Database, evaluation of homology modeling by the Protein Model Portal, the ProSMART tool for conformation-independent structure comparison, the LabDB ‘super’ laboratory information management system and the Cambridge Structural Database. These techniques and technologies represent important tools for the transformation of crystallographic data into knowledge and information, in an effort to address the problem of non-reproducibility of experimental results.
PMCID: PMC4285880  PMID: 25610627
meaning from data; big data; databases; knowledge bases; data deposition
2.  The Future of the Protein Data Bank 
Biopolymers  2012;99(3):218-222.
The Worldwide Protein Data Bank (wwPDB) is the international collaboration that manages the deposition, processing and distribution of the PDB archive. The wwPDB’s mission is to maintain a single archive of macromolecular structural data that are freely and publicly available to the global community. Its members [RCSB PDB (USA), PDBe (Europe), PDBj (Japan), and BMRB (USA)] host data-deposition sites and mirror the PDB ftp archive. To support future developments in structural biology, the wwPDB partners are addressing organizational, scientific, and technical challenges.
PMCID: PMC3684242  PMID: 23023942
Protein Data Bank; structural biology; archive
3.  Recommendations of the wwPDB NMR Validation Task Force 
Structure (London, England : 1993)  2013;21(9):10.1016/j.str.2013.07.021.
As methods for analysis of biomolecular structure and dynamics using nuclear magnetic resonance spectroscopy (NMR) continue to advance, the resulting 3D structures, chemical shifts, and other NMR data are broadly impacting biology, chemistry, and medicine. Structure model assessment is a critical area of NMR methods development, and is an essential component of the process of making these structures accessible and useful to the wider scientific community. For these reasons, the Worldwide Protein Data Bank (wwPDB) has convened an NMR Validation Task Force (NMR-VTF) to work with the wwPDB partners in developing metrics and policies for biomolecular NMR data harvesting, structure representation, and structure quality assessment. This paper summarizes the recommendations of the NMR-VTF, and lays the groundwork for future work in developing standards and metrics for biomolecular NMR structure quality assessment.
PMCID: PMC3884077  PMID: 24010715
4.  RCSB PDB Mobile: iOS and Android mobile apps to provide data access and visualization to the RCSB Protein Data Bank 
Bioinformatics  2014;31(1):126-127.
Summary: The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) resource provides tools for query, analysis and visualization of the 3D structures in the PDB archive. As the mobile Web is starting to surpass desktop and laptop usage, scientists and educators are beginning to integrate mobile devices into their research and teaching. In response, we have developed the RCSB PDB Mobile app for the iOS and Android mobile platforms to enable fast and convenient access to RCSB PDB data and services. Using the app, users from the general public to expert researchers can quickly search and visualize biomolecules, and add personal annotations via the RCSB PDB’s integrated MyPDB service.
Availability and implementation: RCSB PDB Mobile is freely available from the Apple App Store and Google Play (
PMCID: PMC4271143  PMID: 25183487
5.  The Protein Data Bank archive as an open data resource 
The Protein Data Bank archive was established in 1971, and recently celebrated its 40th anniversary (Berman et al. in Structure 20:391, 2012). An analysis of interrelationships of the science, technology and community leads to further insights into how this resource evolved into one of the oldest and most widely used open-access data resources in biology.
PMCID: PMC4196035  PMID: 25062767
Protein Data Bank; Protein structure; Biomacromolecules; Data archive
6.  Trendspotting in the Protein Data Bank 
FEBS letters  2013;587(8):1036-1045.
The Protein Data Bank (PDB) was established in 1971 as a repository for the three dimensional structures of biological macromolecules. Since then, more than 85,000 biological macromolecule structures have been determined and made available in the PDB archive. Through analysis of the corpus of data, it is possible to identify trends that can be used to inform us about the future of structural biology and to plan the best ways to improve the management of the ever-growing amount of PDB data.
PMCID: PMC4068610  PMID: 23337870
7.  Chemical annotation of small and peptide-like molecules at the Protein Data Bank 
Over the past decade, the number of polymers and their complexes with small molecules in the Protein Data Bank archive (PDB) has continued to increase significantly. To support scientific advancements and ensure the best quality and completeness of the data files over the next 10 years and beyond, the Worldwide PDB partnership that manages the PDB archive is developing a new deposition and annotation system. This system focuses on efficient data capture across all supported experimental methods. The new deposition and annotation system is composed of four major modules that together support all of the processing requirements for a PDB entry. In this article, we describe one such module called the Chemical Component Annotation Tool. This tool uses information from both the Chemical Component Dictionary and Biologically Interesting molecule Reference Dictionary to aid in annotation. Benchmark studies have shown that the Chemical Component Annotation Tool provides significant improvements in processing efficiency and data quality. Database URL:
PMCID: PMC3843158  PMID: 24291661
8.  The Nucleic Acid Database: new features and capabilities 
Nucleic Acids Research  2013;42(Database issue):D114-D122.
The Nucleic Acid Database (NDB) ( is a web portal providing access to information about 3D nucleic acid structures and their complexes. In addition to primary data, the NDB contains derived geometric data, classifications of structures and motifs, standards for describing nucleic acid features, as well as tools and software for the analysis of nucleic acids. A variety of search capabilities are available, as are many different types of reports. This article describes the recent redesign of the NDB Web site with special emphasis on new RNA-derived data and annotations and their implementation and integration into the search capabilities.
PMCID: PMC3964972  PMID: 24185695
9.  Catabolite activator protein (CAP): DNA binding and transcription activation 
Recent structures of Escherichia coli catabolite activator protein (CAP) in complex with DNA, and in complex with RNA polymerase α subunit C-terminal domain (αCTD) and DNA, have yielded insights into how CAP binds DNA and activates transcription. Comparison of multiple structures of CAP-DNA complexes has revealed contributions of direct readout and indirect readout to DNA binding by CAP. The structure of the CAP-αCTD-DNA complex has provided the first structural description of interactions between a transcription activator and its functional target within the general transcription machinery. Using the structure of the CAP-αCTD-DNA complex, the structure of an RNAP-DNA complex, and restraints from biophysical, biochemical, and genetic experiments, it has been possible to construct detailed three-dimensional models of intact Class I and Class II transcription activation complexes.
PMCID: PMC2765107  PMID: 15102444
catabolite activator protein (CAP); cAMP receptor protein (CRP); RNA polymerase; σ70; promoter; DNA binding; DNA bending; transcription activation
10.  The Protein Structure Initiative Structural Biology Knowledgebase Technology Portal: A Structural Biology Web Resource 
The Technology Portal of the Protein Structure Initiative Structural Biology Knowledgebase (PSI SBKB; is a web resource providing information about methods and tools that can be used to relieve bottlenecks in many areas of protein production and structural biology research. Several useful features are available on the web site, including multiple ways to search the database of over 250 technological advances, a link to videos of methods on YouTube, and access to a technology forum where scientists can connect, ask questions, get news, and develop collaborations. The Technology Portal is a component of the PSI SBKB (, which presents integrated genomic, structural, and functional information for all protein sequence targets selected by the Protein Structure Initiative. Created in collaboration with the Nature Publishing Group, the SBKB offers an array of resources for structural biologists, such as a research library, editorials about new research advances, a featured biological system each month, and a Functional Sleuth for searching protein structures of unknown function. An overview of the various features and examples of user searches highlight the information, tools, and avenues for scientific interaction available through the Technology Portal.
PMCID: PMC3588887  PMID: 22527514
Database; Protein; Protein Production; Structural Biology; Structural Genomics; Technology
Electron cryo-microscopy (cryoEM) is a rapidly maturing methodology in structural biology, which now enables the determination of 3D structures of molecules, macromolecular complexes and cellular components at resolutions as high as 3.5Å, bridging the gap between light microscopy and X-ray crystallography/NMR. In recent years structures of many complex molecular machines have been visualized using this method. Single particle reconstruction, the most widely used technique in cryoEM, has recently demonstrated the capability of producing structures at resolutions approaching those of X-ray crystallography, with over a dozen structures at better than 5 Å resolution published to date . This method represents a significant new source of experimental data for molecular modeling and simulation studies. CryoEM derived maps and models are archived through joint deposition services to the EM Data Bank (EMDB) and Protein Data Bank (PDB), respectively. CryoEM maps are now being routinely produced over the 3 - 30 Å resolution range, and a number of computational groups are developing software for building coordinate models based on this data and developing validation techniques to better assess map and model accuracy. In this workshop we will present the results of the first cryoEM modeling challenge, in which computational groups were asked to apply their tools to a selected set of published cryoEM structures. We will also compare the results of the various applied methods, and discuss the current state of the art and how we can most productively move forward.
PMCID: PMC3617577  PMID: 21121065
12.  The RCSB Protein Data Bank: new resources for research and education 
Nucleic Acids Research  2012;41(Database issue):D475-D482.
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) develops tools and resources that provide a structural view of biology for research and education. The RCSB PDB web site ( uses the curated 3D macromolecular data contained in the PDB archive to offer unique methods to access, report and visualize data. Recent activities have focused on improving methods for simple and complex searches of PDB data, creating specialized access to chemical component data and providing domain-based structural alignments. New educational resources are offered at the PDB-101 educational view of the main web site such as Author Profiles that display a researcher’s PDB entries in a timeline. To promote different kinds of access to the RCSB PDB, Web Services have been expanded, and an RCSB PDB Mobile application for the iPhone/iPad has been released. These improvements enable new opportunities for analyzing and understanding structure data.
PMCID: PMC3531086  PMID: 23193259
13.  The Protein Data Bank at 40: Reflecting on the Past to Prepare for the Future 
A symposium celebrating the 40th anniversary of the Protein Data Bank archive (PDB), organized by the Worldwide Protein Data Bank, was held at Cold Spring Harbor Laboratory (CSHL) October 28–30, 2011. PDB40’s distinguished speakers highlighted four decades of innovation in structural biology, from the early era of structural determination to future directions for the field.
PMCID: PMC3501388  PMID: 22404998
14.  Outcome of the First Electron Microscopy Validation Task Force Meeting 
Structure(London, England:1993)  2012;20-330(2):205-214.
This Meeting Review describes the proceedings and conclusions from the inaugural meeting of the Electron Microscopy Validation Task Force organized by the Unified Data Resource for 3DEM ( and held at Rutgers University in New Brunswick, NJ on September 28 and 29, 2010. At the workshop, a group of scientists involved in collecting electron microscopy data, using the data to determine three-dimensional electron microscopy (3DEM) density maps, and building molecular models into the maps explored how to assess maps, models, and other data that are deposited into the Electron Microscopy Data Bank and Protein Data Bank public data archives. The specific recommendations resulting from the workshop aim to increase the impact of 3DEM in biology and medicine.
PMCID: PMC3328769  PMID: 22325770
15.  E. coli trp repressor forms a domain-swapped array in aqueous alcohol 
Structure (London, England : 1993)  2004;12(6):1099-1108.
The E. coli trp repressor (trpR) homodimer recognizes its palindromic DNA-binding site through a pair of flexible helix-turn-helix (HTH) motifs displayed on an intertwined helical core. Flexible N-terminal arms mediate association between dimers bound to tandem DNA sites. The 2.5 Å X-ray structure of trpR crystallized in 30% (v/v) isopropanol reveals a substantial conformational rearrangement of HTH motifs and N-terminal arms, with the protein appearing in the unusual form of an ordered 3D domain-swapped supramolecular array. Small angle X-ray scattering measurements show that the self-association properties of trpR in solution are fundamentally altered by isopropanol.
PMCID: PMC3228604  PMID: 15274929
16.  The Structural Biology Knowledgebase: a portal to protein structures, sequences, functions, and methods 
The Protein Structure Initiative’s Structural Biology Knowledgebase (SBKB, URL: is an open web resource designed to turn the products of the structural genomics and structural biology efforts into knowledge that can be used by the biological community to understand living systems and disease. Here we will present examples on how to use the SBKB to enable biological research. For example, a protein sequence or Protein Data Bank (PDB) structure ID search will provide a list of related protein structures in the PDB, associated biological descriptions (annotations), homology models, structural genomics protein target status, experimental protocols, and the ability to order available DNA clones from the PSI:Biology-Materials Repository. A text search will find publication and technology reports resulting from the PSI’s high-throughput research efforts. Web tools that aid in research, including a system that accepts protein structure requests from the community, will also be described. Created in collaboration with the Nature Publishing Group, the Structural Biology Knowledgebase monthly update also provides a research library, editorials about new research advances, news, and an events calendar to present a broader view of structural genomics and structural biology.
PMCID: PMC3123456  PMID: 21472436
Protein; Protein production; Structural biology; Structural databases; Structural genomics; Theoretical models
17.  Quality assurance for the query and distribution systems of the RCSB Protein Data Bank 
The RCSB Protein Data Bank (RCSB PDB, is a key online resource for structural biology and related scientific disciplines. The website is used on average by 165 000 unique visitors per month, and more than 2000 other websites link to it. The amount and complexity of PDB data as well as the expectations on its usage are growing rapidly. Therefore, ensuring the reliability and robustness of the RCSB PDB query and distribution systems are crucially important and increasingly challenging. This article describes quality assurance for the RCSB PDB website at several distinct levels, including: (i) hardware redundancy and failover, (ii) testing protocols for weekly database updates, (iii) testing and release procedures for major software updates and (iv) miscellaneous monitoring and troubleshooting tools and practices. As such it provides suggestions for how other websites might be operated.
Database URL:
PMCID: PMC3056270  PMID: 21382834
18.  The RCSB Protein Data Bank: redesigned web site and web services 
Nucleic Acids Research  2010;39(Database issue):D392-D401.
The RCSB Protein Data Bank (RCSB PDB) web site ( has been redesigned to increase usability and to cater to a larger and more diverse user base. This article describes key enhancements and new features that fall into the following categories: (i) query and analysis tools for chemical structure searching, query refinement, tabulation and export of query results; (ii) web site customization and new structure alerts; (iii) pair-wise and representative protein structure alignments; (iv) visualization of large assemblies; (v) integration of structural data with the open access literature and binding affinity data; and (vi) web services and web widgets to facilitate integration of PDB data and tools with other resources. These improvements enable a range of new possibilities to analyze and understand structure data. The next generation of the RCSB PDB web site, as described here, provides a rich resource for research and education.
PMCID: PMC3013649  PMID: 21036868
19. unified data resource for CryoEM 
Nucleic Acids Research  2010;39(Database issue):D456-D464.
Cryo-electron microscopy reconstruction methods are uniquely able to reveal structures of many important macromolecules and macromolecular complexes., a joint effort of the Protein Data Bank in Europe (PDBe), the Research Collaboratory for Structural Bioinformatics (RCSB) and the National Center for Macromolecular Imaging (NCMI), is a global ‘one-stop shop’ resource for deposition and retrieval of cryoEM maps, models and associated metadata. The resource unifies public access to the two major archives containing EM-based structural data: EM Data Bank (EMDB) and Protein Data Bank (PDB), and facilitates use of EM structural data of macromolecules and macromolecular complexes by the wider scientific community.
PMCID: PMC3013769  PMID: 20935055
20.  Promoting a structural view of biology for varied audiences: an overview of RCSB PDB resources and experiences 
Journal of Applied Crystallography  2010;43(Pt 5):1224-1229.
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) serves a community of users with diverse backgrounds and interests. In addition to processing, archiving and distributing structural data, it also develops educational resources and materials to enable people to utilize PDB data and to further a structural view of biology.
The Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB PDB) supports scientific research and education worldwide by providing an essential resource of information on biomolecular structures. In addition to serving as a deposition, data-processing and distribution center for PDB data, the RCSB PDB offers resources and online materials that different audiences can use to customize their structural biology instruction. These include resources for general audiences that present macromolecular structure in the context of a biological theme, method-based materials for researchers who take a more traditional approach to the presentation of structural science, and materials that mix theme-based and method-based approaches for educators and students. Through these efforts the RCSB PDB aims to enable optimal use of structural data by researchers, educators and students designing and understanding experiments in biology, chemistry and medicine, and by general users making informed decisions about their life and health.
PMCID: PMC2943739  PMID: 20877496
Protein Data Bank; crystallographic education; macromolecular structures; biological crystallography
21.  Signatures of protein-DNA recognition in free DNA binding sites 
Journal of molecular biology  2009;386(4):1054-1065.
One obstacle to achieving complete understanding of the principles underlying sequence-dependent recognition of DNA is the paucity of structural data for DNA recognition sequences in their free (unbound) state. Here we carried out crystallization screening of 50 DNA duplexes containing cognate protein binding sites and obtained new crystal structures of free DNA binding sites for three distinct modes of DNA recognition: anti-parallel beta strands (MetR), helix-turn-helix motif + hinge helices (PurR), and zinc fingers (Zif268). Structural changes between free and protein-bound DNA are manifested differently in each case. The new DNA structures reveal that distinctive sequence-dependent DNA geometry dominates recognition by MetR, protein-induced bending of DNA dictates recognition by PurR, and deformability of DNA along the A-B continuum is important in recognition by Zif268. Together, our findings show that crystal structures of free DNA binding sites provide new information about the nature of protein-DNA interactions and thus lend insights towards a structural code for DNA recognition.
PMCID: PMC2753591  PMID: 19244617
DNA structure; transcription factors; indirect readout; protein-DNA interactions; gene regulation
22.  Outcome of a Workshop on Applications of Protein Models in Biomedical Research 
We describe the proceedings and conclusions from a “Workshop on Applications of Protein Models in Biomedical Research” that was held at University of California at San Francisco on 11 and 12 July, 2008. At the workshop, international scientists involved with structure modeling explored (i) how models are currently used in biomedical research, (ii) what the requirements and challenges for different applications are, and (iii) how the interaction between the computational and experimental research communities could be strengthened to advance the field.
PMCID: PMC2739730  PMID: 19217386
23.  The Protein Model Portal 
Structural Genomics has been successful in determining the structures of many unique proteins in a high throughput manner. Still, the number of known protein sequences is much larger than the number of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. Thereby, experimental structure determination efforts and homology modeling complement each other in the exploration of the protein structure space. One of the challenges in using model information effectively has been to access all models available for a specific protein in heterogeneous formats at different sites using various incompatible accession code systems. Often, structure models for hundreds of proteins can be derived from a given experimentally determined structure, using a variety of established methods. This has been done by all of the PSI centers, and by various independent modeling groups. The goal of the Protein Model Portal (PMP) is to provide a single portal which gives access to the various models that can be leveraged from PSI targets and other experimental protein structures. A single interface allows all existing pre-computed models across these various sites to be queried simultaneously, and provides links to interactive services for template selection, target-template alignment, model building, and quality assessment. The current release of the portal consists of 7.6 million model structures provided by different partner resources (CSMP, JCSG, MCSG, NESG, NYSGXRC, JCMM, ModBase, SWISS-MODEL Repository). The PMP is available at and from the PSI Structural Genomics Knowledgebase.
PMCID: PMC2704613  PMID: 19037750
Protein model portal; PSI structural genomics knowledgebase; Comparative protein structure modeling; Homology modeling; Model database
24.  The protein structure initiative structural genomics knowledgebase 
Nucleic Acids Research  2008;37(Database issue):D365-D368.
The Protein Structure Initiative Structural Genomics Knowledgebase (PSI SGKB, has been created to turn the products of the PSI structural genomics effort into knowledge that can be used by the biological research community to understand living systems and disease. This resource provides central access to structures in the Protein Data Bank (PDB), along with functional annotations, associated homology models, worldwide protein target tracking information, available protocols and the potential to obtain DNA materials for many of the targets. It also offers the ability to search all of the structural and methodological publications and the innovative technologies that were catalyzed by the PSI's high-throughput research efforts. In collaboration with the Nature Publishing Group, the PSI SGKB provides a research library, editorials about new research advances, news and an events calendar to present a broader view of structural biology and structural genomics. By making these resources freely available, the PSI SGKB serves as a bridge to connect the structural biology and the greater biomedical communities.
PMCID: PMC2686438  PMID: 19010965
25.  Representation of viruses in the remediated PDB archive 
A new data model for PDB entries of viruses and other biological assemblies with regular noncrystallographic symmetry is described.
A new scheme has been devised to represent viruses and other biological assemblies with regular noncrystallographic symmetry in the Protein Data Bank (PDB). The scheme describes existing and anticipated PDB entries of this type using generalized descriptions of deposited and experimental coordinate frames, symmetry and frame transformations. A simplified notation has been adopted to express the symmetry generation of assemblies from deposited coordinates and matrix operations describing the required point, helical or crystallographic symmetry. Complete correct information for building full assemblies, subassemblies and crystal asymmetric units of all virus entries is now available in the remediated PDB archive.
PMCID: PMC2677383  PMID: 18645236
virus structures; Protein Data Bank; database integration; uniform curation; point symmetry; helical symmetry; biological assemblies

Results 1-25 (39)