Thoughts about the decisions made in designing macromolecular X-ray crystallography experiments at synchrotron beamlines are presented.
The measurement of X-ray diffraction data from macromolecular crystals for the purpose of structure determination is the convergence of two processes: the preparation of diffraction-quality crystal samples on the one hand and the construction and optimization of an X-ray beamline and end station on the other. Like sample preparation, a macromolecular crystallography beamline is geared to obtaining the best possible diffraction measurements from crystals provided by the synchrotron user. This paper describes the thoughts behind an experiment that fully exploits both the sample and the beamline and how these map into everyday decisions that users can and should make when visiting a beamline with their most precious crystals.
macromolecular crystallography; microcrystallography; X-ray beamlines; synchrotron radiation
A system for the automatic reduction of single- and multi-position macromolecular crystallography data is presented.
The development of automated high-intensity macromolecular crystallography (MX) beamlines at synchrotron facilities has resulted in a remarkable increase in sample throughput. Developments in X-ray detector technology now mean that complete X-ray diffraction datasets can be collected in less than one minute. Such high-speed collection, and the volumes of data that it produces, often make it difficult for even the most experienced users to cope with the deluge. However, the careful reduction of data during experimental sessions is often necessary for the success of a particular project or as an aid in decision making for subsequent experiments. Automated data reduction pipelines provide a fast and reliable alternative to user-initiated processing at the beamline. In order to provide such a pipeline for the MX user community of the European Synchrotron Radiation Facility (ESRF), a system for the rapid automatic processing of MX diffraction data from single and multiple positions on a single or multiple crystals has been developed. Standard integration and data analysis programs have been incorporated into the ESRF data collection, storage and computing environment, with the final results stored and displayed in an intuitive manner in the ISPyB (information system for protein crystallography beamlines) database, from which they are also available for download. In some cases, experimental phase information can be automatically determined from the processed data. Here, the system is described in detail.
automation; data processing; macromolecular crystallography; computer programs
MxCuBE is a beamline control environment optimized for the needs of macromolecular crystallography. This paper describes the design of the software and the features that MxCuBE currently provides.
The design and features of a beamline control software system for macromolecular crystallography (MX) experiments developed at the European Synchrotron Radiation Facility (ESRF) are described. This system, MxCuBE, allows users to easily and simply interact with beamline hardware components and provides automated routines for common tasks in the operation of a synchrotron beamline dedicated to experiments in MX. Additional functionality is provided through intuitive interfaces that enable the assessment of the diffraction characteristics of samples, experiment planning, automatic data collection and the on-line collection and analysis of X-ray emission spectra. The software can be run in a tandem client-server mode that allows for remote control and relevant experimental parameters and results are automatically logged in a relational database, ISPyB. MxCuBE is modular, flexible and extensible and is currently deployed on eight macromolecular crystallography beamlines at the ESRF. Additionally, the software is installed at MAX-lab beamline I911-3 and at BESSY beamline BL14.1.
automation; macromolecular crystallography; synchrotron beamline control; graphical user interface
A mail-in data collection system at SPring-8, which is a web application with automated beamline operation, has been developed.
A mail-in data collection system makes it possible for beamline users to collect diffraction data without visiting a synchrotron facility. In the mail-in data collection system at SPring-8, users pack crystals into sample trays and send the trays to SPring-8 via a courier service as the first step. Next, the user specifies measurement conditions and checks the diffraction images via the Internet. The user can also collect diffraction data using an automated sample changer robot and beamline control software. For distant users there is a newly developed data management system, D-Cha. D-Cha provides a graphical user interface that enables the user to specify the experimental conditions for samples and to check and download the diffraction images using a web browser. This system is now in routine operation and is contributing to high-throughput beamline operation.
mail-in data collection; high-throughput data collection; beamline automation; web application; database system
The Computational Crystallography Toolbox (cctbx) is a flexible software platform that has been used to develop high-throughput crystal-screening tools for both synchrotron sources and X-ray free-electron lasers. Plans for data-processing and visualization applications are discussed, and the benefits and limitations of using graphics-processing units are evaluated.
Current pixel-array detectors produce diffraction images at extreme data rates (of up to 2 TB h−1) that make severe demands on computational resources. New multiprocessing frameworks are required to achieve rapid data analysis, as it is important to be able to inspect the data quickly in order to guide the experiment in real time. By utilizing readily available web-serving tools that interact with the Python scripting language, it was possible to implement a high-throughput Bragg-spot analyzer (cctbx.spotfinder) that is presently in use at numerous synchrotron-radiation beamlines. Similarly, Python interoperability enabled the production of a new data-reduction package (cctbx.xfel) for serial femtosecond crystallography experiments at the Linac Coherent Light Source (LCLS). Future data-reduction efforts will need to focus on specialized problems such as the treatment of diffraction spots on interleaved lattices arising from multi-crystal specimens. In these challenging cases, accurate modeling of close-lying Bragg spots could benefit from the high-performance computing capabilities of graphics-processing units.
data processing; reusable code; multiprocessing; cctbx
The macromolecular crystallography experiment lends itself perfectly to high-throughput technologies. The initial steps including the expression, purification and crystallization of protein crystals, along with some of the later steps involving data processing and structure determination have all been automated to the point where some of the last remaining bottlenecks in the process have been crystal mounting, crystal screening and data collection. At the Stanford Synchrotron Radiation Laboratory (SSRL), a National User Facility which provides extremely brilliant X-ray photon beams for use in materials science, environmental science and structural biology research, the incorporation of advanced robotics has enabled crystals to be screened in a true high-throughput fashion, thus dramatically accelerating the final steps. Up to 288 frozen crystals can be mounted by the beamline robot (the Stanford Automated Mounter, or SAM) and screened for diffraction quality in a matter of hours without intervention. The best quality crystals can then be remounted for the collection of complete X-ray diffraction data sets. Furthermore, the entire screening and data collection experiment can be controlled from the experimenter’s home laboratory by means of advanced software tools that enable network-based control of the highly automated beamlines.
protein crystallography; cryocrystallography; high-throughput screening; robotics; remote access
Hardware and software solutions for MX data-collection strategies using the EMBL/ESRF miniaturized multi-axis goniometer head are presented.
Most macromolecular crystallography (MX) diffraction experiments at synchrotrons use a single-axis goniometer. This markedly contrasts with small-molecule crystallography, in which the majority of the diffraction data are collected using multi-axis goniometers. A novel miniaturized κ-goniometer head, the MK3, has been developed to allow macromolecular crystals to be aligned. It is available on the majority of the structural biology beamlines at the ESRF, as well as elsewhere. In addition, the Strategy for the Alignment of Crystals (STAC) software package has been developed to facilitate the use of the MK3 and other similar devices. Use of the MK3 and STAC is streamlined by their incorporation into online analysis tools such as EDNA. The current use of STAC and MK3 on the MX beamlines at the ESRF is discussed. It is shown that the alignment of macromolecular crystals can result in improved diffraction data quality compared with data obtained from randomly aligned crystals.
kappa goniometer; crystal alignment; data-collection strategies
A powerful and easy-to-use workflow environment has been developed at the ESRF for combining experiment control with online data analysis on synchrotron beamlines. This tool provides the possibility of automating complex experiments without the need for expertise in instrumentation control and programming, but rather by accessing defined beamline services.
The automation of beam delivery, sample handling and data analysis, together with increasing photon flux, diminishing focal spot size and the appearance of fast-readout detectors on synchrotron beamlines, have changed the way that many macromolecular crystallography experiments are planned and executed. Screening for the best diffracting crystal, or even the best diffracting part of a selected crystal, has been enabled by the development of microfocus beams, precise goniometers and fast-readout detectors that all require rapid feedback from the initial processing of images in order to be effective. All of these advances require the coupling of data feedback to the experimental control system and depend on immediate online data-analysis results during the experiment. To facilitate this, a Data Analysis WorkBench (DAWB) for the flexible creation of complex automated protocols has been developed. Here, example workflows designed and implemented using DAWB are presented for enhanced multi-step crystal characterizations, experiments involving crystal reorientation with kappa goniometers, crystal-burning experiments for empirically determining the radiation sensitivity of a crystal system and the application of mesh scans to find the best location of a crystal to obtain the highest diffraction quality. Beamline users interact with the prepared workflows through a specific brick within the beamline-control GUI MXCuBE.
workflows; automation; data processing; macromolecular crystallography; experimental protocols; characterization; reorientation; radiation damage
Statistical analysis system (SAS) is the most comprehensive statistical analysis software package in the world. It offers data analysis for almost all experiments under various statistical models. Each analysis is performed using a particular subroutine, called a procedure (PROC). For example, PROC ANOVA performs analysis of variances. PROC QTL is a user-defined SAS procedure for mapping quantitative trait loci (QTL). It allows users to perform QTL mapping for continuous and discrete traits within the SAS platform. Users of PROC QTL are able to take advantage of all existing features offered by the general SAS software, for example, data management and graphical treatment. The current version of PROC QTL can perform QTL mapping for all line crossing experiments using maximum likelihood (ML), least square (LS), iteratively reweighted least square (IRLS), Fisher scoring (FISHER), Bayesian (BAYES), and empirical Bayes (EBAYES) methods.
A repetitive measurement of the same diffraction image allows to judge the performance of a data collection facility.
The accuracy of X-ray diffraction data depends on the properties of the crystalline sample and on the performance of the data-collection facility (synchrotron beamline elements, goniostat, detector etc.). However, it is difficult to evaluate the level of performance of the experimental setup from the quality of data sets collected in rotation mode, as various crystal properties such as mosaicity, non-uniformity and radiation damage affect the measured intensities. A multiple-image experiment, in which several analogous diffraction frames are recorded consecutively at the same crystal orientation, allows minimization of the influence of the sample properties. A series of 100 diffraction images of a thaumatin crystal were measured on the SBC beamline 19BM at the APS (Argonne National Laboratory). The obtained data were analyzed in the context of the performance of the data-collection facility. An objective way to estimate the uncertainties of individual reflections was achieved by analyzing the behavior of reflection intensities in the series of analogous diffraction images. The multiple-image experiment is found to be a simple and adequate method to decompose the random errors from the systematic errors in the data, which helps in judging the performance of a data-collection facility. In particular, displaying the intensity as a function of the frame number allows evaluation of the stability of the beam, the beamline elements and the detector with minimal influence of the crystal properties. Such an experiment permits evaluation of the highest possible data quality potentially achievable at the particular beamline.
diffraction data precision; signal-to-noise ratio; measurement uncertainty; beamline performance
The DICHROWEB web server enables on-line analyses of circular dichroism (CD) spectroscopic data, providing calculated secondary structure content and graphical analyses comparing calculated structures and experimental data. The server is located at http://www.cryst.bbk.ac.uk/cdweb and may be accessed via a password-limited user ID, available upon completion of a registration form. The server facilitates analyses using five popular algorithms and (currently) seven different reference databases by accepting data in a user-friendly manner in a wide range of formats, including those output by both commercial CD instruments and synchrotron radiation-based circular dichroism beamlines, as well as those produced by spectral processing software packages. It produces as output calculated secondary structures, a goodness-of-fit parameter for the analyses, and tabular and graphical displays of experimental, calculated and difference spectra. The web pages associated with the server provide information on CD spectroscopic methods and terms, literature references and aids for interpreting the analysis results.
A fully automated high-throughput solution X-ray scattering data collection system, developed for protein structure studies at beamline 4-2 of the Stanford Synchrotron Radiation Lightsource, is described.
A fully automated high-throughput solution X-ray scattering data collection system has been developed for protein structure studies at beamline 4-2 of the Stanford Synchrotron Radiation Lightsource. It is composed of a thin-wall quartz capillary cell, a syringe needle assembly on an XYZ positioning arm for sample delivery, a water-cooled sample rack and a computer-controlled fluid dispenser. It is controlled by a specifically developed software component built into the standard beamline control program Blu-Ice/DCS. The integrated system is intuitive and very simple to use, and enables experimenters to customize data collection strategy in a timely fashion in concert with an automated data processing program. The system also allows spectrophotometric determination of protein concentration for each sample aliquot in the beam via an in situ UV absorption spectrometer. A single set of solution scattering measurements requires a 20–30 µl sample aliquot and takes typically 3.5 min, including an extensive capillary cleaning cycle. Over 98.5% of measurements are valid and free from artefacts commonly caused by air-bubble contamination. The sample changer, which is compact and light, facilitates effortless switching with other sample-handling devices required for other types of non-crystalline X-ray scattering experiments.
proteomics; structural genomics; X-ray scattering; laboratory automation; SAXS
The Blu-Ice GUI and Distributed Control System (DCS) developed in the Macromolecular Crystallography Group at the Stanford Synchrotron Radiation Laboratory has been optimized, extended and enhanced to suit the specific needs of the SAXS endstation at the SIBYLS beamline at the Advanced Light Source. The customizations reported here provide one potential route for other SAXS beamlines in need of robust and efficient beamline control software.
Biological small-angle X-ray scattering (SAXS) provides powerful complementary data for macromolecular crystallography (MX) by defining shape, conformation and assembly in solution. Although SAXS is in principle the highest throughput technique for structural biology, data collection is limited in practice by current data collection software. Here the adaption of beamline control software, historically developed for MX beamlines, for the efficient operation and high-throughput data collection at synchrotron SAXS beamlines is reported. The Blu-Ice GUI and Distributed Control System (DCS) developed in the Macromolecular Crystallography Group at the Stanford Synchrotron Radiation Laboratory has been optimized, extended and enhanced to suit the specific needs of the biological SAXS endstation at the SIBYLS beamline at the Advanced Light Source. The customizations reported here provide a potential route for other SAXS beamlines in need of robust and efficient beamline control software. As a great deal of effort and optimization has gone into crystallographic software, the adaption and extension of crystallographic software may prove to be a general strategy to provide advanced SAXS software for the synchrotron community. In this way effort can be put into optimizing features for SAXS rather than reproducing those that have already been successfully implemented for the crystallographic community.
SAXS; software; beamline; control system; Blu-Ice; DCS; SIBYLS; GUI
Gene expression microarrays are a prominent experimental tool in functional genomics which has opened the opportunity for gaining global, systems-level understanding of transcriptional networks. Experiments that apply this technology typically generate overwhelming volumes of data, unprecedented in biological research. Therefore the task of mining meaningful biological knowledge out of the raw data is a major challenge in bioinformatics. Of special need are integrative packages that provide biologist users with advanced but yet easy to use, set of algorithms, together covering the whole range of steps in microarray data analysis.
Here we present the EXPANDER 2.0 (EXPression ANalyzer and DisplayER) software package. EXPANDER 2.0 is an integrative package for the analysis of gene expression data, designed as a 'one-stop shop' tool that implements various data analysis algorithms ranging from the initial steps of normalization and filtering, through clustering and biclustering, to high-level functional enrichment analysis that points to biological processes that are active in the examined conditions, and to promoter cis-regulatory elements analysis that elucidates transcription factors that control the observed transcriptional response. EXPANDER is available with pre-compiled functional Gene Ontology (GO) and promoter sequence-derived data files for yeast, worm, fly, rat, mouse and human, supporting high-level analysis applied to data obtained from these six organisms.
EXPANDER integrated capabilities and its built-in support of multiple organisms make it a very powerful tool for analysis of microarray data. The package is freely available for academic users at
The collection of absorption and Raman spectroscopic data correlated with X-ray diffraction data allows investigators to understand the atomic structure as well as the electronic and vibrational characteristics of their samples, to identify transiently formed intermediates and to explore mechanistic questions. Raman spectroscopy instrumentation at beamline X26-C at the NSLS is currently available to the general user population.
Three-dimensional structures derived from X-ray diffraction of protein crystals provide a wealth of information. Features and interactions important for the function of macromolecules can be deduced and catalytic mechanisms postulated. Still, many questions can remain, for example regarding metal oxidation states and the interpretation of ‘mystery density’, i.e. ambiguous or unknown features within the electron density maps, especially at ∼2 Å resolutions typical of most macromolecular structures. Beamline X26-C at the National Synchrotron Light Source (NSLS), Brookhaven National Laboratory (BNL), provides researchers with the opportunity to not only determine the atomic structure of their samples but also to explore the electronic and vibrational characteristics of the sample before, during and after X-ray diffraction data collection. When samples are maintained under cryo-conditions, an opportunity to promote and follow photochemical reactions in situ as a function of X-ray exposure is also provided. Plans are in place to further expand the capabilities at beamline X26-C and to develop beamlines at NSLS-II, currently under construction at BNL, which will provide users access to a wide array of complementary spectroscopic methods in addition to high-quality X-ray diffraction data.
Raman; single-crystal spectroscopy; X-ray diffraction
Two software plugins are presented for the Mac OSX operating system that allow rapid and convenient visualization of Protein Data Bank files and X-ray diffraction images directly within the file browser, without the need for full-featured applications.
In structural biology, management of a large number of Protein Data Bank (PDB) files and raw X-ray diffraction images often presents a major organizational problem. Existing software packages that manipulate these file types were not designed for these kinds of file-management tasks. This is typically encountered when browsing through a folder of hundreds of X-ray images, with the aim of rapidly inspecting the diffraction quality of a data set. To solve this problem, a useful functionality of the Macintosh operating system (OSX) has been exploited that allows custom visualization plugins to be attached to certain file types. Software plugins have been developed for diffraction images and PDB files, which in many scenarios can save considerable time and effort. The direct visualization of diffraction images and PDB structures in the file browser can be used to identify key files of interest simply by scrolling through a list of files.
PDB visualization; X-ray diffraction images; data management
The ultimate goal of synchrotron data collection is to obtain the best possible data from the best available crystals, and the combination of automation and remote access at Stanford Synchrotron Radiation Lightsource (SSRL) has revolutionized the way in which scientists achieve this goal. This has also seen a change in the way novice crystallographers are trained in the use of the beamlines, and a wide range of remote tools and hands-on workshops are now offered by SSRL to facilitate the education of the next generation of protein crystallographers.
For the past five years, the Structural Molecular Biology group at the Stanford Synchrotron Radiation Lightsource (SSRL) has provided general users of the facility with fully remote access to the macromolecular crystallography beamlines. This was made possible by implementing fully automated beamlines with a flexible control system and an intuitive user interface, and by the development of the robust and efficient Stanford automated mounting robotic sample-changing system. The ability to control a synchrotron beamline remotely from the comfort of the home laboratory has set a new paradigm for the collection of high-quality X-ray diffraction data and has fostered new collaborative research, whereby a number of remote users from different institutions can be connected at the same time to the SSRL beamlines. The use of remote access has revolutionized the way in which scientists interact with synchrotron beamlines and collect diffraction data, and has also triggered a shift in the way crystallography students are introduced to synchrotron data collection and trained in the best methods for collecting high-quality data. SSRL provides expert crystallographic and engineering staff, state-of-the-art crystallography beamlines, and a number of accessible tools to facilitate data collection and in-house remote training, and encourages the use of these facilities for education, training, outreach and collaborative research.
protein crystallography; high-throughput screening; robotics; remote access; crystallographic education and training; outreach
This paper describes FieldTrip, an open source software package that we developed for the analysis of MEG, EEG, and other electrophysiological data. The software is implemented as a MATLAB toolbox and includes a complete set of consistent and user-friendly high-level functions that allow experimental neuroscientists to analyze experimental data. It includes algorithms for simple and advanced analysis, such as time-frequency analysis using multitapers, source reconstruction using dipoles, distributed sources and beamformers, connectivity analysis, and nonparametric statistical permutation tests at the channel and source level. The implementation as toolbox allows the user to perform elaborate and structured analyses of large data sets using the MATLAB command line and batch scripting. Furthermore, users and developers can easily extend the functionality and implement new algorithms. The modular design facilitates the reuse in other software packages.
The BL-17A macromolecular crystallography beamline at the Photon Factory was updated to improve the accuracy of diffraction experiments conducted using tiny crystals.
BL-17A is a macromolecular crystallography beamline dedicated to diffraction experiments conducted using micro-crystals and structure determination studies using a lower energy X-ray beam. In these experiments, highly accurate diffraction intensity measurements are definitively important. Since this beamline was constructed, the beamline apparatus has been improved in several ways to enable the collection of accurate diffraction data. The stability of the beam intensities at the sample position was recently improved by modifying the monochromator. The diffractometer has also been improved. A new detector table was installed to prevent distortions in the diffractometer’s base during the repositioning of the diffractometer detector. A new pinhole system and an on-axis viewing system were installed to improve the X-ray beam profile at the sample position and the centering of tiny crystal samples.
macromolecular crystallography; beamline
An X-ray mini-beam of 8 × 6 µm cross-section was used to collect diffraction data from protein microcrystals with volumes as small as 150–300 µm3. The benefits of the mini-beam for experiments with small crystals and with large inhomogeneous crystals are investigated.
A simple apparatus for achieving beam sizes in the range 5–10 µm on a synchrotron beamline was implemented in combination with a small 125 × 25 µm focus. The resulting beam had sufficient flux for crystallographic data collection from samples smaller than 10 × 10 × 10 µm. Sample data were collected representing three different scenarios: (i) a complete 2.0 Å data set from a single strongly diffracting microcrystal, (ii) a complete and redundant 1.94 Å data set obtained by merging data from six microcrystals and (iii) a complete 2.24 Å data set from a needle-shaped crystal with less than 12 × 10 µm cross-section and average diffracting power. The resulting data were of high quality, leading to well refined structures with good electron-density maps. The signal-to-noise ratios for data collected from small crystals with the mini-beam were significantly higher than for equivalent data collected from the same crystal with a 125 × 25 µm beam. Relative to this large beam, use of the mini-beam also resulted in lower refined crystal mosaicities. The mini-beam proved to be advantageous for inhomogeneous large crystals, where better ordered regions could be selected by the smaller beam.
mini-beam; microbeam; microcrystals; microdiffraction; high mosaicity; inhomogeneous crystal; signal-to-noise; crystal segment; beam divergence; streaky spots
EMEGS (electromagnetic encephalography software) is a MATLAB toolbox designed to provide novice as well as expert users in the field of neuroscience with a variety of functions to perform analysis of EEG and MEG data. The software consists of a set of graphical interfaces devoted to preprocessing, analysis, and visualization of electromagnetic data. Moreover, it can be extended using a plug-in interface. Here, an overview of the capabilities of the toolbox is provided, together with a simple tutorial for both a standard ERP analysis and a time-frequency analysis. Latest features and future directions of the software development are presented in the final section.
We describe a collection of standardized image processing protocols for electron microscopy single-particle analysis using the XMIPP software package. These protocols allow performing the entire processing workflow starting from digitized micrographs up to the final refinement and evaluation of 3D models. A particular emphasis has been placed on the treatment of structurally heterogeneous data through maximum-likelihood refinements and self-organizing maps as well as the generation of initial 3D models for such data sets through random conical tilt reconstruction methods. All protocols presented have been implemented as stand-alone, executable python scripts, for which a dedicated graphical user interface has been developed. Thereby, they may provide novice users with a convenient tool to quickly obtain useful results with minimum efforts in learning about the details of this comprehensive package. Examples of applications are presented for a negative stain random conical tilt data set on the hexameric helicase G40P and for a structurally heterogeneous data set on 70S Escherichia coli ribosomes embedded in vitrified ice.
Motivation: Quantitative real-time polymerase chain reaction (qPCR) is routinely used for RNA expression profiling, validation of microarray hybridization data and clinical diagnostic assays. Although numerous statistical tools are available in the public domain for the analysis of microarray experiments, this is not the case for qPCR. Proprietary software is typically provided by instrument manufacturers, but these solutions are not amenable to the tandem analysis of multiple assays. This is problematic when an experiment involves more than a simple comparison between a control and treatment sample, or when many qPCR datasets are to be analyzed in a high-throughput facility.
Results: We have developed HTqPCR, a package for the R statistical computing environment, to enable the processing and analysis of qPCR data across multiple conditions and replicates.
Availability: HTqPCR and user documentation can be obtained through Bioconductor or at http://www.ebi.ac.uk/bertone/software.
Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping).
We transitioned our proteogenomic mapping information system from a traditional entity-relationship design to one based on Ultra-Structure. Our system integrates tandem mass spectrum data, genomic annotation sets, and spectrum/peptide mappings, all within a small, general framework implemented within a standard relational database system. General software procedures driven by user-modifiable rules can perform tasks such as logical deduction and location-based computations. The system is not tied specifically to proteogenomic research, but is rather designed to accommodate virtually any kind of biological research.
We find Ultra-Structure offers substantial benefits for biological information systems, the largest being the integration of diverse information sources into a common framework. This facilitates systems biology research by integrating data from disparate high-throughput techniques. It also enables us to readily incorporate new data types, sources, and domain knowledge with no change to the database structure or associated computer code. Ultra-Structure may be a significant step towards solving the hard problem of data management and integration in the systems biology era.
This article gives an overview of techniques and procedures for efficient data collection at synchrotron sources.
Modern synchrotron beamlines offer instrumentation of unprecedented quality, which in turn encourages increasingly marginal experiments, and for these, as much as ever, the ultimate success of data collection depends on the experience, but especially the care, of the experimenter. A representative set of difficult cases has been encountered at the Structural Genomics Consortium, a worldwide structural genomics initiative of which the Oxford site currently deposits three novel human structures per month. Achieving this target relies heavily on frequent visits to the Diamond Light Source, and the variety of crystal systems still demand customized data collection, diligent checks and careful planning of each experiment. Here, an overview is presented of the techniques and procedures that have been refined over the years and that are considered synchrotron best practice.
data collection; data-collection strategy; structural genomics