|Home | About | Journals | Submit | Contact Us | Français|
Collagen peptides from ancient dinosaur fossil bones millions of years old have been sequenced using microcapillary LC/MS/MS using ion trap and Orbitrap mass spectrometers, and the sequences can be used for phylogenetic analyses. The challenge is twofold: barely enough detectable proteinaceous material is present from exceptionally well-preserved 68-million-year-old Tyrannosaurus rex and 80-million-year-old Brachylophosaurus fossil bone extracts, and protein databases do not contain sequences from extinct species. To overcome these obstacles, several purification/enrichment steps are necessary, and predicted protein databases were created to help uncover novel sequences. From less than 100 amino acids from Type I collagen, the phylogeny of dinosaurs can be accurately assessed across ~20 extant organisms, including critical species such as reptiles and birds, some of which were also sequenced by LC/MS/MS. Several different phylogenetic algorithms all lead to a common phylogenetic signature, that dinosaurs are closer to birds in evolution than any other organism. A 160,000–600,000-year-old extinct mastodon fossil bone was also studied by mass spectrometry. In the absence of preserved DNA, peptide sequences acquired by mass spectrometry are a viable and often the only option to molecularly characterize ancient species and their evolutionary relationships. These results represent the first example of molecular sequence and phylogenetic analysis from any multimillion-year ancient species.
To effectively monitor protein phosphorylation events governing signaling cascades, we have developed a mass spectrometry–based methodology enabling the simultaneous quantification of tyrosine phosphorylation of specific residues on dozens of key proteins in a time-resolved manner. We have recently applied this technique to quantify temporal profiles of tyrosine phosphorylation following insulin stimulation of 3T3-L1 adipocytes. In this analysis, 122 tyrosine phosphorylation sites on 89 proteins were identified. Insulin treatment caused a change of at least 1.3-fold in tyrosine phosphorylation on 89 of these sites. Among the responsive sites, 20 were previously known to be tyrosine phosphorylated upon insulin treatment, including sites on the insulin receptor and IRS-1. The remaining 69 responsive sites have not previously been shown to be altered by insulin treatment. We have now extended this analysis to identify and quantify additional phosphorylation sites in the network, including serine/threonine phosphorylation sites on Erk substrates. Currently, we are applying this technology to probe the mechanisms underlying insulin resistance in both cell-culture systems and in vivo. By quantifying tyrosine phosphorylation–mediated signaling networks under normal and insulin-resistant conditions, we will identify the key signaling nodes associated with this pathological condition. These nodes can then be targeted with next-generation therapeutics to restore insulin sensitivity. This work combines novel mass spectrometry methodology, quantitative phenotypic measurements, and computational modeling. Through this combination, this project will yield novel insights into the systems-level regulation of insulin sensitivity and resistance.
Cell behavioral functions are controlled by biomolecular networks that translate stimulatory cues (e.g., ligand/receptor binding interactions, mechanical stresses, pathogen infection, and other environmental insults) into intra-cellular signals which regulate transcriptional, metabolic, and cytoskeletal processes that affect proximal and ultimate cell responses. While there is a growing body of work enhancing our understanding of how intracellular signals are generated by stimulatory cues, an exceptionally difficult challenge at the present time is to understand how these signals operate in an integrated manner to govern cell phenotypic behavior. We are addressing this question via a combination of quantitative, dynamic protein-centric experimental manipulations and measurements with a spectrum of computational mining and modeling approaches. A major emphasis is on ascertaining how effects of prospective therapeutics might be usefully predicted, with complementary endeavors to elucidate the logic of interactions among the various molecular components and pathways as an information-processing circuit. This talk will present an overview of our perspective and approach, along with a specific example describing work aimed at understanding integrated operation of signaling pathways governing T-cell activation.
The ability to systematically discover the molecular basis of diverse diseases would revolutionize the design of therapeutic strategies. High-throughput technologies, including transcriptional profiling and phosphoproteomics, provide valuable windows onto these cellular processes. However, each of these methods detects only a small fraction of the changes that occur in a cell. As a result, these experiments are often difficult to interpret and sometimes become little more that lists of genes. I will present a novel, integrative approach to analyzing diverse data that frequently reveals a coherent, mechanistic view of cellular response, including many components that are invisible to each assay.
The glycomics field is currently focused on the structural characterization of released glycans. Typically, glycans are released from glycoproteins, then analyzed by MS either in their native state or following chemical modification. However, this procedure results in a complete loss of information regarding the glycoproteins from which the glycans originated. To analyze glycoprotein expression, lectin affinity chromatography (LAC) has been employed to separate the glycopeptides from the nonglycosylated species, thus simplifying further analyses by MS/MS. Unfortunately, the specificity of the lectin does not facilitate the global isolation of glycoproteins or glycopeptides with a diverse population of glycan structures. The goal of this work was to develop methodologies to identify the glycan compositions and glycosylation sites on glycoproteins from complex mixtures. This glycoproteomic approach facilitates characterization of the intact glycopeptides, rather than analyzing the glycans and peptides separately, which would result in the loss of information on the glycan structures present at each glycoprotein. The first step in this approach was the development of a normal phase liquid chromatography (NPLC)–based glycopeptide enrichment procedure. NPLC is routinely used for glycan separations, and here we show that glycopeptides can be effectively separated from their nonglycosylated counterparts using this approach. Deglycosylation of the enriched glycopeptides followed by RPLC-MS/MS analysis allowed the identification of 70 unique glycopeptides mapping to 37 unique glycoproteins at a 1% false-discovery rate. The serum glycoprotein identifications in these experiments have greater than a four order of magnitude range in concentration, thus indicating the NPLC method is highly effective at enriching low-abundance glycopeptides.
A new paradigm for cancer biomarker discovery is proposed. The detection and analysis of the glycans that decorate the underlying proteins in glycoproteins may provide more specific detection of cancer than examining the proteins themselves. Proteins are often modified by the attachment of sugar units called glycans during the normal course of protein production. It is estimated that over 50% of all human proteins are modified in this way. There are many studies demonstrating that the glycans produced in cancer cells are significantly different from those in normal cells. The aberration in glycosylation is observed with many types of diseases. In the studies in our laboratory, we harvest glycoproteins and extract the glycans in patient serum to determine whether the glycosylation has changed in cancer patients compared to healthy patients. This new glycans assay is used to discover biomarkers in the diagnosis of cancer.
We have results from several pilot studies that strongly support the concept of using the glycans assay to detect ovarian, breast, and prostate cancer. Glycans are readily observed when harvested from serum. We developed a method to analyze both cancer cell line supernatant and patient serum to identify aberrant glycosylation in the shed glycoproteins. This method involves the separation of oligosaccharides into three fractions (with different solvent polarity) and then examining the mass profile of each fraction. We are refining this technique to make it more robust and sensitive to glycan analysis. Matrix-assisted laser desorption/ionization and electrospray ionization are used in conjunction with high-accuracy mass spectrometry for marker discovery. Tandem MS, specifically infrared multiphoton dissociation and collision-induced dissocation, are used to obtain structural information. Results show that specific types of anionic oligosaccharides and complex N-linked oligosaccharides are upregulated in cancer patients.
Cancer is a major public health burden in the United States and in other developed countries. Currently, one in four deaths in the United States is due to cancer. Due to the limitations of current methodologies to reliably detect the disease earlier, attempts are being made to identify molecular biomarkers for the disease. The identification of bio-markers that will have the greatest possibility for improving the assessment of risk, early detection/diagnosis, and prognosis of cancer will be those markers that are truly representative of the biological characteristics of cells destined to develop into cancer or that are typically expressed by cells at the earliest stages of the disease. In this presentation, approaches developed in the burgeoning fields of glycomics and glycoproteomics, which allow confident biometric measurements that convey molecular information about the biological conditions associated with cancer, will be presented and discussed. The inherent complexity of a glycome, which refers to the identity of all glycans associated with cells, tissues, organisms, etc., is substantial, stimulating development of new methodologies. Several MS-based approaches allowing high-sensitivity characterization of a glycome and a glycoproteome will be discussed, including solid-phase and on-line glycan permethylation, multiplexing comparative glycomic mapping (MC-GlycoMAP) through stable-isotope quantification, and multidimensional chromatography for glycoprotein enrichment and fractionation. The utility of these approaches for defining glycans and glycoprotein biomarkers present in the blood serum of breast, prostate, and liver cancer patients will be presented and discussed.
The term lipidomics first appears in papers indexed by PubMed in 2003 and has been defined as the large-scale analysis of lipid molecular species in a tissue, cell, or sub-cellular fraction. Mass spectrometry (MS) has been a key tool in lipidology since the 1950s, though until the advent of electrospray ionization (ESI), the most used MS involve volatile analytes, usually facilitated by chemical derivatization in which some molecular information is lost. ESI-MS enables analysis of intact molecular species, and with liquid chromatography and/or MS/MS and MS/MS/MS, may provide all information required to fully characterize isomeric species. Unlike most other fractions of the metabolome, lipids are defined by a broad chemical property: they are those biomolecules that dissolve in non-polar solvents. The heterogeneity in structures and concentration are vast. For instance, glycerolipids are generally the most abundant components, specifically triacylglycerols existing as nearly pure droplets in adipocytes, and phospholipids that constitute the bulk of cell membranes. At much lower concentration are well-studied eicosanoids (20 carbons) and the emerging docosanoids (22 carbons), both of which tend to be short-lived potent signaling molecules active at picomolar concentrations. Because of the heterogeneity of chemical structures, the MS response of any particular molecular form can range over many orders of magnitude, depending on ESI conditions and the MS modes used. Calibration is problematic because of the availability of commercial standards for a relatively small subset of the potentially thousands of lipids that may be revealed in an analysis. A strategy to obviate this issue focuses upon peak by peak comparison of signals from differently treated samples to characterize “fold changes.” As standards availability improves, lipid profiles within a sample are possible. We end with a brief consideration of software tools necessary for high-throughput lipidomics.
The analysis of lipids by mass spectrometry has made significant advances over the past several decades, largely due to the development of electrospray ionization and matrix assisted laser desorption (MALDI). These two ionization techniques generate abundant gas-phase ions for all known lipids that correspond to the molecular weight of the lipid (in positive ion mode) after attachment of a proton, alkali metal ion, or ammonium ion, or due to loss of a proton and formation of a negative ion. These molecular ions can be collisionally activated to undergo collision-induced decomposition to generate structural information. Specific examples of the analysis of common lipids and advances in the separation of neutral lipids by normal-phase and reversed-phase LC/MS and LC/MS/MS techniques will be provided. Understanding which lipids are present within cells can now include information relevant to unique molecule species of lipids. For example, molecular species of phospholipids include the information of fatty acyl esterification as well as polar head group. While detailed knowledge of pathways involved in biosynthesis of lipids are known, virtually nothing is known about annotation of these pathways at the level of molecular species. A separate advance in lipid analysis has been the use of mass spectrometry to detect lipids as they exist within the tissue environment. This technique of mass-spectrometric imaging utilizes MALDI to simultaneously generate mass spectral data that can be reconstructed into an image of distribution of specific lipids. If the instrument employed includes a high-resolution time-of-flight sector preceded by a collisional cell, one can carry out tandem mass spectrometry experiments at high resolution and deduce chemical structures of those lipids even when they are present within tissues.
The corona discharge used to generate positive and negative ions under conventional atmospheric pressure chemical ionization (APCI) conditions also provides a source of low-energy gas-phase electrons. This is thought to occur by displacement of electrons from the nitrogen sheath gas. Therefore, suitable analytes can undergo electron capture in the gas phase in a manner similar to that observed for gas chromatography/electron capture negative chemical ionization/mass spectrometry (MS). This technique, which has been named electron capture APCI/mass spectrometry (ECAPCI)/MS, provides an increase in sensitivity of two orders of magnitude when compared with conventional APCI methodology.1 It is a simple procedure to tag arachidonic acid (AA)- and linoleic acid (LA)-derived oxidized lipids with an electron-capturing group such as the pentafluorobenzyl (PFB) moiety before analysis. PFB derivatives have previously been used as electron-capturing derivatives because they undergo dissociative electron capture in the gas phase to generate negative ions through the loss of a PFB radical. A similar process occurs under ECAPCI conditions. By monitoring the negative ions that are formed, it is possible to obtain extremely high sensitivity for PFB derivatives of oxidized lipids derived from AA and LA. A combination of stable isotope dilution methodology and chiral LC-ECAPCI/MS makes it possible to resolve and quantify complex mixtures of regioisomeric and enantiomeric oxidized lipids.2 Supported by NIH grants RO1CA95586 and P30ES013508.
It is anticipated that plasma should contain signatures of all human proteins (at some time during life). Unfortunately, most are at low concentrations, relative to the classical plasma proteins and, therefore, it is difficult to characterize many proteins with a single analysis. This talk will focus on the development of new mass spectrometry (MS)-based technologies that are aimed at characterizing complex mixtures of proteins. When a packet of a mixture of ions in a buffer gas is exposed to an electric field, the ions separate according to differences in their mobilities. This separation is the basis for the long-standing analytical technique, ion mobility spectrometry (IMS). In the last ten years, developments of new ionization sources, reliable computational approaches for calculating mobilities for trial geometries, and efficient instrument designs that aim to couple IMS with mass spectrometry and liquid chromatography have led to many new IMS applications. This talk will focus on recent progress associated with developing multidimensional IMS-MS and IMS-IMS instrumentation. The latter approach, IMS-IMS, has been developed during the last two years and is analogous in many ways with MS-MS; however, components are resolved based on differences in mobilities rather than m/z ratios. The instrumentation offers advantages of selectivity, capacity, sensitivity, and dynamic range (compared to IMS alone).
Mass spectrometry is currently at the forefront of technologies for mapping protein-protein interactions, as it is a highly sensitive technique that enables the rapid identification of proteins from a variety of biological samples. When used in combination with affinity purification and/or chemical cross-linking, whole or targeted protein interaction networks can be elucidated. Several methods have recently been introduced that display increased specificity and a reduced occurrence of false positives. In the future, information gained from human protein interaction studies could lead to the discovery of novel pathway associations and therapeutic targets.
G protein-coupled receptors (GPCRs), which are normally embedded in hydrophobic membrane environments, are typically challenging to isolate and maintain in an active state. Surface plasmon resonance (SPR) can be used to identify suitable conditions for extracting GPCRs from membranes while preserving the receptors’ abilities to bind conformational antibodies, native ligands, and small molecules. Using both traditional Biacore platforms and array technologies, we have developed methods to screen solubilization conditions (e.g., detergent, salts, and pH) to maximize the yield and activity of GPCRs. In addition, we are using SPR to help develop affinity purification protocols for GPCRs. The biosensor allows us to determine which ligands are most suitable for resin coupling, identify the conditions required to elute bound receptor, and track the activity of the eluted fractions. Together, these SPR-based optimization methods permit us to study ligand-binding mechanisms and their effects on receptor conformation. Optimizing the activity and purification protocols for GPCRs provides an important first step towards structural analysis of these receptors and can provide a more complete understanding of their functions.
Fragment-based lead discovery is a complementary approach to high-throughput screening where small molecule fragments (MW < 300 Da) with affinities in the hundreds of micromolar to millimolar range are identified as binders to drug targets, and in combination with X-ray crystallography can be developed into potent inhibitors. Fragments have the advantage relative to traditional high-throughput screening in that larger areas of medicinal chemistry space can be covered, the small compounds often reveal novel binding modes, and fragments often make highly efficient interactions with the target proteins. Technologies like X-ray crystallography and NMR have been used for fragment screening but are time and protein intensive, and do not readily yield binding-constant information. Recent advances in surface plasmon resonance (SPR)-based optical biosensor technique have allowed applications of the technology to studies of small molecule/protein interactions. By combining the latest methods in biosensor operation with the standard methodologies of high-throughput screening, we have developed a high-throughput procedure for hit identification from fragment libraries for entry into lead generation chemistry. Key features are the ability to screen and verify binding of thousands of compounds in a short time, large dynamic range of the assay (100 pM to 5 mM), and the low amounts of protein required to complete a fragment-screening study (<0.5 mg protein from assay development through hit validation). Details of the experimental design and case examples will be presented.
Core facility personnel are research professionals who provide services in high-end technologies to academic, government, and industry researchers. Often the services are provided for a fee. What is the appropriate recognition of these services in publications? Publications are important for core facilities, as they document the skill and prowess of the core, and can be vital for professional advancement of core staff. Authorship should be provided if there has been a significant intellectual contribution to the project, including: experimental design, analysis and interpretation of data, critical input, original ideas, work significantly beyond the call of duty, writing sections of the manuscript or grant proposal, issuing final approval for that section, and assuming primary responsibility for the content of that section. Skilled technical contributions, on the other hand, should be acknowledged as such. Where is the line between technical contribution and significant intellectual contribution? What is considered a simple service and what is legitimately considered an intellectual contribution? Does a new approach, a new analytical method designed for a specific project, deserve authorship? Criteria could also include whether or not a project could have succeeded under the same set of circumstances if a given co-author were absent from the project. Are quantity and quality of contribution key factors? We will refer to such documents as the recommendations for authorship by the International Committee of Medical Journal Editors (http://www.icmje.org/index.html#authorship) and others. We intend to provide general guidelines for how to acknowledge expert core facility services in publications, and several examples involving various technologies will be presented for discussion.
The economically important dry bean plant Phaseolus vulgaris is a host for the rust fungus Uromyces appendicu‐latus. Resistance genes have been defined and have been used to protect the bean crop, but the proteins governing resistance are not well-resolved, especially compared to model plant-pathogen systems. To characterize the nature of resistance, we have used high-throughput tandem mass spectrometry to detect and analyze more than 3000 proteins from infected bean leaves. By statistically comparing the amounts of proteins detected in a single plant variety that is either susceptible to infection or resistant, depending on the fungal strains introduced, we have distinguished resistance from susceptibility at a proteomic level. Several other plant proteomic responses, some of which may favor the pathogen, also change during the course of infection. These results provide a basic foundation for understanding the proteomics of disease responses for a major crop plant.
Rhizomania, caused by Beet necrotic yellow vein virus (BNYVV), is a devastating viral pathogen of sugar beet. There are limited sources of resistance against the virus, and resistance-breaking isolates are becoming increasingly problematic worldwide. Developing more effective disease-control strategies starts with gaining a better understanding of the basis for resistance and the mechanism of disease. Multidimensional liquid chromatography was employed to examine proteins differentially expressed in nearly isogenic lines of sugar beet either resistant or susceptible to BNYVV infection. Protein expression was temporally regulated, and in total, 7.4 and 11% of the detected proteome was affected by BNYVV-challenge in the resistant and susceptible genotype, respectively. Sixty-five of the proteins induced or repressed by the virus were identified by tandem MALDI-TOF mass spectrometry, and expression of key defense- and disease-related proteins was further verified using qualitative reverse transcriptase polymerase chain reaction. The proteomic data suggest involvement of classic systemic resistance components in Rz1-mediated resistance and phytohormones in hairy root symptom development. Movement between cells may be linked to disease severity. Recent efforts have focused on developing effective protein-protein interaction techniques (pull-down assays, Far Westerns, yeast 2-hybrid) for determining host factors involved in interactions with viral movement proteins.
A method is presented demonstrating the sequencing of amino acid residues of proteins from non-genomically sequenced bacteria using only MS and MS/MS techniques. The method involves combining multiple, non-overlapping proteomic identifications from homologous sequence regions of proteins from genomically sequenced bacterial strains. This “composite” sequence proteomic analysis (CSPA) involves “bottom-up” proteomic identification of protein biomarkers from non-genomically sequenced Campylobacter species/strains by combining multiple, non-overlapping, sequence regions from genomically sequenced Campylobacter species/strains. The genomically sequenced strains are phylogenetically proximate or distant to the non-genomically sequenced strain. Composite sequences were confirmed by both MS and MS/MS analysis. In addition, gene sequencing was also used to confirm the correctness of the composite sequence. The composite sequence obtained can be utilized for: (1) protein molecular weight–based algorithms for pathogen identification by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF-MS); (2) strain-specific biomarker for analysis by “top-down” proteomics techniques; (3) peptide-centric databases used for bacterial microorganism identification. CSPA can be used to identify the full amino acid sequence of protein biomarkers of emerging bacterial strains (i.e., non-genomically sequenced strains) without the necessity of either genetically sequencing biomarker genes or attempting full de novo MS/MS sequencing. Finally, CSPA is discussed with respect to whether lateral (horizontal) gene transfer across Campylobacter species is responsible for the non-overlapping homologous sequence regions observed.
Our research centers on molecular engineering DNA as a nanoscale material for real-world applications. In particular, our group focuses on three directions: DNA nanobarcodes, DNA gels, and DNA nanoparticles. By taking advantage of the amazing chemical, physical, and biological properties of DNA and by utilizing a myriad of DNA manipulating enzymes, we have created, on a bulk scale with a high yield, DNA dendrimers, DNA-based addressable molecules and materials, DNA-based nanobarcode systems, DNA hydrogels, DNA liposomes, and DNA organized Au nanoparticles. New properties and applications are expected from DNA-based materials. For this conference, I will discuss how DNA gels can be used in life sciences and beyond. In particular, I will present two platform and enabling technologies: the first one is for drug delivery—DNA gel that can deliver any living cells; and the second one is for protein production—DNA gel that can produce proteins without any living cells. I will further discuss the future impact of these potentially disruptive DNA gel technologies.
Iterative search is the most efficient strategy for lead optimization in terms of minimizing the number of real molecules needed to identify the best lead opportunities from a huge diversity pool, but application during the extremely critical hit to lead phase is hampered by the complex infrastructure of high-throughput initiatives. We here seek to illustrate the feasibility of a more elegant solution.
Although high-throughput assays are often conducted with a few thousand molecules of the target and a ligand, the macro-scaled logistics of compound management has high losses and demands that large amounts of each test molecule be maintained, thus demanding extra effort in chemistry and protein production. Numerically efficient but diversity deficient high-throughput chemical procedures are often used to compensate. Unfortunately, these will usually be much less amenable to iterative strategies at a point in the process where the basis of ultimate success is to drive rationally to the greatest chemical diversity to identify the best possible lead series.
The presentation will describe a no-loss, microfluidics-based closed loop iterative strategy for screening hit validation and early lead optimization. In addition, tools that allow benchtop integration of the chemistry, biology, and informatics and have a capability to overcome at low cost the logistical and diversity limitations of the current paradigms will be described. Early results show that syntheses and assays that normally require hours to deliver can be available in minutes, and under the control of suitable software the biology of chemically diverse compounds can be explored.
Monoclonal antibodies and other therapeutic recombinant proteins must be well characterized to establish the molecular and biological properties important in ensuring safety and efficacy. Due to cellular and manufacturing processes used during production and storage, these proteins are often a mixture of the product expected from the DNA sequence and variants generated from post-translational modification, enzymatic processing, and degradation. Regulatory agencies require that these product variants be characterized, categorized as product-related substances or product-related impurities, and monitored to ensure product quality and consistency. Thorough physicochemical characterization involves the use of multiple orthogonal analytical methods to assess specific properties such as purity, potency, and stability. An overview of Genentech’s physicochemical characterization program, examples of the analytical challenges, case studies of the impact of specific chemical modifications, and regulatory implications will be presented.
The details of protein structures begin with the amino acid sequence, but include co- and post-translational modifications and associations with other proteins and components of the biological milieu in which the protein is resident. The details are often dynamic, and multiple forms of the protein are frequently present simultaneously. These refinements are critical because they convey specific biophysical and biochemical properties to the protein and its assemblies and may affect turnover and transport. While it is important to identify the members of the proteomes at the level of organisms, tissues, cells, and even subcellular compartments, full understanding of function requires that each of the structures be defined as fully as possible. This part of the analysis is challenging, demanding, and constantly exciting to the imagination, since it necessitates the creative utilization of multiple mass spectrometric techniques, specialized approaches for sample isolation, derivatization, and/or degradation, and the use of complementary analytical techniques, e.g., multistage separations, atomic force microscopy, bioassays. We apply combinations of approaches to study congenital and sporadic diseases that involve phenomena such as protein misfolding, cardiovascular oxidative stress, variations in glycosylation, and infection. The rationale and results from a few recent and ongoing structural adventures will be presented.
Modern protein characterization has been greatly enhanced by the plethora of new analytical tools available to the protein chemist. Mass spectrometry (MS) has been at the forefront of this advancement, as new modes of analyte ionization and introduction, detection and measurement, and value-added fragmentation processes have allowed proteins to be measured more sensitively with deeper understanding of their structure and function. Examples of how many of these new MS-based tools are applied by our laboratory to generate structural information for intact proteins will be discussed. This will include: the measurement of binding stoichiometry of noncovalent protein-ligand, protein-protein, and protein-metal complexes; the elucidation of ligand binding sites by electron capture dissociation with high-resolution Fourier transfer ion cyclotron resonance and electron transfer dissociation with linear ion trap MS; MS-ion mobility measurements to differentiate gas-phase/solution-phase protein conformations; new ambient modes of desorption/ionization for top-down MS; and methodologies to address supramolecular assemblies in excess of 1 MDa. Measurement of protein molecular weight is near routine using mass spectrometry. A next frontier to conquer is the capture of information critical to decipher the functional role of proteins. New mass spectrometry-based technologies potentially can address this goal.
Molecular detection procedures for pathogen-specific DNA and RNA have transformed clinical medicine. Once the sole purview of large, specialized laboratories, molecular diagnostic procedures are now done in smaller hospital laboratories and in point-of-care situations. The enabling technology in this transformation is real-time PCR. This presentation will review some real-time technologies and discuss how they have transformed clinical testing, laboratory design, and time to result.
New DNA sequencing technologies present practical challenges for implementation in core facility environments. This workshop will present the experiences of core facilities that are currently offering these new technologies. Implementation of next-generation sequencing platforms as shared resources enables cost-effective access and broad-based use of these high-impact emerging technologies.
The recent introduction of new DNA sequencing technologies presents an exceptional opportunity for novel and creative applications with the potential for breakthrough discoveries. To support such research efforts, we have installed an Illumina Solexa Genome Analyzer platform as an academic core facility shared research resource. We have established sample-handling methods and informatics tools to build robust processing pipelines in support of these new technologies. Our DNA sequencing and genotyping core laboratory provides sample-preparation and data-generation services, and, in collaboration with the gene expression and informatics core facilities, provides both project consultation and analysis support for a wide range of possible applications, including whole-genome re-sequencing, amplicon resequencing, mutation detection, SNP genotyping, small RNA profiling, and genome-wide measurements of protein-nucleic interactions. Implementation of next generation sequencing platforms as shared resources with multi-disciplinary core facility support enables cost-effective access and broad-based use of these emerging technologies.
Next-generation DNA sequencing technology has clearly altered the path of gene-based research. The cost efficiency and production capacity of new systems like 454, Solexa, and SOLiD opens research opportunities that would have been practically impossible only two or three years ago. Our first leap into nextGen DNA sequencing with 454 seems an obvious choice today, but three years ago we had several primary considerations: (1) Where would we find the initial funding for the instrument? (2) Could we convince our research community that a new way of DNA sequencing would be reliable and give them the results they expect? (3) What special expertise would we need in order to successfully implement this technology? (4) Were there special considerations for sample processing and locating the instrument? (5) How would nextGen impact our existing Sanger sequencing process and customers? and (6) How would we manage the data and deliver them to the end user? We will report on our experience implementing 454 DNA sequencing technology at the University of Florida, Interdisciplinary Center for Biotechnology Research. We will also discuss our decision to further develop our next-Gen capacity by acquiring additional new technology.
Core facilities are becoming more and more commonplace within research institutions. They are a cost-efficient way to deal with limited resources, especially when dealing with technologies that employ costly equipment that requires specialized expertise and maintenance, such as cytometry. This tutorial will examine the many import aspects of establishing and running a cytometry-based core facility. New core managers (or maybe even a few seasoned managers) will benefit from this comprehensive review of the factors that need consideration in order to have a successful core facility. Topics will include: assessing user needs, instrument selection/evaluation, financial planning, facilities planning, facility operations, and evaluating success/failure. Participants will be able to apply these principles to their own situations in order to better manage their own facility.
We describe the adaptation of a hybrid quadrupole linear ion trap (QLT)-orbitrap mass spectrometer to accommodate electron transfer ion/ion reactions (ETD) for peptide and protein characterization. ETD-inducing anions are generated and injected by a variety of schemes, i.e., from dual AP sources or via chemical ionization. Independent of how the anions were generated and injected, the subsequent ion/ion reactions are conducted within the QLT; after this, the product ions are passed to the orbitrap for m/z analysis. With this arrangement, mass accuracies are typically measured to within 1 ppm at a resolving power of ~60,000. Using large peptides and intact proteins, we demonstrate such capabilities will accelerate our ability to interrogate high-mass species. A key benefit from this increased mass accuracy is the confidence with which fragment ions can be identified. Finally, we describe other new technological capabilities afforded by the implementation of ETD on the hybrid orbitrap mass spectrometer. For example, we have designed a decision tree–based, data-dependent method that automatically selects whether to perform CAD or ETD, based on the precursor charge, mass, and abundance, and whether to perform mass analysis at high or low resolving power. For complex mixture analysis, this method nearly doubles the probability that any given MS/MS event will lead to a successful identification.
One challenge for the utilization of ECD and ETD is the requirement for at least doubly charged precursor ions. We have explored the utility of using divalent metal adducts as charge carriers in electrospray ionization (ESI) of acidic, neutral, and small molecules, including acidic peptides, oligosaccharides, and metabolites. Notably, we observed complete sulfate loss in ECD of protonated sulfated peptides. By contrast, sulfate groups were retained in ECD of metal-adducted peptides and oligosaccharides, thereby allowing their localization. Furthermore, extensive sugar cross-ring cleavage, which provides carbohydrate linkage information, was observed in ECD of metal-adducted oligosaccharides. ECD product ion spectra are complementary to those produced from infrared multi-photon dissociation of the same metal-adducted species. In addition, different bond cleavage patterns compared to collisional activation were seen in ECD of metal-adducted phosphate-containing metabolites. These data present exciting prospects for extending the radical ion chemistry of ECD and ETD to the structural analysis of glycomes and metabolomes. Another concern of ECD/ETD for peptide analysis is that positive-mode ESI of tryptic peptides generally produces doubly protonated species, which predominantly yield z ion series due to retention of charge at the basic C-terminal residue. Because z ions are initially radical species, they have been proposed to be more susceptible to secondary fragmentation, and lower abundance has been reported as compared to c ions. Furthermore, it has been demonstrated that doubly protonated precursor ions exhibit limited fragmentation in conventional (i.e., without ion activation or heating) ETD. We performed ECD of the doubly and triply protonated forms of 128 peptides from trypsin, chymotrypsin, and Glu-C digestion of standard proteins. Triply protonated peptides provided an increase (26%) in the peptide sequence coverage independently of the enzyme used. ECD of tryptic peptides, in both charge states, resulted in higher sequence coverage compared to peptides from chymotrypsin and Glu-C digestion.
It is rare to see an instrument that does not have some type of computing device that handles the operational and data collection tasks of these complex and expensive devices. These instruments now generate so much data so rapidly that they can require high-speed networks and network data storage. The instrument computer has become as difficult to support and requires as much specialized knowledge to service as the instrument itself. Unless a comprehensive IT strategy is developed for instrument computing, inefficiencies, downtime, and security risks can greatly increase the cost of running the facility and slow research.
Stakeholders that support instrumentation have vested interests on how to best service the instrument and its computer. These perspectives place different emphases on individual components of an instrument system. We have found that the problems experienced in instrument computing rise up through support gaps between these differing viewpoints.
We have created a comprehensive lab information technology strategy that combines these different viewpoints and methodologies. We did this using an incremental growth strategy through an all-volunteer approach. We examined different aspects of lab computing support and then tested tools, infrastructure configurations, and organizational approaches to create a scalable, flexible and unified support philosophy.
Some interesting results have been achieved with this support strategy. IS/IT has begun to treat lab computing as a partnered discipline instead of a series of rogue computers on their desktop networks. Support costs have dropped considerably at the research sites that have embraced this philosophy. Additionally, instrument computers are now showing signs of unified configurations with respect to anti-virus, patch levels, and other important criteria.
In conclusion, we believe that developing a unified support vision for your lab computing environment will allow you to realize large-scale net gains in productivity, reduced costs, and security.
Homogeneous, or “closed-tube,” PCR methods require no processing or automation for analysis. PCR product melting analysis is a convenient, rapid, and closed-tube method without a contamination risk. Whereas electrophoresis identifies products by size, melting curves identify amplification products by melting temperature (Tm) and curve shape.
High-resolution melting analysis enables closed-tube mutation scanning and genotyping without labelled probes or real-time PCR.1–4 In many cases, sequencing becomes obsolete. Both mutation scanning and unlabeled probe genotyping can be derived from the same melting curve with saturating dsDNA dyes. Melting of unlabeled probes at low temperature is complemented by amplicon melting at high temperature and requires less than 5 min. In addition, a new method of dsDNA dye genotyping with “snap-back primers” will be introduced.
In this talk we will discuss the utility of combining spectral library searching with protein sequence searching for interpretation of peptide MS/MS spectra. Spectral library searching has been an indispensable tool for identifying unknown compounds in other areas of mass spectrometry for more than 25 years. The benefits of applying this technique to peptide MS/MS spectra have been recently validated. They include higher sensitivity and more robust scoring (i.e., small changes in spectrum give small changes in score). These are attributed to the use of consensus peak intensities, collected from many replicate spectra during the library-building process, and peak annotations during scoring. This method works well for discrete instrument classes because today’s spectra are highly reproducible and similarities between replicates are predictably correlated with signal-to-noise. The limitation of this method is that it requires a comprehensive spectral library with respect to the sample being studied. Since this is not always available, sequence searching can provide the additional search space. We have combined the Open Mass Spectrometry Search Algorithm (OMSSA) from NCBI with routines from NIST’s MS Search 2.0 (NISTMS) to generate a pipeline that maximizes the strengths of both methods. To combine the two programs, we used an FDR calculation of a test dataset to derive a relationship model between OMSSA’s E-value and NISTMS’s native dot-product-based scoring metric (NIST Score). Final FDR thresholds can be approximated using a target-decoy strategy. All matches are separately scaled by count and relative database size according to the engine and search library from which they were obtained.
Comparing database search identification algorithms is necessary, but how can one do so fairly? A proper comparison should ensure that algorithms are compared on equal footing, “with all other things being equal.” This ideal, however, is rarely achieved. In this talk, we will examine a recent publication of a comparison among MyriMatch, Sequest, and X!Tandem. Potential sources of error included spectral file format disparities, algorithm configuration conventions, sequence refinement features, raw identification output format differences, score threshold determination, and more. For any algorithm comparison to be valid, these differences should be minimized to the greatest extent possible.
Most tandem mass spectrometry database search algorithms perform a restrictive search that takes into account only a few types of post-translational modifications (PTMs) and ignores all others. We describe two open-source unrestrictive PTM search algorithms—MS-Alignment and Spectral Networks—that search for all types of PTMs at once in a blind mode (i.e., without knowing in advance which PTMs exist in nature). The key idea of spectral alignment is to represent the spectrum of a peptide with parent mass M as a 0–1 sequence of length M that contains 1 in position m for every mass m of a spectral peak, and 0 elsewhere (the masses in the spectrum are assumed to be integers). In this representation, a PTM with positive shift D > 0 is simply an insertion of D zeroes in the sequence, whereas a PTM with negative shift D < 0 is a deletion of D zeroes. Thus, a comparison of two spectra becomes a comparison of two strings in a 0–1 alphabet with insertions and deletions allowed—the classical edit distance problem in bioinformatics. While MS-Alignment capitalizes on spectral alignment to identify modified variants of peptides in a database, Spectral Networks directly align unidentified spectra to discover PTMs and highly modified peptides. Applying these approaches to the analysis of in-lens proteins led to the identification of hundreds of PTMs, doubled the number of known modification sites, and found evidence for several putative novel modifications. MS-Alignment and Spectral Networks are freely available as open-source packages and Web services at http://proteomics.bioporjects.org.
VisiGen Biotechnologies, Inc., is developing a sequencing platform that will enable low-cost, comprehensive genome analysis ($1000 human genome in a day). We are engineering polymerase and nucleotides to act together as direct molecular sensors of DNA sequence information during DNA replication. Single-molecule detection, fluorescent molecule chemistry, computational biochemistry, and biomolecule engineering and purification are being combined to create this novel platform. The molecules are engineered to maximally FRET during nucleotide insertion. As a nucleotide enters the polymerase’s active site, energy transfers from the excited donor fluorophore associated with the polymerase to the acceptor fluorophore bonded to the nucleotide’s gamma-phosphate, thereby stimulating the emission and detection of a base-type-specific signature. Donor fluorescence is equally informative, because it undergoes anti-correlated intensity changes throughout the incorporation cycle. As the acceptor-tagged PPi is released from the polymerase, the distance between it and the donor fluorophore increases, causing the intensity of the acceptor’s fluorescence to decrease and that of the donor’s to simultaneously increase. After a spFRET event, the donor’s emission returns to its original state and is ready to undergo a similar intensity oscillation cycle with the next acceptor-tagged nucleotide. In this way, the donor fluorophore acts as a punctuation mark between incorporation events. Furthermore, because the acceptor fluorophore is naturally removed during nucleotide incorporation, VisiGen’s strategy enables a real-time approach to DNA sequencing. The technology is scalable: these nanosequencing machines are monitored in massively parallel arrays to produce a sequencing platform that will be capable of collecting sequence data at 1 million bases per sec.
We will report a method for the rapid assembly of micro- and nano-particles conjugated with DNA into high-density arrays with near-perfect order and no background. The usage of these arrays of single DNA molecules or molecular clones from fragmented genomic DNA can dramatically increase the throughput and imaging efficiency of the sequencing process so that genome sequencing can be performed with a single miniaturized device. We will also report our recent progress in the development of an automated system with integrated microfluidics and high-speed fluorescence imaging, and the chemistry for DNA sequencing with these arrays.
Applications of genomics to health care require genotyping technologies with higher sensitivity, improved selectivity, faster turnaround, and lower cost. Similar pressures for cost reduction in DNA sequencing are driving the development of single-molecule DNA analysis methods. Nanopore-based single-molecule schemes are emerging as viable candidates for high-throughput DNA detection, for both sequencing and genotyping applications.
Here we report on preliminary work towards the development of solid-state nanopore-based force spectroscopy for genotyping and single nucleotide polymorphism detection by purely electronic means. We have developed the ability to fabricate solid-state pores at the 2–10 nm scale by TEM, opening up exciting new possibilities for expansion of the force spectroscopy work demonstrated on proteinaceous alpha-hemolysin pores. We will present the force-spectroscopy scheme to be employed for clinical genotyping applications, by means of electrophoretic manipulation of charged molecules within the pore. Briefly, we insert in the pore, from the cis-side, an ssDNA probe anchored to a nanoparticle or protein such that a specific sequence protrudes from the trans-side. The DNA analyte is hybridized to the ssDNA probe of specific sequence and forced to dissociate by a reversed applied electric force, which tends to withdraw the probe from the pore. We present preliminary data which allow us to distinguish ssDNA oligomers with single-base specificity by kinetic analysis of the lifetime under force of the duplex formed with a specific probe.
There are many techniques for 3D imaging of biological specimens, such as confocal, two-photon, and wide-field fluorescence microscopy, CAT scan, MRI, and optical coherence tomography. Due to the differences in resolution, depth of field, and field of view, it is often difficult to compare images from the relatively high-resolution microscopy methods to the lower-resolution high-volume imaging methods. Correlating images across these platforms could be very powerful in relating organ- or organism-level information to cellular-level processes. The system reported here, which is based on previously reported systems and has several advantages for 3D imaging of large objects, such as utilizing reflection, transmission, and fluorescent imaging. It has higher resolution than NMR or CAT scans, and substantially lower cost. The specimen is rotated 360o in the object plane of the microscope. Single images, collected with a CCD camera, of objects less than the depth of field are collected at each rotational step, while images of objects larger than the depth of field are collected as Z-stacks at each rotation step and projected using a wavelet transform function. Images are captured in reflection, transmitted, or fluorescence modes. Two sample images are below. The image on the left is a reflected light image of a deer fly (Chrysops pikei); the image on the right is a reconstructed volume generated from a transmitted light series of a Drosophila (D. melanogaster) larva. These modes can be combined to generate greater dimensional data of the specimen. Custom image registration and reconstruction software was developed for our imaging conditions. Large, single objects such as Drosophila larva were imaged as well as epithelial cells expressing GFP-Histone B growing inside a capillary tube that simulates a vessel.
There are several different modes of light microscopy that are used to image fluorescence in biological specimens, including wide-field, 3D deconvolution, spot scanning confocal, and spinning disk confocal. There has been a great deal of interest and some confusion about which of the available methods is “better.” In many biological specimens, image contrast is degraded by background fluorescence arising from out-of-focus parts of the specimen. The best mode of microscopy for a particular specimen would remove the out-of-focus fluorescence while maintaining high efficiency of photon collection, image contrast, and signal-to-noise ratio. The goal of the experiments that will be presented is to establish guidelines for choosing the most appropriate method of microscopy for a given biological specimen. The approach is to compare the efficiency of photon collection, the image contrast, and the signal-to-noise ratio achieved by the different methods at equivalent illumination, using a specimen in which the amount of out-of-focus background is adjustable over the range encountered in typical biological specimens. We compared spot scanning confocal, spinning disk confocal, and wide-field/deconvolution microscopes and found that the ratio of out-of-focus background to in-focus signal can be used to predict which method of microscopy will provide the most useful image. This analysis provides quantiative evidence that can be used to support a researcher’s choice of microscope for imaging a particular specimen, and defines an upper limit to the useful range of each of the imaging methods.
Exocytosis and endocytosis have been examined during oscillatory growth in pollen tubes of lily and tobacco. For exocytosis, we have employed three markers: (1) changes in apical cell wall thickness using Nomarski differential interference contrast optics; (2) changes in cell wall fluorescence in cells stained with propidium iodide; and (3) changes in cell surface fluorescence in cells expressing pectin methyl esterase linked to GFP (PME-GFP). For endocytosis, we have used the styryl dye, FM4-64. Quantitative analysis of the images indicates that both processes oscillate during pollen tube growth, with exocytosis leading and endocytosis following the increase in growth rate. Phase analysis reveals that exocytosis leads growth by approximately −120 degrees, while endocytosis follows growth exhibiting a lag of +15 degrees at the apical membrane surface, which becomes progressively more delayed in distal regions. Surprisingly, exocytosis is not closely correlated with changes in intracellular calcium, whereas endocytosis is. Imaging further indicates that these two processes are spatially separated. Confocal microscopy of cells expressing PME-GFP indicates staining in the endoplasmic reticulum, the Golgi dictyosomes, and at the apical cell surface, but not appreciably in the inverted cone, suggesting that vesicles destined for secretion do not pass through this region of the cell. By contrast, FM4-64 selectively stains the inverted cone, suggesting that this aggregation of vesicles represents endocytotic rather than exocytotic activity.
Supported by NSF Grant #0516852.
Laser capture microdissection (LCM) is being used to investigate plant transcriptomes. RNA has been extracted and amplified from multiple tissues and cell types of maize, including the developmentally important shoot apical meristem (SAM). RNA samples extracted from LCM-collected SAMs and from intact seedlings were hybridized to cDNA microarrays. Over 5000 cDNAs (~13% of total) were differentially expressed (P < 0.0001) between SAMs and intact seedlings. Transcripts that hybridized to 62 retrotransposon-related cDNAs were also substantially up-regulated in the SAM. More than 260,000 and 280,000 ESTs were generated using 454 Life Sciences sequencing technology from LCM-collected SAMs of the inbreds B73 and Mo17, respectively. Consistent with the microarray results, approximately 14% of the 454-SAM ESTs were retrotransposon related. The 454-ESTs were used to annotate >25,000 maize genomic sequences (MAGIs) and also included 400 expressed transcripts for which homologous sequences have not yet been identified in any other species. These results indicate that coupling of LCM and 454 sequencing technologies facilitates the discovery of rare, possibly cell-type-specific transcripts. A computational pipeline that uses the POLYBAYES polymorphism detection system was adapted for 454 ESTs and used to detect single nucleotide polymorphisms (SNPs) between the two inbred lines. Over 36,000 putative SNPs were detected within ~10,000 unique B73 MAGIs. Stringent post-processing reduced this number to >7000 putative SNPs. Over 85% (94/110) of a sample of these putative SNPs were successfully validated by Sanger sequencing. Subsequently, 1045 SNPs were validated and genetically mapped using Sequenome mass spectrometry based technology. These results demonstrate that 454-based transcriptome sequencing is an excellent method for the high-throughput acquisition of gene-associated SNPs.
The Genome Sequencer FLX is an advanced and flexible next-generation sequencing system that provides genetic analysis solutions in a wide variety of applications. This talk will outline the system capability in whole genome de novo sequencing, resequencing, full-length cDNA sequencing, ultra-deep SNP analysis of tumor samples, viral sequencing, metagenomics, BAC sequencing, and large eukaryotic genome sequencing. GS FLX, which combines high accuracy and long read length with easy laboratory operation, data management, bioinformatics, and rapid turnaround time, is ideally suited for biological core facilities offering production service as well as scientific collaborations.
The Illumina Genome Analyzer is a high throughput DNA sequencing platform that routinely generates more than a billion bases of high quality sequence information from a single run. We will show examples of how the instrument is being used for a large variety of applications in genome biology including eukaryotic and prokaryotic resequencing, SNP discovery, gene expression analysis, ChIP-SEQ, genome-wide mapping of DNA methylation sites, and miRNA discovery and analysis.
Microarrays provide a powerful tool for multiplex testing in gene expression, genotyping, and copy number variation studies. Recent collaborations, such as the microarray quality control project, have demonstrated that microarray technology can meet the performance criteria required for clinical applications. In addition, several studies have identified sets of DNA genotypes or RNA expression patterns that act as biomarkers for clinical traits, including drug metabolism and disease progression. These developments suggest microarray cores focusing on research applications may soon be requested to test clinical specimens to improve patient care.
In this section, we will present examples of microarray use in clinical applications. We will discuss the challenges presented by clinical specimens, such as blood and formalin-fixed paraffin-embedded tissue. We will also review laboratory features, such as turnaround time and batch sizes, that differ for microarrays in research and clinical use. Special attention will be focused on the need for well-documented quality systems such as CLIA certification, which is typical of clinical laboratories.
Have you ever wondered whether the President of the United States knows anything about science? Or wanted to tell your members of Congress just what you think? Ever wished there was something you could do to increase the NIH budget or reduce the burden of regulation? Sometimes the greatest influences on science come from inside the Beltway of our nation’s capitol, rather than from inside the lab. From research funding to rules and regulations to policies that affect training of scientists or target specific areas of research, what goes on in Washington can make a major difference in the everyday lives of researchers. Is your voice being heard? What are the lobbyists that are representing scientists saying about you? How can you find the tools that you need to be an effective science advocate? And what are the hot topics currently or on the horizon that could impact you or your research facility? This session is your opportunity to get answers to all of these questions and more!
Developments in mass spectrometry methodology, instrumentation, and software have rapidly enabled in-depth characterization of complex mixtures, from protein identification and modification analysis to quantitation studies. The amounts of data produced in these studies has changed the paradigm of data analysis to a point where manual verification of results is not generally practical. In addition, the user-friendliness of instrumentation and software made these technologies available to groups with no formal background in mass spectrometric instrumentation and analysis. These factors combined to cause a variety of problems within the proteomics community, where studies were being published but the reliability of the experimental results could not be assessed independently.
Key representatives within the proteomics community met and a set of community agreed publication guidelines were formulated.1 The sponsoring journal Molecular and Cellular Proteomics is employing these guidelines for publication, but many authors are struggling to create manuscripts that are compliant with these instructions.
This presentation represents a tutorial discussing the guidelines in terms of why given pieces of information are required and provides examples of formats that would satisfy the requirements.
ProteinProspector has been on the Internet since 1995. It is heavily used, particularly for PMF-based protein identification. It also allows analysis of MS/MS data, but until recently the public version could perform this only one spectrum at a time. However, a newer version of Protein-Prospector has recently been made available to the public over the Web or for in-house installation in labs worldwide.
The new ProteinProspector version permits the interrogation of multiple LC/MS/MS files together in one search, and the results can be viewed combined or kept separated. Thus, for example, the results from analysis of a whole gel lane can be combined into a single report, while retaining information about which band on the gel each protein was identified in. The program reports peptide identifications with expectation values, allows removal of replicate peptide identifications from the lists, and indicates the number of occurrences of any sequence in the database searched. In the protein reports, homologous proteins will be grouped, with the number of unique, identifying sequences reported.
For covalent modifications, a two-step database search approach can be employed. First, the proteins present are identified with high confidence using strict search parameters; then, using the list of accession numbers identified in the initial search, the data can be searched for a wide variety of specified covalent modifications. Similarly, unexpected or novel modifications can be detected in the second search by allowing for any mass modifications within a user-definable mass window on selected residues or on any amino acid. These search results are also reported with expectation values.
In addition to qualitative analysis, Prospector supports a wide variety of quantitative measurements, such as ICAT, iTRAQ, and SILAC.
This work was supported by NIH NCRR 001614 and the Vincent J. Coates Foundation.