The most abundant proteins in our cells are there to generate mechanical forces, and measurement of these forces has just become possible.
Drosophila melanogaster has one of the best characterized metazoan genomes in terms of functionally annotated regulatory elements. To explore how these elements contribute to gene regulation in the context of gene regulatory networks, we need convenient tools to identify the proteins that bind to them. Here, we present the development and validation of a highly automated protein-DNA interaction detection method, enabling the high-throughput yeast one-hybrid-based screening of DNA elements versus an array of full-length, sequence-verified clones containing 647 (over 85%) of predicted Drosophila transcription factors (TFs). Using six well-characterized regulatory elements (82 bp – 1kb), we identified 33 TF-DNA interactions of which 27 are novel. To simultaneously validate these interactions and locate their binding sites of involved TFs, we implemented a novel microfluidics-based approach that enables us to conduct hundreds of gel shift-like assays at once, thus allowing the retrieval of DNA occupancy data for each TF throughout the respective target DNA elements. Finally, we biologically validate several interactions and specifically identify two novel regulators of sine oculis gene expression and hence eye development.
We describe a method for fluorescent in situ identification of individual mRNA molecules, allowing quantitative and accurate measurements of allele-specific transcripts that differ by only a few nucleotides, in single cells. By using a combination of allele-specific and non-allele-specific probe libraries, we achieve over 95% detection accuracy. We investigate the allele-specific stochastic expression of the pluripotency factor Nanog in murine embryonic stem cells.
Current tandem mass spectral libraries for lipid annotations in metabolomics are limited in size and diversity. We provide a freely available computer generated in-silico tandem mass spectral library of 212,516 MS/MS spectra covering 119,200 compounds from 26 lipid compound classes, including phospholipids, glycerolipids, bacterial lipoglycans and plant glycolipids. Platform independence is shown by using tandem mass spectra from 40 different mass spectrometer types including low-resolution and high-resolution instruments.
We report an in vitro selection strategy to identify RNA sequences that mediate cap-independent translation initiation. This method entails the mRNA display of trillions of genomic fragments, selection for translation initiation, and high-throughput deep sequencing. We identified >12,000 translation enhancing elements (TEEs) in the human genome, generated a high-resolution map of human TEE bearing regions (TBRs), and validated the function of a subset of sequences in vitro and in cells.
Human Genome; Internal Ribosomal Entry Sites (IRES); mRNA Display
Transcriptional enhancers are a primary mechanism by which tissue-specific gene expression is achieved. Despite the importance of these regulatory elements in development, responses to environmental stresses, and disease, testing enhancer activity in animals remains tedious, with a minority of enhancers having been characterized. Here, we have developed ‘enhancer-FACS-Seq’ (eFS) technology for highly parallel identification of active, tissue-specific enhancers in Drosophila embryos. Analysis of enhancers identified by eFS to be active in mesodermal tissues revealed enriched DNA binding site motifs of known and putative, novel mesodermal transcription factors (TFs). Naïve Bayes classifiers using TF binding site motifs accurately predicted mesodermal enhancer activity. Application of eFS to other cell types and organisms should accelerate the cataloging of enhancers and understanding how transcriptional regulation is encoded within them.
We tested whether Transcription Activator-Like Effectors (TALEs) can mediate repression and activation of endogenous enhancers in the Drosophila genome. TALE-repressors (TALERs) targeting each of the five even-skipped (eve) “stripe” enhancers generated repression specifically of the focal stripes. TALE-activators (TALEAs) targeting the eve promoter or eve enhancers caused increased expression primarily in cells normally activated by the promoter or targeted enhancer, respectfully. The phenotypic effects of TALER and TALEA expression in larvae and adults are consistent with the observed modulations of eve expression. In these assays, the Hairy repression domain did not exhibit previously described long-range transcriptional repression activity. The precise effects of the TALEAs support the view that repression acts in a dominant fashion on transcriptional activators and that the activity state of an enhancer influences TALE binding or the ability of VP16 to enhance transcription. TALEs thus provide a novel tool for detection and functional modulation of transcriptional enhancers in their native genomic context.
Near-infrared fluorescent proteins are in high demand for in vivo imaging. We developed four spectrally distinct fluorescent proteins, iRFP670, iRFP682, iRFP702, and iRFP720, from bacterial phytochromes. iRFPs exhibit high brightness in mammalian cells and tissues and are suitable for long-term studies. iRFP670 and iRFP720 enable two-color imaging in living cells and mice using standard approaches. Five iRFPs including previously engineered iRFP713 allow multicolor imaging in living mice with spectral unmixing.
We have designed β-strand peptides (BP) that stabilize integral membrane proteins (IMP). BPs self-assemble in solution as filaments and become restructured upon association with IMPs; the resulting IMP/BP complexes resist aggregation when diluted in detergent-free buffer and are examined as stable, single particles with low detergent background by electron microscopy. This enables clear visualization of a spectrum of flexible conformations in the highly dynamic ATP-binding cassette (ABC) transporter MsbA.
We show that the difficulties in imaging the dynamics of protein expression in live bacterial cells can be overcome using fluorescent sensors based on Spinach, an RNA that activates the fluorescence of a small-molecule fluorophore. These RNAs selectively bind target proteins, and exhibit fluorescence increases that enable protein expression to be imaged in living cells. These sensors provide a general strategy to image protein expression in single bacteria in real-time.
Affinity purification coupled with mass spectrometry (AP-MS) is now a widely used approach for the identification of protein-protein interactions. However, for any given protein of interest, determining which of the identified polypeptides represent bona fide interactors versus those that are background contaminants (e.g. proteins that interact with the solid-phase support, affinity reagent or epitope tag) is a challenging task. While the standard approach is to identify nonspecific interactions using one or more negative controls, most small-scale AP-MS studies do not capture a complete, accurate background protein set. Fortunately, negative controls are largely bait-independent. Hence, aggregating negative controls from multiple AP-MS studies can increase coverage and improve the characterization of background associated with a given experimental protocol. Here we present the Contaminant Repository for Affinity Purification (the CRAPome) and describe the use of this resource to score protein-protein interactions. The repository (currently available for Homo sapiens and Saccharomyces cerevisiae) and computational tools are freely available online at www.crapome.org.
CRISPR-Cas systems have been used with single-guide RNAs for accurate gene disruption and conversion in multiple biological systems. Here we report the use of the endonuclease Cas9 to target genomic sequences in the C. elegans germline, utilizing single-guide RNAs that are expressed from a U6 small nuclear RNA promoter. Our results demonstrate that targeted, heritable genetic alterations can be achieved in C. elegans, providing a convenient and effective approach for generating loss-of-function mutants.
In mass spectrometry based proteomics, data-independent acquisition (DIA) strategies have the ability to acquire a single dataset useful for identification and quantification of detectable peptides in a complex mixture. Despite this, DIA is often overlooked due to noisier data resulting from a typical five to ten fold reduction in precursor selectivity compared to data dependent acquisition or selected reaction monitoring. We demonstrate a multiplexing technique which improves precursor selectivity five-fold.
Data Independent Acquisition; Q-Exactive; Multiplexing; Targeted Proteomics; Shotgun Proteomics
Local anesthetics are effective in suppressing pain sensation, but most of these compounds act non-selectively, inhibiting the activity of all neurons. Moreover, their actions abate slowly, preventing precise spatial and temporal control of nociception. We have developed a photoisomerizable molecule named QAQ (Quaternary ammonium – Azobenzene – Quaternary ammonium) that enables rapid and selective optical control of nociception. QAQ is membrane-impermeant and it has no effect on most cells, but it infiltrates pain-sensing neurons through endogenous ion channels that are activated by noxious stimuli, primarily TRPV1. After QAQ accumulates intracellularly, it blocks voltage-gated ion channels in the trans but not the cis form. QAQ enables reversible optical silencing of mouse nociceptive neuron firing without exogenous gene expression and can serve as a light-sensitive analgesic in rats in vivo. Moreover, because intracellular QAQ accumulation is a consequence of nociceptive ion channel activity, QAQ-mediated photosensitization provides a new platform for understanding signaling mechanisms in acute and chronic pain.
Traditional methods for flow cytometry (FCM) data processing rely on subjective manual gating. Recently, several groups have developed computational methods for identifying cell populations in multidimensional FCM data. The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of these methods on two tasks – mammalian cell population identification to determine if automated algorithms can reproduce expert manual gating, and sample classification to determine if analysis pipelines can identify characteristics that correlate with external variables (e.g., clinical outcome). This analysis presents the results of the first of these challenges. Several methods performed well compared to manual gating or external variables using statistical performance measures, suggesting that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.
Live-cell imaging of mRNA yields important insights into gene expression, but it has generally been limited to the labeling of one RNA species and has never been used to count single mRNAs over time in yeast. We demonstrate a two-color imaging system with single-molecule resolution using MS2 and PP7 RNA labeling. We use this methodology to measure intrinsic noise in mRNA levels and RNA polymerase II kinetics at a single gene.
Natural proteins often rely on the disulfide bond to covalently link side chains. Here we genetically introduce a new type of covalent bond into proteins by enabling an unnatural amino acid to react with a proximal cysteine. We demonstrate the utility of this bond for enabling irreversible binding between an affibody and its protein substrate, capturing peptide-protein interactions in mammalian cells, and improving the photon output of fluorescent proteins.
Determining the long-range haplotypes in a diploid individual is a major technical challenge. Here we report a method of molecular haplotyping by directly imaging multiple polymorphic sites on individual human DNA molecules simultaneously. We demonstrate the utility of this technology by accurately determining the haplotypes consisting of up to 16 single-nucleotide polymorphisms in genomic regions up to 50 kilobases.
We present eXpress, a software package for highly efficient probabilistic assignment of ambiguously mapping sequenced fragments. eXpress uses a streaming algorithm with linear run time and constant memory use. It can determine abundances of sequenced molecules in real time, and can be applied to ChIP-seq, metagenomics and other large-scale sequencing data. We demonstrate its use on RNA-seq data, showing greater efficiency than other quantification methods.
Tetrad analysis has been a gold standard genetic technique for several decades. Unfortunately, the manual nature of the process has relegated its application to small-scale studies and limited its integration with rapidly evolving DNA sequencing technologies. We have developed a rapid, high-throughput method, called Barcode Enabled Sequencing of Tetrads (BEST), that replaces the manual processes of isolating, disrupting and spacing tetrads. BEST uses a meiosis-specific GFP fusion protein to isolate tetrads by fluorescence-activated cell sorting and molecular barcodes that are read during genotyping to identify spores derived from the same tetrad. Maintaining tetrad information allows accurate inference of missing genetic markers and full genotypes of missing (and presumably nonviable) individuals. By removing the bottleneck of manual dissection, hundreds or even thousands of tetrads can be isolated in minutes. We demonstrate the approach in Saccharomyces cerevisiae, but BEST is readily transferable to microorganisms in which meiotic mapping is significantly more laborious.
RNA-Seq is an effective method to study the transcriptome, but can be difficult to apply to scarce or degraded RNA from fixed clinical samples, rare cell populations, or cadavers. Recent studies have proposed several methods for RNA-Seq of low quality and/or low quantity samples, but their relative merits have not been systematically analyzed. Here, we compare five such methods using metrics relevant to transcriptome annotation, transcript discovery, and gene expression. Using a single human RNA sample, we constructed and sequenced ten libraries with these methods and two control libraries. We find that the RNase H method performed best for low quality RNA, and confirmed this with actual degraded samples. RNase H can even effectively replace oligo (dT) based methods for standard RNA-Seq. SMART and NuGEN had distinct strengths for low quantity RNA. Our analysis allows biologists to select the most suitable methods and provides a benchmark for future method development.
Newly developed scientific complementary metal–oxide–semiconductor (sCMOS) cameras have the potential to dramatically accelerate data acquisition in single-molecule switching nanoscopy (SMSN) while simultaneously increasing the effective quantum efficiency. However, sCMOS-intrinsic pixel-dependent readout noise substantially reduces the localization precision and introduces localization artifacts. Here we present algorithms that overcome these limitations and provide unbiased, precise localization of single molecules at the theoretical limit. In combination with a multi-emitter fitting algorithm, we demonstrate single-molecule localization super-resolution imaging at up to 32 reconstructed images/second (recorded at 1,600–3,200 camera frames/second) in both fixed and living cells.
High-throughput sequencing has opened numerous possibilities for the identification of regulatory RNA-binding events. Cross-linking and immunoprecipitation of Argonaute protein members can pinpoint microRNA target sites within tens of bases, but leaves the identity of the microRNA unresolved. A flexible computational framework that integrates sequence with cross-linking features reliably identifies the microRNA family involved in each binding event, considerably outperforms sequence-only approaches, and quantifies the prevalence of noncanonical binding modes.
The measurement of lifespan pervades aging research. Because lifespan results from complex interactions between genetic, environmental and stochastic factors, it varies widely even among isogenic individuals. The action of molecular mechanisms on lifespan is therefore visible only through their statistical effects on populations. Survival assays in C. elegans provided critical insights into evolutionarily conserved determinants of aging. To enable the rapid acquisition of survival curves at arbitrary statistical resolution, we developed a scalable imaging and analysis platform to observe nematodes over multiple weeks across square meters of agar surface at 8 μm resolution. The method generates a permanent visual record of individual deaths from which survival curves are constructed and validated, producing data consistent with the manual method for several mutants in both standard and stressful environments. Our approach allows rapid, detailed reverse-genetic and chemical screens for effects on survival and enables quantitative investigations into the statistical structure of aging.
Conventional Fourier-transform infrared (FTIR) microspectroscopic systems are limited by an inevitable trade-off between spatial resolution, acquisition time, signal-to-noise ratio (SNR) and sample coverage. We present an FTIR imaging approach that substantially extends current capabilities by combining multiple synchrotron beams with wide-field detection. This advance allows truly diffraction-limited high-resolution imaging over the entire mid-infrared spectrum with high chemical sensitivity and fast acquisition speed while maintaining high-quality SNR.