1.  Measuring error rates in genomic perturbation screens: gold standards for human functional genomics 
Molecular Systems Biology  2014;10(7):733.
Technological advancement has opened the door to systematic genetics in mammalian cells. Genome-scale loss-of-function screens can assay fitness defects induced by partial gene knockdown, using RNA interference, or complete gene knockout, using new CRISPR techniques. These screens can reveal the basic blueprint required for cellular proliferation. Moreover, comparing healthy to cancerous tissue can uncover genes that are essential only in the tumor; these genes are targets for the development of specific anticancer therapies. Unfortunately, progress in this field has been hampered by off-target effects of perturbation reagents and poorly quantified error rates in large-scale screens. To improve the quality of information derived from these screens, and to provide a framework for understanding the capabilities and limitations of CRISPR technology, we derive gold-standard reference sets of essential and nonessential genes, and provide a Bayesian classifier of gene essentiality that outperforms current methods on both RNAi and CRISPR screens. Our results indicate that CRISPR technology is more sensitive than RNAi and that both techniques have nontrivial false discovery rates that can be mitigated by rigorous analytical methods.
2.  Finding the active genes in deep RNA-seq gene expression studies 
BMC Genomics  2013;14:778.
Early application of second-generation sequencing technologies to transcript quantitation (RNA-seq) has hinted at a vast mammalian transcriptome, including transcripts from nearly all known genes, which might be fully measured only by ultradeep sequencing. Subsequent studies suggested that low-abundance transcripts might be the result of technical or biological noise rather than active transcripts; moreover, most RNA-seq experiments did not provide enough read depth to generate high-confidence estimates of gene expression for low-abundance transcripts. As a result, the community adopted several heuristics for RNA-seq analysis, most notably an arbitrary expression threshold of 0.3 - 1 FPKM for downstream analysis. However, advances in RNA-seq library preparation, sequencing technology, and informatic analysis have addressed many of the systemic sources of uncertainty and undermined the assumptions that drove the adoption of these heuristics. We provide an updated view of the accuracy and efficiency of RNA-seq experiments, using genomic data from large-scale studies like the ENCODE project to provide orthogonal information against which to validate our conclusions.
We show that a human cell’s transcriptome can be divided into active genes carrying out the work of the cell and other genes that are likely the by-products of biological or experimental noise. We use ENCODE data on chromatin state to show that ultralow-expression genes are predominantly associated with repressed chromatin; we provide a novel normalization metric, zFPKM, that identifies the threshold between active and background gene expression; and we show that this threshold is robust to experimental and analytical variations.
The zFPKM normalization method accurately separates the biologically relevant genes in a cell, which are associated with active promoters, from the ultralow-expression noisy genes that have repressed promoters. A read depth of twenty to thirty million mapped reads allows high-confidence quantitation of genes expressed at this threshold, providing important guidance for the design of RNA-seq studies of gene expression. Moreover, we offer an example for using extensive ENCODE chromatin state information to validate RNA-seq analysis pipelines.
3.  microRNA regulation of molecular networks mapped by global microRNA, mRNA, and protein expression in activated T-lymphocytes 
MicroRNAs (miRNAs) regulate specific immune mechanisms but their genome-wide regulation of T-lymphocyte activation is largely unknown. We performed a multidimensional functional genomics analysis to integrate genome-wide differential mRNA, miRNA, and protein expression as a function of human T-lymphocyte activation and time. We surveyed expression of 420 human miRNAs in parallel with genome-wide mRNA expression. We identified a unique signature of 71 differentially expressed miRNAs, 57 of which were previously not known as regulators of immune activation. The majority of miRNAs are upregulated, mRNA expression of these target genes is downregulated and this is a function of binding multiple miRNAs (combinatorial targeting). Our data reveal that consideration of this complex signature, rather than single miRNAs, is necessary to construct a full picture of miRNA-mediated regulation. Molecular network mapping of miRNA targets revealed the regulation of activation-induced immune signaling. In contrast, pathways populated by genes that are not miRNA targets are enriched for metabolism and biosynthesis. Finally, we specifically validated miR-155 (known) and miR-221 (novel in T-lymphocytes) using locked nucleic acid inhibitors. Inhibition of these 2 highly upregulated miRNAs in CD4+ T cells were shown to increase proliferation by removing suppression of 4 target genes linked to proliferation and survival. Thus, multiple lines of evidence link top functional networks directly to T-lymphocyte immunity underlining the value of mapping global gene, protein and miRNA expression.
4.  Human Cell Chips: Adapting DNA Microarray Spotting Technology to Cell-Based Imaging Assays 
PLoS ONE  2009;4(10):e7088.
Here we describe human spotted cell chips, a technology for determining cellular state across arrays of cells subjected to chemical or genetic perturbation. Cells are grown and treated under standard tissue culture conditions before being fixed and printed onto replicate glass slides, effectively decoupling the experimental conditions from the assay technique. Each slide is then probed using immunofluorescence or other optical reporter and assayed by automated microscopy. We show potential applications of the cell chip by assaying HeLa and A549 samples for changes in target protein abundance (of the dsRNA-activated protein kinase PKR), subcellular localization (nuclear translocation of NFκB) and activation state (phosphorylation of STAT1 and of the p38 and JNK stress kinases) in response to treatment by several chemical effectors (anisomycin, TNFα, and interferon), and we demonstrate scalability by printing a chip with ∼4,700 discrete samples of HeLa cells. Coupling this technology to high-throughput methods for culturing and treating cell lines could enable researchers to examine the impact of exogenous effectors on the same population of experimentally treated cells across multiple reporter targets potentially representing a variety of molecular systems, thus producing a highly multiplexed dataset with minimized experimental variance and at reduced reagent cost compared to alternative techniques. The ability to prepare and store chips also allows researchers to follow up on observations gleaned from initial screens with maximal repeatability.
