1.  Online GESS: prediction of miRNA-like off-target effects in large-scale RNAi screen data by seed region analysis 
BMC Bioinformatics  2014;15:192.
RNA interference (RNAi) is an effective and important tool used to study gene function. For large-scale screens, RNAi is used to systematically down-regulate genes of interest and analyze their roles in a biological process. However, RNAi is associated with off-target effects (OTEs), including microRNA (miRNA)-like OTEs. The contribution of reagent-specific OTEs to RNAi screen data sets can be significant. In addition, the post-screen validation process is time and labor intensive. Thus, the availability of robust approaches to identify candidate off-targeted transcripts would be beneficial.
Significant efforts have been made to eliminate false positive results attributable to sequence-specific OTEs associated with RNAi. These approaches have included improved algorithms for RNAi reagent design, incorporation of chemical modifications into siRNAs, and the use of various bioinformatics strategies to identify possible OTEs in screen results. Genome-wide Enrichment of Seed Sequence matches (GESS) was developed to identify potential off-targeted transcripts in large-scale screen data by seed-region analysis. Here, we introduce a user-friendly web application that provides researchers a relatively quick and easy way to perform GESS analysis on data from human or mouse cell-based screens using short interfering RNAs (siRNAs) or short hairpin RNAs (shRNAs), as well as for Drosophila screens using shRNAs. Online GESS relies on up-to-date transcript sequence annotations for human and mouse genes extracted from NCBI Reference Sequence (RefSeq) and Drosophila genes from FlyBase. The tool also accommodates analysis with user-provided reference sequence files.
Online GESS provides a straightforward user interface for genome-wide seed region analysis for human, mouse and Drosophila RNAi screen data. With the tool, users can either use a built-in database or provide a database of transcripts for analysis. This makes it possible to analyze RNAi data from any organism for which the user can provide transcript sequences.
PMCID: PMC4073188  PMID: 24934636
RNAi; Off-target effects; Data analysis; Seed region; miRNA; siRNA; shRNA; High-throughput screening
2.  Toward interoperable bioscience data 
Nature genetics  2012;44(2):121-126.
To make full use of research data, the bioscience community needs to adopt technologies and reward mechanisms that support interoperability and promote the growth of an open ‘data commoning’ culture. Here we describe the prerequisites for data commoning and present an established and growing ecosystem of solutions using the shared ‘Investigation-Study-Assay’ framework to support that vision.
PMCID: PMC3428019  PMID: 22281772
3.  Traditional Medicine Collection Tracking System (TM-CTS): A Database for Ethnobotanically-Driven Drug-Discovery Programs 
Journal of ethnopharmacology  2011;135(2):590-593.
Aim of the study.
Ethnobotanically-driven drug-discovery programs include data related to many aspects of the preparation of botanical medicines, from initial plant collection to chemical extraction and fractionation. The Traditional Medicine-Collection Tracking System (TM-CTS) was created to organize and store data of this type for an international collaborative project involving the systematic evaluation of commonly used Traditional Chinese Medicinal plants.
Materials and Methods.
The system was developed using domain-driven design techniques, and is implemented using Java, Hibernate, PostgreSQL, Business Intelligence and Reporting Tools (BIRT), and Apache Tomcat.
The TM-CTS relational database schema contains over 70 data types, comprising over 500 data fields. The system incorporates a number of unique features that are useful in the context of ethnobotanical projects such as support for information about botanical collection, method of processing, quality tests for plants with existing pharmacopoeia standards, chemical extraction and fractionation, and historical uses of the plants. The database also accommodates data provided in multiple languages and integration with a database system built to support high throughput screening based drug discovery efforts. It is accessed via a web-based application that provides extensive, multi-format reporting capabilities.
This new database system was designed to support a project evaluating the bioactivity of Chinese medicinal plants. The software used to create the database is open source, freely available, and could potentially be applied to other ethnobotanically-driven natural product collection and drug-discovery programs.
PMCID: PMC3096074  PMID: 21420479
Database; Traditional Chinese Medicine; High Throughput Screening; Ethnobotany; Drug Discovery
4.  Developing a library of authenticated Traditional Chinese Medicinal (TCM) plants for systematic biological evaluation — Rationale, methods and preliminary results from a Sino-American collaboration☆ 
Fitoterapia  2010;82(1):17-33.
While the popularity of and expenditures for herbal therapies (aka “ethnomedicines”) have increased globally in recent years, their efficacy, safety, mechanisms of action, potential as novel therapeutic agents, cost-effectiveness, or lack thereof, remain poorly defined and controversial. Moreover, published clinical trials evaluating the efficacy of herbal therapies have rightfully been criticized, post hoc, for their lack of quality assurance and reproducibility of study materials, as well as a lack of demonstration of plausible mechanisms and dosing effects. In short, clinical botanical investigations have suffered from the lack of a cohesive research strategy which draws on the expertise of all relevant specialties.
With this as background, US and Chinese co-investigators with expertise in Traditional Chinese Medicine (TCM), botany, chemistry and drug discovery, have jointly established a prototype library consisting of 202 authenticated medicinal plant and fungal species that collectively represent the therapeutic content of the majority of all commonly prescribed TCM herbal prescriptions. Currently housed at Harvard University, the library consists of duplicate or triplicate kilogram quantities of each authenticated and processed species, as well as “detanninized” extracts and sub-fractions of each mother extract. Each species has been collected at 2–3 sites, each separated geographically by hundreds of miles, with precise GPS documentation, and authenticated visually and chemically prior to testing for heavy metals and/or pesticides contamination. An explicit decision process has been developed whereby samples with the least contamination were selected to undergo ethanol extraction and HPLC sub-fractionation in preparation for high throughput screening across a broad array of biological targets including cancer biology targets. As envisioned, the subfractions in this artisan collection of authenticated medicinal plants will be tested for biological activity individually and in combinations (i.e., “complex mixtures”) consistent with traditional ethnomedical practice.
This manuscript summarizes the rationale, methods and preliminary “proof of principle” for the establishment of this prototype, authenticated medicinal plant library. It is hoped that these methods will foster scientific discoveries with therapeutic potential and enhance efforts to systematically evaluate commonly used herbal therapies worldwide.
PMCID: PMC3031246  PMID: 21108995
Herbal medicine; Library; Traditional Chinese; Ethnomedicine
5.  A High-Throughput, Cell-Based Screening Method for siRNA and Small Molecule Inhibitors of mTORC1 Signaling Using the In Cell Western Technique 
The mTORC1 pathway is a central regulator of cell growth, and defective mTORC1 regulation plays a causative role in a variety of human diseases, including cancer, tumor syndromes such as the tuberous sclerosis complex (TSC) and lymphangioleiomyomatosis (LAM), and metabolic diseases such as diabetes and obesity. Given the importance of mTORC1 signaling in these diseases, there has been significant interest in developing screening methods suitable for identifying inhibitors of mTORC1 activation. To this end, we have developed a high-throughput, cell-based assay for the detection of rpS6-phosphorylation as a measure of mTORC1 signaling. This assay takes advantage of the “In Cell Western” (ICW) technique using the Aerius infrared imaging system (LI-COR® Biosciences). The ICW procedure involves fixation and immunostaining of cells in a manner similar to standard immunofluorescence methods but takes advantage of secondary antibodies conjugated to infrared-excitable fluorophores for quantitative detection by the Aerius® scanner. In addition, the cells are stained with an infrared-excitable succinimidyl ester dye, which covalently modifies free amine groups in fixed cells and provides a quantitative measure of cell number. We present validation data and pilot screens in a 384-well format demonstrating that this assay provides a statistically robust method for both small molecule and siRNA screening approaches designed to identify inhibitors of mTORC1 signaling.
PMCID: PMC3096554  PMID: 20085456
6.  Screensaver: an open source lab information management system (LIMS) for high throughput screening facilities 
BMC Bioinformatics  2010;11:260.
Shared-usage high throughput screening (HTS) facilities are becoming more common in academe as large-scale small molecule and genome-scale RNAi screening strategies are adopted for basic research purposes. These shared facilities require a unique informatics infrastructure that must not only provide access to and analysis of screening data, but must also manage the administrative and technical challenges associated with conducting numerous, interleaved screening efforts run by multiple independent research groups.
We have developed Screensaver, a free, open source, web-based lab information management system (LIMS), to address the informatics needs of our small molecule and RNAi screening facility. Screensaver supports the storage and comparison of screening data sets, as well as the management of information about screens, screeners, libraries, and laboratory work requests. To our knowledge, Screensaver is one of the first applications to support the storage and analysis of data from both genome-scale RNAi screening projects and small molecule screening projects.
The informatics and administrative needs of an HTS facility may be best managed by a single, integrated, web-accessible application such as Screensaver. Screensaver has proven useful in meeting the requirements of the ICCB-Longwood/NSRB Screening Facility at Harvard Medical School, and has provided similar benefits to other HTS facilities.
PMCID: PMC3001403  PMID: 20482787
7.  Statistical Methods for Analysis of High-Throughput RNA Interference Screens 
Nature methods  2009;6(8):569-575.
RNA interference (RNAi) has become a powerful technique for reverse genetics and drug discovery and, in both of these areas, large-scale high-throughput RNAi screens are commonly performed. The statistical techniques used to analyze these screens are frequently borrowed directly from small-molecule screening; however small-molecule and RNAi data characteristics differ in meaningful ways. We examine the similarities and differences between RNAi and small-molecule screens, highlighting particular characteristics of RNAi screen data that must be addressed during analysis. Additionally, we provide guidance on selection of analysis techniques in the context of a sample workflow.
PMCID: PMC2789971  PMID: 19644458
8.  An Intermittent Live Cell Imaging Screen for siRNA Enhancers and Suppressors of a Kinesin-5 Inhibitor 
PLoS ONE  2009;4(10):e7339.
Kinesin-5 (also known as Eg5, KSP and Kif11) is required for assembly of a bipolar mitotic spindle. Small molecule inhibitors of Kinesin-5, developed as potential anti-cancer drugs, arrest cell in mitosis and promote apoptosis of cancer cells. We performed a genome-wide siRNA screen for enhancers and suppressors of a Kinesin-5 inhibitor in human cells to elucidate cellular responses, and thus identify factors that might predict drug sensitivity in cancers. Because the drug's actions play out over several days, we developed an intermittent imaging screen. Live HeLa cells expressing GFP-tagged histone H2B were imaged at 0, 24 and 48 hours after drug addition, and images were analyzed using open-source software that incorporates machine learning. This screen effectively identified siRNAs that caused increased mitotic arrest at low drug concentrations (enhancers), and vice versa (suppressors), and we report siRNAs that caused both effects. We then classified the effect of siRNAs for 15 genes where 3 or 4 out of 4 siRNA oligos tested were suppressors as assessed by time lapse imaging, and by testing for suppression of mitotic arrest in taxol and nocodazole. This identified 4 phenotypic classes of drug suppressors, which included known and novel genes. Our methodology should be applicable to other screens, and the suppressor and enhancer genes we identified may open new lines of research into mitosis and checkpoint biology.
PMCID: PMC2752188  PMID: 19802393
9.  The Pathway of Us11-Dependent Degradation of Mhc Class I Heavy Chains Involves a Ubiquitin-Conjugated Intermediate 
The Journal of Cell Biology  1999;147(1):45-58.
The human cytomegalovirus protein, US11, initiates the destruction of MHC class I heavy chains by targeting them for dislocation from the ER to the cytosol and subsequent degradation by the proteasome. We report the development of a permeabilized cell system that recapitulates US11-dependent degradation of class I heavy chains. We have used this system, in combination with experiments in intact cells, to identify and order intermediates in the US11-dependent degradation pathway. We find that heavy chains are ubiquitinated before they are degraded. Ubiquitination of the cytosolic tail of heavy chain is not required for its dislocation and degradation, suggesting that ubiquitination occurs after at least part of the heavy chain has been dislocated from the ER. Thus, ubiquitination of the heavy chain does not appear to be the signal to start dislocation. Ubiquitinated heavy chains are associated with membrane fractions, suggesting that ubiquitination occurs while the heavy chain is still bound to the ER membrane. Our results support a model in which US11 co-opts the quality control process by which the cell destroys misfolded ER proteins in order to specifically degrade MHC class I heavy chains.
PMCID: PMC2164983  PMID: 10508854
ubiquitin; US11; dislocation; endoplasmic reticulum; quality control
10.  Polyubiquitination Is Required for US11-dependent Movement of MHC Class I Heavy Chain from Endoplasmic Reticulum into Cytosol 
Molecular Biology of the Cell  2001;12(8):2546-2555.
The human cytomegalovirus protein US11 induces the dislocation of MHC class I heavy chains from the endoplasmic reticulum (ER) into the cytosol for degradation by the proteasome. With the use of a fractionated, permeabilized cell system, we find that US11 activity is needed only in the cell membranes and that additional cytosolic factors are required for heavy chain dislocation. We identify ubiquitin as one of the required cytosolic factors. Cytosol depleted of ubiquitin does not support heavy chain dislocation from the ER, and activity can be restored by adding back purified ubiquitin. Methylated-ubiquitin or a ubiquitin mutant lacking all lysine residues does not substitute for wild-type ubiquitin, suggesting that polyubiquitination is required for US11-dependent dislocation. We propose a new function for ubiquitin in which polyubiquitination prevents the lumenal domain of the MHC class I heavy chain from moving back into the ER lumen. A similar mechanism may be operating in the dislocation of misfolded proteins from the ER in the cellular quality control pathway.
PMCID: PMC58612  PMID: 11514634

