PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (779444)

Clipboard (0)
None

Related Articles

1.  Using iterative cluster merging with improved gap statistics to perform online phenotype discovery in the context of high-throughput RNAi screens 
BMC Bioinformatics  2008;9:264.
Background
The recent emergence of high-throughput automated image acquisition technologies has forever changed how cell biologists collect and analyze data. Historically, the interpretation of cellular phenotypes in different experimental conditions has been dependent upon the expert opinions of well-trained biologists. Such qualitative analysis is particularly effective in detecting subtle, but important, deviations in phenotypes. However, while the rapid and continuing development of automated microscope-based technologies now facilitates the acquisition of trillions of cells in thousands of diverse experimental conditions, such as in the context of RNA interference (RNAi) or small-molecule screens, the massive size of these datasets precludes human analysis. Thus, the development of automated methods which aim to identify novel and biological relevant phenotypes online is one of the major challenges in high-throughput image-based screening. Ideally, phenotype discovery methods should be designed to utilize prior/existing information and tackle three challenging tasks, i.e. restoring pre-defined biological meaningful phenotypes, differentiating novel phenotypes from known ones and clarifying novel phenotypes from each other. Arbitrarily extracted information causes biased analysis, while combining the complete existing datasets with each new image is intractable in high-throughput screens.
Results
Here we present the design and implementation of a novel and robust online phenotype discovery method with broad applicability that can be used in diverse experimental contexts, especially high-throughput RNAi screens. This method features phenotype modelling and iterative cluster merging using improved gap statistics. A Gaussian Mixture Model (GMM) is employed to estimate the distribution of each existing phenotype, and then used as reference distribution in gap statistics. This method is broadly applicable to a number of different types of image-based datasets derived from a wide spectrum of experimental conditions and is suitable to adaptively process new images which are continuously added to existing datasets. Validations were carried out on different dataset, including published RNAi screening using Drosophila embryos [Additional files 1, 2], dataset for cell cycle phase identification using HeLa cells [Additional files 1, 3, 4] and synthetic dataset using polygons, our methods tackled three aforementioned tasks effectively with an accuracy range of 85%–90%. When our method is implemented in the context of a Drosophila genome-scale RNAi image-based screening of cultured cells aimed to identifying the contribution of individual genes towards the regulation of cell-shape, it efficiently discovers meaningful new phenotypes and provides novel biological insight. We also propose a two-step procedure to modify the novelty detection method based on one-class SVM, so that it can be used to online phenotype discovery. In different conditions, we compared the SVM based method with our method using various datasets and our methods consistently outperformed SVM based method in at least two of three tasks by 2% to 5%. These results demonstrate that our methods can be used to better identify novel phenotypes in image-based datasets from a wide range of conditions and organisms.
Conclusion
We demonstrate that our method can detect various novel phenotypes effectively in complex datasets. Experiment results also validate that our method performs consistently under different order of image input, variation of starting conditions including the number and composition of existing phenotypes, and dataset from different screens. In our findings, the proposed method is suitable for online phenotype discovery in diverse high-throughput image-based genetic and chemical screens.
doi:10.1186/1471-2105-9-264
PMCID: PMC2443381  PMID: 18534020
2.  Clustering phenotype populations by genome-wide RNAi and multiparametric imaging 
How to predict gene function from phenotypic cues is a longstanding question in biology.Using quantitative multiparametric imaging, RNAi-mediated cell phenotypes were measured on a genome-wide scale.On the basis of phenotypic ‘neighbourhoods', we identified previously uncharacterized human genes as mediators of the DNA damage response pathway and the maintenance of genomic integrity.The phenotypic map is provided as an online resource at http://www.cellmorph.org for discovering further functional relationships for a broad spectrum of biological module
Genetic screens for phenotypic similarity have made key contributions for associating genes with biological processes. Aggregating genes by similarity of their loss-of-function phenotype has provided insights into signalling pathways that have a conserved function from Drosophila to human (Nusslein-Volhard and Wieschaus, 1980; Bier, 2005). Complex visual phenotypes, such as defects in pattern formation during development, greatly facilitated the classification of genes into pathways, and phenotypic similarities in many cases predicted molecular relationships. With RNA interference (RNAi), highly parallel phenotyping of loss-of-function effects in cultured cells has become feasible in many organisms whose genome have been sequenced (Boutros and Ahringer, 2008). One of the current challenges is the computational categorization of visual phenotypes and the prediction of gene function and associated biological processes. With large parts of the genome still being in unchartered territory, deriving functional information from large-scale phenotype analysis promises to uncover novel gene–gene relationships and to generate functional maps to explore cellular processes.
In this study, we developed an automated approach using RNAi-mediated cell phenotypes, multiparametric imaging and computational modelling to obtain functional information on previously uncharacterized genes. To generate broad, computer-readable phenotypic signatures, we measured the effect of RNAi-mediated knockdowns on changes of cell morphology in human cells on a genome-wide scale. First, the several million cells were stained for nuclear and cytoskeletal markers and then imaged using automated microscopy. On the basis of fluorescent markers, we established an automated image analysis to classify individual cells (Figure 1A). After cell segmentation for determining nuclei and cell boundaries (Figure 1C), we computed 51 cell descriptors that quantified intensities, shape characteristics and texture (Figure 1F). Individual cells were categorized into 1 of 10 classes, which included cells showing protrusion/elongation, cells in metaphase, large cells, condensed cells, cells with lamellipodia and cellular debris (Figure 1D and E). Each siRNA knockdown was summarized by a phenotypic profile and differences between RNAi knockdowns were quantified by the similarity between phenotypic profiles. We termed the vector of scores a phenoprint (Figure 3C) and defined the phenotypic distance between a pair of perturbations as the distance between their corresponding phenoprints.
To visualize the distribution of all phenoprints, we plotted them in a genome-wide map as a two-dimensional representation of the phenotypic similarity relationships (Figure 3A). The complete data set and an interactive version of the phenotypic map are available at http://www.cellmorph.org. The map identified phenotypic ‘neighbourhoods', which are characterized by cells with lamellipodia (WNK3, ANXA4), cells with prominent actin fibres (ODF2, SOD3), abundance of large cells (CA14), many elongated cells (SH2B2, ELMO2), decrease in cell number (TPX2, COPB1, COPA), increase in number of cells in metaphase (BLR1, CIB2) and combinations of phenotypes such as presence of large cells with protrusions and bright nuclei (PTPRZ1, RRM1; Figure 3B).
To test whether phenotypic similarity might serve as a predictor of gene function, we focused our further analysis on two clusters that contained genes associated with the DNA damage response (DDR) and genomic integrity (Figure 3A and C). The first phenotypic cluster included proteins with kinetochore-associated functions such as NUF2 (Figure 3B) and SGOL1. It also contained the centrosomal protein CEP164 that has been described as an important mediator of the DNA damage-activated signalling cascade (Sivasubramaniam et al, 2008) and the largely uncharacterized genes DONSON and SON. A second phenotypically distinct cluster included previously described components of the DDR pathway such as RRM1 (Figure 3A–C), CLSPN, PRIM2 and SETD8. Furthermore, this cluster contained the poorly characterized genes CADM1 and CD3EAP.
Cells activate a signalling cascade in response to DNA damage induced by exogenous and endogenous factors. Central are the kinases ATM and ATR as they serve as sensors of DNA damage and activators of further downstream kinases (Harper and Elledge, 2007; Cimprich and Cortez, 2008). To investigate whether DONSON, SON, CADM1 and CD3EAP, which were found in phenotypic ‘neighbourhoods' to known DDR components, have a role in the DNA damage signalling pathway, we tested the effect of their depletion on the DDR on γ irradiation. As indicated by reduced CHEK1 phosphorylation, siRNA knock down of DONSON, SON, CD3EAP or CADM1 resulted in impaired DDR signalling on γ irradiation. Furthermore, knock down of DONSON or SON reduced phosphorylation of downstream effectors such as NBS1, CHEK1 and the histone variant H2AX on UVC irradiation. DONSON depletion also impaired recruitment of RPA2 onto chromatin and SON knockdown reduced RPA2 phosphorylation indicating that DONSON and SON presumably act downstream of the activation of ATM. In agreement to their phenotypic profile, these results suggest that DONSON, SON, CADM1 and CD3EAP are important mediators of the DDR. Further experiments demonstrated that they are also required for the maintenance of genomic integrity.
In summary, we show that genes with similar phenotypic profiles tend to share similar functions. The power of our computational and experimental approach is demonstrated by the identification of novel signalling regulators whose phenotypic profiles were found in proximity to known biological modules. Therefore, we believe that such phenotypic maps can serve as a resource for functional discovery and characterization of unknown genes. Furthermore, such approaches are also applicable for other perturbation reagents, such as small molecules in drug discovery and development. One could also envision combined maps that contain both siRNAs and small molecules to predict target–small molecule relationships and potential side effects.
Genetic screens for phenotypic similarity have made key contributions to associating genes with biological processes. With RNA interference (RNAi), highly parallel phenotyping of loss-of-function effects in cells has become feasible. One of the current challenges however is the computational categorization of visual phenotypes and the prediction of biological function and processes. In this study, we describe a combined computational and experimental approach to discover novel gene functions and explore functional relationships. We performed a genome-wide RNAi screen in human cells and used quantitative descriptors derived from high-throughput imaging to generate multiparametric phenotypic profiles. We show that profiles predicted functions of genes by phenotypic similarity. Specifically, we examined several candidates including the largely uncharacterized gene DONSON, which shared phenotype similarity with known factors of DNA damage response (DDR) and genomic integrity. Experimental evidence supports that DONSON is a novel centrosomal protein required for DDR signalling and genomic integrity. Multiparametric phenotyping by automated imaging and computational annotation is a powerful method for functional discovery and mapping the landscape of phenotypic responses to cellular perturbations.
doi:10.1038/msb.2010.25
PMCID: PMC2913390  PMID: 20531400
DNA damage response signalling; massively parallel phenotyping; phenotype networks; RNAi screening
3.  Single-cell analysis of population context advances RNAi screening at multiple levels 
A large set of high-content RNAi screens investigating mammalian virus infection and multiple cellular activities is analysed to reveal the impact of population context on phenotypic variability and to identify indirect RNAi effects.
Cell population context determines phenotypes in RNAi screens of multiple cellular activities (including virus infection, cell size regulation, endocytosis, and lipid homeostasis), which can be accounted for by a combination of novel image analysis and multivariate statistical methods.Accounting for cell population context-mediated effects strongly changes the reproducibility and consistency of RNAi screens across cell lines as well as of siRNAs targeting the same gene.Such analyses can identify the perturbed regulation of population context dependent cell-to-cell variability, a novel perturbation phenotype.Overall, these methods advance the use of large-scale RNAi screening for a systems-level understanding of cellular processes.
Isogenic cells in culture show strong variability, which arises from dynamic adaptations to the microenvironment of individual cells. Here we study the influence of the cell population context, which determines a single cell's microenvironment, in image-based RNAi screens. We developed a comprehensive computational approach that employs Bayesian and multivariate methods at the single-cell level. We applied these methods to 45 RNA interference screens of various sizes, including 7 druggable genome and 2 genome-wide screens, analysing 17 different mammalian virus infections and four related cell physiological processes. Analysing cell-based screens at this depth reveals widespread RNAi-induced changes in the population context of individual cells leading to indirect RNAi effects, as well as perturbations of cell-to-cell variability regulators. We find that accounting for indirect effects improves the consistency between siRNAs targeted against the same gene, and between replicate RNAi screens performed in different cell lines, in different labs, and with different siRNA libraries. In an era where large-scale RNAi screens are increasingly performed to reach a systems-level understanding of cellular processes, we show that this is often improved by analyses that account for and incorporate the single-cell microenvironment.
doi:10.1038/msb.2012.9
PMCID: PMC3361004  PMID: 22531119
cell-to-cell variability; image analysis; population context; RNAi; virus infection
4.  RNAiDB and PhenoBlast: web tools for genome-wide phenotypic mapping projects 
Nucleic Acids Research  2004;32(Database issue):D406-D410.
RNA interference (RNAi) is being used in large-scale genomic studies as a rapid way to obtain in vivo functional information associated with specific genes. How best to archive and mine the complex data derived from these studies provides a series of challenges associated with both the methods used to elicit the RNAi response and the functional data gathered. RNAiDB (RNAi Database; http://www.rnai.org) has been created for the archival, distribution and analysis of phenotypic data from large-scale RNAi analyses in Caenorhabditis elegans. The database contains a compendium of publicly available data and provides information on experimental methods and phenotypic results, including raw data in the form of images and streaming time-lapse movies. Phenotypic summaries together with graphical displays of RNAi to gene mappings allow quick intuitive comparison of results from different RNAi assays and visualization of the gene product(s) potentially inhibited by each RNAi experiment based on multiple sequence analysis methods. RNAiDB can be searched using combinatorial queries and using the novel tool PhenoBlast, which ranks genes according to their overall phenotypic similarity. RNAiDB could serve as a model database for distributing and navigating in vivo functional information from large-scale systematic phenotypic analyses in different organisms.
doi:10.1093/nar/gkh110
PMCID: PMC308844  PMID: 14681444
5.  An image score inference system for RNAi genome-wide screening based on fuzzy mixture regression modeling 
With recent advances in fluorescence microscopy imaging techniques and methods of gene knock down by RNA interference (RNAi), genome-scale high-content screening (HCS) has emerged as a powerful approach to systematically identify all parts of complex biological processes. However, a critical barrier preventing fulfillment of the success is the lack of efficient and robust methods for automating RNAi image analysis and quantitative evaluation of the gene knock down effects on huge volume of HCS data. Facing such opportunities and challenges, we have started investigation of automatic methods towards the development of a fully automatic RNAi-HCS system. Particularly important are reliable approaches to cellular phenotype classification and image-based gene function estimation.
We have developed a HCS analysis platform that consists of two main components: fluorescence image analysis and image scoring. For image analysis, we used a two-step enhanced watershed method to extract cellular boundaries from HCS images. Segmented cells were classified into several predefined phenotypes based on morphological and appearance features. Using statistical characteristics of the identified phenotypes as a quantitative description of the image, a score is generated that reflects gene function. Our scoring model integrates fuzzy gene class estimation and single regression models. The final functional score of an image was derived using the weighted combination of the inference from several support vector-based regression models. We validated our phenotype classification method and scoring system on our cellular phenotype and gene database with expert ground truth labeling.
We built a database of high-content, 3-channel, fluorescence microscopy images of Drosophila Kc167 cultured cells that were treated with RNAi to perturb gene function. The proposed informatics system for microscopy image analysis is tested on this database. Both of the two main components, automated phenotype classification and image scoring system, were evaluated. The robustness and efficiency of our system were validated in quantitatively predicting the biological relevance of genes.
doi:10.1016/j.jbi.2008.04.007
PMCID: PMC2763194  PMID: 18547870
High-content screening; Image score inference
6.  RNAi Screening: New Approaches, Understandings and Organisms 
RNA interference (RNAi) leads to sequence-specific knockdown of gene function. The approach can be used in large-scale screens to interrogate function in various model organisms and an increasing number of other species. Genome-scale RNAi screens are routinely performed in cultured or primary cells or in vivo in organisms such as C. elegans. High-throughput RNAi screening is benefitting from the development of sophisticated new instrumentation and software tools for collecting and analyzing data, including high-content image data. The results of large-scale RNAi screens have already proved useful, leading to new understandings of gene function relevant to topics such as infection, cancer, obesity and aging. Nevertheless, important caveats apply and should be taken into consideration when developing or interpreting RNAi screens. Some level of false discovery is inherent to high-throughput approaches and specific to RNAi screens, false discovery due to off-target effects (OTEs) of RNAi reagents remains a problem. The need to improve our ability to use RNAi to elucidate gene function at large scale and in additional systems continues to be addressed through improved RNAi library design, development of innovative computational and analysis tools and other approaches.
doi:10.1002/wrna.110
PMCID: PMC3249004  PMID: 21953743
RNAi; high-throughput screens; high-content imaging; cell-based assays
7.  Systemic RNAi mediated gene silencing in the anhydrobiotic nematode Panagrolaimus superbus 
Background
Gene silencing by RNA interference (RNAi) is a powerful tool for functional genomics. Although RNAi was first described in Caenorhabditis elegans, several nematode species are unable to mount an RNAi response when exposed to exogenous double stranded RNA (dsRNA). These include the satellite model organisms Pristionchus pacificus and Oscheius tipulae. Available data also suggest that the RNAi pathway targeting exogenous dsRNA may not be fully functional in some animal parasitic nematodes. The genus Panagrolaimus contains bacterial feeding nematodes which occupy a diversity of niches ranging from polar, temperate and semi-arid soils to terrestrial mosses. Thus many Panagrolaimus species are adapted to tolerate freezing and desiccation and are excellent systems to study the molecular basis of environmental stress tolerance. We investigated whether Panagrolaimus is susceptible to RNAi to determine whether this nematode could be used in large scale RNAi studies in functional genomics.
Results
We studied two species: Panagrolaimus sp. PS1159 and Panagrolaimus superbus. Both nematode species displayed embryonic lethal RNAi phenotypes following ingestion of Escherichia coli expressing dsRNA for the C. elegans embryonic lethal genes Ce-lmn-1 and Ce-ran-4. Embryonic lethal RNAi phenotypes were also obtained in both species upon ingestion of dsRNA for the Panagrolaimus genes ef1b and rps-2. Single nematode RT-PCR showed that a significant reduction in mRNA transcript levels occurred for the target ef1b and rps-2 genes in RNAi treated Panagrolaimus sp. 1159 nematodes. Visible RNAi phenotypes were also observed when P. superbus was exposed to dsRNA for structural genes encoding contractile proteins. All RNAi phenotypes were highly penetrant, particularly in P. superbus.
Conclusion
This demonstration that Panagrolaimus is amenable to RNAi by feeding will allow the development of high throughput methods of RNAi screening for P. superbus. This greatly enhances the utility of this nematode as a model system for the study of the molecular biology of anhydrobiosis and cryobiosis and as a possible satellite model nematode for comparative and functional genomics. Our data also identify another nematode infraorder which is amenable to RNAi and provide additional information on the diversity of RNAi phenotypes in nematodes.
doi:10.1186/1471-2199-9-58
PMCID: PMC2453295  PMID: 18565215
8.  GenomeRNAi: a database for cell-based and in vivo RNAi phenotypes, 2013 update 
Nucleic Acids Research  2012;41(Database issue):D1021-D1026.
RNA interference (RNAi) represents a powerful method to systematically study loss-of-function phenotypes on a large scale with a wide variety of biological assays, constituting a rich source for the assignment of gene function. The GenomeRNAi database (http://www.genomernai.org) makes available RNAi phenotype data extracted from the literature for human and Drosophila. It also provides RNAi reagent information, along with an assessment as to their efficiency and specificity. This manuscript describes an update of the database previously featured in the NAR Database Issue. The new version has undergone a complete re-design of the user interface, providing an intuitive, flexible framework for additional functionalities. Screen information and gene-reagent-phenotype associations are now available for download. The integration with other resources has been improved by allowing in-links via GenomeRNAi screen IDs, or external gene or reagent identifiers. A distributed annotation system (DAS) server enables the visualization of the phenotypes and reagents in the context of a genome browser. We have added a page listing ‘frequent hitters’, i.e. genes that show a phenotype in many screens, which might guide on-going RNAi studies. Structured annotation guidelines have been established to facilitate consistent curation, and a submission template for direct submission by data producers is available for download.
doi:10.1093/nar/gks1170
PMCID: PMC3531141  PMID: 23193271
9.  RNA Interference in Schistosoma mansoni Schistosomula: Selectivity, Sensitivity and Operation for Larger-Scale Screening 
Background
The possible emergence of resistance to the only available drug for schistosomiasis spurs drug discovery that has been recently incentivized by the availability of improved transcriptome and genome sequence information. Transient RNAi has emerged as a straightforward and important technique to interrogate that information through decreased or loss of gene function and identify potential drug targets. To date, RNAi studies in schistosome stages infecting humans have focused on single (or up to 3) genes of interest. Therefore, in the context of standardizing larger RNAi screens, data are limited on the extent of possible off-targeting effects, gene-to-gene variability in RNAi efficiency and the operational capabilities and limits of RNAi.
Methodology/Principal Findings
We investigated in vitro the sensitivity and selectivity of RNAi using double-stranded (ds)RNA (approximately 500 bp) designed to target 11 Schistosoma mansoni genes that are expressed in different tissues; the gut, tegument and otherwise. Among the genes investigated were 5 that had been previously predicted to be essential for parasite survival. We employed mechanically transformed schistosomula that are relevant to parasitism in humans, amenable to screen automation and easier to obtain in greater numbers than adult parasites. The operational parameters investigated included defined culture media for optimal parasite maintenance, transfection strategy, time- and dose- dependency of RNAi, and dosing limits. Of 7 defined culture media tested, Basch Medium 169 was optimal for parasite maintenance. RNAi was best achieved by co-incubating parasites and dsRNA (standardized to 30 µg/ml for 6 days); electroporation provided no added benefit. RNAi, including interference of more than one transcript, was selective to the gene target(s) within the pools of transcripts representative of each tissue. Concentrations of dsRNA above 90 µg/ml were directly toxic. RNAi efficiency was transcript-dependent (from 40 to >75% knockdown relative to controls) and this may have contributed to the lack of obvious phenotypes observed, even after prolonged incubations of 3 weeks. Within minutes of their mechanical preparation from cercariae, schistosomula accumulated fluorescent macromolecules in the gut indicating that the gut is an important route through which RNAi is expedited in the developing parasite.
Conclusions
Transient RNAi operates gene-selectively in S. mansoni newly transformed schistosomula yet the sensitivity of individual gene targets varies. These findings and the operational parameters defined will facilitate larger RNAi screens.
Author Summary
RNA interference (RNAi) is a technique to selectively suppress mRNA of individual genes and, consequently, their cognate proteins. RNAi using double-stranded (ds) RNA has been used to interrogate the function of mainly single genes in the flatworm, Schistosoma mansoni, one of a number of schistosome species causing schistosomiasis. In consideration of large-scale screens to identify candidate drug targets, we examined the selectivity and sensitivity (the degree of suppression) of RNAi for 11 genes produced in different tissues of the parasite: the gut, tegument (surface) and otherwise. We used the schistosomulum stage prepared from infective cercariae larvae which are accessible in large numbers and adaptable to automated screening platforms. We found that RNAi suppresses transcripts selectively, however, the sensitivity of suppression varies (40%–>75%). No obvious changes in the parasite occurred post-RNAi, including after targeting the mRNA of genes that had been computationally predicted to be essential for survival. Additionally, we defined operational parameters to facilitate large-scale RNAi, including choice of culture medium, transfection strategy to deliver dsRNA, dose- and time-dependency, and dosing limits. Finally, using fluorescent probes, we show that the developing gut allows rapid entrance of dsRNA into the parasite to initiate RNAi.
doi:10.1371/journal.pntd.0000850
PMCID: PMC2957409  PMID: 20976050
10.  Identification of Neural Outgrowth Genes using Genome-Wide RNAi 
PLoS Genetics  2008;4(7):e1000111.
While genetic screens have identified many genes essential for neurite outgrowth, they have been limited in their ability to identify neural genes that also have earlier critical roles in the gastrula, or neural genes for which maternally contributed RNA compensates for gene mutations in the zygote. To address this, we developed methods to screen the Drosophila genome using RNA-interference (RNAi) on primary neural cells and present the results of the first full-genome RNAi screen in neurons. We used live-cell imaging and quantitative image analysis to characterize the morphological phenotypes of fluorescently labelled primary neurons and glia in response to RNAi-mediated gene knockdown. From the full genome screen, we focused our analysis on 104 evolutionarily conserved genes that when downregulated by RNAi, have morphological defects such as reduced axon extension, excessive branching, loss of fasciculation, and blebbing. To assist in the phenotypic analysis of the large data sets, we generated image analysis algorithms that could assess the statistical significance of the mutant phenotypes. The algorithms were essential for the analysis of the thousands of images generated by the screening process and will become a valuable tool for future genome-wide screens in primary neurons. Our analysis revealed unexpected, essential roles in neurite outgrowth for genes representing a wide range of functional categories including signalling molecules, enzymes, channels, receptors, and cytoskeletal proteins. We also found that genes known to be involved in protein and vesicle trafficking showed similar RNAi phenotypes. We confirmed phenotypes of the protein trafficking genes Sec61alpha and Ran GTPase using Drosophila embryo and mouse embryonic cerebral cortical neurons, respectively. Collectively, our results showed that RNAi phenotypes in primary neural culture can parallel in vivo phenotypes, and the screening technique can be used to identify many new genes that have important functions in the nervous system.
Author Summary
Development and function of the brain requires the coordinated action of thousands of genes, and currently we understand the roles of only a small fraction of them. Recent advances in genomics, such as the sequencing of entire genomes and the discovery of RNA-interference as a means of testing the effects of gene loss, have opened up the possibility to systematically analyze the function of all known and predicted genes in an organism. Until now, this type of functional genomics approach has not been applied to the study of very complex cells, such as the brain's neurons, on a full-genome scale. In this work, we developed techniques to test all genes, one by one in a rapid manner, for their potential role in neuronal development using neurons isolated from fruit fly embryos. These results yielded a global perspective of what types of genes are necessary for brain development; importantly, they show that a large variety of genes can be studied in this way.
doi:10.1371/journal.pgen.1000111
PMCID: PMC2435276  PMID: 18604272
11.  The FLIGHT Drosophila RNAi database 
Fly  2010;4(4):344-348.
FLIGHT (http://flight.icr.ac.uk/) is an online resource compiling data from high-throughput Drosophila in vivo and in vitro RNAi screens. FLIGHT includes details of RNAi reagents and their predicted off-target effects, alongside RNAi screen hits, scores and phenotypes, including images from high-content screens. The latest release of FLIGHT is designed to enable users to upload, analyze, integrate and share their own RNAi screens. Users can perform multiple normalizations, view quality control plots, detect and assign screen hits and compare hits from multiple screens using a variety of methods including hierarchical clustering. FLIGHT integrates RNAi screen data with microarray gene expression as well as genomic annotations and genetic/physical interaction datasets to provide a single interface for RNAi screen analysis and datamining in Drosophila.
doi:10.4161/fly.4.4.13303
PMCID: PMC3174485  PMID: 20855970
RNAi; database; integration; bioinformatics; phenotype
12.  Online GESS: prediction of miRNA-like off-target effects in large-scale RNAi screen data by seed region analysis 
BMC Bioinformatics  2014;15:192.
Background
RNA interference (RNAi) is an effective and important tool used to study gene function. For large-scale screens, RNAi is used to systematically down-regulate genes of interest and analyze their roles in a biological process. However, RNAi is associated with off-target effects (OTEs), including microRNA (miRNA)-like OTEs. The contribution of reagent-specific OTEs to RNAi screen data sets can be significant. In addition, the post-screen validation process is time and labor intensive. Thus, the availability of robust approaches to identify candidate off-targeted transcripts would be beneficial.
Results
Significant efforts have been made to eliminate false positive results attributable to sequence-specific OTEs associated with RNAi. These approaches have included improved algorithms for RNAi reagent design, incorporation of chemical modifications into siRNAs, and the use of various bioinformatics strategies to identify possible OTEs in screen results. Genome-wide Enrichment of Seed Sequence matches (GESS) was developed to identify potential off-targeted transcripts in large-scale screen data by seed-region analysis. Here, we introduce a user-friendly web application that provides researchers a relatively quick and easy way to perform GESS analysis on data from human or mouse cell-based screens using short interfering RNAs (siRNAs) or short hairpin RNAs (shRNAs), as well as for Drosophila screens using shRNAs. Online GESS relies on up-to-date transcript sequence annotations for human and mouse genes extracted from NCBI Reference Sequence (RefSeq) and Drosophila genes from FlyBase. The tool also accommodates analysis with user-provided reference sequence files.
Conclusion
Online GESS provides a straightforward user interface for genome-wide seed region analysis for human, mouse and Drosophila RNAi screen data. With the tool, users can either use a built-in database or provide a database of transcripts for analysis. This makes it possible to analyze RNAi data from any organism for which the user can provide transcript sequences.
doi:10.1186/1471-2105-15-192
PMCID: PMC4073188  PMID: 24934636
RNAi; Off-target effects; Data analysis; Seed region; miRNA; siRNA; shRNA; High-throughput screening
13.  Phenotype Recognition with Combined Features and Random Subspace Classifier Ensemble 
BMC Bioinformatics  2011;12:128.
Background
Automated, image based high-content screening is a fundamental tool for discovery in biological science. Modern robotic fluorescence microscopes are able to capture thousands of images from massively parallel experiments such as RNA interference (RNAi) or small-molecule screens. As such, efficient computational methods are required for automatic cellular phenotype identification capable of dealing with large image data sets. In this paper we investigated an efficient method for the extraction of quantitative features from images by combining second order statistics, or Haralick features, with curvelet transform. A random subspace based classifier ensemble with multiple layer perceptron (MLP) as the base classifier was then exploited for classification. Haralick features estimate image properties related to second-order statistics based on the grey level co-occurrence matrix (GLCM), which has been extensively used for various image processing applications. The curvelet transform has a more sparse representation of the image than wavelet, thus offering a description with higher time frequency resolution and high degree of directionality and anisotropy, which is particularly appropriate for many images rich with edges and curves. A combined feature description from Haralick feature and curvelet transform can further increase the accuracy of classification by taking their complementary information. We then investigate the applicability of the random subspace (RS) ensemble method for phenotype classification based on microscopy images. A base classifier is trained with a RS sampled subset of the original feature set and the ensemble assigns a class label by majority voting.
Results
Experimental results on the phenotype recognition from three benchmarking image sets including HeLa, CHO and RNAi show the effectiveness of the proposed approach. The combined feature is better than any individual one in the classification accuracy. The ensemble model produces better classification performance compared to the component neural networks trained. For the three images sets HeLa, CHO and RNAi, the Random Subspace Ensembles offers the classification rates 91.20%, 98.86% and 91.03% respectively, which compares sharply with the published result 84%, 93% and 82% from a multi-purpose image classifier WND-CHARM which applied wavelet transforms and other feature extraction methods. We investigated the problem of estimation of ensemble parameters and found that satisfactory performance improvement could be brought by a relative medium dimensionality of feature subsets and small ensemble size.
Conclusions
The characteristics of curvelet transform of being multiscale and multidirectional suit the description of microscopy images very well. It is empirically demonstrated that the curvelet-based feature is clearly preferred to wavelet-based feature for bioimage descriptions. The random subspace ensemble of MLPs is much better than a number of commonly applied multi-class classifiers in the investigated application of phenotype recognition.
doi:10.1186/1471-2105-12-128
PMCID: PMC3098787  PMID: 21529372
14.  RNAi screen of Salmonella invasion shows role of COPI in membrane targeting of cholesterol and Cdc42 
A genome wide RNAi screen identifies 72 host cell genes affecting S. Typhimurium entry, including actin regulators and COPI. This study implicates COPI-dependent cholesterol and sphingolipid localization as a common mechanism of infection by bacterial and viral pathogens.
Genome-scale RNAi screen identifies 72 host genes affecting S. Typhimurium host cell invasion.Step-specific follow-up assays assign the phenotypes to specific steps of the invasion process.COPI effects on host cell binding, ruffling and invasion were traced to a key role of COPI in membrane targeting of cholesterol, sphingolipids, Rac1 and Cdc42.This new role of COPI explains why COPI is required for host cell infection by numerous bacterial and viral pathogens.
Pathogens are not only a menace to public health, but they also provide excellent tools for probing host cell function. Thus, studying infection mechanisms has fueled progress in cell biology (Ridley et al, 1992; Welch et al, 1997). In the presented study, we have performed an RNAi screen to identify host cell genes required for Salmonella host cell invasion. This screen identified proteins known to contribute to Salmonella-induced actin rearrangements (e.g., Cdc42 and the Arp2/3 complex; reviewed in Schlumberger and Hardt, 2006) and vesicular traffic (e.g., Rab7) as well as unexpected hits, such as the COPI complex. COPI is a known organizer of Golgi-to-ER vesicle transport (Bethune et al, 2006; Beck et al, 2009). Here, we show that COPI is also involved in plasma membrane targeting of cholesterol, sphingolipids and the Rho GTPases Cdc42 and Rac1, essential host cell factors required for Salmonella invasion. This explains why COPI depletion inhibits infection by S. Typhimurium and illustrates how combining bacterial pathogenesis and systems approaches can promote cell biology.
Salmonella Typhimurium is a common food-borne pathogen and worldwide a major public health problem causing severe diarrhea. The pathogen uses the host's gut mucosa as a portal of entry and gut tissue invasion is a key event leading to the disease. This explains the intense interest from medicine and basic biology in the mechanism of Salmonella host cell invasion.
Tissue culture infection models have delineated a sequence of events leading host cell invasion (Figure 1; Schlumberger and Hardt, 2006): (i) pathogen binding to the host cell surface; (ii) activation of a syringe-like apparatus (‘Type III secretion system 1', T1) of the bacterium and injection of a bacterial toxin cocktail into the host cell. These toxins include SopE, a key virulence factor triggering invasion (Hardt et al, 1998), which was analyzed in our study; (iii) toxin-triggered membrane ruffling. To a significant extent, this is facilitated by SopE-triggered activation of Cdc42 and Rac1 and subsequent actin polymerization at the site of infection; (iv) engulfment of the pathogen within a vesicular compartment (SCV) and (v) maturation of the SCV, a process driven by a second Type III secretion system (T2), which is expressed by the pathogen upon bacterial entry (Figure 1). This sequence of events mediates Salmonella invasion into the gut epithelium and illustrates that this pathogen can be used for probing mechanisms of host cell actin control, membrane biogenesis, vesicle formation and vesicular trafficking.
SopE is a key virulence factor of invasion and triggers the activation of Cdc42 and Rac1 and subsequent actin polymerization at the site of infection. We have employed a SopE-expressing S. Typhimurium strain and RNAi screening technology to identify host cell factors affecting invasion. First, we developed an automated fluorescence microscopy assay to quantify S. Typhimurium entry in a high-throughput format (Figure 1C). This assay was based on a GFP reporter expressed by the pathogen after invasion and maturation of the SCV. Using this assay, we screened a ‘druggable genome' siRNA library (6978 genes, 3 oligos each, 1 oligo per well) and identified 72 invasion hits. These included established regulators of the actin cytoskeleton (Cdc42, Arp2/3, Nap1; Schlumberger and Hardt, 2006), some of which have not been implicated so far in Salmonella entry (Pfn1, Cap1), as well as proteins not previously thought to influence infection (Atp1a1, Rbx1, COPI complex). Potentially, these hits could affect any step of the invasion process (Figure 1A).
In the second stage of the study, we have assigned each ‘invasion hit' to particular steps of the invasion process. For this purpose, we developed step-specific assays for Salmonella binding, injection, ruffling and membrane engulfment and re-screened the genes found as hits in the first screen (four siRNAs per gene). As expected, a significant number of ‘hits' affected binding to the host cell, others affected binding and ruffling (e.g., Pfn1, Itgβ5, Cap1), a few were specific for the ruffling step (e.g., Cdc42) and some affected SCV maturation, namely Rab7a, the trafficking protein Vps39 and the vacuolar proton pump Atp6ap2. Thus, our experimental strategy allowed mechanistic interpretation and linked novel hits to particular phenotypes, thus providing a basis for further studies (Figure 1).
COPI depletion impaired effector injection and ruffling. This was surprising, as the COPI complex was known to regulate retrogade Golgi-to-ER transport, but was not expected to affect pathogen interactions at the plasma membrane. Therefore, we have investigated the underlying mechanism. We have observed that COPI depletion entailed dramatic changes in the plasma membrane composition (Figure 6). Cholesterol and sphingolipids, which form domains (‘lipid rafts') in the plasma membrane, were depleted from the cell surface and redirected into a large vesicular compartment. The same was true for the Rho GTPases Rac1 and Cdc42. This strong decrease in the amount of cholesterol-enriched microdomains and Rho GTPases in the plasma membrane explained the observed defects in S. Typhimurium host cell invasion and assigned a novel role for COPI in controlling mammalian plasma membrane composition. It should be noted that other viral and bacterial pathogens do show a similar dependency on host cellular COPI and plasma membrane lipids. This includes notorious pathogens such as Staphylococcus aureus (Ramet et al, 2002; Potrich et al, 2009), Listeria monocytogenes (Seveau et al, 2004; Agaisse et al, 2005; Cheng et al, 2005; Gekara et al, 2005), Mycobacterium tuberculosis (Munoz et al, 2009), Chlamydia trachomatis (Elwell et al, 2008), influenza virus (Hao et al, 2008; Konig et al, 2010), hepatitis C virus (Tai et al, 2009; Popescu and Dubuisson, 2010) and the vesicular stomatitis virus (presented study) and suggests that COPI-mediated control of host cell plasma membrane composition might be of broad importance for pathogenesis. Future work will have to address whether this might offer starting points for developing anti-infective therapeutics with a very broad spectrum of activity.
The pathogen Salmonella Typhimurium is a common cause of diarrhea and invades the gut tissue by injecting a cocktail of virulence factors into epithelial cells, triggering actin rearrangements, membrane ruffling and pathogen entry. One of these factors is SopE, a G-nucleotide exchange factor for the host cellular Rho GTPases Rac1 and Cdc42. How SopE mediates cellular invasion is incompletely understood. Using genome-scale RNAi screening we identified 72 known and novel host cell proteins affecting SopE-mediated entry. Follow-up assays assigned these ‘hits' to particular steps of the invasion process; i.e., binding, effector injection, membrane ruffling, membrane closure and maturation of the Salmonella-containing vacuole. Depletion of the COPI complex revealed a unique effect on virulence factor injection and membrane ruffling. Both effects are attributable to mislocalization of cholesterol, sphingolipids, Rac1 and Cdc42 away from the plasma membrane into a large intracellular compartment. Equivalent results were obtained with the vesicular stomatitis virus. Therefore, COPI-facilitated maintenance of lipids may represent a novel, unifying mechanism essential for a wide range of pathogens, offering opportunities for designing new drugs.
doi:10.1038/msb.2011.7
PMCID: PMC3094068  PMID: 21407211
coatomer; HeLa; Salmonella; siRNA; systems biology
15.  Simultaneous analysis of large-scale RNAi screens for pathogen entry 
BMC Genomics  2014;15(1):1162.
Background
Large-scale RNAi screening has become an important technology for identifying genes involved in biological processes of interest. However, the quality of large-scale RNAi screening is often deteriorated by off-targets effects. In order to find statistically significant effector genes for pathogen entry, we systematically analyzed entry pathways in human host cells for eight pathogens using image-based kinome-wide siRNA screens with siRNAs from three vendors. We propose a Parallel Mixed Model (PMM) approach that simultaneously analyzes several non-identical screens performed with the same RNAi libraries.
Results
We show that PMM gains statistical power for hit detection due to parallel screening. PMM allows incorporating siRNA weights that can be assigned according to available information on RNAi quality. Moreover, PMM is able to estimate a sharedness score that can be used to focus follow-up efforts on generic or specific gene regulators. By fitting a PMM model to our data, we found several novel hit genes for most of the pathogens studied.
Conclusions
Our results show parallel RNAi screening can improve the results of individual screens. This is currently particularly interesting when large-scale parallel datasets are becoming more and more publicly available. Our comprehensive siRNA dataset provides a public, freely available resource for further statistical and biological analyses in the high-content, high-throughput siRNA screening field.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-1162) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2164-15-1162
PMCID: PMC4326433  PMID: 25534632
High-throughput high-content RNAi screening; Pathogen entry; Linear mixed model; Hit detection
16.  An Integrative Genomic Approach to Uncover Molecular Mechanisms of Prokaryotic Traits 
PLoS Computational Biology  2006;2(11):e159.
With mounting availability of genomic and phenotypic databases, data integration and mining become increasingly challenging. While efforts have been put forward to analyze prokaryotic phenotypes, current computational technologies either lack high throughput capacity for genomic scale analysis, or are limited in their capability to integrate and mine data across different scales of biology. Consequently, simultaneous analysis of associations among genomes, phenotypes, and gene functions is prohibited. Here, we developed a high throughput computational approach, and demonstrated for the first time the feasibility of integrating large quantities of prokaryotic phenotypes along with genomic datasets for mining across multiple scales of biology (protein domains, pathways, molecular functions, and cellular processes). Applying this method over 59 fully sequenced prokaryotic species, we identified genetic basis and molecular mechanisms underlying the phenotypes in bacteria. We identified 3,711 significant correlations between 1,499 distinct Pfam and 63 phenotypes, with 2,650 correlations and 1,061 anti-correlations. Manual evaluation of a random sample of these significant correlations showed a minimal precision of 30% (95% confidence interval: 20%–42%; n = 50). We stratified the most significant 478 predictions and subjected 100 to manual evaluation, of which 60 were corroborated in the literature. We furthermore unveiled 10 significant correlations between phenotypes and KEGG pathways, eight of which were corroborated in the evaluation, and 309 significant correlations between phenotypes and 166 GO concepts evaluated using a random sample (minimal precision = 72%; 95% confidence interval: 60%–80%; n = 50). Additionally, we conducted a novel large-scale phenomic visualization analysis to provide insight into the modular nature of common molecular mechanisms spanning multiple biological scales and reused by related phenotypes (metaphenotypes). We propose that this method elucidates which classes of molecular mechanisms are associated with phenotypes or metaphenotypes and holds promise in facilitating a computable systems biology approach to genomic and biomedical research.
Synopsis
A key challenge of the post-genomic era is to conceive large-scale studies of genomes and observable characteristics of organisms (phenotypes) and to interpret the data thus produced. The goal of this “phenomic” study is to improve our understanding of complex biological systems in terms of their molecular underpinnings. In this paper, Liu and colleagues present comprehensive computational and novel visualization methods for discovering biological knowledge spanning multiple scales of biology. The authors were able to predict and visualize new knowledge between clusters of microbiological phenotypes and their molecular mechanisms. To their knowledge, this is the first time this has been done. More specifically, the method integrates microbiological data with genomic-scale data from protein family databases, gene ontology, and biological pathways. Conducted over 59 fully sequenced bacteria, and including significantly more phenotypes than previous studies of its kind, this study enables a “systems biology” view across different classifications of genes and processes. This represents advancement over previous techniques, which are either limited in biological scale or analytical breadth. Visualization of the networks generated by this technique shows the common biological modules shared by related phenotypes. The results of this experiment demonstrate that the fusion of clinical data with genomic information is able to elucidate, in high throughput, a massive number of biological processes underlying phenotypes.
doi:10.1371/journal.pcbi.0020159
PMCID: PMC1636675  PMID: 17112314
17.  Characterizing Protein Interactions Employing a Genome-Wide siRNA Cellular Phenotyping Screen 
PLoS Computational Biology  2014;10(9):e1003814.
Characterizing the activating and inhibiting effect of protein-protein interactions (PPI) is fundamental to gain insight into the complex signaling system of a human cell. A plethora of methods has been suggested to infer PPI from data on a large scale, but none of them is able to characterize the effect of this interaction. Here, we present a novel computational development that employs mitotic phenotypes of a genome-wide RNAi knockdown screen and enables identifying the activating and inhibiting effects of PPIs. Exemplarily, we applied our technique to a knockdown screen of HeLa cells cultivated at standard conditions. Using a machine learning approach, we obtained high accuracy (82% AUC of the receiver operating characteristics) by cross-validation using 6,870 known activating and inhibiting PPIs as gold standard. We predicted de novo unknown activating and inhibiting effects for 1,954 PPIs in HeLa cells covering the ten major signaling pathways of the Kyoto Encyclopedia of Genes and Genomes, and made these predictions publicly available in a database. We finally demonstrate that the predicted effects can be used to cluster knockdown genes of similar biological processes in coherent subgroups. The characterization of the activating or inhibiting effect of individual PPIs opens up new perspectives for the interpretation of large datasets of PPIs and thus considerably increases the value of PPIs as an integrated resource for studying the detailed function of signaling pathways of the cellular system of interest.
Author Summary
Mathematical models which aim to describe cellular signaling start from constructing an interaction network of effectors, mediators and their effected target proteins. Several developments came up making it easier to put these links together. Besides tediously assembling knowledge from textbooks and research articles, experimental high-throughput methods were established like Yeast-2-Hybrid assays or Fluorescence Emission Resonance Transfer. However, these methods do not elucidate the effect of such interactions. We aimed inferring if an interaction in a specific cellular context is rather activating or inhibiting. We used cellular phenotypes of a genome-wide RNAi knockdown screen of live cells to identify such activating and inhibiting effects of protein interactions. The rationale behind it is that activating protein interactions should lead to similar phenotypes when their respective genes are knocked down, whereas an inhibiting protein interaction should lead to dissimilar phenotypes. Exemplarily, we applied our method to a phenotype screen of perturbed HeLa cells. Our predictions effectively reproduced textbook relationships between proteins or domains when comparing the predicted effects with pairs of effectors, receptors, kinases, phosphatases and of general signalling modules. The presented computational approach is generic and may enable elucidating the effects of studied interactions also of other cellular systems under more specific conditions.
doi:10.1371/journal.pcbi.1003814
PMCID: PMC4178005  PMID: 25255318
18.  web cellHTS2: A web-application for the analysis of high-throughput screening data 
BMC Bioinformatics  2010;11:185.
Background
The analysis of high-throughput screening data sets is an expanding field in bioinformatics. High-throughput screens by RNAi generate large primary data sets which need to be analyzed and annotated to identify relevant phenotypic hits. Large-scale RNAi screens are frequently used to identify novel factors that influence a broad range of cellular processes, including signaling pathway activity, cell proliferation, and host cell infection. Here, we present a web-based application utility for the end-to-end analysis of large cell-based screening experiments by cellHTS2.
Results
The software guides the user through the configuration steps that are required for the analysis of single or multi-channel experiments. The web-application provides options for various standardization and normalization methods, annotation of data sets and a comprehensive HTML report of the screening data analysis, including a ranked hit list. Sessions can be saved and restored for later re-analysis. The web frontend for the cellHTS2 R/Bioconductor package interacts with it through an R-server implementation that enables highly parallel analysis of screening data sets. web cellHTS2 further provides a file import and configuration module for common file formats.
Conclusions
The implemented web-application facilitates the analysis of high-throughput data sets and provides a user-friendly interface. web cellHTS2 is accessible online at http://web-cellHTS2.dkfz.de. A standalone version as a virtual appliance and source code for platforms supporting Java 1.5.0 can be downloaded from the web cellHTS2 page. web cellHTS2 is freely distributed under GPL.
doi:10.1186/1471-2105-11-185
PMCID: PMC3098057  PMID: 20385013
19.  Functional complementation of RNA interference mutants in trypanosomes 
BMC Biotechnology  2005;5:6.
Background
In many eukaryotic cells, double-stranded RNA (dsRNA) triggers RNA interference (RNAi), the specific degradation of RNA of homologous sequence. RNAi is now a major tool for reverse-genetics projects, including large-scale high-throughput screens. Recent reports have questioned the specificity of RNAi, raising problems in interpretation of RNAi-based experiments.
Results
Using the protozoan Trypanosoma brucei as a model, we designed a functional complementation assay to ascertain that phenotypic effect(s) observed upon RNAi were due to specific silencing of the targeted gene. This was applied to a cytoskeletal gene encoding the paraflagellar rod protein 2 (TbPFR2), whose product is essential for flagellar motility. We demonstrate the complementation of TbPFR2, silenced via dsRNA targeting its UTRs, through the expression of a tagged RNAi-resistant TbPFR2 encoding a protein that could be immunolocalized in the flagellum. Next, we performed a functional complementation of TbPFR2, silenced via dsRNA targeting its coding sequence, through heterologous expression of the TbPFR2 orthologue gene from Trypanosoma cruzi: the flagellum regained its motility.
Conclusions
This work shows that functional complementation experiments can be readily performed in order to ascertain that phenotypic effects observed upon RNAi experiments are indeed due to the specific silencing of the targetted gene. Further, the results described here are of particular interest when reverse genetics studies cannot be easily achieved in organisms not amenable to RNAi. In addition, our strategy should constitute a firm basis to elaborate functional-dissection studies of genes from other organisms.
doi:10.1186/1472-6750-5-6
PMCID: PMC549545  PMID: 15703078
20.  FLIGHT: database and tools for the integration and cross-correlation of large-scale RNAi phenotypic datasets 
Nucleic Acids Research  2005;34(Database issue):D479-D483.
FLIGHT () is a new database designed to help researchers browse and cross-correlate data from large-scale RNAi studies. To date, the majority of these functional genomic screens have been carried out using Drosophila cell lines. These RNAi screens follow 100 years of classical Drosophila genetics, but have already revealed their potential by ascribing an impressive number of functions to known and novel genes. This has in turn given rise to a pressing need for tools to simplify the analysis of the large amount of phenotypic information generated. FLIGHT aims to do this by providing users with a gene-centric view of screen results and by making it possible to cluster phenotypic data to identify genes with related functions. Additionally, FLIGHT provides microarray expression data for many of the Drosophila cell lines commonly used in RNAi screens. This, together with information about cell lines, protocols and dsRNA primer sequences, is intended to help researchers design their own cell-based screens. Finally, although the current focus of FLIGHT is Drosophila, the database has been designed to facilitate the comparison of functional data across species and to help researchers working with other systems navigate their way through the fly genome.
doi:10.1093/nar/gkj038
PMCID: PMC1347401  PMID: 16381916
21.  Automatic Segmentation of High-Throughput RNAi Fluorescent Cellular Images 
High-throughput genome-wide RNA interference (RNAi) screening is emerging as an essential tool to assist biologists in understanding complex cellular processes. The large number of images produced in each study make manual analysis intractable; hence, automatic cellular image analysis becomes an urgent need, where segmentation is the first and one of the most important steps. In this paper, a fully automatic method for segmentation of cells from genome-wide RNAi screening images is proposed. Nuclei are first extracted from the DNA channel by using a modified watershed algorithm. Cells are then extracted by modeling the interaction between them as well as combining both gradient and region information in the Actin and Rac channels. A new energy functional is formulated based on a novel interaction model for segmenting tightly clustered cells with significant intensity variance and specific phenotypes. The energy functional is minimized by using a multiphase level set method, which leads to a highly effective cell segmentation method. Promising experimental results demonstrate that automatic segmentation of high-throughput genome-wide multichannel screening can be achieved by using the proposed method, which may also be extended to other multichannel image segmentation problems.
doi:10.1109/TITB.2007.898006
PMCID: PMC2846541  PMID: 18270043
Fluorescent microscopy; high throughput; image segmentation; interaction model; level set; multichannel
22.  Unsupervised automated high throughput phenotyping of RNAi time-lapse movies 
BMC Bioinformatics  2013;14:292.
Background
Gene perturbation experiments in combination with fluorescence time-lapse cell imaging are a powerful tool in reverse genetics. High content applications require tools for the automated processing of the large amounts of data. These tools include in general several image processing steps, the extraction of morphological descriptors, and the grouping of cells into phenotype classes according to their descriptors. This phenotyping can be applied in a supervised or an unsupervised manner. Unsupervised methods are suitable for the discovery of formerly unknown phenotypes, which are expected to occur in high-throughput RNAi time-lapse screens.
Results
We developed an unsupervised phenotyping approach based on Hidden Markov Models (HMMs) with multivariate Gaussian emissions for the detection of knockdown-specific phenotypes in RNAi time-lapse movies. The automated detection of abnormal cell morphologies allows us to assign a phenotypic fingerprint to each gene knockdown. By applying our method to the Mitocheck database, we show that a phenotypic fingerprint is indicative of a gene’s function.
Conclusion
Our fully unsupervised HMM-based phenotyping is able to automatically identify cell morphologies that are specific for a certain knockdown. Beyond the identification of genes whose knockdown affects cell morphology, phenotypic fingerprints can be used to find modules of functionally related genes.
doi:10.1186/1471-2105-14-292
PMCID: PMC3851277  PMID: 24090185
23.  Extended Query Refinement for Medical Image Retrieval 
Journal of Digital Imaging  2007;21(3):280-289.
The impact of image pattern recognition on accessing large databases of medical images has recently been explored, and content-based image retrieval (CBIR) in medical applications (IRMA) is researched. At the present, however, the impact of image retrieval on diagnosis is limited, and practical applications are scarce. One reason is the lack of suitable mechanisms for query refinement, in particular, the ability to (1) restore previous session states, (2) combine individual queries by Boolean operators, and (3) provide continuous-valued query refinement. This paper presents a powerful user interface for CBIR that provides all three mechanisms for extended query refinement. The various mechanisms of man–machine interaction during a retrieval session are grouped into four classes: (1) output modules, (2) parameter modules, (3) transaction modules, and (4) process modules, all of which are controlled by a detailed query logging. The query logging is linked to a relational database. Nested loops for interaction provide a maximum of flexibility within a minimum of complexity, as the entire data flow is still controlled within a single Web page. Our approach is implemented to support various modalities, orientations, and body regions using global features that model gray scale, texture, structure, and global shape characteristics. The resulting extended query refinement has a significant impact for medical CBIR applications.
doi:10.1007/s10278-007-9037-4
PMCID: PMC3043837  PMID: 17497197
Graphical user interface (GUI); web-based interface; query refinement; relevance feedback; usability
24.  A Computational model for compressed sensing RNAi cellular screening 
BMC Bioinformatics  2012;13:337.
Background
RNA interference (RNAi) becomes an increasingly important and effective genetic tool to study the function of target genes by suppressing specific genes of interest. This system approach helps identify signaling pathways and cellular phase types by tracking intensity and/or morphological changes of cells. The traditional RNAi screening scheme, in which one siRNA is designed to knockdown one specific mRNA target, needs a large library of siRNAs and turns out to be time-consuming and expensive.
Results
In this paper, we propose a conceptual model, called compressed sensing RNAi (csRNAi), which employs a unique combination of group of small interfering RNAs (siRNAs) to knockdown a much larger size of genes. This strategy is based on the fact that one gene can be partially bound with several small interfering RNAs (siRNAs) and conversely, one siRNA can bind to a few genes with distinct binding affinity. This model constructs a multi-to-multi correspondence between siRNAs and their targets, with siRNAs much fewer than mRNA targets, compared with the conventional scheme. Mathematically this problem involves an underdetermined system of equations (linear or nonlinear), which is ill-posed in general. However, the recently developed compressed sensing (CS) theory can solve this problem. We present a mathematical model to describe the csRNAi system based on both CS theory and biological concerns. To build this model, we first search nucleotide motifs in a target gene set. Then we propose a machine learning based method to find the effective siRNAs with novel features, such as image features and speech features to describe an siRNA sequence. Numerical simulations show that we can reduce the siRNA library to one third of that in the conventional scheme. In addition, the features to describe siRNAs outperform the existing ones substantially.
Conclusions
This csRNAi system is very promising in saving both time and cost for large-scale RNAi screening experiments which may benefit the biological research with respect to cellular processes and pathways.
doi:10.1186/1471-2105-13-337
PMCID: PMC3544734  PMID: 23270311
25.  GenomeRNAi: a database for cell-based RNAi phenotypes 
Nucleic Acids Research  2006;35(Database issue):D492-D497.
RNA interference (RNAi) has emerged as a powerful tool to generate loss-of-function phenotypes in a variety of organisms. Combined with the sequence information of almost completely annotated genomes, RNAi technologies have opened new avenues to conduct systematic genetic screens for every annotated gene in the genome. As increasing large datasets of RNAi-induced phenotypes become available, an important challenge remains the systematic integration and annotation of functional information. Genome-wide RNAi screens have been performed both in Caenorhabditis elegans and Drosophila for a variety of phenotypes and several RNAi libraries have become available to assess phenotypes for almost every gene in the genome. These screens were performed using different types of assays from visible phenotypes to focused transcriptional readouts and provide a rich data source for functional annotation across different species. The GenomeRNAi database provides access to published RNAi phenotypes obtained from cell-based screens and maps them to their genomic locus, including possible non-specific regions. The database also gives access to sequence information of RNAi probes used in various screens. It can be searched by phenotype, by gene, by RNAi probe or by sequence and is accessible at
doi:10.1093/nar/gkl906
PMCID: PMC1747177  PMID: 17135194

Results 1-25 (779444)