Real-time quantitative PCR (qPCR) is still the gold-standard technique for gene-expression quantification. Recent technological advances of this method allow for the high-throughput gene-expression analysis, without the limitations of sample space and reagent used. However, non-commercial and user-friendly software for the management and analysis of these data is not available.
The recently developed commercial microarrays allow for the drawing of standard curves of multiple assays using the same n-fold diluted samples. Data Analysis Gene (DAG) Expression software has been developed to perform high-throughput gene-expression data analysis using standard curves for relative quantification and one or multiple reference genes for sample normalization. We discuss the application of DAG Expression in the analysis of data from an experiment performed with Fluidigm technology, in which 48 genes and 115 samples were measured. Furthermore, the quality of our analysis was tested and compared with other available methods.
DAG Expression is a freely available software that permits the automated analysis and visualization of high-throughput qPCR. A detailed manual and a demo-experiment are provided within the DAG Expression software at http://www.dagexpression.com/dage.zip.
How dermal papilla (DP) niche cells regulate hair follicle progenitors to control hair growth remains unclear. Using Tbx18Cre to target embryonic DP precursors, we ablate the transcription factor Sox2 early and efficiently, resulting in diminished hair shaft outgrowth. We find that DP niche expression of Sox2 controls the migration rate of differentiating hair shaft progenitors. Transcriptional profiling of Sox2 null DPs reveals increased Bmp6 and decreased Bmp inhibitor Sostdc1, a direct Sox2 transcriptional target. Subsequently, we identify upregulated Bmp signaling in knockout hair shaft progenitors and demonstrate that Bmps inhibit cell migration, an effect that can be attenuated by Sostdc1. A shorter and Sox2-negative hair type lacks Sostdc1 in the DP and shows reduced migration and increased Bmp activity of hair shaft progenitors. Collectively, our data identify Sox2 as a key regulator of hair growth that controls progenitor migration by fine-tuning Bmp-mediated mesenchymal-epithelial crosstalk.
Sox2; Dermal papilla; Stem cell niche; Mesenchymal-epithelial interactions; Hair growth; Hair follicle; Gene ablation; Bmp signaling
We assessed tissue macrophage gene expression in different mouse organs. Diversity in gene expression among different populations of macrophages was remarkable. Only a few hundred mRNA transcripts stood out as selectively expressed by macrophages over DCs and many of these were not present in all macrophages. Nonetheless, well-characterized surface markers, including MerTK and FcγR1 (CD64), along with a cluster of novel transcripts were distinctly and universally associated with mature tissue macrophages. TCEF3, C/EBPα, BACH1, and CREG-1 were among the top transcriptional regulators predicted to regulate these core macrophage-associated genes. Other transcription factor mRNAs were strongly associated with single macrophage populations. We further illustrate how these transcripts and the proteins they encode facilitate distinguishing macrophage versus DC identity of less characterized populations of mononuclear phagocytes.
With the widespread use of combination antiretroviral agents, the incidence of HIV-associated nephropathy has decreased. Currently, HIV-infected patients live much longer and often suffer from comorbidities such as diabetes mellitus. Recent epidemiological studies suggest that concurrent HIV infection and diabetes mellitus may have a synergistic effect on the incidence of chronic kidney disease. To address this, we determined whether HIV-1 transgene expression accelerates diabetic kidney injury using a diabetic HIV-1 transgenic (Tg26) murine model. Diabetes was initially induced with low-dose streptozotocin in both Tg26 and the wild-type mice on the C57BL/6 background, which is resistant to classic HIV-associated nephropathy. Although diabetic nephropathy is minimally observed on C57BL/6 background, diabetic Tg26 mice exhibited a significant increase in glomerular injury compared to non-diabetic Tg26 mice or diabetic wild type mice. Validation of microarray gene expression analysis from isolated glomeruli showed a significant up-regulation of pro-inflammatory pathways in the diabetic Tg26 mice. Thus, our study found that expression of HIV-1 genes aggravates diabetic kidney disease
Many signals must be integrated to maintain self-renewal and pluripotency in embryonic stem cells (ESCs) and to enable induced pluripotent stem cell (iPSC) reprogramming. However, the exact molecular regulatory mechanisms remain elusive. To unravel the essential internal and external signals required for sustaining the ESC state, we conducted a short hairpin (sh) RNA screen of 104 ESC-associated phosphoregulators. Depletion of one such molecule, aurora kinase A (Aurka), resulted in compromised self-renewal and consequent differentiation. By integrating global gene expression and computational analyses, we discovered that loss of Aurka leads to up-regulated p53 activity that triggers ESC differentiation. Specifically, Aurka regulates pluripotency through phosphorylation-mediated inhibition of p53-directed ectodermal and mesodermal gene expression. Phosphorylation of p53 not only impairs p53-induced ESC differentiation but also p53-mediated suppression of iPSC reprogramming. Our studies demonstrate an essential role for Aurka-p53 signaling in the regulation of self-renewal, differentiation, and somatic cell reprogramming.
Regulatory motifs are patterns of activation and inhibition that appear repeatedly in various signaling networks and that show specific regulatory properties. However, the network structures of regulatory motifs are highly diverse and complex, rendering their identification difficult. Here, we present a RMOD, a web-based system for the identification of regulatory motifs and their properties in signaling networks. RMOD finds various network structures of regulatory motifs by compressing the signaling network and detecting the compressed forms of regulatory motifs. To apply it into a large-scale signaling network, it adopts a new subgraph search algorithm using a novel data structure called path-tree, which is a tree structure composed of isomorphic graphs of query regulatory motifs. This algorithm was evaluated using various sizes of signaling networks generated from the integration of various human signaling pathways and it showed that the speed and scalability of this algorithm outperforms those of other algorithms. RMOD includes interactive analysis and auxiliary tools that make it possible to manipulate the whole processes from building signaling network and query regulatory motifs to analyzing regulatory motifs with graphical illustration and summarized descriptions. As a result, RMOD provides an integrated view of the regulatory motifs and mechanism underlying their regulatory motif activities within the signaling network. RMOD is freely accessible online at the following URL: http://pks.kaist.ac.kr/rmod.
High content studies that profile mouse and human embryonic stem cells (m/hESCs) using various genome-wide technologies such as transcriptomics and proteomics are constantly being published. However, efforts to integrate such data to obtain a global view of the molecular circuitry in m/hESCs are lagging behind. Here, we present an m/hESC-centered database called Embryonic Stem Cell Atlas from Pluripotency Evidence integrating data from many recent diverse high-throughput studies including chromatin immunoprecipitation followed by deep sequencing, genome-wide inhibitory RNA screens, gene expression microarrays or RNA-seq after knockdown (KD) or overexpression of critical factors, immunoprecipitation followed by mass spectrometry proteomics and phosphoproteomics. The database provides web-based interactive search and visualization tools that can be used to build subnetworks and to identify known and novel regulatory interactions across various regulatory layers. The web-interface also includes tools to predict the effects of combinatorial KDs by additive effects controlled by sliders, or through simulation software implemented in MATLAB. Overall, the Embryonic Stem Cell Atlas from Pluripotency Evidence database is a comprehensive resource for the stem cell systems biology community.
Database URL: http://www.maayanlab.net/ESCAPE
Autism spectrum disorders (ASD) are a group of related neurodevelopmental disorders with significant combined prevalence (~1%) and high heritability. Dozens of individually rare genes and loci associated with high-risk for ASD have been identified, which overlap extensively with genes for intellectual disability (ID). However, studies indicate that there may be hundreds of genes that remain to be identified. The advent of inexpensive massively parallel nucleotide sequencing can reveal the genetic underpinnings of heritable complex diseases, including ASD and ID. However, whole exome sequencing (WES) and whole genome sequencing (WGS) provides an embarrassment of riches, where many candidate variants emerge. It has been argued that genetic variation for ASD and ID will cluster in genes involved in distinct pathways and protein complexes. For this reason, computational methods that prioritize candidate genes based on additional functional information such as protein-protein interactions or association with specific canonical or empirical pathways, or other attributes, can be useful. In this study we applied several supervised learning approaches to prioritize ASD or ID disease gene candidates based on curated lists of known ASD and ID disease genes. We implemented two network-based classifiers and one attribute-based classifier to show that we can rank and classify known, and predict new, genes for these neurodevelopmental disorders. We also show that ID and ASD share common pathways that perturb an overlapping synaptic regulatory subnetwork. We also show that features relating to neuronal phenotypes in mouse knockouts can help in classifying neurodevelopmental genes. Our methods can be applied broadly to other diseases helping in prioritizing newly identified genetic variation that emerge from disease gene discovery based on WES and WGS.
High-throughput sequencing; massively parallel sequencing; gene discovery; networks; pathways; neurodevelopmental disorders; classifiers; support vector machine
A number of key regulators of mouse embryonic stem (ES) cell identity, including the transcription factor Nanog, show strong expression fluctuations at the single cell level. The molecular basis for these fluctuations is unknown. Here we used a genetic complementation strategy to investigate expression changes during transient periods of Nanog downregulation. Employing an integrated approach, that includes high-throughput single cell transcriptional profiling and mathematical modelling, we found that early molecular changes subsequent to Nanog loss are stochastic and reversible. However, analysis also revealed that Nanog loss severely compromises the self-sustaining feedback structure of the ES cell regulatory network. Consequently, these nascent changes soon become consolidated to committed fate decisions in the prolonged absence of Nanog. Consistent with this, we found that exogenous regulation of Nanog-dependent feedback control mechanisms produced more a homogeneous ES cell population. Taken together our results indicate that Nanog-dependent feedback loops have a role in controlling both ES cell fate decisions and population variability.
With the widespread use of combination antiretroviral agents, the incidence of HIV-associated nephropathy has decreased. Currently, HIV-infected patients live much longer and often suffer from comorbidities such as diabetes mellitus. Recent epidemiological studies suggest that concurrent HIV infection and diabetes mellitus may have a synergistic effect on the incidence of chronic kidney disease. To address this, we determined whether HIV-1 transgene expression accelerates diabetic kidney injury using a diabetic HIV-1 transgenic (Tg26) murine model. Diabetes was initially induced with low-dose streptozotocin in both Tg26 and wild-type mice on a C57BL/6 background, which is resistant to classic HIV-associated nephropathy. Although diabetic nephropathy is minimally observed on the C57BL/6 background, diabetic Tg26 mice exhibited a significant increase in glomerular injury compared with nondiabetic Tg26 mice and diabetic wild-type mice. Validation of microarray gene expression analysis from isolated glomeruli showed a significant upregulation of proinflammatory pathways in diabetic Tg26 mice. Thus, our study found that expression of HIV-1 genes aggravates diabetic kidney disease.
diabetic nephropathy; glomerulopathy; HIV
Autism spectrum disorders (ASD) are believed to have genetic and environmental origins, yet in only a modest fraction of individuals can specific causes be identified1,2. To identify further genetic risk factors, we assess the role of de novo mutations in ASD by sequencing the exomes of ASD cases and their parents (n= 175 trios). Fewer than half of the cases (46.3%) carry a missense or nonsense de novo variant and the overall rate of mutation is only modestly higher than the expected rate. In contrast, there is significantly enriched connectivity among the proteins encoded by genes harboring de novo missense or nonsense mutations, and excess connectivity to prior ASD genes of major effect, suggesting a subset of observed events are relevant to ASD risk. The small increase in rate of de novo events, when taken together with the connections among the proteins themselves and to ASD, are consistent with an important but limited role for de novo point mutations, similar to that documented for de novo copy number variants. Genetic models incorporating these data suggest that the majority of observed de novo events are unconnected to ASD, those that do confer risk are distributed across many genes and are incompletely penetrant (i.e., not necessarily causal). Our results support polygenic models in which spontaneous coding mutations in any of a large number of genes increases risk by 5 to 20-fold. Despite the challenge posed by such models, results from de novo events and a large parallel case-control study provide strong evidence in favor of CHD8 and KATNAL2 as genuine autism risk factors.
computational biology; drug–drug networks; systems biology
Flow cytometry is a widely used technique for the analysis of cell populations in the study and diagnosis of human diseases. It yields large amounts of high-dimensional data, the analysis of which would clearly benefit from efficient computational approaches aiming at automated diagnosis and decision support. This article presents our analysis of flow cytometry data in the framework of the DREAM6/FlowCAP2 Molecular Classification of Acute Myeloid Leukemia (AML) Challenge, 2011. In the challenge, example data was provided for a set of 179 subjects, comprising healthy donors and 23 cases of AML. The participants were asked to provide predictions with respect to the condition of 180 patients in a test set. We extracted feature vectors from the data in terms of single marker statistics, including characteristic moments, median and interquartile range of the observed values. Subsequently, we applied Generalized Matrix Relevance Learning Vector Quantization (GMLVQ), a machine learning technique which extends standard LVQ by an adaptive distance measure. Our method achieved the best possible performance with respect to the diagnoses of test set patients. The extraction of features from the flow cytometry data is outlined in detail, the machine learning approach is discussed and classification results are presented. In addition, we illustrate how GMLVQ can provide deeper insight into the problem by allowing to infer the relevance of specific markers and features for the diagnosis.
Motivation: Genome-wide mRNA profiling provides a snapshot of the global state of cells under different conditions. However, mRNA levels do not provide direct understanding of upstream regulatory mechanisms. Here, we present a new approach called Expression2Kinases (X2K) to identify upstream regulators likely responsible for observed patterns in genome-wide gene expression. By integrating chromatin immuno-precipitation (ChIP)-seq/chip and position weight matrices (PWMs) data, protein–protein interactions and kinase–substrate phosphorylation reactions, we can better identify regulatory mechanisms upstream of genome-wide differences in gene expression. We validated X2K by applying it to recover drug targets of food and drug administration (FDA)-approved drugs from drug perturbations followed by mRNA expression profiling; to map the regulatory landscape of 44 stem cells and their differentiating progeny; to profile upstream regulatory mechanisms of 327 breast cancer tumors; and to detect pathways from profiled hepatic stellate cells and hippocampal neurons. The X2K approach can advance our understanding of cell signaling and unravel drugs mechanisms of action.
Availability: The software and source code are freely available at: http://www.maayanlab.net/X2K.
Supplementary information: Supplementary data are available at Bioinformatics online.
Oct4 is a well-known transcription factor that plays fundamental roles in stem cell self-renewal, pluripotency, and somatic cell reprogramming. However, limited information is available on Oct4-associated protein complexes and their intrinsic protein-protein interactions that dictate Oct4's critical regulatory activities. Here we employed an improved affinity purification approach combined with mass spectrometry to purify Oct4 protein complexes in mouse embryonic stem cells (mESCs), and discovered many novel Oct4 partners important for self-renewal and pluripotency of mESCs. Notably, we found that Oct4 is associated with multiple chromatin-modifying complexes with documented as well as newly proved functional significance in stem cell maintenance and somatic cell reprogramming. Our study establishes a solid biochemical basis for genetic and epigenetic regulation of stem cell pluripotency and provides a framework for exploring alternative factor-based reprogramming strategies.
Oct4; ESCs; pluripotency; self-renewal; epigenetic regulation
Cortical efferents growing in the same environment diverge early in development. The expression of particular transcription factors dictates the trajectories taken presumably by regulating responsiveness to guidance cues via cellular mechanisms that are not yet known. Here we show that cortical neurons that are dissociated and grown in culture maintain their cell-type specific identities defined by the expression of transcription factors. Using this model system we sought to identify and characterize mechanisms that are recruited to produce cell-type specific responses to Semaphorin 3A (Sema3A), a guidance cue that would be presented similarly to cortical axons in vivo. Axons from presumptive corticofugal neurons lacking the transcription factor Satb2 and expressing Ctip2 or Tbr1 respond far more robustly to Sema3A than those from presumptive callosal neurons expressing Satb2. Both populations of axons express similar levels of Sema3A receptors (Neuropilin-1, L1CAM and PlexinA4), but significantly, axons from neurons lacking Satb2 internalize more Sema3A and they do so via a raft-mediated endocytic pathway. We used an in silico approach to identify the endocytosis effector Flotillin-1 as a Sema3A signaling candidate. We tested the contributions of Flotillin-1 to Sema3A endocytosis and signaling, and show that raft-mediated Sema3A endocytosis is defined by and depends on the recruitment of Flotillin-1, which mediates LIMK activation, and regulates axon responsiveness to Sema3A in presumptive corticofugal axons.
axon guidance; growth cone; transcription factors; uptake; rafts; cytoskeleton
Kidney fibrosis is a common process that leads to the progression of kidney diseases. We used an integrated computational/experimental systems biology approach to identify upstream protein kinases that regulate gene expression changes in kidneys of HIV-1 transgenic mice (Tg26), which have both tubulo-interstitial fibrosis and glomerulosclerosis. We identified the homeo-domain interacting protein kinase 2 (HIPK2) as a key regulator of kidney fibrosis. HIPK2 was upregulated in kidneys of Tg26 and patients with various kidney diseases. HIV infection increased the protein level of HIPK2 by promoting oxidative stress, which inhibited SIAH1-mediated proteasomal degradation of HIPK2. HIPK2 induced apoptosis and expression of epithelial-mesenchymal trans-differentiation markers in kidney epithelial cells by activating p53, TGF-β/Smad3, and Wnt/Notch pathways. Knockout of HIPK2 improved renal function and attenuated proteinuria and kidney fibrosis in Tg26 as well as in other animal models of kidney fibrosis. We conclude that HIPK2 is a potential target for anti-fibrosis therapy.
HIPK2; tubular epithelial cells; HIV; fibrosis; systems biology
Large-scale collective behaviors such as synchronization and coordination spontaneously arise in many bacterial populations. With systems biology attempting to understand these phenomena, and synthetic biology opening up the possibility of engineering them for our own benefit, there is growing interest in how bacterial populations are best modeled. Here we introduce BSim, a highly flexible agent-based computational tool for analyzing the relationships between single-cell dynamics and population level features. BSim includes reference implementations of many bacterial traits to enable the quick development of new models partially built from existing ones. Unlike existing modeling tools, BSim fully considers spatial aspects of a model allowing for the description of intricate micro-scale structures, enabling the modeling of bacterial behavior in more realistic three-dimensional, complex environments. The new opportunities that BSim opens are illustrated through several diverse examples covering: spatial multicellular computing, modeling complex environments, population dynamics of the lac operon, and the synchronization of genetic oscillators. BSim is open source software that is freely available from http://bsim-bccs.sf.net and distributed under the Open Source Initiative (OSI) recognized MIT license. Developer documentation and a wide range of example simulations are also available from the website. BSim requires Java version 1.6 or higher.
Expression quantitative trait loci (eQTL) mapping is a widely used technique to uncover regulatory relationships between genes. A range of methodologies have been developed to map links between expression traits and genotypes. The DREAM (Dialogue on Reverse Engineering Assessments and Methods) initiative is a community project to objectively assess the relative performance of different computational approaches for solving specific systems biology problems. The goal of one of the DREAM5 challenges was to reverse-engineer genetic interaction networks from synthetic genetic variation and gene expression data, which simulates the problem of eQTL mapping. In this framework, we proposed an approach whose originality resides in the use of a combination of existing machine learning algorithms (committee). Although it was not the best performer, this method was by far the most precise on average. After the competition, we continued in this direction by evaluating other committees using the DREAM5 data and developed a method that relies on Random Forests and LASSO. It achieved a much higher average precision than the DREAM best performer at the cost of slightly lower average sensitivity.
The skeleton of complex systems can be represented as networks where vertices represent entities, and edges represent the relations between these entities. Often it is impossible, or expensive, to determine the network structure by experimental validation of the binary interactions between every vertex pair. It is usually more practical to infer the network from surrogate observations. Network inference is the process by which an underlying network of relations between entities is determined from indirect evidence. While many algorithms have been developed to infer networks from quantitative data, less attention has been paid to methods which infer networks from repeated co-occurrence of entities in related sets. This type of data is ubiquitous in the field of systems biology and in other areas of complex systems research. Hence, such methods would be of great utility and value.
Here we present a general method for network inference from repeated observations of sets of related entities. Given experimental observations of such sets, we infer the underlying network connecting these entities by generating an ensemble of networks consistent with the data. The frequency of occurrence of a given link throughout this ensemble is interpreted as the probability that the link is present in the underlying real network conditioned on the data. Exponential random graphs are used to generate and sample the ensemble of consistent networks, and we take an algorithmic approach to numerically execute the inference method. The effectiveness of the method is demonstrated on synthetic data before employing this inference approach to problems in systems biology and systems pharmacology, as well as to construct a co-authorship collaboration network. We predict direct protein-protein interactions from high-throughput mass-spectrometry proteomics, integrate data from Chip-seq and loss-of-function/gain-of-function followed by expression data to infer a network of associations between pluripotency regulators, extract a network that connects 53 cancer drugs to each other and to 34 severe adverse events by mining the FDA’s Adverse Events Reporting Systems (AERS), and construct a co-authorship network that connects Mount Sinai School of Medicine investigators. The predicted networks and online software to create networks from entity-set libraries are provided online at http://www.maayanlab.net/S2N.
The network inference method presented here can be applied to resolve different types of networks in current systems biology and systems pharmacology as well as in other fields of research.
Systems biology aims for building quantitative models to address unresolved issues in molecular biology. In order to describe the behavior of biological cells adequately, gene regulatory networks (GRNs) are intensively investigated. As the validity of models built for GRNs depends crucially on the kinetic rates, various methods have been developed to estimate these parameters from experimental data. For this purpose, it is favorable to choose the experimental conditions yielding maximal information. However, existing experimental design principles often rely on unfulfilled mathematical assumptions or become computationally demanding with growing model complexity. To solve this problem, we combined advanced methods for parameter and uncertainty estimation with experimental design considerations. As a showcase, we optimized three simulated GRNs in one of the challenges from the Dialogue for Reverse Engineering Assessment and Methods (DREAM). This article presents our approach, which was awarded the best performing procedure at the DREAM6 Estimation of Model Parameters challenge. For fast and reliable parameter estimation, local deterministic optimization of the likelihood was applied. We analyzed identifiability and precision of the estimates by calculating the profile likelihood. Furthermore, the profiles provided a way to uncover a selection of most informative experiments, from which the optimal one was chosen using additional criteria at every step of the design process. In conclusion, we provide a strategy for optimal experimental design and show its successful application on three highly nonlinear dynamic models. Although presented in the context of the GRNs to be inferred for the DREAM6 challenge, the approach is generic and applicable to most types of quantitative models in systems biology and other disciplines.
Protein-protein, cell signaling, metabolic, and transcriptional interaction networks are useful for identifying connections between lists of experimentally identified genes/proteins. However, besides physical or co-expression interactions there are many ways in which pairs of genes, or their protein products, can be associated. By systematically incorporating knowledge on shared properties of genes from diverse sources to build functional association networks (FANs), researchers may be able to identify additional functional interactions between groups of genes that are not readily apparent.
Genes2FANs is a web based tool and a database that utilizes 14 carefully constructed FANs and a large-scale protein-protein interaction (PPI) network to build subnetworks that connect lists of human and mouse genes. The FANs are created from mammalian gene set libraries where mouse genes are converted to their human orthologs. The tool takes as input a list of human or mouse Entrez gene symbols to produce a subnetwork and a ranked list of intermediate genes that are used to connect the query input list. In addition, users can enter any PubMed search term and then the system automatically converts the returned results to gene lists using GeneRIF. This gene list is then used as input to generate a subnetwork from the user’s PubMed query. As a case study, we applied Genes2FANs to connect disease genes from 90 well-studied disorders. We find an inverse correlation between the counts of links connecting disease genes through PPI and links connecting diseases genes through FANs, separating diseases into two categories.
Genes2FANs is a useful tool for interpreting the relationships between gene/protein lists in the context of their various functions and networks. Combining functional association interactions with physical PPIs can be useful for revealing new biology and help form hypotheses for further experimentation. Our finding that disease genes in many cancers are mostly connected through PPIs whereas other complex diseases, such as autism and type-2 diabetes, are mostly connected through FANs without PPIs, can guide better strategies for disease gene discovery. Genes2FANs is available at:
Oct4 is a well-known transcription factor that plays fundamental roles in stem cell self-renewal, pluripotency, and somatic cell reprogramming. However, limited information is available on Oct4-associated protein complexes and their intrinsic protein-protein interactions that dictate Oct4’s critical regulatory activities. Here we employed an improved affinity purification approach combined with mass spectrometry to purify Oct4 protein complexes in mouse embryonic stem cells (mESCs), and discovered many novel Oct4 partners important for self-renewal and pluripotency of mESCs. Notably, we found that Oct4 is associated with multiple chromatin modifying complexes with documented as well as newly proved functional significance in stem cell maintenance and somatic cell reprogramming. Our study establishes a solid biochemical basis for genetic and epigenetic regulation of stem cell pluripotency and provides a framework for exploring alternative factor-based reprogramming strategies.
Oct4; ESCs; pluripotency; self-renewal; epigenetic regulation
Flow cytometry provides multi-dimensional data at the single-cell level. Such data contain information about the cellular heterogeneity of bulk samples, making it possible to correlate single-cell features with phenotypic properties of bulk tissues. Predicting phenotypes from single-cell measurements is a difficult challenge that has not been extensively studied. The 6th Dialogue for Reverse Engineering Assessments and Methods (DREAM6) invited the research community to develop solutions to a computational challenge: classifying acute myeloid leukemia (AML) positive patients and healthy donors using flow cytometry data. DREAM6 provided flow cytometry data for 359 normal and AML samples, and the class labels for half of the samples. Researchers were asked to predict the class labels of the remaining half. This paper describes one solution that was constructed by combining three algorithms: spanning-tree progression analysis of density-normalized events (SPADE), earth mover’s distance, and a nearest-neighbor classifier called Relief. This solution was among the top-performing methods that achieved 100% prediction accuracy.
Motivation: Network diagrams are commonly used to visualize biochemical pathways by displaying the relationships between genes, proteins, mRNAs, microRNAs, metabolites, regulatory DNA elements, diseases, viruses and drugs. While there are several currently available web-based pathway viewers, there is still room for improvement. To this end, we have developed a flash-based network viewer (FNV) for the visualization of small to moderately sized biological networks and pathways.
Summary: Written in Adobe ActionScript 3.0, the viewer accepts simple Extensible Markup Language (XML) formatted input files to display pathways in vector graphics on any web-page providing flexible layout options, interactivity with the user through tool tips, hyperlinks and the ability to rearrange nodes on the screen. FNV was utilized as a component in several web-based systems, namely Genes2Networks, Lists2Networks, KEA, ChEA and PathwayGenerator. In addition, FVN can be used to embed pathways inside pdf files for the communication of pathways in soft publication materials.
Availability: FNV is available for use and download along with the supporting documentation and sample networks at http://www.maayanlab.net/FNV.