Big data is becoming ubiquitous in biology, and poses significant challenges in data analysis and interpretation. RNAi screening has become a workhorse of functional genomics, and has been applied, for example, to identify host factors involved in infection for a panel of different viruses. However, the analysis of data resulting from such screens is difficult, with often low overlap between hit lists, even when comparing screens targeting the same virus. This makes it a major challenge to select interesting candidates for further detailed, mechanistic experimental characterization.
To address this problem we propose an integrative bioinformatics pipeline that allows for a network based meta-analysis of viral high-throughput RNAi screens. Initially, we collate a human protein interaction network from various public repositories, which is then subjected to unsupervised clustering to determine functional modules. Modules that are significantly enriched with host dependency factors (HDFs) and/or host restriction factors (HRFs) are then filtered based on network topology and semantic similarity measures. Modules passing all these criteria are finally interpreted for their biological significance using enrichment analysis, and interesting candidate genes can be selected from the modules.
We apply our approach to seven screens targeting three different viruses, and compare results with other published meta-analyses of viral RNAi screens. We recover key hit genes, and identify additional candidates from the screens. While we demonstrate the application of the approach using viral RNAi data, the method is generally applicable to identify underlying mechanisms from hit lists derived from high-throughput experimental data, and to select a small number of most promising genes for further mechanistic studies.
Electronic supplementary material
The online version of this article (doi:10.1186/s13015-015-0035-7) contains supplementary material, which is available to authorized users.
Network analysis; RNAi screening; Virus-host interactions
Changes in the airway microbiome may be important in the pathophysiology of chronic lung disease in patients with cystic fibrosis. However, little is known about the microbiome in early cystic fibrosis lung disease and the relationship between the microbiomes from different niches in the upper and lower airways. Therefore, in this cross-sectional study, we examined the relationship between the microbiome in the upper (nose and throat) and lower (sputum) airways from children with cystic fibrosis using next generation sequencing. Our results demonstrate a significant difference in both α and β-diversity between the nose and the two other sampling sites. The nasal microbiome was characterized by a polymicrobial community while the throat and sputum communities were less diverse and dominated by a few operational taxonomic units. Moreover, sputum and throat microbiomes were closely related especially in patients with clinically stable lung disease. There was a high inter-individual variability in sputum samples primarily due to a decrease in evenness linked to increased abundance of potential respiratory pathogens such as Pseudomonas aeruginosa. Patients with chronic Pseudomonas aeruginosa infection exhibited a less diverse sputum microbiome. A high concordance was found between pediatric and adult sputum microbiomes except that Burkholderia was only observed in the adult cohort. These results indicate that an adult-like lower airways microbiome is established early in life and that throat swabs may be a good surrogate in clinically stable children with cystic fibrosis without chronic Pseudomonas aeruginosa infection in whom sputum sampling is often not feasible.
Hepatitis C virus (HCV) is a major cause of chronic liver disease affecting around 130 million people worldwide. While great progress has been made to define the principle steps of the viral life cycle, detailed knowledge how HCV interacts with its host cells is still limited. To overcome this limitation we conducted a comprehensive whole-virus RNA interference-based screen and identified 40 host dependency and 16 host restriction factors involved in HCV entry/replication or assembly/release. Of these factors, heterogeneous nuclear ribonucleoprotein K (HNRNPK) was found to suppress HCV particle production without affecting viral RNA replication. This suppression of virus production was specific to HCV, independent from assembly competence and genotype, and not found with the related Dengue virus. By using a knock-down rescue approach we identified the domains within HNRNPK required for suppression of HCV particle production. Importantly, HNRNPK was found to interact specifically with HCV RNA and this interaction was impaired by mutations that also reduced the ability to suppress HCV particle production. Finally, we found that in HCV-infected cells, subcellular distribution of HNRNPK was altered; the protein was recruited to sites in close proximity of lipid droplets and colocalized with core protein as well as HCV plus-strand RNA, which was not the case with HNRNPK variants unable to suppress HCV virion formation. These results suggest that HNRNPK might determine efficiency of HCV particle production by limiting the availability of viral RNA for incorporation into virions. This study adds a new function to HNRNPK that acts as central hub in the replication cycle of multiple other viruses.
As obligate intracellular parasites with limited gene coding capacity viruses exploit host cell machineries for the sake of efficient replication and spread. Thus, identification of these cellular machineries and factors is necessary to understand how a given virus achieves efficient replication and eventually causes host cell damage. Hepatitis C virus (HCV) is an RNA virus replicating in the cytoplasm of hepatocytes. While viral proteins have been studied in great detail, our knowledge about how host cell factors are used by HCV for efficient replication and spread is still scarce. In the present study we conducted a comprehensive RNA-interference-based screen and identified 40 genes that promote the HCV lifecycle and 16 genes that suppress it. Follow-up studies revealed that one of these genes, the heterogeneous nuclear ribonucleoprotein K (HNRNPK), selectively suppresses production of infectious HCV particles. We mapped the domains of HNRNPK required for this suppression and demonstrate that this protein selectively binds to the HCV RNA genome. Based on the correlation between suppression of virus production, HCV RNA binding and recruitment to lipid droplets, we propose that HNRNPK might limit the amount of viral RNA genomes available for incorporation into virus particles. This study provides novel insights into the complexity of reactions that are involved in the formation of HCV virions.
Hepatitis C virus (HCV) predominantly infects human hepatocytes, although extrahepatic virus reservoirs are being discussed. Infection of cells is initiated via cell-free and direct cell-to-cell transmission routes. Cell type-specific determinants of HCV entry and RNA replication have been reported. Moreover, several host factors required for synthesis and secretion of lipoproteins from liver cells, in part expressed in tissue-specific fashion, have been implicated in HCV assembly. However, the minimal cell type-specific requirements for HCV assembly have remained elusive. Here we report that production of HCV trans-complemented particles (HCVTCP) from nonliver cells depends on ectopic expression of apolipoprotein E (ApoE). For efficient virus production by full-length HCV genomes, microRNA 122 (miR-122)-mediated enhancement of RNA replication is additionally required. Typical properties of cell culture-grown HCV (HCVcc) particles from ApoE-expressing nonliver cells are comparable to those of virions derived from human hepatoma cells, although specific infectivity of virions is modestly reduced. Thus, apolipoprotein B (ApoB), microsomal triglyceride transfer protein (MTTP), and apolipoprotein C1 (ApoC1), previously implicated in HCV assembly, are dispensable for production of infectious HCV. In the absence of ApoE, release of core protein from infected cells is reduced, and production of extracellular as well as intracellular infectivity is ablated. Since envelopment of capsids was not impaired, we conclude that ApoE acts after capsid envelopment but prior to secretion of infectious HCV. Remarkably, the lack of ApoE also abrogated direct HCV cell-to-cell transmission. These findings highlight ApoE as a host factor codetermining HCV tissue tropism due to its involvement in a late assembly step and viral cell-to-cell transmission.
Network inference deals with the reconstruction of molecular networks from experimental data. Given N molecular species, the challenge is to find the underlying network. Due to data limitations, this typically is an ill-posed problem, and requires the integration of prior biological knowledge or strong regularization. We here focus on the situation when time-resolved measurements of a system’s response after systematic perturbations are available.
We present a novel method to infer signaling networks from time-course perturbation data. We utilize dynamic Bayesian networks with probabilistic Boolean threshold functions to describe protein activation. The model posterior distribution is analyzed using evolutionary MCMC sampling and subsequent clustering, resulting in probability distributions over alternative networks. We evaluate our method on simulated data, and study its performance with respect to data set size and levels of noise. We then use our method to study EGF-mediated signaling in the ERBB pathway.
Dynamic Probabilistic Threshold Networks is a new method to infer signaling networks from time-series perturbation data. It exploits the dynamic response of a system after external perturbation for network reconstruction. On simulated data, we show that the approach outperforms current state of the art methods. On the ERBB data, our approach recovers a significant fraction of the known interactions, and predicts novel mechanisms in the ERBB pathway.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2105-15-250) contains supplementary material, which is available to authorized users.
As an RNA virus, hepatitis C virus (HCV) is able to rapidly acquire drug resistance, and for this reason the design of effective anti-HCV drugs is a real challenge. The HCV subgenomic replicon-containing cells are widely used for experimental studies of the HCV genome replication mechanisms, for drug testing in vitro and in studies of HCV drug resistance. The NS3/4A protease is essential for virus replication and, therefore, it is one of the most attractive targets for developing specific antiviral agents against HCV. We have developed a stochastic model of subgenomic HCV replicon replication, in which the emergence and selection of drug resistant mutant viral RNAs in replicon cells is taken into account. Incorporation into the model of key NS3 protease mutations leading to resistance to BILN-2061 (A156T, D168V, R155Q), VX-950 (A156S, A156T, T54A) and SCH 503034 (A156T, A156S, T54A) inhibitors allows us to describe the long term dynamics of the viral RNA suppression for various inhibitor concentrations. We theoretically showed that the observable difference between the viral RNA kinetics for different inhibitor concentrations can be explained by differences in the replication rate and inhibitor sensitivity of the mutant RNAs. The pre-existing mutants of the NS3 protease contribute more significantly to appearance of new resistant mutants during treatment with inhibitors than wild-type replicon. The model can be used to interpret the results of anti-HCV drug testing on replicon systems, as well as to estimate the efficacy of potential drugs and predict optimal schemes of their usage.
Virus infection-induced global protein synthesis suppression is linked to assembly of stress granules (SGs), cytosolic aggregates of stalled translation preinitiation complexes. To study long-term stress responses, we developed an imaging approach for extended observation and analysis of SG dynamics during persistent hepatitis C virus (HCV) infection. In combination with type 1 interferon, HCV infection induces highly dynamic assembly/disassembly of cytoplasmic SGs, concomitant with phases of active and stalled translation, delayed cell division, and prolonged cell survival. Double-stranded RNA (dsRNA), independent of viral replication, is sufficient to trigger these oscillations. Translation initiation factor eIF2α phosphorylation by protein kinase R mediates SG formation and translation arrest. This is antagonized by the upregulation of GADD34, the regulatory subunit of protein phosphatase 1 dephosphorylating eIF2α. Stress response oscillation is a general mechanism to prevent long-lasting translation repression and a conserved host cell reaction to multiple RNA viruses, which HCV may exploit to establish persistence.
Altered DNA methylation patterns represent an attractive mechanism for understanding the phenotypic changes associated with human aging. Several studies have described global and complex age-related methylation changes, but their structural and functional significance has remained largely unclear.
We have used transcriptome sequencing to characterize age-related gene expression changes in the human epidermis. The results revealed a significant set of 75 differentially expressed genes with a strong functional relationship to skin homeostasis. We then used whole-genome bisulfite sequencing to identify age-related methylation changes at single-base resolution. Data analysis revealed no global aberrations, but rather highly localized methylation changes, particularly in promoter and enhancer regions that were associated with altered transcriptional activity.
Our results suggest that the core developmental program of human skin is stably maintained through the aging process and that aging is associated with a limited destabilization of the epigenome at gene regulatory elements.
Aging; DNA methylation; Epidermis; Methylome sequencing; Transcriptome sequencing
Oligodendroglial tumors form a distinct subgroup of gliomas, characterized by a better response to treatment and prolonged overall survival. Most oligodendrogliomas and also some oligoastrocytomas are characterized by a unique and typical unbalanced translocation, der(1,19), resulting in a 1p/19q co-deletion. Candidate tumor suppressor genes targeted by these losses, CIC on 19q13.2 and FUBP1 on 1p31.1, were only recently discovered. We analyzed 17 oligodendrogliomas and oligoastrocytomas by applying a comprehensive approach consisting of RNA expression analysis, DNA sequencing of CIC, FUBP1, IDH1/2, and array CGH. We confirmed three different genetic subtypes in our samples: i) the “oligodendroglial” subtype with 1p/19q co-deletion in twelve out of 17 tumors; ii) the “astrocytic” subtype in three tumors; iii) the “other” subtype in two tumors. All twelve tumors with the 1p/19q co-deletion carried the most common IDH1 R132H mutation. In seven of these tumors, we found protein-disrupting point mutations in the remaining allele of CIC, four of which are novel. One of these tumors also had a deleterious mutation in FUBP1. Only by integrating RNA expression and array CGH data, were we able to discover an exon-spanning homozygous microdeletion within the remaining allele of CIC in an additional tumor with 1p/19q co-deletion. Therefore we propose that the mutation rate might be underestimated when looking at sequence variants alone. In conclusion, the high frequency and the spectrum of CIC mutations in our 1p/19q-codeleted tumor cohort support the hypothesis that CIC acts as a tumor suppressor in these tumors, whereas FUBP1 might play only a minor role.
Hepatitis C virus (HCV) infection develops into chronicity in 80% of all patients, characterized by persistent low-level replication. To understand how the virus establishes its tightly controlled intracellular RNA replication cycle, we developed the first detailed mathematical model of the initial dynamic phase of the intracellular HCV RNA replication. We therefore quantitatively measured viral RNA and protein translation upon synchronous delivery of viral genomes to host cells, and thoroughly validated the model using additional, independent experiments. Model analysis was used to predict the efficacy of different classes of inhibitors and identified sensitive substeps of replication that could be targeted by current and future therapeutics. A protective replication compartment proved to be essential for sustained RNA replication, balancing translation versus replication and thus effectively limiting RNA amplification. The model predicts that host factors involved in the formation of this compartment determine cellular permissiveness to HCV replication. In gene expression profiling, we identified several key processes potentially determining cellular HCV replication efficiency.
Hepatitis C is a severe disease and a prime cause for liver transplantation. Up to 3% of the world's population are chronically infected with its causative agent, the Hepatitis C virus (HCV). This capacity to establish long (decades) lasting persistent infection sets HCV apart from other plus-strand RNA viruses typically causing acute, self-limiting infections. A prerequisite for its capacity to persist is HCV's complex and tightly regulated intracellular replication strategy. In this study, we therefore wanted to develop a comprehensive understanding of the molecular processes governing HCV RNA replication in order to pinpoint the most vulnerable substeps in the viral life cycle. For that purpose, we used a combination of biological experiments and mathematical modeling. Using the model to study HCV's replication strategy, we recognized diverse but crucial roles for the membraneous replication compartment of HCV in regulating RNA amplification. We further predict the existence of an essential limiting host factor (or function) required for establishing active RNA replication and thereby determining cellular permissiveness for HCV. Our model also proved valuable to understand and predict the effects of pharmacological inhibitors of HCV and might be a solid basis for the development of similar models for other plus-strand RNA viruses.
Perturbation experiments for example using RNA interference (RNAi) offer an attractive way to elucidate gene function in a high throughput fashion. The placement of hit genes in their functional context and the inference of underlying networks from such data, however, are challenging tasks. One of the problems in network inference is the exponential number of possible network topologies for a given number of genes. Here, we introduce a novel mathematical approach to address this question. We formulate network inference as a linear optimization problem, which can be solved efficiently even for large-scale systems. We use simulated data to evaluate our approach, and show improved performance in particular on larger networks over state-of-the art methods. We achieve increased sensitivity and specificity, as well as a significant reduction in computing time. Furthermore, we show superior performance on noisy data. We then apply our approach to study the intracellular signaling of human primary nave CD4+ T-cells, as well as ErbB signaling in trastuzumab resistant breast cancer cells. In both cases, our approach recovers known interactions and points to additional relevant processes. In ErbB signaling, our results predict an important role of negative and positive feedback in controlling the cell cycle progression.
Viruses are extremely heterogeneous entities; the size and the nature of their genetic information, as well as the strategies employed to amplify and propagate their genomes, are highly variable. However, as obligatory intracellular parasites, replication of all viruses relies on the host cell. Having co-evolved with their host for several million years, viruses have developed very sophisticated strategies to hijack cellular factors that promote virus uptake, replication, and spread. Identification of host cell factors (HCFs) required for these processes is a major challenge for researchers, but it enables the identification of new, highly selective targets for anti viral therapeutics. To this end, the establishment of platforms enabling genome-wide high-throughput RNA interference (HT-RNAi) screens has led to the identification of several key factors involved in the viral life cycle. A number of genome-wide HT-RNAi screens have been performed for major human pathogens. These studies enable first inter-viral comparisons related to HCF requirements. Although several cellular functions appear to be uniformly required for the life cycle of most viruses tested (such as the proteasome and the Golgi-mediated secretory pathways), some factors, like the lipid kinase Phosphatidylinositol 4-kinase IIIα in the case of hepatitis C virus, are selectively required for individual viruses. However, despite the amount of data available, we are still far away from a comprehensive understanding of the interplay between viruses and host factors. Major limitations towards this goal are the low sensitivity and specificity of such screens, resulting in limited overlap between different screens performed with the same virus. This review focuses on how statistical and bioinformatic analysis methods applied to HT-RNAi screens can help overcoming these issues thus increasing the reliability and impact of such studies.
RNA interference; High-throughput; Cell population; Dependency factors; Bioinformatics; Human immunodeficiency virus; Hepatitis C virus; Dengue virus; Viral infection; Virus-host interactions
Oligodendroglioma poses a biological conundrum for malignant adult human gliomas: it is a tumor type that is universally incurable for patients, and yet, only a few of the human tumors have been established as cell populations in vitro or as intracranial xenografts in vivo. Their survival, thus, may emerge only within a specific environmental context. To determine the fate of human oligodendroglioma in an experimental model, we studied the development of an anaplastic tumor after intracranial implantation into enhanced green fluorescent protein (eGFP) positive NOD/SCID mice. Remarkably after nearly nine months, the tumor not only engrafted, but it also retained classic histological and genetic features of human oligodendroglioma, in particular cells with a clear cytoplasm, showing an infiltrative growth pattern, and harboring mutations of IDH1 (R132H) and of the tumor suppressor genes, FUBP1 and CIC. The xenografts were highly invasive, exhibiting a distinct migration and growth pattern around neurons, especially in the hippocampus, and following white matter tracts of the corpus callosum with tumor cells accumulating around established vasculature. Although tumors exhibited a high growth fraction in vivo, neither cells from the original patient tumor nor the xenograft exhibited significant growth in vitro over a six-month period. This glioma xenograft is the first to display a pure oligodendroglioma histology and expression of R132H. The unexpected property, that the cells fail to grow in vitro even after passage through the mouse, allows us to uniquely investigate the relationship of this oligodendroglioma with the in vivo microenvironment.
miRNA cluster miR-17-92 is known as oncomir-1 due to its potent oncogenic function. miR-17-92 is a polycistronic cluster that encodes 6 miRNAs, and can both facilitate and inhibit cell proliferation. Known targets of miRNAs encoded by this cluster are largely regulators of cell cycle progression and apoptosis. Here, we show that miRNAs encoded by this cluster and sharing the seed sequence of miR-17 exert their influence on one of the most essential cellular processes – endocytic trafficking. By mRNA expression analysis we identified that regulation of endocytic trafficking by miR-17 can potentially be achieved by targeting of a number of trafficking regulators. We have thoroughly validated TBC1D2/Armus, a GAP of Rab7 GTPase, as a novel target of miR-17. Our study reveals regulation of endocytic trafficking as a novel function of miR-17, which might act cooperatively with other functions of miR-17 and related miRNAs in health and disease.
Using a genome-wide screening approach, we have established the genetic requirements for proper telomere structure in Saccharomyces cerevisiae. We uncovered 112 genes, many of which have not previously been implicated in telomere function, that are required to form a fold-back structure at chromosome ends. Among other biological processes, lysine deacetylation, through the Rpd3L, Rpd3S, and Hda1 complexes, emerged as being a critical regulator of telomere structure. The telomeric-bound protein, Rif2, was also found to promote a telomere fold-back through the recruitment of Rpd3L to telomeres. In the absence of Rpd3 function, telomeres have an increased susceptibility to nucleolytic degradation, telomere loss, and the initiation of premature senescence, suggesting that an Rpd3-mediated structure may have protective functions. Together these data reveal that multiple genetic pathways may directly or indirectly impinge on telomere structure, thus broadening the potential targets available to manipulate telomere function.
Impaired telomere elongation eventually results in telomere dysfunction and can lead to diseases such as dyskeratosis congenita, which is associated with bone-marrow failure and pulmonary fibrosis. Cancer cells require continuous telomere maintenance to ensure continued cellular proliferation. Therefore the regulation of telomere function, both positively (in the case of dyskeratosis congenita) and negatively (for cancer), may be of therapeutic benefit. In this study we have used yeast to determine which genetic factors are important for a certain telomeric structure (the loop structure), which may help to maintain chromosome ends in a protected state. We found that multiple genetic factors and pathways affect telomere structure, ranging from metabolic signaling to specific telomere-binding proteins. We found that proper chromatin structure at the telomere is essential to maintain a telomere fold-back structure. Importantly, there was a strong correlation between telomere structure and function, as the mutants found in our screen (looping defective) were often associated with rapid senescence and telomere dysfunction phenotypes. We believe that, through the regulation of the various genetic pathways uncovered in our screen, one may be able to both positively and negatively influence telomere function.
Hepatitis C virus (HCV) is a major causative agent of chronic liver disease in humans. To gain insight into host factor requirements for HCV replication we performed a siRNA screen of the human kinome and identified 13 different kinases, including phosphatidylinositol-4 kinase III alpha (PI4KIIIα) as required for HCV replication. Consistent with elevated levels of the PI4KIIIα product phosphatidylinositol-4-phosphate (PI4P) detected in HCV infected cultured hepatocytes and liver tissue from chronic hepatitis C patients, the enzymatic activity of PI4KIIIα was critical for HCV replication. Viral nonstructural protein 5A (NS5A) was found to interact with PI4KIIIα and stimulate its kinase activity. The absence of PI4KIIIα activity induced a dramatic change in the ultrastructural morphology of the membranous HCV replication complex. Our analysis suggests that the direct activation of a lipid kinase by HCV NS5A contributes critically to the integrity of the membranous viral replication complex.
Hepatitis C virus (HCV) has infected around 160 million individuals. Current therapies have limited efficacy and are fraught with side effects. To identify cellular HCV dependency factors, possible therapeutic targets, we manipulated signaling cascades with pathway-specific inhibitors. Using this approach we identified the MAPK/ERK regulated, cytosolic, calcium-dependent, group IVA phospholipase A2 (PLA2G4A) as a novel HCV dependency factor. Inhibition of PLA2G4A activity reduced core protein abundance at lipid droplets, core envelopment and secretion of particles. Moreover, released particles displayed aberrant protein composition and were 100-fold less infectious. Exogenous addition of arachidonic acid, the cleavage product of PLA2G4A-catalyzed lipolysis, but not other related poly-unsaturated fatty acids restored infectivity. Strikingly, production of infectious Dengue virus, a relative of HCV, was also dependent on PLA2G4A. These results highlight previously unrecognized parallels in the assembly pathways of these human pathogens, and define PLA2G4A-dependent lipolysis as crucial prerequisite for production of highly infectious viral progeny.
The human genome encodes more than 30 phospholipase A2s. These enzymes cleave fatty acids at the C2 atom of phosphoglycerides and thus modulate membrane properties. Among all PLA2s only PLA2G4A, which is recruited to perinuclear membranes by Ca2+ and activated by extracellular stimuli via the mitogen activated protein kinase pathway, specifically cleaves lipids with arachidonic acid. Metabolism of arachidonic acid yields prostaglandins and leukotriens, important lipid mediators of inflammation. We show that inhibition of PLA2G4A produces aberrant HCV particles and that infectivity is rescued by addition of arachidonic acid. Our results suggest that a specific lipid (arachidonic acid) is essential for production of highly infectious HCV progeny, likely by creating a membrane environment conducive for efficient incorporation of crucial host and viral factors into the lipid envelope of nascent particles. Strikingly, PLA2G4A is also essential for production of highly infectious Dengue Virus (DENV) particles but not for vesicular stomatitis virus (VSV). These observations argue that HCV and DENV which unlike VSV produce particles at intracellular membranes usurp a common host factor (PLA2G4A) for assembly of highly infectious progeny. These findings open new perspectives for antiviral intervention and highlight thus far unrecognized parallels in the assembly pathway of HCV and DENV.
Network inference deals with the reconstruction of biological networks from experimental data. A variety of different reverse engineering techniques are available; they differ in the underlying assumptions and mathematical models used. One common problem for all approaches stems from the complexity of the task, due to the combinatorial explosion of different network topologies for increasing network size. To handle this problem, constraints are frequently used, for example on the node degree, number of edges, or constraints on regulation functions between network components. We propose to exploit topological considerations in the inference of gene regulatory networks. Such systems are often controlled by a small number of hub genes, while most other genes have only limited influence on the network's dynamic. We model gene regulation using a Bayesian network with discrete, Boolean nodes. A hierarchical prior is employed to identify hub genes. The first layer of the prior is used to regularize weights on edges emanating from one specific node. A second prior on hyperparameters controls the magnitude of the former regularization for different nodes. The net effect is that central nodes tend to form in reconstructed networks. Network reconstruction is then performed by maximization of or sampling from the posterior distribution. We evaluate our approach on simulated and real experimental data, indicating that we can reconstruct main regulatory interactions from the data. We furthermore compare our approach to other state-of-the art methods, showing superior performance in identifying hubs. Using a large publicly available dataset of over 800 cell cycle regulated genes, we are able to identify several main hub genes. Our method may thus provide a valuable tool to identify interesting candidate genes for further study. Furthermore, the approach presented may stimulate further developments in regularization methods for network reconstruction from data.
High-content, high-throughput RNA interference (RNAi) offers unprecedented possibilities to elucidate gene function and involvement in biological processes. Microscopy based screening allows phenotypic observations at the level of individual cells. It was recently shown that a cell's population context significantly influences results. However, standard analysis methods for cellular screens do not currently take individual cell data into account unless this is important for the phenotype of interest, i.e. when studying cell morphology.
We present a method that normalizes and statistically scores microscopy based RNAi screens, exploiting individual cell information of hundreds of cells per knockdown. Each cell's individual population context is employed in normalization. We present results on two infection screens for hepatitis C and dengue virus, both showing considerable effects on observed phenotypes due to population context. In addition, we show on a non-virus screen that these effects can be found also in RNAi data in the absence of any virus. Using our approach to normalize against these effects we achieve improved performance in comparison to an analysis without this normalization and hit scoring strategy. Furthermore, our approach results in the identification of considerably more significantly enriched pathways in hepatitis C virus replication than using a standard analysis approach.
Using a cell-based analysis and normalization for population context, we achieve improved sensitivity and specificity not only on a individual protein level, but especially also on a pathway level. This leads to the identification of new host dependency factors of the hepatitis C and dengue viruses and higher reproducibility of results.
Autoimmune pancreatitis (AIP) is thought to be an immune-mediated inflammatory process, directed against the epithelial components of the pancreas.
In order to explore key targets of the inflammatory process we analysed the expression of proteins at the RNA and protein level using genomics and proteomics, immunohistochemistry, Western blot and immunoassay. An animal model of AIP with LP-BM5 murine leukemia virus infected mice was studied in parallel. RNA microarrays of pancreatic tissue from 12 patients with AIP were compared to those of 8 patients with non-AIP chronic pancreatitis (CP).
Expression profiling revealed 272 upregulated genes, including those encoding for immunoglobulins, chemokines and their receptors, and 86 downregulated genes, including those for pancreatic proteases such as three trypsinogen isoforms. Protein profiling showed that the expression of trypsinogens and other pancreatic enzymes was greatly reduced. Immunohistochemistry demonstrated a near-loss of trypsin positive acinar cells, which was also confirmed by Western blotting. The serum of AIP patients contained high titres of autoantibodies against the trypsinogens PRSS1, and PRSS2 but not against PRSS3. In addition, there were autoantibodies against the trypsin inhibitor PSTI (the product of the SPINK1 gene). In the pancreas of AIP animals we found similar protein patterns and a reduction in trypsinogen.
These data indicate that the immune-mediated process characterizing AIP involves pancreatic acinar cells and their secretory enzymes such as trypsin isoforms. Demonstration of trypsinogen autoantibodies may be helpful for the diagnosis of AIP.
autoimmune pancreatitis; chronic pancreatitis; trypsinogen; proteomics; transcriptomics; autoantibody
Motivation: Detecting human proteins that are involved in virus entry and replication is facilitated by modern high-throughput RNAi screening technology. However, hit lists from different laboratories have shown only little consistency. This may be caused by not only experimental discrepancies, but also not fully explored possibilities of the data analysis. We wanted to improve reliability of such screens by combining a population analysis of infected cells with an established dye intensity readout.
Results: Viral infection is mainly spread by cell–cell contacts and clustering of infected cells can be observed during spreading of the infection in situ and in vivo. We employed this clustering feature to define knockdowns which harm viral infection efficiency of human Hepatitis C Virus. Images of knocked down cells for 719 human kinase genes were analyzed with an established point pattern analysis method (Ripley's K-function) to detect knockdowns in which virally infected cells did not show any clustering and therefore were hindered to spread their infection to their neighboring cells. The results were compared with a statistical analysis using a common intensity readout of the GFP-expressing viruses and a luciferase-based secondary screen yielding five promising host factors which may suit as potential targets for drug therapy.
Conclusion: We report of an alternative method for high-throughput imaging methods to detect host factors being relevant for the infection efficiency of viruses. The method is generic and has the potential to be used for a large variety of different viruses and treatments being screened by imaging techniques.
Contact: email@example.com; firstname.lastname@example.org
Supplementary information: Supplementary data are available at Bioinformatics online.
Human immunodeficiency virus type 1 (HIV-1) group M viruses have achieved a global distribution, while HIV-1 group O viruses are endemic only in particular regions of Africa. Here, we evaluated biological characteristics of group O and group M viruses in ex vivo models of HIV-1 infection. The replicative capacity and ability to induce CD4 T-cell depletion of eight group O and seven group M primary isolates were monitored in cultures of human peripheral blood mononuclear cells and tonsil explants. Comparative and longitudinal infection studies revealed HIV-1 group-specific activity patterns: CCR5-using (R5) viruses from group M varied considerably in their replicative capacity but showed similar levels of cytopathicity. In contrast, R5 isolates from group O were relatively uniform in their replicative fitness but displayed a high and unprecedented variability in their potential to deplete CD4 T cells. Two R5 group O isolates were identified that cause massive depletion of CD4 T cells, to an extent comparable to CXCR4-using viruses and not documented for any R5 isolate from group M. Intergroup comparisons found a five- to eightfold lower replicative fitness of isolates from group O than for isolates from group M yet a similar overall intrinsic pathogenicity in tonsil cultures. This study establishes biological ex vivo characteristics of HIV-1 group O primary isolates. The current findings challenge the belief that a grossly reduced replicative fitness or inherently impaired cytopathicity of viruses from this group underlies their low global prevalence.
The reconstruction of gene regulatory networks from time series gene expression data is one of the most difficult problems in systems biology. This is due to several reasons, among them the combinatorial explosion of possible network topologies, limited information content of the experimental data with high levels of noise, and the complexity of gene regulation at the transcriptional, translational and post-translational levels. At the same time, quantitative, dynamic models, ideally with probability distributions over model topologies and parameters, are highly desirable.
We present a novel approach to infer such models from data, based on nonlinear differential equations, which we embed into a stochastic Bayesian framework. We thus address both the stochasticity of experimental data and the need for quantitative dynamic models. Furthermore, the Bayesian framework allows it to easily integrate prior knowledge into the inference process. Using stochastic sampling from the Bayes' posterior distribution, our approach can infer different likely network topologies and model parameters along with their respective probabilities from given data. We evaluate our approach on simulated data and the challenge #3 data from the DREAM 2 initiative. On the simulated data, we study effects of different levels of noise and dataset sizes. Results on real data show that the dynamics and main regulatory interactions are correctly reconstructed.
Our approach combines dynamic modeling using differential equations with a stochastic learning framework, thus bridging the gap between biophysical modeling and stochastic inference approaches. Results show that the method can reap the advantages of both worlds, and allows the reconstruction of biophysically accurate dynamic models from noisy data. In addition, the stochastic learning framework used permits the computation of probability distributions over models and model parameters, which holds interesting prospects for experimental design purposes.
Single-nucleotide polymorphism (SNP) analysis is a powerful tool for mapping and diagnosing disease-related alleles. Mutation analysis by polymerase-mediated single-base primer extension (minisequencing) can be massively parallelized using DNA microchips or flow cytometry with microspheres as solid support. By adding a unique oligonucleotide tag to the 5′ end of the minisequencing primer and attaching the complementary antitag to the array or bead surface, the assay can be ‘demultiplexed’. Such high-throughput scoring of SNPs requires a high level of primer multiplexing in order to analyze multiple loci in one assay, thus enabling inexpensive and fast polymorphism scoring. We present a computer program to automate the design process for the assay. Oligonucleotide primers for the reaction are automatically selected by the software, a unique DNA tag/antitag system is generated, and the pairing of primers and DNA tags is automatically done in a way to avoid any crossreactivity. We report results on a 45-plex genotyping assay, indicating that minisequencing can be adapted to be a powerful tool for high-throughput, massively parallel genotyping. The software is available to academic users on request.