In the present study, an integrated hierarchical approach was applied to: (1) identify pathways associated with susceptibility to schizophrenia; (2) detect genes that may be potentially affected in these pathways since they contain an associated polymorphism; and (3) annotate the functional consequences of such single-nucleotide polymorphisms (SNPs) in the affected genes or their regulatory regions. The Global Test was applied to detect schizophrenia-associated pathways using discovery and replication datasets comprising 5,040 and 5,082 individuals of European ancestry, respectively. Information concerning functional gene-sets was retrieved from the Kyoto Encyclopedia of Genes and Genomes, Gene Ontology, and the Molecular Signatures Database. Fourteen of the gene-sets or pathways identified in the discovery dataset were confirmed in the replication dataset. These include functional processes involved in transcriptional regulation and gene expression, synapse organization, cell adhesion, and apoptosis. For two genes, i.e. CTCF and CACNB2, evidence for association with schizophrenia was available (at the gene-level) in both the discovery study and published data from the Psychiatric Genomics Consortium schizophrenia study. Furthermore, these genes mapped to four of the 14 presently identified pathways. Several of the SNPs assigned to CTCF and CACNB2 have potential functional consequences, and a gene in close proximity to CACNB2, i.e. ARL5B, was identified as a potential gene of interest. Application of the present hierarchical approach thus allowed: (1) identification of novel biological gene-sets or pathways with potential involvement in the etiology of schizophrenia, as well as replication of these findings in an independent cohort; (2) detection of genes of interest for future follow-up studies; and (3) the highlighting of novel genes in previously reported candidate regions for schizophrenia.
Large-scale genetic studies of complex diseases such as schizophrenia have identified a variety of susceptibility loci. Since many of the respective variants have only a weak influence on disease risk, pathophysiological interpretation of the results is problematic. Investigation of the joint effects of multiple functionally related genes or pathways increases the power to detect disease related genes, and provides insights into the etiology of the disease in question. In the present study, an integrated hierarchical approach was applied to: (i) identify pathways associated with complex neuropsychiatric disease schizophrenia (ii) detect potentially affected genes in these pathways; and (iii) annotate the functional consequences of genetic markers in the affected genes or their regulatory regions. Two samples comprising >10,000 individuals of European ancestry as well as data from the Psychiatric Genomics Consortium schizophrenia study were examined. Pathways representing transcriptional regulation and gene expression, cell adhesion, apoptosis, and synapse organization showed significant association with schizophrenia. In particular, CTCF, CACNB2, and ARL5B, i.e. genes involved in chromatin modulation, calcium channel signaling and membrane transport, respectively, were highlighted as candidate genes for schizophrenia risk.
Resident human lamina propria immune cells serve as powerful effectors in host defense. Molecular events associated with the initiation of an intestinal inflammatory response in these cells are largely unknown. Here, we aimed to characterize phenotypic and functional changes induced in these cells at the onset of intestinal inflammation using a human intestinal organ culture model. In this model, healthy human colonic mucosa was depleted of epithelial cells by EDTA treatment. Following loss of the epithelial layer, expression of the inflammatory mediators IL1B, IL6, IL8, IL23A, TNFA, CXCL2, and the surface receptors CD14, TLR2, CD86, CD54 was rapidly induced in resident lamina propria cells in situ as determined by qRT-PCR and immunohistology. Gene microarray analysis of lamina propria cells obtained by laser-capture microdissection provided an overview of global changes in gene expression occurring during the initiation of an intestinal inflammatory response in these cells. Bioinformatic analysis gave insight into signalling pathways mediating this inflammatory response. Furthermore, comparison with published microarray datasets of inflamed mucosa in vivo (ulcerative colitis) revealed a significant overlap of differentially regulated genes underlining the in vivo relevance of the organ culture model. Furthermore, genes never been previously associated with intestinal inflammation were identified using this model. The organ culture model characterized may be useful to study molecular mechanisms underlying the initiation of an intestinal inflammatory response in normal mucosa as well as potential alterations of this response in inflammatory bowel disease.
Pilocytic astrocytoma, the most common childhood brain tumor1, is typically associated with mitogen-activated protein kinase (MAPK) pathway alterations2. Surgically inaccessible midline tumors are therapeutically challenging, showing sustained tendency for progression3 and often becoming a chronic disease with substantial morbidities4.
Here we describe whole-genome sequencing of 96 pilocytic astrocytomas, with matched RNA sequencing (n=73), conducted by the International Cancer Genome Consortium (ICGC) PedBrain Tumor Project. We identified recurrent activating mutations in FGFR1 and PTPN11 and novel NTRK2 fusion genes in non-cerebellar tumors. New BRAF activating changes were also observed. MAPK pathway alterations affected 100% of tumors analyzed, with no other significant mutations, indicating pilocytic astrocytoma as predominantly a single-pathway disease.
Notably, we identified the same FGFR1 mutations in a subset of H3F3A-mutated pediatric glioblastoma with additional alterations in NF15. Our findings thus identify new potential therapeutic targets in distinct subsets of pilocytic astrocytoma and childhood glioblastoma.
In systems biology, a mathematical description of signal transduction processes is used to gain a more detailed mechanistic understanding of cellular signaling networks. Such models typically depend on a number of parameters that have different influence on the model behavior. Local sensitivity analysis is able to identify parameters that have the largest effect on signaling strength. Bifurcation analysis shows on which parameters a qualitative model response depends. Most methods for model analysis are intrinsically univariate. They typically cannot consider combinations of parameters since the search space for such analysis would be too large. This limitation is important since activation of a signaling pathway often relies on multiple rather than on single factors. Here, we present a novel method for model analysis that overcomes this limitation. As input to a model defined by a system of ordinary differential equations, we consider parameters for initial chemical species concentrations. The model is used to simulate the system response, which is then classified into pre-defined classes (e.g., active or not active). This is combined with a scan of the parameter space. Parameter sets leading to a certain system response are subjected to a decision tree algorithm, which learns conditions that lead to this response. We compare our method to two alternative multivariate approaches to model analysis: analytical solution for steady states combined with a parameter scan, and direct Lyapunov exponent (DLE) analysis. We use three previously published models including a model for EGF receptor internalization and two apoptosis models to demonstrate the power of our approach. Our method reproduces critical parameter relations previously obtained by both steady-state and DLE analysis while being more generally applicable and substantially less computationally expensive. The method can be used as a general tool to predict multivariate control strategies for pathway activation and to suggest strategies for drug intervention.
High-grade soft tissue sarcomas are a heterogeneous, complex group of aggressive malignant tumors showing mesenchymal differentiation. Recently, soft tissue sarcomas have increasingly been classified on the basis of underlying genetic alterations; however, the role of aberrant DNA methylation in these tumors is not well understood and, consequently, the usefulness of methylation-based classification is unclear.
We used the Infinium HumanMethylation27 platform to profile DNA methylation in 80 primary, untreated high-grade soft tissue sarcomas, representing eight relevant subtypes, two non-neoplastic fat samples and 14 representative sarcoma cell lines. The primary samples were partitioned into seven stable clusters. A classification algorithm identified 216 CpG sites, mapping to 246 genes, showing different degrees of DNA methylation between these seven groups. The differences between the clusters were best represented by a set of eight CpG sites located in the genes SPEG, NNAT, FBLN2, PYROXD2, ZNF217, COL14A1, DMRT2 and CDKN2A. By integrating DNA methylation and mRNA expression data, we identified 27 genes showing negative and three genes showing positive correlation. Compared with non-neoplastic fat, NNAT showed DNA hypomethylation and inverse gene expression in myxoid liposarcomas, and DNA hypermethylation and inverse gene expression in dedifferentiated and pleomorphic liposarcomas. Recovery of NNAT in a hypermethylated myxoid liposarcoma cell line decreased cell migration and viability.
Our analysis represents the first comprehensive integration of DNA methylation and transcriptional data in primary high-grade soft tissue sarcomas. We propose novel biomarkers and genes relevant for pathogenesis, including NNAT as a potential tumor suppressor in myxoid liposarcomas.
Mutation is a fundamental process in tumorigenesis. However, the degree to which the rate of somatic mutation varies across the human genome and the mechanistic basis underlying this variation remain to be fully elucidated. Here, we performed a cross-cancer comparison of 402 whole genomes comprising a diverse set of childhood and adult tumors, including both solid and hematopoietic malignancies. Surprisingly, we found that the inactive X chromosome of many female cancer genomes accumulates on average twice and up to four times as many somatic mutations per megabase, as compared to the individual autosomes. Whole-genome sequencing of clonally expanded hematopoietic stem/progenitor cells (HSPCs) from healthy individuals and a premalignant myelodysplastic syndrome (MDS) sample revealed no X chromosome hypermutation. Our data suggest that hypermutation of the inactive X chromosome is an early and frequent feature of tumorigenesis resulting from DNA replication stress in aberrantly proliferating cells.
•X chromosome has up to 4× more mutations than the autosomes in female cancer genomes•Hypermutations only affect the inactive X chromosome•X hypermutation involves somatic point mutations and indels, but not germline mutations•No X hypermutation is found in clonal expansions of normal or premalignant cells
A comparison of 402 cancer genomes identifies a surprisingly high level of somatic mutations in the inactive X chromosome of female cancer genomes. As hypermutability of the inactive X was not observed in clonal hematopoietic progenitor or preleukemic samples, it is likely that it may be a contributing factor to tumorigenesis.
The emergence of high-throughput, next-generation sequencing technologies has dramatically altered the way we assess genomes in population genetics and in cancer genomics. Currently, there are four commonly used whole-genome sequencing platforms on the market: Illumina’s HiSeq2000, Life Technologies’ SOLiD 4 and its completely redesigned 5500xl SOLiD, and Complete Genomics’ technology. A number of earlier studies have compared a subset of those sequencing platforms or compared those platforms with Sanger sequencing, which is prohibitively expensive for whole genome studies. Here we present a detailed comparison of the performance of all currently available whole genome sequencing platforms, especially regarding their ability to call SNVs and to evenly cover the genome and specific genomic regions. Unlike earlier studies, we base our comparison on four different samples, allowing us to assess the between-sample variation of the platforms. We find a pronounced GC bias in GC-rich regions for Life Technologies’ platforms, with Complete Genomics performing best here, while we see the least bias in GC-poor regions for HiSeq2000 and 5500xl. HiSeq2000 gives the most uniform coverage and displays the least sample-to-sample variation. In contrast, Complete Genomics exhibits by far the smallest fraction of bases not covered, while the SOLiD platforms reveal remarkable shortcomings, especially in covering CpG islands. When comparing the performance of the four platforms for calling SNPs, HiSeq2000 and Complete Genomics achieve the highest sensitivity, while the SOLiD platforms show the lowest false positive rate. Finally, we find that integrating sequencing data from different platforms offers the potential to combine the strengths of different technologies. In summary, our results detail the strengths and weaknesses of all four whole-genome sequencing platforms. It indicates application areas that call for a specific sequencing platform and disallow other platforms. This helps to identify the proper sequencing platform for whole genome studies with different application scopes.
Medulloblastoma is an aggressively-growing tumour, arising in the cerebellum or medulla/brain stem. It is the most common malignant brain tumour in children, and displays tremendous biological and clinical heterogeneity1. Despite recent treatment advances, approximately 40% of children experience tumour recurrence, and 30% will die from their disease. Those who survive often have a significantly reduced quality of life.
Four tumour subgroups with distinct clinical, biological and genetic profiles are currently discriminated2,3. WNT tumours, displaying activated wingless pathway signalling, carry a favourable prognosis under current treatment regimens4. SHH tumours show hedgehog pathway activation, and have an intermediate prognosis2. Group 3 & 4 tumours are molecularly less well-characterised, and also present the greatest clinical challenges2,3,5. The full repertoire of genetic events driving this distinction, however, remains unclear.
Here we describe an integrative deep-sequencing analysis of 125 tumour-normal pairs. Tetraploidy was identified as a frequent early event in Group 3 & 4 tumours, and a positive correlation between patient age and mutation rate was observed. Several recurrent mutations were identified, both in known medulloblastoma-related genes (CTNNB1, PTCH1, MLL2, SMARCA4) and in genes not previously linked to this tumour (DDX3X, CTDNEP1, KDM6A, TBR1), often in subgroup-specific patterns. RNA-sequencing confirmed these alterations, and revealed the expression of the first medulloblastoma fusion genes. Chromatin modifiers were frequently altered across all subgroups.
These findings enhance our understanding of the genomic complexity and heterogeneity underlying medulloblastoma, and provide several potential targets for new therapeutics, especially for Group 3 & 4 patients.
Malignant melanoma is a highly-aggressive type of malignancy with considerable metastatic potential and frequent resistance to cytotoxic agents. BRAF mutant protein was recently recognized as therapeutic target in metastatic melanoma. We present a newly-developed U-BRAFV600 approach – a universal pyrosequencing-based assay for mutation detection within activation segment in exon 15 of human braf. We identified 5 different BRAF mutations in a single assay analyzing 75 different formalin-fixed paraffin-embedded (FFPE) samples of cutaneous melanoma metastases from 29 patients. We found BRAF mutations in 21 of 29 metastases. All mutant variants were quantitatively detectable by the newly-developed U-BRAFV600 assay. These results were confirmed by ultra-deep-sequencing validation (∼60,000-fold coverage). In contrast to all other BRAF state detection methods, the U-BRAFV600 assay is capable of automated quantitative identification of at least 36 previously-published BRAF mutations. Under the precaution of a minimum of 3% mutated cells in front of a background of wild type cells, U-BRAFV600 assay design completely excludes false wild-type results. The corresponding algorithm for classification of BRAF-mutated variants is provided. The single-reaction assay and data analysis automation makes our approach suitable for the assessment of large clinical sample sizes. Therefore, we suggest U-BRAFV600 assay as a most powerful sequencing-based diagnostic tool to automatically identify BRAF state as a prerequisite to targeted therapy.
Genomic rearrangements are thought to occur progressively during tumor development. Recent findings, however, suggest an alternative mechanism, involving massive chromosome rearrangements in a one-step catastrophic event termed chromothripsis. We report the whole-genome sequencing-based analysis of a Sonic-Hedgehog medulloblastoma (SHH-MB) brain tumor from a patient with a germline TP53 mutation (Li-Fraumeni syndrome), uncovering massive, complex chromosome rearrangements. Integrating TP53 status with microarray and deep sequencing-based DNA rearrangement data in additional patients reveals a striking association between TP53 mutation and chromothripsis in SHH-MBs. Analysis of additional tumor entities substantiates a link between TP53 mutation and chromothripsis, and indicates a context-specific role for p53 in catastrophic DNA rearrangements. Among these, we observed a strong association between somatic TP53 mutations and chromothripsis in acute myeloid leukemia. These findings connect p53 status and chromothripsis in specific tumor types, providing a genetic basis for understanding particularly aggressive subtypes of cancer.
Amyotrophic lateral sclerosis (ALS) is a fatal disorder of the motor neuron system with poor prognosis and marginal therapeutic options. Current clinical diagnostic criteria are based on electrophysiological examination and exclusion of other ALS-mimicking conditions. Neuroprotective treatments are, however, most promising in early disease stages. Identification of disease-specific CSF biomarkers and associated biochemical pathways is therefore most relevant to monitor disease progression, response to neuroprotective agents and to enable early inclusion of patients into clinical trials.
Methods and Findings
CSF from 35 patients with ALS diagnosed according to the revised El Escorial criteria and 23 age-matched controls was processed using paramagnetic bead chromatography for protein isolation and subsequently analyzed by MALDI-TOF mass spectrometry. CSF protein profiles were integrated into a Random Forest model constructed from 153 mass peaks. After reducing this peak set to the top 25%, a classifier was built which enabled prediction of ALS with high accuracy, sensitivity and specificity. Further analysis of the identified peptides resulted in a panel of five highly sensitive ALS biomarkers. Upregulation of secreted phosphoprotein 1 in ALS-CSF samples was confirmed by univariate analysis of ELISA and mass spectrometry data. Further quantitative validation of the five biomarkers was achieved in an 80-plex Multiple Reaction Monitoring mass spectrometry assay.
ALS classification based on the CSF biomarker panel proposed in this study could become a valuable predictive tool for early clinical risk stratification. Of the numerous CSF proteins identified, many have putative roles in ALS-related metabolic processes, particularly in chromogranin-mediated secretion signaling pathways. While a stand-alone clinical application of this classifier will only be possible after further validation and a multicenter trial, it could be readily used to complement current ALS diagnostics and might also provide new insights into the pathomechanisms of this disease in the future.
Mitochondria exist as a network of interconnected organelles undergoing constant fission and fusion. Current approaches to study mitochondrial morphology are limited by low data sampling coupled with manual identification and classification of complex morphological phenotypes. Here we propose an integrated mechanistic and data-driven modeling approach to analyze heterogeneous, quantified datasets and infer relations between mitochondrial morphology and apoptotic events. We initially performed high-content, multi-parametric measurements of mitochondrial morphological, apoptotic, and energetic states by high-resolution imaging of human breast carcinoma MCF-7 cells. Subsequently, decision tree-based analysis was used to automatically classify networked, fragmented, and swollen mitochondrial subpopulations, at the single-cell level and within cell populations. Our results revealed subtle but significant differences in morphology class distributions in response to various apoptotic stimuli. Furthermore, key mitochondrial functional parameters including mitochondrial membrane potential and Bax activation, were measured under matched conditions. Data-driven fuzzy logic modeling was used to explore the non-linear relationships between mitochondrial morphology and apoptotic signaling, combining morphological and functional data as a single model. Modeling results are in accordance with previous studies, where Bax regulates mitochondrial fragmentation, and mitochondrial morphology influences mitochondrial membrane potential. In summary, we established and validated a platform for mitochondrial morphological and functional analysis that can be readily extended with additional datasets. We further discuss the benefits of a flexible systematic approach for elucidating specific and general relationships between mitochondrial morphology and apoptosis.
Autoimmune pancreatitis (AIP) is thought to be an immune-mediated inflammatory process, directed against the epithelial components of the pancreas.
In order to explore key targets of the inflammatory process we analysed the expression of proteins at the RNA and protein level using genomics and proteomics, immunohistochemistry, Western blot and immunoassay. An animal model of AIP with LP-BM5 murine leukemia virus infected mice was studied in parallel. RNA microarrays of pancreatic tissue from 12 patients with AIP were compared to those of 8 patients with non-AIP chronic pancreatitis (CP).
Expression profiling revealed 272 upregulated genes, including those encoding for immunoglobulins, chemokines and their receptors, and 86 downregulated genes, including those for pancreatic proteases such as three trypsinogen isoforms. Protein profiling showed that the expression of trypsinogens and other pancreatic enzymes was greatly reduced. Immunohistochemistry demonstrated a near-loss of trypsin positive acinar cells, which was also confirmed by Western blotting. The serum of AIP patients contained high titres of autoantibodies against the trypsinogens PRSS1, and PRSS2 but not against PRSS3. In addition, there were autoantibodies against the trypsin inhibitor PSTI (the product of the SPINK1 gene). In the pancreas of AIP animals we found similar protein patterns and a reduction in trypsinogen.
These data indicate that the immune-mediated process characterizing AIP involves pancreatic acinar cells and their secretory enzymes such as trypsin isoforms. Demonstration of trypsinogen autoantibodies may be helpful for the diagnosis of AIP.
autoimmune pancreatitis; chronic pancreatitis; trypsinogen; proteomics; transcriptomics; autoantibody
The β-amyloid precursor protein (APP) and the related β-amyloid precursor-like proteins (APLPs) undergo complex proteolytic processing giving rise to several fragments. Whereas it is well established that Aβ accumulation is a central trigger for Alzheimer's disease, the physiological role of APP family members and their diverse proteolytic products is still largely unknown. The secreted APPsα ectodomain has been shown to be involved in neuroprotection and synaptic plasticity. The γ-secretase-generated APP intracellular domain (AICD) functions as a transcriptional regulator in heterologous reporter assays although its role for endogenous gene regulation has remained controversial.
To gain further insight into the molecular changes associated with knockout phenotypes and to elucidate the physiological functions of APP family members including their proposed role as transcriptional regulators, we performed DNA microarray transcriptome profiling of prefrontal cortex of adult wild-type (WT), APP knockout (APP-/-), APLP2 knockout (APLP2-/-) and APPsα knockin mice (APPα/α) expressing solely the secreted APPsα ectodomain. Biological pathways affected by the lack of APP family members included neurogenesis, transcription, and kinase activity. Comparative analysis of transcriptome changes between mutant and wild-type mice, followed by qPCR validation, identified co-regulated gene sets. Interestingly, these included heat shock proteins and plasticity-related genes that were both down-regulated in knockout cortices. In contrast, we failed to detect significant differences in expression of previously proposed AICD target genes including Bace1, Kai1, Gsk3b, p53, Tip60, and Vglut2. Only Egfr was slightly up-regulated in APLP2-/- mice. Comparison of APP-/- and APPα/α with wild-type mice revealed a high proportion of co-regulated genes indicating an important role of the C-terminus for cellular signaling. Finally, comparison of APLP2-/- on different genetic backgrounds revealed that background-related transcriptome changes may dominate over changes due to the knockout of a single gene.
Shared transcriptome profiles corroborated closely related physiological functions of APP family members in the adult central nervous system. As expression of proposed AICD target genes was not altered in adult cortex, this may indicate that these genes are not affected by lack of APP under resting conditions or only in a small subset of cells.
Gastrointestinal stromal tumors (GIST) represent the most common mesenchymal tumors of the gastrointestinal tract. About 85% carry an activating mutation in the KIT or PDGFRA gene. Approximately 10% of GIST are so-called wild type GIST (wt-GIST) without mutations in the hot spots. In the present study we evaluated appropriate reference genes for the expression analysis of formalin-fixed, paraffin-embedded and fresh frozen samples from gastrointestinal stromal tumors. We evaluated the gene expression of KIT as well as of the alternative receptor tyrosine kinase genes FLT3, CSF1-R, PDGFRB, AXL and MET by qPCR. wt-GIST were compared to samples with mutations in KIT exon 9 and 11 and PDGFRA exon 18 in order to evaluate whether overexpression of these alternative RTK might contribute to the pathogenesis of wt-GIST.
Gene expression variability of the pooled cDNA samples is much lower than the single reverse transcription cDNA synthesis. By combining the lowest variability values of fixed and fresh tissue, the genes POLR2A, PPIA, RPLPO and TFRC were chosen for further analysis of the GIST samples. Overexpression of KIT compared to the corresponding normal tissue was detected in each GIST subgroup except in GIST with PDGFRA exon 18 mutation. Comparing our sample groups, no significant differences in the gene expression levels of FLT3, CSF1R and AXL were determined. An exception was the sample group with KIT exon 9 mutation. A significantly reduced expression of CSF1R, FLT3 and PDGFRB compared to the normal tissue was detected. GIST with mutations in KIT exon 9 and 11 and in PDGFRA exon 18 showed a significant PDGFRB downregulation.
As the variability of expression levels for the reference genes is very high comparing fresh frozen and formalin-fixed tissue there is a strong need for validation in each tissue type. None of the alternative receptor tyrosine kinases analyzed is associated with the pathogenesis of wild-type or mutated GIST. It remains to be clarified whether an autocrine or paracrine mechanism by overexpression of receptor tyrosine kinase ligands is responsible for the tumorigenesis of wt-GIST.
In the past, molecular mechanisms that drive the initiation of an inflammatory response have been studied intensively. However, corresponding mechanisms that sustain the expression of inflammatory response genes and hence contribute to the establishment of chronic disorders remain poorly understood. Recently, we provided genetic evidence that signaling via the receptor for advanced glycation end products (Rage) drives the strength and maintenance of an inflammatory reaction. In order to decipher the mode of Rage function on gene transcription levels during inflammation, we applied global gene expression profiling on time-resolved samples of mouse back skin, which had been treated with the phorbol ester TPA, a potent inducer of skin inflammation.
Ranking of TPA-regulated genes according to their time average mean and peak expression and superimposition of data sets from wild-type (wt) and Rage-deficient mice revealed that Rage signaling is not essential for initial changes in TPA-induced transcription, but absolutely required for sustained alterations in transcript levels. Next, we used a data set of differentially expressed genes between TPA-treated wt and Rage-deficient skin and performed computational analysis of their proximal promoter regions. We found a highly significant enrichment for several transcription factor binding sites (TFBS) leading to the prediction that corresponding transcription factors, such as Sp1, Tcfap2, E2f, Myc and Egr, are regulated by Rage signaling. Accordingly, we could confirm aberrant expression and regulation of members of the E2f protein family in epidermal keratinocytes of Rage-deficient mice.
In summary, our data support the model that engagement of Rage converts a transient cellular stimulation into sustained cellular dysfunction and highlight a novel role of the Rb-E2f pathway in Rage-dependent inflammation during pathological conditions.
Normalization of microarrays is a standard practice to account for and minimize effects which are not due to the controlled factors in an experiment. There is an overwhelming number of different methods that can be applied, none of which is ideally suited for all experimental designs. Thus, it is important to identify a normalization method appropriate for the experimental setup under consideration that is neither too negligent nor too stringent. Major aim is to derive optimal results from the underlying experiment. Comparisons of different normalization methods have already been conducted, none of which, to our knowledge, comparing more than a handful of methods.
In the present study, 25 different ways of pre-processing Illumina Sentrix BeadChip array data are compared. Among others, methods provided by the BeadStudio software are taken into account. Looking at different statistical measures, we point out the ideal versus the actual observations. Additionally, we compare qRT-PCR measurements of transcripts from different ranges of expression intensities to the respective normalized values of the microarray data. Taking together all different kinds of measures, the ideal method for our dataset is identified.
Pre-processing of microarray gene expression experiments has been shown to influence further downstream analysis to a great extent and thus has to be carefully chosen based on the design of the experiment. This study provides a recommendation for deciding which normalization method is best suited for a particular experimental setup.
Theme-driven cancer survival studies address whether the expression signature of genes related to a biological process can predict patient survival time. Although this should ideally be achieved by testing two separate null hypotheses, current methods treat both hypotheses as one. The first test should assess whether a geneset, independent of its composition, is associated with prognosis (frequently done with a survival test). The second test then verifies whether the theme of the geneset is relevant (usually done with an empirical test that compares the geneset of interest with random genesets). Current methods do not test this second null hypothesis because it has been assumed that the distribution of p-values for random genesets (when tested against the first null hypothesis) is uniform. Here we demonstrate that such an assumption is generally incorrect and consequently, such methods may erroneously associate the biology of a particular geneset with cancer prognosis.
To assess the impact of non-uniform distributions for random genesets in such studies, an automated theme-driven method was developed. This method empirically approximates the p-value distribution of sets of unrelated genes based on a permutation approach, and tests whether predefined sets of biologically-related genes are associated with survival. The results from a comparison with a published theme-driven approach revealed non-uniform distributions, suggesting a significant problem exists with false positive rates in the original study. When applied to two public cancer datasets our technique revealed novel ontological categories with prognostic power, including significant correlations between "fatty acid metabolism" with overall survival in breast cancer, as well as "receptor mediated endocytosis", "brain development", "apical plasma membrane" and "MAPK signaling pathway" with overall survival in lung cancer.
Current methods of theme-driven survival studies assume uniformity of p-values for random genesets, which can lead to false conclusions. Our approach provides a method to correct for this pitfall, and provides a novel route to identifying higher-level biological themes and pathways with prognostic power in clinical microarray datasets.
Human herpesvirus 8 (HHV-8) is the etiologic agent of Kaposi's sarcoma and primary effusion lymphoma. Activation of the cellular transcription factor nuclear factor-kappa B (NF-κB) is essential for latent persistence of HHV-8, survival of HHV-8-infected cells, and disease progression. We used reverse-transfected cell microarrays (RTCM) as an unbiased systems biology approach to systematically analyze the effects of HHV-8 genes on the NF-κB signaling pathway. All HHV-8 genes individually (n = 86) and, additionally, all K and latent genes in pairwise combinations (n = 231) were investigated. Statistical analyses of more than 14,000 transfections identified ORF75 as a novel and confirmed K13 as a known HHV-8 activator of NF-κB. K13 and ORF75 showed cooperative NF-κB activation. Small interfering RNA-mediated knockdown of ORF75 expression demonstrated that this gene contributes significantly to NF-κB activation in HHV-8-infected cells. Furthermore, our approach confirmed K10.5 as an NF-κB inhibitor and newly identified K1 as an inhibitor of both K13- and ORF75-mediated NF-κB activation. All results obtained with RTCM were confirmed with classical transfection experiments. Our work describes the first successful application of RTCM for the systematic analysis of pathofunctions of genes of an infectious agent. With this approach, ORF75 and K1 were identified as novel HHV-8 regulatory molecules on the NF-κB signal transduction pathway. The genes identified may be involved in fine-tuning of the balance between latency and lytic replication, since this depends critically on the state of NF-κB activity.
MicroRNAs (miRNAs) play key roles in mammalian gene expression and several cellular processes, including differentiation, development, apoptosis and cancer pathomechanisms. Recently the biological importance of primary cilia has been recognized in a number of human genetic diseases. Numerous disorders are related to cilia dysfunction, including polycystic kidney disease (PKD). Although involvement of certain genes and transcriptional networks in PKD development has been shown, not much is known how they are regulated molecularly.
Given the emerging role of miRNAs in gene expression, we explored the possibilities of miRNA-based regulations in PKD. Here, we analyzed the simultaneous expression changes of miRNAs and mRNAs by microarrays. 935 genes, classified into 24 functional categories, were differentially regulated between PKD and control animals. In parallel, 30 miRNAs were differentially regulated in PKD rats: our results suggest that several miRNAs might be involved in regulating genetic switches in PKD. Furthermore, we describe some newly detected miRNAs, miR-31 and miR-217, in the kidney which have not been reported previously. We determine functionally related gene sets, or pathways to reveal the functional correlation between differentially expressed mRNAs and miRNAs.
We find that the functional patterns of predicted miRNA targets and differentially expressed mRNAs are similar. Our results suggest an important role of miRNAs in specific pathways underlying PKD.
Differences in MYCN/c-MYC target gene expression are associated with distinct neuroblastoma subtypes and clinical outcome.
Amplified MYCN oncogene resulting in deregulated MYCN transcriptional activity is observed in 20% of neuroblastomas and identifies a highly aggressive subtype. In MYCN single-copy neuroblastomas, elevated MYCN mRNA and protein levels are paradoxically associated with a more favorable clinical phenotype, including disseminated tumors that subsequently regress spontaneously (stage 4s-non-amplified). In this study, we asked whether distinct transcriptional MYCN or c-MYC activities are associated with specific neuroblastoma phenotypes.
We defined a core set of direct MYCN/c-MYC target genes by applying gene expression profiling and chromatin immunoprecipitation (ChIP, ChIP-chip) in neuroblastoma cells that allow conditional regulation of MYCN and c-MYC. Their transcript levels were analyzed in 251 primary neuroblastomas. Compared to localized-non-amplified neuroblastomas, MYCN/c-MYC target gene expression gradually increases from stage 4s-non-amplified through stage 4-non-amplified to MYCN amplified tumors. This was associated with MYCN activation in stage 4s-non-amplified and predominantly c-MYC activation in stage 4-non-amplified tumors. A defined set of MYCN/c-MYC target genes was induced in stage 4-non-amplified but not in stage 4s-non-amplified neuroblastomas. In line with this, high expression of a subset of MYCN/c-MYC target genes identifies a patient subtype with poor overall survival independent of the established risk markers amplified MYCN, disease stage, and age at diagnosis.
High MYCN/c-MYC target gene expression is a hallmark of malignant neuroblastoma progression, which is predominantly driven by c-MYC in stage 4-non-amplified tumors. In contrast, moderate MYCN function gain in stage 4s-non-amplified tumors induces only a restricted set of target genes that is still compatible with spontaneous regression.
The opportunistic food-borne gram-positive pathogen Listeria monocytogenes can exist as a free-living microorganism in the environment and grow in the cytoplasm of vertebrate and invertebrate cells following infection. The general stress response, controlled by the alternative sigma factor, σB, has an important role for bacterial survival both in the environment and during infection. We used quantitative real-time PCR analysis and immuno-blot analysis to examine σB expression during growth of L. monocytogenes EGD-e. Whole genome-based transcriptional profiling was used to identify σB-dependent genes at different growth phases.
We detected 105 σB-positively regulated genes and 111 genes which appeared to be under negative control of σB and validated 36 σB-positively regulated genes in vivo using a reporter gene fusion system.
Genes comprising the σB regulon encode solute transporters, novel cell-wall proteins, universal stress proteins, transcriptional regulators and include those involved in osmoregulation, carbon metabolism, ribosome- and envelope-function, as well as virulence and niche-specific survival genes such as those involved in bile resistance and exclusion. Ten of the σB-positively regulated genes of L. monocytogenes are absent in L. innocua. A total of 75 σB-positively regulated listerial genes had homologs in B. subtilis, but only 33 have been previously described as being σB-regulated in B. subtilis even though both species share a highly conserved σB-dependent consensus sequence. A low overlap of genes may reflects adaptation of these bacteria to their respective environmental conditions.
The most fatal and prevalent form of malaria is caused by the bloodborne pathogen Plasmodium falciparum (henceforth P.f). Annually, approximately three million people died of malaria. Despite P.f devastivating effect globally, the vast majority of its proteins have not been characterized experimentally. In this work, we provide computational insight that explore the modalities of the regulation for some important group of genes of P.f, namely components of the glycolytic pathway, and those involved in apicoplast metabolism. Glycolysis is a crucial pathway in the maintenance of the parasite while the recently discovered apicoplast contains a range of metabolic pathways and housekeeping processes that differ radically to those of the host, which makes it ideal for drug therapy.
We have been able to validate some of our findings from available literature and therefore provide a basis to give theoretical insight for some genes regulations, which has not been characterized experimentally.