Gene set testing problem has become the focus of microarray data analysis. A gene set is a group of genes that are defined by a priori biological knowledge. Several statistical methods have been proposed to determine whether functional gene sets express differentially (enrichment and/or deletion) in variations of phenotypes. However, little attention has been given to analyzing the dependence structure among gene sets. In this study, we have proposed a novel statistical method of gene set association analysis to identify significantly associated gene sets using the coefficient of intrinsic dependence. The simulation studies show that the proposed method outperforms the conventional methods to detect general forms of association in terms of control of type I error and power. The correlation of intrinsic dependence has been applied to a breast cancer microarray dataset to quantify the un-supervised relationship between two sets of genes in the tumor and non-tumor samples. It was observed that the existence of gene-set association differed across various clinical cohorts. In addition, a supervised learning was employed to illustrate how gene sets, in signaling transduction pathways or subnetworks regulated by a set of transcription factors, can be discovered using microarray data. In conclusion, the coefficient of intrinsic dependence provides a powerful tool for detecting general types of association. Hence, it can be useful to associate gene sets using microarray expression data. Through connecting relevant gene sets, our approach has the potential to reveal underlying associations by drawing a statistically relevant network in a given population, and it can also be used to complement the conventional gene set analysis.
doi:10.1371/journal.pone.0058851
PMCID: PMC3597597
Wang, Lu | Motoi, Toru | Khanin, Raya | Olshen, Adam | Mertens, Fredrik | Bridge, Julia | Dal Cin, Paola | Antonescu, Cristina | Singer, Sam | Hameed, Meera | Bovee, Judith | Hogendoorn, Pancras C.W. | Socci, Nicholas | Ladanyi, Marc
Cancer gene fusions that encode a chimeric protein are often characterized by an intragenic discontinuity in the RNA expression levels of the exons that are 5′ or 3′ to the fusion point in one or both of the fusion partners due to differences in the levels of activation of their respective promoters. Based on this, we developed an unbiased, genome-wide bioinformatic screen for gene fusions using Affymetrix Exon array expression data. Using a training set of 46 samples with different known gene fusions, we developed a data analysis pipeline, the “Fusion Score (FS) model”, to score and rank genes for intragenic changes in expression. In a separate discovery set of 41 tumor samples with possible unknown gene fusions, the FS model generated a list of 552 candidate genes. The transcription factor gene NCOA2 was one of the candidates identified in a mesenchymal chondrosarcoma. A novel HEY1-NCOA2 fusion was identified by 5′ RACE, representing an in-frame fusion of HEY1 exon 4 to NCOA2 exon 13. RT-PCR or FISH evidence of this HEY1-NCOA2 fusion was present in all additional mesenchymal chondrosarcomas tested with a definitive histologic diagnosis and adequate material for analysis (n=9) but was absent in 15 samples of other subtypes of chondrosarcomas. We also identified a NUP107-LGR5 fusion in a dedifferentiated liposarcoma but analysis of 17 additional samples did not confirm it as a recurrent event in this sarcoma type. The novel HEY1-NCOA2 fusion appears to be the defining and diagnostic gene fusion in mesenchymal chondrosarcomas.
doi:10.1002/gcc.20937
PMCID: PMC3235801
PMID: 22034177
Differences between TLR-deficient mouse colonies occur from extended husbandry in isolation that are communicated to offspring by maternal transmission.
The intestinal microbiota contributes to the development of the immune system, and conversely, the immune system influences the composition of the microbiota. Toll-like receptors (TLRs) in the gut recognize bacterial ligands. Although TLR signaling represents a major arm of the innate immune system, the extent to which TLRs influence the composition of the intestinal microbiota remains unclear. We performed deep 16S ribosomal RNA sequencing to characterize the complex bacterial populations inhabiting the ileum and cecum of TLR- and MyD88-deficient mice. The microbiota of MyD88- and TLR-deficient mouse colonies differed markedly, with each colony harboring distinct and distinguishable bacterial populations in the small and large intestine. Comparison of MyD88-, TLR2-, TLR4-, TLR5-, and TLR9-deficient mice and their respective wild-type (WT) littermates demonstrated that the impact of TLR deficiency on the composition of the intestinal microbiota is minimal under homeostatic conditions and after recovery from antibiotic treatment. Thus, differences between TLR-deficient mouse colonies reflected long-term divergence of the microbiota after extended husbandry in isolation from each other. Long-term breeding of isolated mouse colonies results in changes of the intestinal microbiota that are communicated to offspring by maternal transmission, which account for marked compositional differences between WT and mutant mouse strains.
doi:10.1084/jem.20120504
PMCID: PMC3409501
PMID: 22826298
In the present study we examine the changes in the expression of genes of Lactococcus lactis subspecies cremoris MG1363 during growth in milk. To reveal which specific classes of genes (pathways, operons, regulons, COGs) are important, we performed a transcriptome time series experiment. Global analysis of gene expression over time showed that L. lactis adapted quickly to the environmental changes. Using upstream sequences of genes with correlated gene expression profiles, we uncovered a substantial number of putative DNA binding motifs that may be relevant for L. lactis fermentative growth in milk. All available novel and literature-derived data were integrated into network reconstruction building blocks, which were used to reconstruct and visualize the L. lactis gene regulatory network. This network enables easy mining in the chrono-transcriptomics data. A freely available website at http://milkts.molgenrug.nl gives full access to all transcriptome data, to the reconstructed network and to the individual network building blocks.
doi:10.1371/journal.pone.0053085
PMCID: PMC3547956
PMID: 23349698
Gemcitabine (2,2-difluorodeoxycytidine, dFdC) is a prodrug widely used for treating various carcinomas. Gemcitabine exerts its clinical effect by depleting the deoxyribonucleotide pools, and incorporating its triphosphate metabolite (dFdC-TP) into DNA, thereby inhibiting DNA synthesis. This process blocks the cell cycle in the early S phase, eventually resulting in apoptosis. The incorporation of gemcitabine into DNA takes place in competition with the natural nucleoside dCTP. The mechanisms of indirect competition between these cascades for common resources are given with the race for DNA incorporation; in clinical studies dedicated to singling out mechanisms of resistance, ribonucleotide reductase (RR) and deoxycytidine kinase (dCK) and human equilibrative nucleoside transporter1 (hENT1) have been associated to efficacy of gemcitabine with respect to their roles in the synthesis cascades of dFdC-TP and dCTP. However, the direct competition, which manifests itself in terms of inhibitions between these cascades, remains to be quantified. We propose an algorithmic model of gemcitabine mechanism of action, verified with respect to independent experimental data. We performed in silico experiments in different virtual conditions, otherwise difficult in vivo, to evaluate the contribution of the inhibitory mechanisms to gemcitabine efficacy. In agreement with the experimental data, our model indicates that the inhibitions due to the association of dCTP with dCK and the association of gemcitabine diphosphate metabolite (dFdC-DP) with RR play a key role in adjusting the efficacy. While the former tunes the catalysis of the rate-limiting first phosphorylation of dFdC, the latter is responsible for depletion of dCTP pools, thereby contributing to gemcitabine efficacy with a dependency on nucleoside transport efficiency. Our simulations predict the existence of a continuum of non-efficacy to high-efficacy regimes, where the levels of dFdC-TP and dCTP are coupled in a complementary manner, which can explain the resistance to this drug in some patients.
doi:10.1371/journal.pone.0050176
PMCID: PMC3519828
PMID: 23239976
Quantitative predictions in computational life sciences are often based on regression models. The advent of machine learning has led to highly accurate regression models that have gained widespread acceptance. While there are statistical methods available to estimate the global performance of regression models on a test or training dataset, it is often not clear how well this performance transfers to other datasets or how reliable an individual prediction is–a fact that often reduces a user’s trust into a computational method. In analogy to the concept of an experimental error, we sketch how estimators for individual prediction errors can be used to provide confidence intervals for individual predictions. Two novel statistical methods, named CONFINE and CONFIVE, can estimate the reliability of an individual prediction based on the local properties of nearby training data. The methods can be applied equally to linear and non-linear regression methods with very little computational overhead. We compare our confidence estimators with other existing confidence and applicability domain estimators on two biologically relevant problems (MHC–peptide binding prediction and quantitative structure-activity relationship (QSAR)). Our results suggest that the proposed confidence estimators perform comparable to or better than previously proposed estimation methods. Given a sufficient amount of training data, the estimators exhibit error estimates of high quality. In addition, we observed that the quality of estimated confidence intervals is predictable. We discuss how confidence estimation is influenced by noise, the number of features, and the dataset size. Estimating the confidence in individual prediction in terms of error intervals represents an important step from plain, non-informative predictions towards transparent and interpretable predictions that will help to improve the acceptance of computational methods in the biological community.
doi:10.1371/journal.pone.0048723
PMCID: PMC3499506
PMID: 23166592
Jenq, Robert R. | Ubeda, Carles | Taur, Ying | Menezes, Clarissa C. | Khanin, Raya | Dudakov, Jarrod A. | Liu, Chen | West, Mallory L. | Singer, Natalie V. | Equinda, Michele J. | Gobourne, Asia | Lipuma, Lauren | Young, Lauren F. | Smith, Odette M. | Ghosh, Arnab | Hanash, Alan M. | Goldberg, Jenna D. | Aoyama, Kazutoshi | Blazar, Bruce R. | Pamer, Eric G. | R.M. van den Brink, Marcel
GVHD is associated with significant shifts in the composition of the intestinal microbiota in human and mouse models; manipulating the microbiota can alter the severity of GVHD in mice.
Despite a growing understanding of the link between intestinal inflammation and resident gut microbes, longitudinal studies of human flora before initial onset of intestinal inflammation have not been reported. Here, we demonstrate in murine and human recipients of allogeneic bone marrow transplantation (BMT) that intestinal inflammation secondary to graft-versus-host disease (GVHD) is associated with major shifts in the composition of the intestinal microbiota. The microbiota, in turn, can modulate the severity of intestinal inflammation. In mouse models of GVHD, we observed loss of overall diversity and expansion of Lactobacillales and loss of Clostridiales. Eliminating Lactobacillales from the flora of mice before BMT aggravated GVHD, whereas reintroducing the predominant species of Lactobacillus mediated significant protection against GVHD. We then characterized gut flora of patients during onset of intestinal inflammation caused by GVHD and found patterns mirroring those in mice. We also identified increased microbial chaos early after allogeneic BMT as a potential risk factor for subsequent GVHD. Together, these data demonstrate regulation of flora by intestinal inflammation and suggest that flora manipulation may reduce intestinal inflammation and improve outcomes for allogeneic BMT recipients.
doi:10.1084/jem.20112408
PMCID: PMC3348096
PMID: 22547653
Predicting miRNAs is an arduous task, due to the diversity of the precursors and complexity of enzyme processes. Although several prediction approaches have reached impressive performances, few of them could achieve a full-function recognition of mature miRNA directly from the candidate hairpins across species. Therefore, researchers continue to seek a more powerful model close to biological recognition to miRNA structure. In this report, we describe a novel miRNA prediction algorithm, known as FOMmiR, using a fixed-order Markov model based on the secondary structural pattern. For a training dataset containing 809 human pre-miRNAs and 6441 human pseudo-miRNA hairpins, the model’s parameters were defined and evaluated. The results showed that FOMmiR reached 91% accuracy on the human dataset through 5-fold cross-validation. Moreover, for the independent test datasets, the FOMmiR presented an outstanding prediction in human and other species including vertebrates, Drosophila, worms and viruses, even plants, in contrast to the well-known algorithms and models. Especially, the FOMmiR was not only able to distinguish the miRNA precursors from the hairpins, but also locate the position and strand of the mature miRNA. Therefore, this study provides a new generation of miRNA prediction algorithm, which successfully realizes a full-function recognition of the mature miRNAs directly from the hairpin sequences. And it presents a new understanding of the biological recognition based on the strongest signal’s location detected by FOMmiR, which might be closely associated with the enzyme cleavage mechanism during the miRNA maturation.
doi:10.1371/journal.pone.0048236
PMCID: PMC3484136
PMID: 23118959
Lu, Chao | Ward, Patrick S. | Kapoor, Gurpreet S. | Rohle, Dan | Turcan, Sevin | Abdel-Wahab, Omar | Edwards, Christopher R. | Khanin, Raya | Figueroa, Maria E. | Melnick, Ari | Wellen, Kathryn E. | O’Rourke, Donald M. | Berger, Shelley L. | Chan, Timothy A. | Levine, Ross L. | Mellinghoff, Ingo K. | Thompson, Craig B.
Nature
2012;483(7390):474-478.
Recurrent mutations in isocitrate dehydrogenase 1 (IDH1) and IDH2 have been identified in gliomas, acute myeloid leukaemias (AML) and chondrosarcomas, and share a novel enzymatic property of producing 2-hydroxyglutarate (2HG) from α-ketoglutarate1-6. Here we report that 2HG-producing IDH mutants can prevent the histone demethylation that is required for lineage-specific progenitor cells to differentiate into terminally differentiated cells. In tumour samples from glioma patients, IDH mutations were associated with a distinct gene expression profile enriched for genes expressed in neural progenitor cells, and this was associated with increased histone methylation. To test whether the ability of IDH mutants to promote histone methylation contributes to a block in cell differentiation in non-transformed cells, we tested the effect of neomorphic IDH mutants on adipocyte differentiation in vitro. Introduction of either mutant IDH or cell-permeable 2HG was associated with repression of the inducible expression of lineage-specific differentiation genes and a block to differentiation. This correlated with a significant increase in repressive histone methylation marks without observable changes in promoter DNA methylation. Gliomas were found to have elevated levels of similar histone repressive marks. Stable transfection of a 2HG-producing mutant IDH into immortalized astrocytes resulted in progressive accumulation of histone methylation. Of the marks examined, increased H3K9 methylation reproducibly preceded a rise in DNA methylation as cells were passaged in culture. Furthermore, we found that the 2HG-inhibitable H3K9 demethylase KDM4C was induced during adipocyte differentiation, and that RNA-interference suppression of KDM4C was sufficient to block differentiation. Together these data demonstrate that 2HG can inhibit histone demethylation and that inhibition of histone demethylation can be sufficient to block the differentiation of non-transformed cells.
doi:10.1038/nature10860
PMCID: PMC3478770
PMID: 22343901
Prognostic models are often used to estimate the length of patient survival. The Cox proportional hazards model has traditionally been applied to assess the accuracy of prognostic models. However, it may be suboptimal due to the inflexibility to model the baseline survival function and when the proportional hazards assumption is violated. The aim of this study was to use internal validation to compare the predictive power of a flexible Royston-Parmar family of survival functions with the Cox proportional hazards model. We applied the Palliative Performance Scale on a dataset of 590 hospice patients at the time of hospice admission. The retrospective data were obtained from the Lifepath Hospice and Palliative Care center in Hillsborough County, Florida, USA. The criteria used to evaluate and compare the models' predictive performance were the explained variation statistic R2, scaled Brier score, and the discrimination slope. The explained variation statistic demonstrated that overall the Royston-Parmar family of survival functions provided a better fit (R2 = 0.298; 95% CI: 0.236–0.358) than the Cox model (R2 = 0.156; 95% CI: 0.111–0.203). The scaled Brier scores and discrimination slopes were consistently higher under the Royston-Parmar model. Researchers involved in prognosticating patient survival are encouraged to consider the Royston-Parmar model as an alternative to Cox.
doi:10.1371/journal.pone.0047804
PMCID: PMC3474724
PMID: 23082220
It is unclear whether the new anti-catabolic agent denosumab represents a viable alternative to the widely used anti-catabolic agent pamidronate in the treatment of Multiple Myeloma (MM)-induced bone disease. This lack of clarity primarily stems from the lack of sufficient clinical investigations, which are costly and time consuming. However, in silico investigations require less time and expense, suggesting that they may be a useful complement to traditional clinical investigations. In this paper, we aim to (i) develop integrated computational models that are suitable for investigating the effects of pamidronate and denosumab on MM-induced bone disease and (ii) evaluate the responses to pamidronate and denosumab treatments using these integrated models. To achieve these goals, pharmacokinetic models of pamidronate and denosumab are first developed and then calibrated and validated using different clinical datasets. Next, the integrated computational models are developed by incorporating the simulated transient concentrations of pamidronate and denosumab and simulations of their actions on the MM-bone compartment into the previously proposed MM-bone model. These integrated models are further calibrated and validated by different clinical datasets so that they are suitable to be applied to investigate the responses to the pamidronate and denosumab treatments. Finally, these responses are evaluated by quantifying the bone volume, bone turnover, and MM-cell density. This evaluation identifies four denosumab regimes that potentially produce an overall improved bone-related response compared with the recommended pamidronate regime. This in silico investigation supports the idea that denosumab represents an appropriate alternative to pamidronate in the treatment of MM-induced bone disease.
doi:10.1371/journal.pone.0044868
PMCID: PMC3448612
PMID: 23028650
Ugras, Stacy | Brill, Elliott | Jacobsen, Anders | Hafner, Markus | Socci, Nicholas D. | DeCarolis, Penelope L. | Khanin, Raya | O'Connor, Rachael | Mihailovic, Aleksandra | Taylor, Barry S. | Sheridan, Robert | Gimble, Jeffrey M. | Viale, Agnes | Crago, Aimee | Antonescu, Cristina R. | Sander, Chris | Tuschl, Thomas | Singer, Samuel
Liposarcoma remains the most common mesenchymal cancer, with a mortality rate of 60% among patients with this disease. To address the present lack of therapeutic options, we embarked upon a study of microRNA (miRNA) expression alterations associated with liposarcomagenesis with the goal of exploiting differentially expressed miRNAs and the gene products they regulate as potential therapeutic targets. MicroRNA expression was profiled in samples of normal adipose tissue, well-differentiated liposarcoma, and dedifferentiated liposarcoma by both deep sequencing of small RNA libraries and hybridization-based Agilent microarrays. The expression profiles discriminated liposarcoma from normal adipose tissue and well-differentiated from dedifferentiated disease. We defined over 40 miRNAs that were dysregulated in dedifferentiated liposarcomas in both the sequencing and the microarray analysis. The upregulated miRNAs included two cancer-associated species (miR-21, miR-26a), and the downregulated miRNAs included two species that were highly abundant in adipose tissue (miR-143, miR-145). Restoring miR-143 expression in dedifferentiated liposarcoma cells inhibited proliferation, induced apoptosis, and decreased expression of BCL2, TOP2A, PRC1, and PLK1. The downregulation of PRC1 and its docking partner PLK1 suggests that miR-143 inhibits cytokinesis in these cells. In support of this idea, treatment with a PLK1 inhibitor potently induced G2/M growth arrest and apoptosis in liposarcoma cells. Taken together, our findings suggest that miR-143 re-expression vectors or selective agents directed at miR-143 or its targets may have therapeutic value in dedifferentiated liposarcoma.
doi:10.1158/0008-5472.CAN-11-0890
PMCID: PMC3165140
PMID: 21693658
The advances in proteomics technologies offer an unprecedented opportunity and valuable resources to understand how living organisms execute necessary functions at systems levels. However, little work has been done up to date to utilize the highly accurate spatio-temporal dynamic proteome data generated by phosphoprotemics for mathematical modeling of complex cell signaling pathways. This work proposed a novel computational framework to develop mathematical models based on proteomic datasets. Using the MAP kinase pathway as the test system, we developed a mathematical model including the cytosolic and nuclear subsystems; and applied the genetic algorithm to infer unknown model parameters. Robustness property of the mathematical model was used as a criterion to select the appropriate rate constants from the estimated candidates. Quantitative information regarding the absolute protein concentrations was used to refine the mathematical model. We have demonstrated that the incorporation of more experimental data could significantly enhance both the simulation accuracy and robustness property of the proposed model. In addition, we used the MAP kinase pathway inhibited by phosphatases with different concentrations to predict the signal output influenced by different cellular conditions. Our predictions are in good agreement with the experimental observations when the MAP kinase pathway was inhibited by phosphatase PP2A and MKP3. The successful application of the proposed modeling framework to the MAP kinase pathway suggests that our method is very promising for developing accurate mathematical models and yielding insights into the regulatory mechanisms of complex cell signaling pathways.
doi:10.1371/journal.pone.0042230
PMCID: PMC3414524
PMID: 22905119
Purpose
Gastric cancer may be subdivided into three distinct subtypes –proximal, diffuse, and distal gastric cancer– based on histopathologic and anatomic criteria. Each subtype is associated with unique epidemiology. Our aim is to test the hypothesis that these distinct gastric cancer subtypes may also be distinguished by gene expression analysis.
Experimental Design
Patients with localized gastric adenocarcinoma being screened for a phase II preoperative clinical trial (NCI 5917) underwent endoscopic biopsy for fresh tumor procurement. 4–6 targeted biopsies of the primary tumor were obtained. Macrodissection was performed to ensure >80% carcinoma in the sample. HG-U133A GeneChip (Affymetrix) was used for cDNA expression analysis, and all arrays were processed and analyzed using the Bioconductor R-package.
Results
Between November 2003 and January 2006, 57 patients were screened to identify 36 patients with localized gastric cancer who had adequate RNA for expression analysis. Using supervised analysis, we built a classifier to distinguish the three gastric cancer subtypes, successfully classifying each into tightly grouped clusters. Leave-one-out cross validation error was 0.14, suggesting that >85% of samples were classified correctly. Gene set analysis with the False Discovery Rate set at 0.25 identified several pathways that were differentially regulated when comparing each gastric cancer subtype to adjacent normal stomach.
Conclusions
Subtypes of gastric cancer that have epidemiologic and histologic distinction are also distinguished by gene expression data. These preliminary data suggest a new classification of gastric cancer with implications for improving our understanding of disease biology and identification of unique molecular drivers for each gastric cancer subtype.
doi:10.1158/1078-0432.CCR-10-2203
PMCID: PMC3100216
PMID: 21430069
Gastric Cancer; cDNA Expression; classification; pathways
For many complex traits, single nucleotide polymorphisms (SNPs) identified from genome-wide association studies (GWAS) only explain a small percentage of heritability. Next generation sequencing technology makes it possible to explore unexplained heritability by identifying rare variants (RVs). Existing tests designed for RVs look for optimal strategies to combine information across multiple variants. Many of the tests have good power when the true underlying associations are either in the same direction or in opposite directions. We propose three tests for examining the association between a phenotype and RVs, where two of them jointly consider the common association across RVs and the individual deviations from the common effect. On one hand, similar to some of the best existing methods, the individual deviations are modeled as random effects to borrow information across multiple RVs. On the other hand, unlike the existing methods which pool individual effects towards zero, we pool them towards a possibly non-zero common effect by adding a pooled variant into the model. The common effect and the individual effects are jointly tested. We show through extensive simulations that at least one of the three tests proposed here is the most powerful or very close to being the most powerful in various settings of true models. This is appealing in practice because the direction and size of the true effects of the associated RVs are unknown. Researchers can apply the developed tests to improve power under a wide range of true models.
doi:10.1371/journal.pone.0032485
PMCID: PMC3309869
PMID: 22468164
Lim, Seunghwan | Bae, Eunjin | Kim, Hae-Suk | Kim, Tae-Aug | Byun, Kyunghee | Kim, Byungchul | Hong, Suntaek | Im, Jong Pil | Yun, Chohee | Lee, Bona | Lee, Bonghee | Park, Seok Hee | Letterio, John | Kim, Seong-Jin | Khanin, Raya
Transforming growth factor-β1 (TGF-β1) is an important anti-inflammatory cytokine that modulates and resolves inflammatory responses. Recent studies have demonstrated that inflammation enhances neoplastic risk and potentiates tumor progression. In the evolution of cancer, pro-inflammatory cytokines such as IL-1β must overcome the anti-inflammatory effects of TGF-β to boost pro-inflammatory responses in epithelial cells. Here we show that IL-1β or Lipopolysaccharide (LPS) suppresses TGF-β-induced anti-inflammatory signaling in a NF-κB-independent manner. TRAF6, a key molecule in IL-1β signaling, mediates this suppressive effect through interaction with the type III TGF-β receptor (TβRIII), which is TGF-β-dependent and requires type I TGF-β receptor (TβRI) kinase activity. TβRI phosphorylates TβRIII at residue S829, which promotes the TRAF6/TβRIII interaction and consequent sequestration of TβRIII from the TβRII/TβRI complex. Our data indicate that IL-1β enhances the pro-inflammatory response by suppressing TGF-βsignaling through TRAF6-mediated sequestration of TβRIII, which may be an important contributor to the early stages of tumor progression.
doi:10.1371/journal.pone.0032705
PMCID: PMC3299683
PMID: 22427868
The serine/threonine kinase LKB1 is a tumour suppressor that regulates multiple biological pathways, including cell cycle control, cell polarity and energy metabolism by direct phosphorylation of 14 different AMP-activated protein kinase (AMPK) family members. Although many downstream targets have been described, the regulation of LKB1 gene expression is still poorly understood. In this study, we performed a functional analysis of the human LKB1 upstream regulatory region. We used 200 base pair deletion constructs of the 5′-flanking region fused to a luciferase reporter to identify the core promoter. It encompasses nucleotides −345 to +52 relative to the transcription start site and coincides with a DNase I hypersensitive site. Based on extensive deletion and substitution mutant analysis of the LKB1 promoter, we identified four cis-acting elements which are critical for transcriptional activation. Using electrophoretic mobility shift assays as well as chromatin immunoprecipitations, we demonstrate that the transcription factors Sp1, NF-Y and two forkhead box O (FOXO) family members FOXO3 and FOXO4 bind to these elements. Overexpression of these factors significantly increased the LKB1 promoter activity. Conversely, small interfering RNAs directed against NF-Y alpha and the two FOXO proteins greatly reduced endogenous LKB1 expression and phosphorylation of LKB1's main substrate AMPK in three different cell lines. Taken together, these results demonstrate that Sp1, NF-Y and FOXO transcription factors are involved in the regulation of LKB1 transcription.
doi:10.1371/journal.pone.0032590
PMCID: PMC3295762
PMID: 22412893
A comparative analysis of genome-scale transcriptomic data of two types of skin cancers, melanoma and basal cell carcinoma in comparison with other cancer types, was conducted with the aim of identifying key regulatory factors that either cause or contribute to the aggressiveness of melanoma, while basal cell carcinoma generally remains a mild disease. Multiple cancer-related pathways such as cell proliferation, apoptosis, angiogenesis, cell invasion and metastasis, are considered, but our focus is on energy metabolism, cell invasion and metastasis pathways. Our findings include the following. (a) Both types of skin cancers use both glycolysis and increased oxidative phosphorylation (electron transfer chain) for their energy supply. (b) Advanced melanoma shows substantial up-regulation of key genes involved in fatty acid metabolism (β-oxidation) and oxidative phosphorylation, with aerobic metabolism being far more efficient than anaerobic glycolysis, providing a source of the energetics necessary to support the rapid growth of this cancer. (c) While advanced melanoma is similar to pancreatic cancer in terms of the activity level of genes involved in promoting cell invasion and metastasis, the main metastatic form of basal cell carcinoma is substantially reduced in this activity, partially explaining why this cancer type has been considered as far less aggressive. Our method of using comparative analyses of transcriptomic data of multiple cancer types focused on specific pathways provides a novel and highly effective approach to cancer studies in general.
doi:10.1371/journal.pone.0030750
PMCID: PMC3266277
PMID: 22295108
In previous work, we proposed a method for detecting differential gene expression based on change-point of expression profile. This non-parametric change-point method gave promising result in both simulation study and public dataset experiment. However, the performance is still limited by the less sensitiveness to the right bound and the statistical significance of the statistics has not been fully explored. To overcome the insensitiveness to the right bound we modified the original method by adding a weight function to the Dn statistic. Simulation study showed that the weighted change-point statistics method is significantly better than the original NPCPS in terms of ROC, false positive rate, as well as change-point estimate. The mean absolute error of the estimated change-point by weighted change-point method was 0.03, reduced by more than 50% comparing with the original 0.06, and the mean FPR was reduced by more than 55%. Experiment on microarray Dataset I resulted in 3974 differentially expressed genes out of total 5293 genes; experiment on microarray Dataset II resulted in 9983 differentially expressed genes among total 12576 genes. In summary, the method proposed here is an effective modification to the previous method especially when only a small subset of cancer samples has DGE.
doi:10.1371/journal.pone.0029860
PMCID: PMC3262809
PMID: 22276133
Alfred, Tamuno | Ben-Shlomo, Yoav | Cooper, Rachel | Hardy, Rebecca | Cooper, Cyrus | Deary, Ian J. | Gaunt, Tom R. | Gunnell, David | Harris, Sarah E. | Kumari, Meena | Martin, Richard M. | Sayer, Avan Aihie | Starr, John M. | Kuh, Diana | Day, Ian N. M. | Khanin, Raya
Background
Low muscle mass and function have been associated with poorer indicators of physical capability in older people, which are in-turn associated with increased mortality rates. The growth hormone/insulin-like growth factor (GH/IGF) axis is involved in muscle function and genetic variants in genes in the axis may influence measures of physical capability.
Methods
As part of the Healthy Ageing across the Life Course (HALCyon) programme, men and women from seven UK cohorts aged between 52 and 90 years old were genotyped for six polymorphisms: rs35767 (IGF1), rs7127900 (IGF2), rs2854744 (IGFBP3), rs2943641 (IRS1), rs2665802 (GH1) and the exon-3 deletion of GHR. The polymorphisms have previously been robustly associated with age-related traits or are potentially functional. Meta-analysis was used to pool within-study genotypic effects of the associations between the polymorphisms and four measures of physical capability: grip strength, timed walk or get up and go, chair rises and standing balance.
Results
Few important associations were observed among the several tests. We found evidence that rs2665802 in GH1 was associated with inability to balance for 5 s (pooled odds ratio per minor allele = 0.90, 95% CI: 0.82–0.98, p-value = 0.01, n = 10,748), after adjusting for age and sex. We found no evidence for other associations between the polymorphisms and physical capability traits.
Conclusion
Our findings do not provide evidence for a substantial influence of these common polymorphisms in the GH/IGF axis on objectively measured physical capability levels in older adults.
doi:10.1371/journal.pone.0029883
PMCID: PMC3254646
PMID: 22253814
The study of biological systems dynamics requires elucidation of the transitions of steady states. A “small perturbation” approach can provide important information on the “steady state” of a biological system. In our experiments, small perturbations were generated by applying a series of repeating small doses of ultraviolet radiation to a human keratinocyte cell line, HaCaT. The biological response was assessed by monitoring the gene expression profiles using cDNA microarrays. Repeated small doses (10 J/m2) of ultraviolet B (UVB) exposure modulated the expression profiles of two groups of genes in opposite directions. The genes that were up-regulated have functions mainly associated with anti-proliferation/anti-mitogenesis/apoptosis, and the genes that were down-regulated were mainly related to proliferation/mitogenesis/anti-apoptosis. For both groups of genes, repetition of the small doses of UVB caused an immediate response followed by relaxation between successive small perturbations. This cyclic pattern was suppressed when large doses (233 or 582.5 J/m2) of UVB were applied. Our method and results contribute to a foundation for computational systems biology, which implicitly uses the concept of steady state.
doi:10.1371/journal.pone.0029241
PMCID: PMC3240659
PMID: 22195030
How a living organism maintains its healthy equilibrium in response to endless exposure of potentially harmful chemicals is an important question in current biology. By transcriptomic analysis of zebrafish livers treated by various chemicals, we defined hubs as molecular pathways that are frequently perturbed by chemicals and have high degree of functional connectivity to other pathways. Our network analysis revealed that these hubs were organized into two groups showing inverted functionality with each other. Intriguingly, the inverted activity profiles in these two groups of hubs were observed to associate only with toxicopathological states but not with physiological changes. Furthermore, these inverted profiles were also present in rat, mouse, and human under certain toxicopathological conditions. Thus, toxicopathological-associated anti-correlated profiles in hubs not only indicate their potential use in diagnosis but also development of systems-based therapeutics to modulate gene expression by chemical approach in order to rewire the deregulated activities of hubs back to normal physiology.
doi:10.1371/journal.pone.0027819
PMCID: PMC3226580
PMID: 22140468
Integrin signaling regulates cell migration and plays a pivotal role in developmental processes and cancer metastasis. Integrin signaling has been studied extensively and much data is available on pathway components and interactions. Yet the data is fragmented and an integrated model is missing. We use a rule-based modeling approach to integrate available data and test biological hypotheses regarding the role of talin, Dok1 and PIPKI in integrin activation. The detailed biochemical characterization of integrin signaling provides us with measured values for most of the kinetics parameters. However, measurements are not fully accurate and the cellular concentrations of signaling proteins are largely unknown and expected to vary substantially across different cellular conditions. By sampling model behaviors over the physiologically realistic parameter range we find that the model exhibits only two different qualitative behaviors and these depend mainly on the relative protein concentrations, which offers a powerful point of control to the cell. Our study highlights the necessity to characterize model behavior not for a single parameter optimum, but to identify parameter sets that characterize different signaling modes.
doi:10.1371/journal.pone.0024808
PMCID: PMC3217926
PMID: 22110576
The availability of electronic health care records is unlocking the potential for novel studies on understanding and modeling disease co-morbidities based on both phenotypic and genetic data. Moreover, the insurgence of increasingly reliable phenotypic data can aid further studies on investigating the potential genetic links among diseases. The goal is to create a feedback loop where computational tools guide and facilitate research, leading to improved biological knowledge and clinical standards, which in turn should generate better data. We build and analyze disease interaction networks based on data collected from previous genetic association studies and patient medical histories, spanning over 12 years, acquired from a regional hospital. By exploring both individual and combined interactions among these two levels of disease data, we provide novel insight into the interplay between genetics and clinical realities. Our results show a marked difference between the well defined structure of genetic relationships and the chaotic co-morbidity network, but also highlight clear interdependencies. We demonstrate the power of these dependencies by proposing a novel multi-relational link prediction method, showing that disease co-morbidity can enhance our currently limited knowledge of genetic association. Furthermore, our methods for integrated networks of diverse data are widely applicable and can provide novel advances for many problems in systems biology and personalized medicine.
doi:10.1371/journal.pone.0022670
PMCID: PMC3146471
PMID: 21829475
Background
To understand complex biological signalling mechanisms, mathematical modelling of signal transduction pathways has been applied successfully in last few years. However, precise quantitative measurements of signal transduction events such as activation-dependent phosphorylation of proteins, remains one bottleneck to this success.
Methodology/Principal Findings
We use multi-colour immunoprecipitation measured by flow cytometry (IP-FCM) for studying signal transduction events to unrivalled precision. In this method, antibody-coupled latex beads capture the protein of interest from cellular lysates and are then stained with differently fluorescent-labelled antibodies to quantify the amount of the immunoprecipitated protein, of an interaction partner and of phosphorylation sites. The fluorescence signals are measured by FCM. Combining this procedure with beads containing defined amounts of a fluorophore allows retrieving absolute numbers of stained proteins, and not only relative values. Using IP-FCM we derived multidimensional data on the membrane-proximal T-cell antigen receptor (TCR-CD3) signalling network, including the recruitment of the kinase ZAP70 to the TCR-CD3 and subsequent ZAP70 activation by phosphorylation in the murine T-cell hybridoma and primary murine T cells. Counter-intuitively, these data showed that cell stimulation by pervanadate led to a transient decrease of the phospho-ZAP70/ZAP70 ratio at the TCR. A mechanistic mathematical model of the underlying processes demonstrated that an initial massive recruitment of non-phosphorylated ZAP70 was responsible for this behaviour. Further, the model predicted a temporal order of multisite phosphorylation of ZAP70 (with Y319 phosphorylation preceding phosphorylation at Y493) that we subsequently verified experimentally.
Conclusions/Significance
The quantitative data sets generated by IP-FCM are one order of magnitude more precise than Western blot data. This accuracy allowed us to gain unequalled insight into the dynamics of the TCR-CD3-ZAP70 signalling network.
doi:10.1371/journal.pone.0022928
PMCID: PMC3146539
PMID: 21829558