Search tips
Search criteria

Results 1-25 (119)

Clipboard (0)

Select a Filter Below

more »
Year of Publication
more »
1.  eQTL networks unveil enriched mRNA master integrators downstream of complex disease-associated SNPs 
The causal and interplay mechanisms of Single Nucleotide Polymorphisms (SNPs) associated with complex diseases (complex disease SNPs) investigated in genome-wide association studies (GWAS) at the transcriptional level (mRNA) are poorly understood despite recent advancements such as discoveries reported in the Encyclopedia of DNA Elements (ENCODE) and Genotype-Tissue Expression (GTex). Protein interaction network analyses have successfully improved our understanding of both single gene diseases (Mendelian diseases) and complex diseases. Whether the mRNAs downstream of complex disease genes are central or peripheral in the genetic information flow relating DNA to mRNA remains unclear and may be disease-specific. Using expression Quantitative Trait Loci (eQTL) that provide DNA to mRNA associations and network centrality metrics, we hypothesize that we can unveil the systems properties of information flow between SNPs and the transcriptomes of complex diseases. We compare different conditions such as naïve SNP assignments and stringent linkage disequilibrium (LD) free assignments for transcripts to remove confounders from LD. Additionally, we compare the results from eQTL networks between lymphoblastoid cell lines and liver tissue. Empirical permutation resampling (p<0.001) and theoretic Mann-Whitney U test (p<10−30) statistics indicate that mRNAs corresponding to complex disease SNPs via eQTL associations are likely to be regulated by a larger number of SNPs than expected. We name this novel property mRNA hubness in eQTL networks, and further term mRNAs with high hubness as master integrators. mRNA master integrators receive and coordinate the perturbation signals from large numbers of polymorphisms and respond to the personal genetic architecture integratively. This genetic signal integration contrasts with the mechanism underlying some Mendelian diseases, where a genetic polymorphism affecting a single protein hub produces a divergent signal that affects a large number of downstream proteins. Indeed, we verify that this property is independent of the hubness in protein networks for which these mRNAs are transcribed. Our findings provide novel insights into the pleiotropy of mRNAs targeted by complex disease polymorphisms and the architecture of the information flow between the genetic polymorphisms and transcriptomes of complex diseases.
Graphical Abstract
PMCID: PMC4684766  PMID: 26524128
translational bioinformatics; centrality; complex diseases; eQTL; SNP; Single Nucleotide Polymorphism; master integrator; computational genomics; genomics; transcriptome; mRNA; network biology; big data; computational biology; computational medicine; complex disease; genetics; systems biology; systems medicine; signal integration
2.  Towards a PBMC “virogram assay” for precision medicine: concordance between ex vivo and in vivo viral infection transcriptomes 
Understanding individual patient host-response to viruses is key to designing optimal personalized therapy. Unsurprisingly, in vivo human experimentation to understand individualized dynamic response of the transcriptome to viruses are rarely studied because of the obviously limitations stemming from ethical considerations of the clinical risk.
In this rhinovirus study, we first hypothesized that ex vivo human cells response to virus can serve as proxy for otherwise controversial in vivo human experimentation. We further hypothesized that the N-of-1-pathways framework, previously validated in cancer, can be effective in understanding the more subtle individual transcriptomic response to viral infection.
N-of-1-pathways computes a significance score for a given list of gene sets at the patient level, using merely the ‘omics profiles of two paired samples as input. We extracted the peripheral blood mononuclear cells (PBMC) of four human subjects, aliquoted in two paired samples, one subjected to ex vivo rhinovirus infection. Their dysregulated genes and pathways were then compared to those of 9 human subjects prior and after intranasal inoculation in vivo with rhinovirus. Additionally, we developed the Similarity Venn Diagram, a novel visualization method that goes beyond conventional overlap to show the similarity between two sets of qualitative measures.
We evaluated the individual N-of-1-pathways results using two established cohort-based methods: GSEA and enrichment of differentially expressed genes. Similarity Venn Diagrams and individual patient ROC curves illustrate and quantify that the in vivo dysregulation is recapitulated ex vivo both at the gene and pathway level (p-values≤0.004).
We established the first evidence that an interpretable dynamic transcriptome metric, conducted as an ex vivo assays for a single subject, has the potential to predict individualized response to infectious disease without the clinical risks otherwise associated to in vivo challenges. These results serve as foundational work for personalized “virograms”.
PMCID: PMC4951181  PMID: 25797143
personal transcriptome; rhinovirus; PBMC; genomic response; virogram; Similarity Venn Diagrams
3.  Evidence Suggesting that Discontinuous Dosing of ALK Kinase Inhibitors May Prolong Control of ALK+ Tumors 
Cancer research  2015;75(14):2916-2927.
The anaplastic lymphoma kinase ALK is chromosomally rearranged in a subset of certain cancers, including 2–7% non-small cell lung cancers (NSCLC) and ~70% of anaplastic large cell lymphomas (ALCL). The ALK kinase inhibitors crizotinib and ceritinib are approved for relapsed ALK+ NSCLC, but acquired resistance to these drugs limits median progression-free survival on average to ~10 months. Kinase domain mutations are detectable in 25–37% of resistant NSCLC samples, with activation of bypass signaling pathways detected frequently with or without concurrent ALK mutations. Here we report that, in contrast to NSCLC cells, drug resistant ALCL cells show no evidence of bypassing ALK by activating alternate signaling pathways. Instead, drug resistance selected in this setting reflects upregulation of ALK itself. Notably, in the absence of crizotinib or ceritinib, we found that increased ALK signaling rapidly arrested or killed cells, allowing a prolonged control of drug-resistant tumors in vivo with the administration of discontinuous rather than continuous regimens of drug dosing. Furthermore, even when drug resistance mutations were detected in the kinase domain, overexpression of the mutant ALK was toxic to tumor cells. We confirmed these findings derived from human ALCL cells in murine pro-B cells that were transformed to cytokine independence by ectopic expression of an activated NPM-ALK fusion oncoprotein. In summary, our results show how ALK activation functions as a double-edged sword for tumor cell viability, with potential therapeutic implications.
PMCID: PMC4506255  PMID: 26018086
Tyrosine-Kinase Inhibitors; ALK; Oncogene Overdose; Crizotinib; Ceritinib
4.  Analysis of aggregated cell–cell statistical distances within pathways unveils therapeutic-resistance mechanisms in circulating tumor cells 
Bioinformatics  2016;32(12):i80-i89.
Motivation: As ‘omics’ biotechnologies accelerate the capability to contrast a myriad of molecular measurements from a single cell, they also exacerbate current analytical limitations for detecting meaningful single-cell dysregulations. Moreover, mRNA expression alone lacks functional interpretation, limiting opportunities for translation of single-cell transcriptomic insights to precision medicine. Lastly, most single-cell RNA-sequencing analytic approaches are not designed to investigate small populations of cells such as circulating tumor cells shed from solid tumors and isolated from patient blood samples.
Results: In response to these characteristics and limitations in current single-cell RNA-sequencing methodology, we introduce an analytic framework that models transcriptome dynamics through the analysis of aggregated cell–cell statistical distances within biomolecular pathways. Cell–cell statistical distances are calculated from pathway mRNA fold changes between two cells. Within an elaborate case study of circulating tumor cells derived from prostate cancer patients, we develop analytic methods of aggregated distances to identify five differentially expressed pathways associated to therapeutic resistance. Our aggregation analyses perform comparably with Gene Set Enrichment Analysis and better than differentially expressed genes followed by gene set enrichment. However, these methods were not designed to inform on differential pathway expression for a single cell. As such, our framework culminates with the novel aggregation method, cell-centric statistics (CCS). CCS quantifies the effect size and significance of differentially expressed pathways for a single cell of interest. Improved rose plots of differentially expressed pathways in each cell highlight the utility of CCS for therapeutic decision-making.
Availability and implementation:
Contact: or
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4908332  PMID: 27307648
5.  Metrics and tools for consistent cohort discovery and financial analyses post-transition to ICD-10-CM 
In the United States, International Classification of Disease Clinical Modification (ICD-9-CM, the ninth revision) diagnosis codes are commonly used to identify patient cohorts and to conduct financial analyses related to disease. In October 2015, the healthcare system of the United States will transition to ICD-10-CM (the tenth revision) diagnosis codes. One challenge posed to clinical researchers and other analysts is conducting diagnosis-related queries across datasets containing both coding schemes. Further, healthcare administrators will manage growth, trends, and strategic planning with these dually-coded datasets. The majority of the ICD-9-CM to ICD-10-CM translations are complex and nonreciprocal, creating convoluted representations and meanings. Similarly, mapping back from ICD-10-CM to ICD-9-CM is equally complex, yet different from mapping forward, as relationships are likewise nonreciprocal. Indeed, 10 of the 21 top clinical categories are complex as 78% of their diagnosis codes are labeled as “convoluted” by our analyses. Analysis and research related to external causes of morbidity, injury, and poisoning will face the greatest challenges due to 41 745 (90%) convolutions and a decrease in the number of codes. We created a web portal tool and translation tables to list all ICD-9-CM diagnosis codes related to the specific input of ICD-10-CM diagnosis codes and their level of complexity: “identity” (reciprocal), “class-to-subclass,” “subclass-to-class,” “convoluted,” or “no mapping.” These tools provide guidance on ambiguous and complex translations to reveal where reports or analyses may be challenging to impossible.
Web portal:
Tables annotated with levels of translation complexity:
PMCID: PMC4457110  PMID: 25681260
ICD-10-CM; medical informatics; network patterns; patient cohort; financial analyses; ICD-9-CM
6.  The Complexity and Challenges of the ICD-9-CM to ICD-10-CM Transition in Emergency Departments 
Beginning October 2015, the Center for Medicare and Medicaid Services (CMS) will require medical providers to utilize the vastly expanded ICD-10-CM system. Despite wide availability of information and mapping tools for the next generation of the ICD classification system, some of the challenges associated with transition from ICD-9-CM to ICD-10-CM are not well understood. To quantify the challenges faced by emergency physicians, we analyzed a subset of a 2010 Illinois Medicaid database of emergency department ICD-9-CM codes, seeking to determine the accuracy of existing mapping tools in order to better prepare emergency physicians for the change to the expanded ICD-10-CM system. We found that 27% of 1,830 codes represented convoluted multidirectional mappings. We then analyzed the convoluted transitions and found 8% of total visit encounters (23% of the convoluted transitions) were clinically incorrect. The ambiguity and inaccuracy of these mappings may impact the work flow associated with the translation process and affect the potential mapping between ICD codes and CPT (Current Procedural Codes) codes, which determine physician reimbursement.
PMCID: PMC4430372  PMID: 25863652
Clinical informatics; Health informatics; ICD-10-CM; Billing; Reimbursement
7.  Wnt7a is a novel inducer of β-catenin-independent Tumor-Suppressive Cellular Senescence in Lung Cancer 
Oncogene  2015;34(42):5317-5328.
Cellular senescence is an initial barrier for carcinogenesis. However, the signaling mechanisms that trigger cellular senescence are incompletely understood, particularly in vivo. Here, we identify Wnt7a as a novel upstream inducer of cellular senescence. In two different mouse strains (C57Bl/6J and FVB/NJ) we show that the loss of Wnt7a is a major contributing factor for increased lung tumorigenesis owing to reduced cellular senescence, and not reduced apoptosis, or autophagy. Wnt7a null mice under de novo conditions and in both the strains display E-cadherin-to-N-cadherin switch, reduced expression of cellular senescence markers, and reduced expression of senescence-associated secretory phenotype, indicating a genetic predisposition of these mice to increased carcinogen-induced lung tumorigenesis. Interestingly, Wnt7a induced an alternate senescence pathway, which was independent of β-catenin, and distinct from that of classical oncogene-induced senescence mediated by the well-known p16INK4a and p19ARF pathways. Mechanistically, Wnt7a induced cellular senescence via inactivation of SKP2, an important alternate regulator of cellular senescence. Additionally, we identified Iloprost, a prostacyclin analog, which initiates downstream signaling cascades similar to that of Wnt7a, as a novel inducer of cellular senescence, presenting potential future clinical translational strategies. Thus, pro-senescence therapies using either Wnt7a or its mimic, Iloprost, might represent a new class of therapeutic treatments for lung cancer.
PMCID: PMC4558401  PMID: 25728679
Wnt7a; Senescence; SKP2; p27; beta-catenin independent
8.  Rethinking the role and impact of health information technology: informatics as an interventional discipline 
Recent advances in the adoption and use of health information technology (HIT) have had a dramatic impact on the practice of medicine. In many environments, this has led to the ability to achieve new efficiencies and levels of safety. In others, the impact has been less positive, and is associated with both: 1) workflow and user experience dissatisfaction; and 2) perceptions of missed opportunities relative to the use of computational tools to enable data-driven and precise clinical decision making. Simultaneously, the “pipeline” through which new diagnostic tools and therapeutic agents are being developed and brought to the point-of-care or population health is challenged in terms of both cost and timeliness. Given the confluence of these trends, it can be argued that now is the time to consider new ways in which HIT can be used to deliver health and wellness interventions comparable to traditional approaches (e.g., drugs, devices, diagnostics, and behavioral modifications). Doing so could serve to fulfill the promise of what has been recently promoted as “precision medicine” in a rapid and cost-effective manner. However, it will also require the health and life sciences community to embrace new modes of using HIT, wherein the use of technology becomes a primary intervention as opposed to enabler of more conventional approaches, a model that we refer to in this commentary as “interventional informatics”. Such a paradigm requires attention to critical issues, including: 1) the nature of the relationships between HIT vendors and healthcare innovators; 2) the formation and function of multidisciplinary teams consisting of technologists, informaticians, and clinical or scientific subject matter experts; and 3) the optimal design and execution of clinical studies that focus on HIT as the intervention of interest. Ultimately, the goal of an “interventional informatics” approach can and should be to substantially improve human health and wellness through the use of data-driven interventions at the point of care of broader population levels. Achieving a vision of “interventional informatics” will requires us to re-think how we study HIT tools in order to generate the necessary evidence-base that can support and justify their use as a primary means of improving the human condition.
PMCID: PMC4812636  PMID: 27025583
Biomedical research; Informatics; Research design
9.  Challenges and remediation for Patient Safety Indicators in the transition to ICD-10-CM 
Reporting of hospital adverse events relies on Patient Safety Indicators (PSIs) using International Classification of Diseases, Ninth Edition, Clinical Modification (ICD-9-CM) codes. The US transition to ICD-10-CM in 2015 could result in erroneous comparisons of PSIs. Using the General Equivalent Mappings (GEMs), we compared the accuracy of ICD-9-CM coded PSIs against recommended ICD-10-CM codes from the Centers for Medicaid/Medicare Services (CMS). We further predict their impact in a cohort of 38 644 patients (1 446 581 visits and 399 hospitals). We compared the predicted results to the published PSI related ICD-10-CM diagnosis codes. We provide the first report of substantial hospital safety reporting errors with five direct comparisons from the 23 types of PSIs (transfusion and anesthesia related PSIs). One PSI was excluded from the comparison between code sets due to reorganization, while 15 additional PSIs were inaccurate to a lesser degree due to the complexity of the coding translation. The ICD-10-CM translations proposed by CMS pose impending risks for (1) comparing safety incidents, (2) inflating the number of PSIs, and (3) increasing the variability of calculations attributable to the abundance of coding system translations. Ethical organizations addressing ‘data-, process-, and system-focused’ improvements could be penalized using the new ICD-10-CM Agency for Healthcare Research and Quality PSIs because of apparent increases in PSIs bearing the same PSI identifier and label, yet calculated differently. Here we investigate which PSIs would reliably transition between ICD-9-CM and ICD-10-CM, and those at risk of under-reporting and over-reporting adverse events while the frequency of these adverse events remain unchanged.
PMCID: PMC4433358  PMID: 25186492
adverse events; patient safety indicators; ICD-10-cm; network topology; clinical informatics; hospitals
10.  A functional genomic model for predicting prognosis in idiopathic pulmonary fibrosis 
BMC Pulmonary Medicine  2015;15:147.
The course of disease for patients with idiopathic pulmonary fibrosis (IPF) is highly heterogeneous. Prognostic models rely on demographic and clinical characteristics and are not reproducible. Integrating data from genomic analyses may identify novel prognostic models and provide mechanistic insights into IPF.
Total RNA of peripheral blood mononuclear cells was subjected to microarray profiling in a training (45 IPF individuals) and two independent validation cohorts (21 IPF/10 controls, and 75 IPF individuals, respectively). To identify a gene set predictive of IPF prognosis, we incorporated genomic, clinical, and outcome data from the training cohort. Predictor genes were selected if all the following criteria were met: 1) Present in a gene co-expression module from Weighted Gene Co-expression Network Analysis (WGCNA) that correlated with pulmonary function (p < 0.05); 2) Differentially expressed between observed “good” vs. “poor” prognosis with fold change (FC) >1.5 and false discovery rate (FDR) < 2 %; and 3) Predictive of mortality (p < 0.05) in univariate Cox regression analysis. “Survival risk group prediction” was adopted to construct a functional genomic model that used the IPF prognostic predictor gene set to derive a prognostic index (PI) for each patient into either high or low risk for survival outcomes. Prediction accuracy was assessed with a repeated 10-fold cross-validation algorithm and independently assessed in two validation cohorts through multivariate Cox regression survival analysis.
A set of 118 IPF prognostic predictor genes was used to derive the functional genomic model and PI. In the training cohort, high-risk IPF patients predicted by PI had significantly shorter survival compared to those labeled as low-risk patients (log rank p < 0.001). The prediction accuracy was further validated in two independent cohorts (log rank p < 0.001 and 0.002). Functional pathway analysis revealed that the canonical pathways enriched with the IPF prognostic predictor gene set were involved in T-cell biology, including iCOS, T-cell receptor, and CD28 signaling.
Using supervised and unsupervised analyses, we identified a set of IPF prognostic predictor genes and derived a functional genomic model that predicted high and low-risk IPF patients with high accuracy. This genomic model may complement current prognostic tools to deliver more personalized care for IPF patients.
Electronic supplementary material
The online version of this article (doi:10.1186/s12890-015-0142-8) contains supplementary material, which is available to authorized users.
PMCID: PMC4654815  PMID: 26589497
Idiopathic pulmonary fibrosis (IPF); Peripheral blood mononuclear cells (PBMCs); Gene expression profiling; Functional genomic model; Prognosis prediction
11.  The Transition to ICD-10-CM: Challenges for Pediatric Practice 
Pediatrics  2014;134(1):31-36.
Diagnostic codes are used widely within health care for billing, quality assessment, and to measure clinical outcomes. The US health care system will transition to the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM), in October 2015. Little is known about how this transition will affect pediatric practices. The objective of this study was to examine how the transition to ICD-10-CM may result in ambiguity of clinical information and financial disruption for pediatricians.
Using a statewide data set from Illinois Medicaid specified for pediatricians, 2708 International Classification of Diseases, Ninth Revision, Clinical Modification, diagnosis codes were identified. Diagnosis codes were categorized into 1 of 5 categories: identity, class-to-subclass, subclass-to-class, convoluted, and no translation. The convoluted and high-cost diagnostic codes (n = 636) were analyzed for accuracy and categorized into “information loss,” “overlapping categories,” “inconsistent,” and “consistent.” Finally, reimbursement by Medicaid was calculated for each category.
Twenty-six percent of pediatric diagnosis codes are convoluted, which represents 21% of Illinois Medicaid pediatric patient encounters and 16% of reimbursement. The diagnosis codes represented by information loss (3.6%), overlapping categories (3.2%), and inconsistent (1.2%) represent 8% of Medicaid pediatric reimbursement.
The potential for financial disruption and administrative errors from 8% of reimbursement diagnosis codes necessitates special attention to these codes in preparing for the transition to ICD-10-CM for pediatric practices.
PMCID: PMC4531279  PMID: 24918217
ICD-9-CM; ICD-10-CM; diagnostic codes; health informatics; convolution
12.  Dynamic changes of RNA-sequencing expression for precision medicine: N-of-1-pathways Mahalanobis distance within pathways of single subjects predicts breast cancer survival 
Bioinformatics  2015;31(12):i293-i302.
Motivation: The conventional approach to personalized medicine relies on molecular data analytics across multiple patients. The path to precision medicine lies with molecular data analytics that can discover interpretable single-subject signals (N-of-1). We developed a global framework, N-of-1-pathways, for a mechanistic-anchored approach to single-subject gene expression data analysis. We previously employed a metric that could prioritize the statistical significance of a deregulated pathway in single subjects, however, it lacked in quantitative interpretability (e.g. the equivalent to a gene expression fold-change).
Results: In this study, we extend our previous approach with the application of statistical Mahalanobis distance (MD) to quantify personal pathway-level deregulation. We demonstrate that this approach, N-of-1-pathways Paired Samples MD (N-OF-1-PATHWAYS-MD), detects deregulated pathways (empirical simulations), while not inflating false-positive rate using a study with biological replicates. Finally, we establish that N-OF-1-PATHWAYS-MD scores are, biologically significant, clinically relevant and are predictive of breast cancer survival (P < 0.05, n = 80 invasive carcinoma; TCGA RNA-sequences).
Conclusion: N-of-1-pathways MD provides a practical approach towards precision medicine. The method generates the magnitude and the biological significance of personal deregulated pathways results derived solely from the patient’s transcriptome. These pathways offer the opportunities for deriving clinically actionable decisions that have the potential to complement the clinical interpretability of personal polymorphisms obtained from DNA acquired or inherited polymorphisms and mutations. In addition, it offers an opportunity for applicability to diseases in which DNA changes may not be relevant, and thus expand the ‘interpretable ‘omics’ of single subjects (e.g. personalome).
Availability and implementation:
Contact: or
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4765863  PMID: 26072495
13.  ARTS: automated randomization of multiple traits for study design 
Bioinformatics  2014;30(11):1637-1639.
Summary: Collecting data from large studies on high-throughput platforms, such as microarray or next-generation sequencing, typically requires processing samples in batches. There are often systematic but unpredictable biases from batch-to-batch, so proper randomization of biologically relevant traits across batches is crucial for distinguishing true biological differences from experimental artifacts. When a large number of traits are biologically relevant, as is common for clinical studies of patients with varying sex, age, genotype and medical background, proper randomization can be extremely difficult to prepare by hand, especially because traits may affect biological inferences, such as differential expression, in a combinatorial manner. Here we present ARTS (automated randomization of multiple traits for study design), which aids researchers in study design by automatically optimizing batch assignment for any number of samples, any number of traits and any batch size.
Availability and implementation: ARTS is implemented in Perl and is available at ARTS is also available in the Galaxy Tool Shed, and can be used at the Galaxy installation hosted by the UIC Center for Research Informatics (CRI) at
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC4029038  PMID: 24493035
14.  The Mitochondrial Cardiolipin Remodeling Enzyme Lysocardiolipin Acyltransferase Is a Novel Target in Pulmonary Fibrosis 
Rationale: Lysocardiolipin acyltransferase (LYCAT), a cardiolipin-remodeling enzyme regulating the 18:2 linoleic acid pattern of mammalian mitochondrial cardiolipin, is necessary for maintaining normal mitochondrial function and vascular development. We hypothesized that modulation of LYCAT expression in lung epithelium regulates development of pulmonary fibrosis.
Objectives: To define a role for LYCAT in human and murine models of pulmonary fibrosis.
Methods: We analyzed the correlation of LYCAT expression in peripheral blood mononuclear cells (PBMCs) with the outcomes of pulmonary functions and overall survival, and used the murine models to establish the role of LYCAT in fibrogenesis. We studied the LYCAT action on cardiolipin remodeling, mitochondrial reactive oxygen species generation, and apoptosis of alveolar epithelial cells under bleomycin challenge.
Measurements and Main Results: LYCAT expression was significantly altered in PBMCs and lung tissues from patients with idiopathic pulmonary fibrosis (IPF), which was confirmed in two preclinical murine models of IPF, bleomycin- and radiation-induced pulmonary fibrosis. LYCAT mRNA expression in PBMCs directly and significantly correlated with carbon monoxide diffusion capacity, pulmonary function outcomes, and overall survival. In both bleomycin- and radiation-induced pulmonary fibrosis murine models, hLYCAT overexpression reduced several indices of lung fibrosis, whereas down-regulation of native LYCAT expression by siRNA accentuated fibrogenesis. In vitro studies demonstrated that LYCAT modulated bleomycin-induced cardiolipin remodeling, mitochondrial membrane potential, reactive oxygen species generation, and apoptosis of alveolar epithelial cells, potential mechanisms of LYCAT-mediated lung protection.
Conclusions: This study is the first to identify modulation of LYCAT expression in fibrotic lungs and offers a novel therapeutic approach for ameliorating lung inflammation and pulmonary fibrosis.
PMCID: PMC4098083  PMID: 24779708
LYCAT; mitochondrial cardiolipin remodeling; bleomycin; IPF; apoptosis
15.  The Emergence of Genome-Based Drug Repositioning 
Science translational medicine  2011;3(96):96ps35.
In this issue of Science Translational Medicine, the Butte Research group provides a concrete example of how reinterpreting and comparing genome-wide metrics may allow us to effectively hypothesize which drugs from one disease indication can be used for another. Here we discuss the basis of this shift toward genomic computational integrative approaches that has precedence in scalar theories of biological information and is aptly warranted for exploitation in drug repurposing.
PMCID: PMC4262402  PMID: 21849663
16.  COPD Hospitalization Risk Increased with Distinct Patterns of Multiple Systems Comorbidities Unveiled by Network Modeling 
Earlier studies on hospitalization risk are largely based on regression models. To our knowledge, network modeling of multiple comorbidities is novel and inherently enables multidimensional scoring and unbiased feature reduction. Network modeling was conducted using an independent validation design starting from 38,695 patients, 1,446,581 visits, and 430 distinct clinical facilities/hospitals. Odds ratios (OR) were calculated for every pair of comorbidity using patient counts and compared their tendency with hospitalization rates and ED visits. Network topology analyses were performed, defining significant comorbidity associations as having OR≥5 & False-Discovery-Rate≤10−7. Four COPD-associated comorbidity sub-networks emerged, incorporating multiple clinical systems: (i) metabolic syndrome, (ii) substance abuse and mental disorder, (iii) pregnancy-associated conditions, and (iv) fall-related injury. The latter two have not been reported yet. Features prioritized from the network are predictive of hospitalizations in an independent set (p<0.004). Therefore, we suggest that network topology is a scalable and generalizable method predictive of hospitalization.
PMCID: PMC4419951  PMID: 25954392
17.  Peripheral Blood Mononuclear Cell Gene Expression Profiles Predict Poor Outcome in Idiopathic Pulmonary Fibrosis 
Science translational medicine  2013;5(205):205ra136.
We aimed to identify peripheral blood mononuclear cell (PBMC) gene expression profiles predictive of poor outcomes in idiopathic pulmonary fibrosis (IPF) by performing microarray experiments of PBMCs in discovery and replication cohorts of IPF patients. Microarray analyses identified 52 genes associated with transplant-free survival (TFS) in the discovery cohort. Clustering the microarray samples of the replication cohort using the 52-gene outcome-predictive signature distinguished two patient groups with significant differences in TFS. We studied the pathways associated with TFS in each independent microarray cohort and identified decreased expression of “The costimulatory signal during T cell activation” Biocarta pathway and, in particular, the genes CD28, ICOS, LCK, and ITK, results confirmed by quantitative reverse transcription polymerase chain reaction (qRT-PCR). A proportional hazards model, including the qRT-PCR expression of CD28, ICOS, LCK, and ITK along with patient’s age, gender, and percent predicted forced vital capacity (FVC%), demonstrated an area under the receiver operating characteristic curve of 78.5% at 2.4 months for death and lung transplant prediction in the replication cohort. To evaluate the potential cellular source of CD28, ICOS, LCK, and ITK expression, we analyzed and found significant correlation of these genes with the PBMC percentage of CD4+CD28+ T cells in the replication cohort. Our results suggest that CD28, ICOS, LCK, and ITK are potential outcome biomarkers in IPF and should be further evaluated for patient prioritization for lung transplantation and stratification in drug studies.
PMCID: PMC4175518  PMID: 24089408
18.  Accelerating precision biology and medicine with computational biology and bioinformatics 
Genome Biology  2014;15:450.
A report on the 22nd Annual International Conference on Intelligent Systems for Molecular Biology, held in Boston, Massachusetts, USA, July 11-15, 2014.
PMCID: PMC4709972  PMID: 25316263
19.  Conquering computational challenges of omics data and post-ENCODE paradigms 
Genome Biology  2013;14(8):310.
A report on the 21st Annual International Conference on Intelligent Systems for Molecular Biology (ISMB) and 12th European Conference on Computational Biology (ECCB), held in Berlin, Germany, July 21-23, 2013.
PMCID: PMC4053832  PMID: 23998801
Epigenetic network; machine learning; next-generation sequencing; post-transcriptional modification; post-translational modification; regulation; statistic modeling; translational bioinformatics
20.  ‘N-of-1-pathways’ unveils personal deregulated mechanisms from a single pair of RNA-Seq samples: towards precision medicine 
The emergence of precision medicine allowed the incorporation of individual molecular data into patient care. Indeed, DNA sequencing predicts somatic mutations in individual patients. However, these genetic features overlook dynamic epigenetic and phenotypic response to therapy. Meanwhile, accurate personal transcriptome interpretation remains an unmet challenge. Further, N-of-1 (single-subject) efficacy trials are increasingly pursued, but are underpowered for molecular marker discovery.
‘N-of-1-pathways’ is a global framework relying on three principles: (i) the statistical universe is a single patient; (ii) significance is derived from geneset/biomodules powered by paired samples from the same patient; and (iii) similarity between genesets/biomodules assesses commonality and differences, within-study and cross-studies. Thus, patient gene-level profiles are transformed into deregulated pathways. From RNA-Seq of 55 lung adenocarcinoma patients, N-of-1-pathways predicts the deregulated pathways of each patient.
Cross-patient N-of-1-pathways obtains comparable results with conventional genesets enrichment analysis (GSEA) and differentially expressed gene (DEG) enrichment, validated in three external evaluations. Moreover, heatmap and star plots highlight both individual and shared mechanisms ranging from molecular to organ-systems levels (eg, DNA repair, signaling, immune response). Patients were ranked based on the similarity of their deregulated mechanisms to those of an independent gold standard, generating unsupervised clusters of diametric extreme survival phenotypes (p=0.03).
The N-of-1-pathways framework provides a robust statistical and relevant biological interpretation of individual disease-free survival that is often overlooked in conventional cross-patient studies. It enables mechanism-level classifiers with smaller cohorts as well as N-of-1 studies.
PMCID: PMC4215042  PMID: 25301808
N-of-1; Single Subject Design; Precision Medicine; Personalized Medicine; Personal Transcriptome; Geneset
21.  In Silico cancer cell versus stroma cellularity index computed from species-specific human and mouse transcriptome of xenograft models: towards accurate stroma targeting therapy assessment 
BMC Medical Genomics  2014;7(Suppl 1):S2.
The current state of the art for measuring stromal response to targeted therapy requires burdensome and rate limiting quantitative histology. Transcriptome measures are increasingly affordable and provide an opportunity for developing a stromal versus cancer ratio in xenograft models. In these models, human cancer cells are transplanted into mouse host tissues (stroma) and together coevolve into a tumour microenvironment. However, profiling the mouse or human component separately remains problematic. Indeed, laser capture microdissection is labour intensive. Moreover, gene expression using commercial microarrays introduces significant and underreported cross-species hybridization errors that are commonly overlooked by biologists.
We developed a customized dual-species array, H&M array, and performed cross-species and species-specific hybridization measurements. We validated a new methodology for establishing the stroma vs cancer ratio using transcriptomic data.
In the biological validation of the H&M array, cross-species hybridization of human and mouse probes was significantly reduced (4.5 and 9.4 fold reduction, respectively; p < 2x10-16 for both, Mann-Whitney test). We confirmed the capability of the H&M array to determine the stromal to cancer cells ratio based on the estimation of cellularity index of mouse/human mRNA content in vitro. This new metrics enable to investigate more efficiently the stroma-cancer cell interactions (e.g. cellularity) bypassing labour intensive requirement and biases of laser capture microdissection.
These results provide the initial evidence of improved and cost-efficient analytics for the investigation of cancer cell microenvironment, using species-specificity arrays specifically designed for xenografts models.
PMCID: PMC4101338  PMID: 25079962
22.  Concordance of deregulated mechanisms unveiled in underpowered experiments: PTBP1 knockdown case study 
BMC Medical Genomics  2014;7(Suppl 1):S1.
Genome-wide transcriptome profiling generated by microarray and RNA-Seq often provides deregulated genes or pathways applicable only to larger cohort. On the other hand, individualized interpretation of transcriptomes is increasely pursued to improve diagnosis, prognosis, and patient treatment processes. Yet, robust and accurate methods based on a single paired-sample remain an unmet challenge.
"N-of-1-pathways" translates gene expression data profiles into mechanism-level profiles on single pairs of samples (one p-value per geneset). It relies on three principles: i) statistical universe is a single paired sample, which serves as its own control; ii) statistics can be derived from multiple gene expression measures that share common biological mechanisms assimilated to genesets; iii) semantic similarity metric takes into account inter-mechanisms' relationships to better assess commonality and differences, within and cross study-samples (e.g. patients, cell-lines, tissues, etc.), which helps the interpretation of the underpinning biology.
In the context of underpowered experiments, N-of-1-pathways predictions perform better or comparable to those of GSEA and Differentially Expressed Genes enrichment (DEG enrichment), within-and cross-datasets. N-of-1-pathways uncovered concordant PTBP1-dependent mechanisms across datasets (Odds-Ratios≥13, p-values≤1 × 10−5), such as RNA splicing and cell cycle. In addition, it unveils tissue-specific mechanisms of alternatively transcribed PTBP1-dependent genesets. Furthermore, we demonstrate that GSEA and DEG Enrichment preclude accurate analysis on single paired samples.
N-of-1-pathways enables robust and biologically relevant mechanism-level classifiers with small cohorts and one single paired samples that surpasses conventional methods. Further, it identifies unique sample/ patient mechanisms, a requirement for precision medicine.
PMCID: PMC4101571  PMID: 25079003
23.  Role of FAM18B in diabetic retinopathy 
Molecular Vision  2014;20:1146-1159.
Genome-wide association studies have suggested an association between a previously uncharacterized gene, FAM18B, and diabetic retinopathy. This study explores the role of FAM18B in diabetic retinopathy. An improved understanding of FAM18B could yield important insights into the pathogenesis of this sight-threatening complication of diabetes mellitus.
Postmortem human eyes were examined with immunohistochemistry and immunofluorescence for the presence of FAM18B. Expression of FAM18B in primary human retinal microvascular endothelial cells (HRMECs) exposed to hyperglycemia, vascular endothelial growth factor (VEGF), or advanced glycation end products (AGEs) was determined with quantitative reverse-transcription PCR (qRT-PCR) and/or western blot. The role of FAM18B in regulating human retinal microvascular endothelial cell viability, migration, and endothelial tube formation was determined following RNAi-mediated knockdown of FAM18B. The presence of FAM18B was determined with qRT-PCR in CD34+/VEGFR2+ mononuclear cells isolated from a cohort of 17 diabetic subjects with and without diabetic retinopathy.
Immunohistochemistry and immunofluorescence demonstrated the presence of FAM18B in the human retina with prominent vascular staining. Hyperglycemia, VEGF, and AGEs downregulated the expression of FAM18B in HRMECs. RNAi-mediated knockdown of FAM18B in HRMECs contributed to enhanced migration and tube formation as well as exacerbating the hyperglycemia-induced decrease in HRMEC viability. The enhanced migration, tube formation, and decrease in the viability of HRMECs as a result of FAM18B downregulation was reversed with pyrrolidine dithiocarbamate (PDTC), a specific nuclear factor-kappa B (NF-κB) inhibitor. CD34+/VEGFR2+ mononuclear cells from subjects with proliferative diabetic retinopathy demonstrated significantly reduced mRNA expression of FAM18B compared to diabetic subjects without retinopathy.
FAM18B is expressed in the retina. Diabetic culture conditions decrease the expression of FAM18B in HRMECs. The downregulation of FAM18B by siRNA in HRMECs results in enhanced migration and tube formation, but also exacerbates the hyperglycemia-induced decrease in HRMEC viability. The pathogenic changes observed in HRMECs as a result of FAM18B downregulation were reversed with PDTC, a specific NF-κB inhibitor. This study is the first to demonstrate a potential role for FAM18B in the pathogenesis of diabetic retinopathy.
PMCID: PMC4124103  PMID: 25221423
24.  Breakthroughs in genomics data integration for predicting clinical outcome 
Journal of biomedical informatics  2012;45(6):1199-1201.
PMCID: PMC3632294  PMID: 23117078
25.  The rise of translational bioinformatics 
Genome Biology  2012;13(8):319.
A report on the 20th International Conference on Intelligent Systems for Molecular Biology (ISMB), held at Long Beach, California, USA, July 15-17, 2012.
PMCID: PMC3491366  PMID: 22943369
Biomarkers; complex diseases; computational medicine; drug repositioning; mechanism classifiers; next-generation sequencing; off-target mechanisms; translational bioinformatics

Results 1-25 (119)