Search tips
Search criteria

Results 1-25 (43)

Clipboard (0)

Select a Filter Below

Year of Publication
more »
1.  Enumerateblood – an R package to estimate the cellular composition of whole blood from Affymetrix Gene ST gene expression profiles 
BMC Genomics  2017;18:43.
Measuring genome-wide changes in transcript abundance in circulating peripheral whole blood is a useful way to study disease pathobiology and may help elucidate the molecular mechanisms of disease, or discovery of useful disease biomarkers. The sensitivity and interpretability of analyses carried out in this complex tissue, however, are significantly affected by its dynamic cellular heterogeneity. It is therefore desirable to quantify this heterogeneity, either to account for it or to better model interactions that may be present between the abundance of certain transcripts, specific cell types and the indication under study. Accurate enumeration of the many component cell types that make up peripheral whole blood can further complicate the sample collection process, however, and result in additional costs. Many approaches have been developed to infer the composition of a sample from high-dimensional transcriptomic and, more recently, epigenetic data. These approaches rely on the availability of isolated expression profiles for the cell types to be enumerated. These profiles are platform-specific, suitable datasets are rare, and generating them is expensive. No such dataset exists on the Affymetrix Gene ST platform.
We present ‘Enumerateblood’, a freely-available and open source R package that exposes a multi-response Gaussian model capable of accurately predicting the composition of peripheral whole blood samples from Affymetrix Gene ST expression profiles, outperforming other current methods when applied to Gene ST data.
‘Enumerateblood’ significantly improves our ability to study disease pathobiology from whole blood gene expression assayed on the popular Affymetrix Gene ST platform by allowing a more complete study of the various components of this complex tissue without the need for additional data collection. Future use of the model may allow for novel insights to be generated from the ~400 Affymetrix Gene ST blood gene expression datasets currently available on the Gene Expression Omnibus (GEO) website.
Electronic supplementary material
The online version of this article (doi:10.1186/s12864-016-3460-1) contains supplementary material, which is available to authorized users.
PMCID: PMC5219701  PMID: 28061752
2.  SABRE: a method for assessing the stability of gene modules in complex tissues and subject populations 
BMC Bioinformatics  2016;17:460.
Gene network inference (GNI) algorithms can be used to identify sets of coordinately expressed genes, termed network modules from whole transcriptome gene expression data. The identification of such modules has become a popular approach to systems biology, with important applications in translational research. Although diverse computational and statistical approaches have been devised to identify such modules, their performance behavior is still not fully understood, particularly in complex human tissues. Given human heterogeneity, one important question is how the outputs of these computational methods are sensitive to the input sample set, or stability. A related question is how this sensitivity depends on the size of the sample set. We describe here the SABRE (Similarity Across Bootstrap RE-sampling) procedure for assessing the stability of gene network modules using a re-sampling strategy, introduce a novel criterion for identifying stable modules, and demonstrate the utility of this approach in a clinically-relevant cohort, using two different gene network module discovery algorithms.
The stability of modules increased as sample size increased and stable modules were more likely to be replicated in larger sets of samples. Random modules derived from permutated gene expression data were consistently unstable, as assessed by SABRE, and provide a useful baseline value for our proposed stability criterion. Gene module sets identified by different algorithms varied with respect to their stability, as assessed by SABRE. Finally, stable modules were more readily annotated in various curated gene set databases.
The SABRE procedure and proposed stability criterion may provide guidance when designing systems biology studies in complex human disease and tissues.
Electronic supplementary material
The online version of this article (doi:10.1186/s12859-016-1319-8) contains supplementary material, which is available to authorized users.
PMCID: PMC5109843  PMID: 27842512
Systems biology; Gene modules; Reproducibility; WGCNA; Bootstrap
3.  MicroRNA-21 is a potential link between non-alcoholic fatty liver disease and hepatocellular carcinoma via modulation of the HBP1-p53-Srebp1c pathway 
Gut  2015;65(11):1850-1860.
Non-alcoholic fatty liver disease (NAFLD) is a major risk factor for hepatocellular carcinoma (HCC). However, the mechanistic pathways that link both disorders are essentially unknown.
Our study was designed to investigate the role of microRNA-21 in the pathogenesis of NAFLD and its potential involvement in HCC.
Wildtype mice maintained on a high fat diet (HFD) received tail vein injections of microRNA-21-anti-sense oligonucleotide (ASO) or miR-21 mismatched ASO for 4 or 8 weeks. Livers were collected after that time period for lipid content and gene expression analysis. Human hepatoma HepG2 cells incubated with oleate were used to study the role of miR-21 in lipogenesis and analysed with Nile-Red staining. microRNA-21 function in carcinogenesis was determined by soft-agar colony formation, cell cycle analysis and xenograft tumour assay using HepG2 cells.
The expression of microRNA-21 was increased in the livers of HFD-treated mice and human HepG2 cells incubated with fatty acid. MicroRNA-21 knockdown in those mice and HepG2 cells impaired lipid accumulation and growth of xenograft tumour. Further studies revealed that Hbp1 was a novel target of microRNA-21 and a transcriptional activator of p53. It is well established that p53 is a tumour suppressor and an inhibitor of lipogenesis by inhibiting Srebp1c. As expected, microRNA-21 knockdown led to increased HBP1 and p53 and subsequently reduced lipogenesis and delayed G1/S transition, and the additional treatment of HBP1-siRNA antagonised the effect of microRNA-21-ASO, suggesting that HBP1 mediated the inhibitory effects of microRNA-21-ASO on both hepatic lipid accumulation and hepatocarcinogenesis. Mechanistically, microRNA-21 knockdown induced p53 transcription, which subsequently reduced expression of genes controlling lipogenesis and cell cycle transition. In contrast, the opposite result was observed with overexpression of microRNA-21, which prevented p53 transcription.
Our findings reveal a novel mechanism by which microRNA-21, in part, promotes hepatic lipid accumulation and cancer progression by interacting with the Hbp1-p53-Srebp1c pathway and suggest the potential therapeutic value of microRNA-21-ASO for both disorders.
PMCID: PMC4882277  PMID: 26282675
4.  Circulating biomarker responses to medical management vs. mechanical circulatory support in severe inotrope‐dependent acute heart failure 
Esc Heart Failure  2015;3(2):86-96.
Severe inotrope‐dependent acute heart failure (AHF) is associated with poor clinical outcomes. There are currently no well‐defined blood biomarkers of response to treatment that can guide management or identify recovery in this patient population. In the present study, we characterized the levels of novel and emerging circulating biomarkers of heart failure in patients with AHF over the first 30 days of medical management or mechanical circulatory support (MCS). We hypothesized a shared a plasma proteomic treatment response would be identifiable in both patient groups, representing reversal of the AHF phenotype.
Methods and results
Time course plasma samples of the first 30 days of therapy, obtained from patients managed medically (n = 8) or with implantable MCS (n = 5), underwent semi‐targeted and candidate biomarker analyses, using multiple reaction monitoring (MRM) mass spectrometry, antibody arrays, and enzyme‐linked immunosorbent assays. Differentially expressed proteins were identified using robust limma for MRM and antibody array data. Patients managed medically or with implantable MCS had a shared proteomic signature of six plasma proteins: circulating cardiotrophin 1, cardiac troponin T, clusterin, and dickopff 1 increased, while levels of C‐reactive protein and growth differentiation factor 15 decreased in both groups over the 30 day time course.
We have characterized the temporal proteomic signature of clinical recovery in AHF patients managed medically or with MCS, over the first 30 days of treatment. Changes in biomarker expression over the time course of treatment may provide a basis for understanding the biological basis of AHF, potentially identifying novel markers and pathophysiologic mechanisms of recovery.
PMCID: PMC5063158  PMID: 27774271
Acute heart failure; Plasma proteomics; Biomarkers; Mechanical circulatory support; Ventricular assist device; Bioinformatics
5.  Individualized prediction of lung-function decline in chronic obstructive pulmonary disease 
The rate of lung-function decline in chronic obstructive pulmonary disease (COPD) varies substantially among individuals. We sought to develop and validate an individualized prediction model for forced expiratory volume at 1 second (FEV1) in current smokers with mild-to-moderate COPD.
Using data from a large long-term clinical trial (the Lung Health Study), we derived mixed-effects regression models to predict future FEV1 values over 11 years according to clinical traits. We modelled heterogeneity by allowing regression coefficients to vary across individuals. Two independent cohorts with COPD were used for validating the equations.
We used data from 5594 patients (mean age 48.4 yr, 63% men, mean baseline FEV1 2.75 L) to create the individualized prediction equations. There was significant between-individual variability in the rate of FEV1 decline, with the interval for the annual rate of decline that contained 95% of individuals being −124 to −15 mL/yr for smokers and −83 to 15 mL/yr for sustained quitters. Clinical variables in the final model explained 88% of variation around follow-up FEV1. The C statistic for predicting severity grades was 0.90. Prediction equations performed robustly in the 2 external data sets.
A substantial part of individual variation in FEV1 decline can be explained by easily measured clinical variables. The model developed in this work can be used for prediction of future lung health in patients with mild-to-moderate COPD.
Trial registration:
Lung Health Study —, no. NCT00000568; Pan-Canadian Early Detection of Lung Cancer Study —, no. NCT00751660
PMCID: PMC5047815  PMID: 27486205
6.  COPD Exacerbation Biomarkers Validated Using Multiple Reaction Monitoring Mass Spectrometry 
PLoS ONE  2016;11(8):e0161129.
Acute exacerbations of chronic obstructive pulmonary disease (AECOPD) result in considerable morbidity and mortality. However, there are no objective biomarkers to diagnose AECOPD.
We used multiple reaction monitoring mass spectrometry to quantify 129 distinct proteins in plasma samples from patients with COPD. This analytical approach was first performed in a biomarker cohort of patients hospitalized with AECOPD (Cohort A, n = 72). Proteins differentially expressed between AECOPD and convalescent states were chosen using a false discovery rate <0.01 and fold change >1.2. Protein selection and classifier building were performed using an elastic net logistic regression model. The performance of the biomarker panel was then tested in two independent AECOPD cohorts (Cohort B, n = 37, and Cohort C, n = 109) using leave-pair-out cross-validation methods.
Five proteins were identified distinguishing AECOPD and convalescent states in Cohort A. Biomarker scores derived from this model were significantly higher during AECOPD than in the convalescent state in the discovery cohort (p<0.001). The receiver operating characteristic cross-validation area under the curve (CV-AUC) statistic was 0.73 in Cohort A, while in the replication cohorts the CV-AUC was 0.77 for Cohort B and 0.79 for Cohort C.
A panel of five biomarkers shows promise in distinguishing AECOPD from convalescence and may provide the basis for a clinical blood test to diagnose AECOPD. Further validation in larger cohorts is necessary for future clinical translation.
PMCID: PMC4985129  PMID: 27525416
7.  Inhibition of MicroRNA-24 Expression in Liver Prevents Hepatic Lipid Accumulation and Hyperlipidemia 
Hepatology (Baltimore, Md.)  2014;60(2):554-564.
The incidence of nonalcoholic fatty liver disease (NAFLD) and hyperlipidemia, with their associated risks of endstage liver and cardiovascular diseases, is increasing rapidly due to the prevalence of obesity. Although the mechanisms of NAFLD have been studied extensively, the underlying pathogenesis and the role of microRNAs in this process remain relatively unclear. MicroRNA (miRNA)-dependent posttranscriptional gene silencing is now recognized as a key element of lipid metabolism. Here we report that the expression of microRNA-24 (miR-24) is significantly increased in the livers of high-fat diet-treated mice and in isolated human hepatocytes incubated with fatty acid. Knockdown of miR-24 in those mice caused impaired hepatic lipid accumulation and reduced plasma triglycerides. Bioinformatic and in vitro and in vivo studies led us to identify insulin-induced gene 1 (Insig1), an inhibitor of lipogenesis, as a novel target of miR-24. Inhibition of endogenous miR-24 expression by way of miR-24 inhibitors led to up-regulation of Insig1, and subsequently decreased hepatic lipid accumulation. It is well established that liver-specific deletion of Insig1 leads to higher hepatic and plasma triglyceride levels by inhibiting the processing of sterol regulatory element-binding proteins (SREBPs), transcription factors that activate lipid synthesis. As expected, miR-24 knockdown prevented SREBP processing, and subsequent expression of lipogenic genes. In contrast, the opposite result was observed with overexpression of miR-24, which enhanced SREBP processing. Thus, our study defines a potentially critical role for deregulated expression of miR-24 in the development of fatty liver by way of targeting of Insig1.
Our findings show a novel mechanism by which miR-24 promotes hepatic lipid accumulation and hyperlipidemia by repressing Insig1, and suggest the use of miR-24 inhibitor as a potential therapeutic agent for NAFLD and/or atherosclerosis.
PMCID: PMC4809671  PMID: 24677249
8.  The Effect of Statins on Blood Gene Expression in COPD 
PLoS ONE  2015;10(10):e0140022.
COPD is currently the fourth leading cause of death worldwide. Statins are lipid lowering agents with documented cardiovascular benefits. Observational studies have shown that statins may have a beneficial role in COPD. The impact of statins on blood gene expression from COPD patients is largely unknown.
Identify blood gene signature associated with statin use in COPD patients, and the pathways underpinning this signature that could explain any potential benefits in COPD.
Whole blood gene expression was measured on 168 statin users and 451 non-users from the ECLIPSE study using the Affymetrix Human Gene 1.1 ST microarray chips. Factor Analysis for Robust Microarray Summarization (FARMS) was used to process the expression data. Differential gene expression analysis was undertaken using the Linear Models for Microarray data (Limma) package adjusting for propensity score and surrogate variables. Similarity of the expression signal with published gene expression profiles was performed in ProfileChaser.
25 genes were differentially expressed between statin users and non-users at an FDR of 10%, including LDLR, CXCR2, SC4MOL, FAM108A1, IFI35, FRYL, ABCG1, MYLIP, and DHCR24. The 25 genes were significantly enriched in cholesterol homeostasis and metabolism pathways. The resulting gene signature showed correlation with Huntington’s disease, Parkinson’s disease and acute myeloid leukemia gene signatures.
The blood gene signature of statins’ use in COPD patients was enriched in cholesterol homeostasis pathways. Further studies are needed to delineate the role of these pathways in lung biology.
PMCID: PMC4604084  PMID: 26462087
9.  Mapping and direct valuation: do they give equivalent EQ-5D-5L index scores? 
Utility values of health states defined by health-related quality of life instruments can be derived from either direct valuation (‘valuation-derived’) or mapping (‘mapping-derived’). This study aimed to compare the utility-based EQ-5D-5L index scores derived from the two approaches as a means to validating the mapping function developed by van Hout et al for the EQ-5D-5L instrument.
This was an observational study of 269 breast cancer patients whose EQ-5D-5L index scores were derived from both methods. For comparing discriminatory ability and responsiveness to change, multivariable regression models were used to estimate the effect sizes of various health indicators on the index scores. Agreement and test-retest reliability were examined using intraclass correlation coefficient (ICC). Whenever appropriate, the 90 % confidence intervals (90 % CI) were compared to predefined equivalence margins.
The mean difference in and ICC between the valuation- and mapping-derived EQ-5D-5L index scores were 0.015 (90 % CI = 0.006 to 0.024) and 0.915, respectively. Discriminatory ability and responsiveness of the two indices were equivalent in 13 of 15 regression analyses. However, the mapping-derived index score was lower than the valuation-derived index score in patients experiencing extreme health problems, and the test-retest reliability of the former was lower than the latter, for example, their ICCs differed by 0.121 (90 % CI = 0.051 to 0.198) in patients who reported no change in performance status in the follow-up survey.
This study provided the first evidence supporting the validity of the mapping function for converting EQ-5D-5L profile data into a utility-based index score.
PMCID: PMC4595246  PMID: 26438167
10.  Medication use by early-stage breast cancer survivors: a 1-year longitudinal study 
Supportive Care in Cancer  2015;24:1639-1647.
The aim of this study is to characterize the patterns of medication use by early-stage breast cancer (ESBC) survivors from diagnosis to 1 year post-chemotherapy.
A single-center longitudinal study was conducted with ESBC patients diagnosed between December 2011 and June 2014. Data on the medication use of individual patients were retrieved from prescription databases, supplemented by records from the National Electronic Health Records. The data covered the period from ESBC diagnosis to 1 year post-chemotherapy. Medication types were classified according to the World Health Organization’s Anatomical Therapeutic Chemical classification system, and medication for chronic diseases was created by adapting a list of 20 chronic diseases provided by the U.S. Department of Human and Health Services.
Of the 107 patients involved in the study (mean age 51.1 ± 8.4 years; 78.5 % Chinese), 46.7 % manifested non-cancer comorbidities, of which hypertension (24.3 %) was the most prevalent, followed by hyperlipidemia (13.1 %) and diabetes (5.6 %). Calcium channel blockers (12.1 %) and lipid-modifying agents (11.2 %) were the most common chronic medication types used before chemotherapy, and their use persisted during chemotherapy (10.3 and 11.2 %, respectively) and after chemotherapy (11.2 and 13.1 %, respectively). Hormonal therapy was the predominant post-chemotherapy medication (77.6 %). A statistically significant increase (p < 0.0001) was observed in the mean number of chronic disease medication classes prescribed to patients between the pre-chemotherapy (0.53 ± 1.04) and chemotherapy (0.62 ± 1.08) periods and between the chemotherapy and post-chemotherapy (1.63 ± 1.35) periods.
There is an increase in trend of chronic medication usage in breast cancer survivors after cancer treatment. This study provides important insights into the design of medication management programs tailored to this population. Future studies should incorporate a control population to improve the interpretation of study results.
PMCID: PMC4766201  PMID: 26404861
Early-stage breast cancer; Medication use; Chemotherapy; Cancer survivor; Medication management
11.  Brain-derived neurotrophic factor genetic polymorphism (rs6265) is protective against chemotherapy-associated cognitive impairment in patients with early-stage breast cancer 
Neuro-Oncology  2015;18(2):244-251.
Brain-derived neurotrophic factor (BDNF), a neurotrophin that regulates neuronal function and development, is implicated in several neurodegenerative conditions. Preliminary data suggest that a reduction of BDNF concentrations may lead to postchemotherapy cognitive impairment. We hypothesized that a single nucleotide polymorphism (rs6265) of the BDNF gene may predispose patients to cognitive impairment. This study aimed to evaluate the effect of BDNF gene polymorphism on chemotherapy-associated cognitive impairment.
Overall, 145 patients receiving chemotherapy for early-stage breast cancer (mean age: 50.8 ± 8.8 y; 82.1% Chinese) were recruited. Patients' cognitive functions were assessed longitudinally using the validated Functional Assessment of Cancer Therapy–Cognitive Function (v.3) and an objective computerized tool, Headminder. Genotyping was performed using Sanger sequencing. Logistic regression was used to evaluate the association between BDNF Val66Met polymorphism and cognition after adjusting for ethnicity and clinically important covariates.
Of the 145 patients, 54 (37%) reported cognitive impairment postchemotherapy. The Met/Met genotype was associated with statistically significant lower odds of developing cognitive impairment (odds ratio [OR] = 0.26; 95% CI: 0.08–0.92; P = .036). The Met carriers were less likely to experience impairment in the domains of verbal fluency (OR = 0.34; 95% CI: 0.12–0.90; P = .031) and multitasking ability (OR = 0.37; 95% CI: 0.15–0.91; P = .030) compared with the Val/Val homozygote. No associations were observed between Headminder and the BDNF Val66Met polymorphism.
This is the first study to provide evidence that carriers of the BDNF Met allele are protected against chemotherapy-associated cognitive impairment. Further studies are required to validate the findings.
PMCID: PMC4724179  PMID: 26289590
BDNF; breast cancer; cognition; genetics; rs6265
12.  MiR-494 Within an Oncogenic MicroRNA Megacluster Regulates G1/S Transition in Liver Tumorigenesis Through Suppression of MCC 
Hepatology (Baltimore, Md.)  2013;59(1):10.1002/hep.26662.
Hepatocellular carcinoma (HCC) is associated with poor survival for patients and few effective treatment options, raising the need for novel therapeutic strategies. MicroRNAs (miRNAs) play important roles in tumor development and show deregulated patterns of expression in HCC. Because of the liver’s unique affinity for small nucleic acids, miRNA based therapy has been proposed in the treatment of liver disease. There is thus an urgent need to identify and characterize aberrantly expressed miRNAs in HCC. In our study, we profiled miRNA expression changes in de novo liver tumors driven by MYC and/or RAS, two canonical oncogenes activated in a majority of human HCC. We identified an upregulated miRNA megacluster comprised of 53 miRNAs on mouse chromosome 12qF1 (human homolog 14q32). This miRNA megacluster is upregulated in all three transgenic liver models and in a subset of human HCCs. An unbiased functional analysis of all miRNAs within this cluster was performed.
We found that miR-494 is overexpressed in human HCC, and aids in transformation by regulating the G1/S cell cycle transition through targeting of the Mutated in Colorectal Cancer (MCC) tumor suppressor. miR-494 inhibition in human HCC cell lines decreases cellular transformation and anti-miR-494 treatment of primary MYC-driven liver tumor formation significantly diminishes tumor size. Our findings identify a new therapeutic target, miR-494, for the treatment of HCC.
PMCID: PMC3877416  PMID: 23913442
HCC; cancer; cell cycle; Dlk1-Dio3; miRNA therapy
13.  Mapping the Functional Assessment of Cancer Therapy - Breast (FACT-B) to the 5-level EuroQoL group’s 5-dimension questionnaire (EQ-5D-5L) utility index in a Multi-ethnic Asian population 
To develop an algorithm for mapping the Functional Assessment of Cancer Therapy – Breast (FACT-B) to the 5-level EuroQoL Group’s 5-dimension questionnaire (EQ-5D-5L) utility index.
A survey of 238 breast cancer patients in Singapore was conducted. Models using various regression methods with or without recognizing the upper boundary of utility values at 1 were fitted to predict the EQ-5D-5L utility index based on the five subscale scores of the FACT-B. Data from a follow-up survey of these patients were used to validate the results.
A model that maps the physical, emotional, functional well-being and the breast cancer concerns subscales of the FACT-B to the EQ-5D-5L utility index was derived. The social well-being subscale was not associated to the utility index. Although theoretical assumptions may not be valid, ordinary least square outperformed other regression methods. The mean predicted utility index within each performance status level at follow-up deviated from the observed mean less than the minimally important difference of EQ-5D for cancer patients.
The mapping algorithm converts the FACT-B to the EQ-5D utility index. This enables oncologists, clinical researchers and policy makers to obtain a quantitative utility summary of a patient’s health status when only the FACT-B is assessed.
PMCID: PMC4267156  PMID: 25495840
Breast cancer; EQ-5D-5L; FACT-B; Health utility; Mapping; Quality of life
14.  Novel Multivariate Methods for Integration of Genomics and Proteomics Data: Applications in a Kidney Transplant Rejection Study 
Multi-omics research is a key ingredient of data-intensive life sciences research, permitting measurement of biological molecules at different functional levels in the same individual. For a complete picture at the biological systems level, appropriate statistical techniques must however be developed to integrate different ‘omics’ data sets (e.g., genomics and proteomics). We report here multivariate projection-based analyses approaches to genomics and proteomics data sets, using the case study of and applications to observations in kidney transplant patients who experienced an acute rejection event (n=20) versus non-rejecting controls (n=20). In this data sets, we show how these novel methodologies might serve as promising tools for dimension reduction and selection of relevant features for different analytical frameworks. Unsupervised analyses highlighted the importance of post transplant time-of-rejection, while supervised analyses identified gene and protein signatures that together predicted rejection status with little time effect. The selected genes are part of biological pathways that are representative of immune responses. Gene enrichment profiles revealed increases in innate immune responses and neutrophil activities and a depletion of T lymphocyte related processes in rejection samples as compared to controls. In all, this article offers candidate biomarkers for future detection and monitoring of acute kidney transplant rejection, as well as ways forward for methodological advances to better harness multi-omics data sets.
PMCID: PMC4229708  PMID: 25387159
15.  Two-Stage, In Silico Deconvolution of the Lymphocyte Compartment of the Peripheral Whole Blood Transcriptome in the Context of Acute Kidney Allograft Rejection 
PLoS ONE  2014;9(4):e95224.
Acute rejection is a major complication of solid organ transplantation that prevents the long-term assimilation of the allograft. Various populations of lymphocytes are principal mediators of this process, infiltrating graft tissues and driving cell-mediated cytotoxicity. Understanding the lymphocyte-specific biology associated with rejection is therefore critical. Measuring genome-wide changes in transcript abundance in peripheral whole blood cells can deliver a comprehensive view of the status of the immune system. The heterogeneous nature of the tissue significantly affects the sensitivity and interpretability of traditional analyses, however. Experimental separation of cell types is an obvious solution, but is often impractical and, more worrying, may affect expression, leading to spurious results. Statistical deconvolution of the cell type-specific signal is an attractive alternative, but existing approaches still present some challenges, particularly in a clinical research setting. Obtaining time-matched sample composition to biologically interesting, phenotypically homogeneous cell sub-populations is costly and adds significant complexity to study design. We used a two-stage, in silico deconvolution approach that first predicts sample composition to biologically meaningful and homogeneous leukocyte sub-populations, and then performs cell type-specific differential expression analysis in these same sub-populations, from peripheral whole blood expression data. We applied this approach to a peripheral whole blood expression study of kidney allograft rejection. The patterns of differential composition uncovered are consistent with previous studies carried out using flow cytometry and provide a relevant biological context when interpreting cell type-specific differential expression results. We identified cell type-specific differential expression in a variety of leukocyte sub-populations at the time of rejection. The tissue-specificity of these differentially expressed probe-set lists is consistent with the originating tissue and their functional enrichment consistent with allograft rejection. Finally, we demonstrate that the strategy described here can be used to derive useful hypotheses by validating a cell type-specific ratio in an independent cohort using the nanoString nCounter assay.
PMCID: PMC3986379  PMID: 24733377
16.  Variation in RNA-Seq Transcriptome Profiles of Peripheral Whole Blood from Healthy Individuals with and without Globin Depletion 
PLoS ONE  2014;9(3):e91041.
The molecular profile of circulating blood can reflect physiological and pathological events occurring in other tissues and organs of the body and delivers a comprehensive view of the status of the immune system. Blood has been useful in studying the pathobiology of many diseases. It is accessible and easily collected making it ideally suited to the development of diagnostic biomarker tests. The blood transcriptome has a high complement of globin RNA that could potentially saturate next-generation sequencing platforms, masking lower abundance transcripts. Methods to deplete globin mRNA are available, but their effect has not been comprehensively studied in peripheral whole blood RNA-Seq data. In this study we aimed to assess technical variability associated with globin depletion in addition to assessing general technical variability in RNA-Seq from whole blood derived samples.
We compared technical and biological replicates having undergone globin depletion or not and found that the experimental globin depletion protocol employed removed approximately 80% of globin transcripts, improved the correlation of technical replicates, allowed for reliable detection of thousands of additional transcripts and generally increased transcript abundance measures. Differential expression analysis revealed thousands of genes significantly up-regulated as a result of globin depletion. In addition, globin depletion resulted in the down-regulation of genes involved in both iron and zinc metal ion bonding.
Globin depletion appears to meaningfully improve the quality of peripheral whole blood RNA-Seq data, and may improve our ability to detect true biological variation. Some concerns remain, however. Key amongst them the significant reduction in RNA yields following globin depletion. More generally, our investigation of technical and biological variation with and without globin depletion finds that high-throughput sequencing by RNA-Seq is highly reproducible within a large dynamic range of detection and provides an accurate estimation of RNA concentration in peripheral whole blood. High-throughput sequencing is thus a promising technology for whole blood transcriptomics and biomarker discovery.
PMCID: PMC3946641  PMID: 24608128
17.  Longitudinal Analysis of Whole Blood Transcriptomes to Explore Molecular Signatures Associated With Acute Renal Allograft Rejection 
In this study, we explored a time course of peripheral whole blood transcriptomes from kidney transplantation patients who either experienced an acute rejection episode or did not in order to better delineate the immunological and biological processes measureable in blood leukocytes that are associated with acute renal allograft rejection. Using microarrays, we generated gene expression data from 24 acute rejectors and 24 nonrejectors. We filtered the data to obtain the most unambiguous and robustly expressing probe sets and selected a subset of patients with the clearest phenotype. We then performed a data-driven exploratory analysis using data reduction and differential gene expression analysis tools in order to reveal gene expression signatures associated with acute allograft rejection. Using a template-matching algorithm, we then expanded our analysis to include time course data, identifying genes whose expression is modulated leading up to acute rejection. We have identified molecular phenotypes associated with acute renal allograft rejection, including a significantly upregulated signature of neutrophil activation and accumulation following transplant surgery that is common to both acute rejectors and nonrejectors. Our analysis shows that this expression signature appears to stabilize over time in nonrejectors but persists in patients who go on to reject the transplanted organ. In addition, we describe an expression signature characteristic of lymphocyte activity and proliferation. This lymphocyte signature is significantly downregulated in both acute rejectors and nonrejectors following surgery; however, patients who go on to reject the organ show a persistent downregulation of this signature relative to the neutrophil signature.
PMCID: PMC3921155  PMID: 24526836
blood transcriptomics; microarray; kidney transplant rejection; peripheral whole blood; neutrophil to lymphocyte ratio
18.  Alteration of human blood cell transcriptome in uremia 
BMC Medical Genomics  2013;6:23.
End-stage renal failure is associated with profound changes in physiology and health, but the molecular causation of these pleomorphic effects termed “uremia” is poorly understood. The genomic changes of uremia were explored in a whole genome microarray case-control comparison of 95 subjects with end-stage renal failure (n = 75) or healthy controls (n = 20).
RNA was separated from blood drawn in PAXgene tubes and gene expression analyzed using Affymetrix Human Genome U133 Plus 2.0 arrays. Quality control and normalization was performed, and statistical significance determined with multiple test corrections (qFDR). Biological interpretation was aided by knowledge mining using NIH DAVID, MetaCore and PubGene
Over 9,000 genes were differentially expressed in uremic subjects compared to normal controls (fold change: -5.3 to +6.8), and more than 65% were lower in uremia. Changes appeared to be regulated through key gene networks involving cMYC, SP1, P53, AP1, NFkB, HNF4 alpha, HIF1A, c-Jun, STAT1, STAT3 and CREB1. Gene set enrichment analysis showed that mRNA processing and transport, protein transport, chaperone functions, the unfolded protein response and genes involved in tumor genesis were prominently lower in uremia, while insulin-like growth factor activity, neuroactive receptor interaction, the complement system, lipoprotein metabolism and lipid transport were higher in uremia. Pathways involving cytoskeletal remodeling, the clathrin-coated endosomal pathway, T-cell receptor signaling and CD28 pathways, and many immune and biological mechanisms were significantly down-regulated, while the ubiquitin pathway and certain others were up-regulated.
End-stage renal failure is associated with profound changes in human gene expression which appears to be mediated through key transcription factors. Dialysis and primary kidney disease had minor effects on gene regulation, but uremia was the dominant influence in the changes observed. This data provides important insight into the changes in cellular biology and function, opportunities for biomarkers of disease progression and therapy, and potential targets for intervention in uremia.
PMCID: PMC3706221  PMID: 23809614
Gene expression profiling; Uremia; Chronic renal failure
19.  Computational Biomarker Pipeline from Discovery to Clinical Implementation: Plasma Proteomic Biomarkers for Cardiac Transplantation 
PLoS Computational Biology  2013;9(4):e1002963.
Recent technical advances in the field of quantitative proteomics have stimulated a large number of biomarker discovery studies of various diseases, providing avenues for new treatments and diagnostics. However, inherent challenges have limited the successful translation of candidate biomarkers into clinical use, thus highlighting the need for a robust analytical methodology to transition from biomarker discovery to clinical implementation. We have developed an end-to-end computational proteomic pipeline for biomarkers studies. At the discovery stage, the pipeline emphasizes different aspects of experimental design, appropriate statistical methodologies, and quality assessment of results. At the validation stage, the pipeline focuses on the migration of the results to a platform appropriate for external validation, and the development of a classifier score based on corroborated protein biomarkers. At the last stage towards clinical implementation, the main aims are to develop and validate an assay suitable for clinical deployment, and to calibrate the biomarker classifier using the developed assay. The proposed pipeline was applied to a biomarker study in cardiac transplantation aimed at developing a minimally invasive clinical test to monitor acute rejection. Starting with an untargeted screening of the human plasma proteome, five candidate biomarker proteins were identified. Rejection-regulated proteins reflect cellular and humoral immune responses, acute phase inflammatory pathways, and lipid metabolism biological processes. A multiplex multiple reaction monitoring mass-spectrometry (MRM-MS) assay was developed for the five candidate biomarkers and validated by enzyme-linked immune-sorbent (ELISA) and immunonephelometric assays (INA). A classifier score based on corroborated proteins demonstrated that the developed MRM-MS assay provides an appropriate methodology for an external validation, which is still in progress. Plasma proteomic biomarkers of acute cardiac rejection may offer a relevant post-transplant monitoring tool to effectively guide clinical care. The proposed computational pipeline is highly applicable to a wide range of biomarker proteomic studies.
Author Summary
Novel proteomic technology has led to the generation of vast amounts of biological data and the identification of numerous potential biomarkers. However, computational approaches to translate this information into knowledge capable of impacting clinical care have been lagging. We propose a computational proteomic pipeline for biomarker studies that is founded on the combination of advanced statistical methodologies. We demonstrate our approach through the analysis of data obtained from heart transplant patients. Heart transplantation is the gold standard treatment for patients with end-stage heart failure, but is complicated by episodes of immune rejection that can adversely impact patient outcomes. Current rejection monitoring approaches are highly invasive, requiring a biopsy of the heart. This work aims to reduce the need for biopsies, and demonstrate the power and utility of computational approaches in proteomic biomarker discovery. Our work utilizes novel high-throughput proteomic technology combined with advanced statistical techniques to identify blood markers that guide the decision as to whether a biopsy is warranted, reduce the number of unnecessary biopsies, and ultimately diagnose the presence of rejection in heart transplant patients. Additionally, the proposed computational methodologies can be applied to a range of proteomic biomarker studies of various diseases and conditions.
PMCID: PMC3617196  PMID: 23592955
20.  A computational pipeline for the development of multi-marker bio-signature panels and ensemble classifiers 
BMC Bioinformatics  2012;13:326.
Biomarker panels derived separately from genomic and proteomic data and with a variety of computational methods have demonstrated promising classification performance in various diseases. An open question is how to create effective proteo-genomic panels. The framework of ensemble classifiers has been applied successfully in various analytical domains to combine classifiers so that the performance of the ensemble exceeds the performance of individual classifiers. Using blood-based diagnosis of acute renal allograft rejection as a case study, we address the following question in this paper: Can acute rejection classification performance be improved by combining individual genomic and proteomic classifiers in an ensemble?
The first part of the paper presents a computational biomarker development pipeline for genomic and proteomic data. The pipeline begins with data acquisition (e.g., from bio-samples to microarray data), quality control, statistical analysis and mining of the data, and finally various forms of validation. The pipeline ensures that the various classifiers to be combined later in an ensemble are diverse and adequate for clinical use. Five mRNA genomic and five proteomic classifiers were developed independently using single time-point blood samples from 11 acute-rejection and 22 non-rejection renal transplant patients. The second part of the paper examines five ensembles ranging in size from two to 10 individual classifiers. Performance of ensembles is characterized by area under the curve (AUC), sensitivity, and specificity, as derived from the probability of acute rejection for individual classifiers in the ensemble in combination with one of two aggregation methods: (1) Average Probability or (2) Vote Threshold. One ensemble demonstrated superior performance and was able to improve sensitivity and AUC beyond the best values observed for any of the individual classifiers in the ensemble, while staying within the range of observed specificity. The Vote Threshold aggregation method achieved improved sensitivity for all 5 ensembles, but typically at the cost of decreased specificity.
Proteo-genomic biomarker ensemble classifiers show promise in the diagnosis of acute renal allograft rejection and can improve classification performance beyond that of individual genomic or proteomic classifiers alone. Validation of our results in an international multicenter study is currently underway.
PMCID: PMC3575305  PMID: 23216969
Biomarkers; Computational; Pipeline; Genomics; Proteomics; Ensemble; Classification
21.  Impact of CYP2D6, CYP3A5, CYP2C9 and CYP2C19 polymorphisms on tamoxifen pharmacokinetics in Asian breast cancer patients 
To investigate the impact of genetic polymorphisms in CYP2D6, CYP3A5, CYP2C9 and CYP2C19 on the pharmacokinetics of tamoxifen and its metabolites in Asian breast cancer patients.
A total of 165 Asian breast cancer patients receiving 20 mg tamoxifen daily and 228 healthy Asian subjects (Chinese, Malay and Indian; n = 76 each) were recruited. The steady-state plasma concentrations of tamoxifen and its metabolites were quantified using high-performance liquid chromatography. The CYP2D6 polymorphisms were genotyped using the INFINITI™ CYP450 2D6I assay, while the polymorphisms in CYP3A5, CYP2C9 and CYP2C19 were determined via direct sequencing.
The polymorphisms, CYP2D6*5 and *10, were significantly associated with lower endoxifen and higher N-desmethyltamoxifen (NDM) concentrations. Patients who were *1/*1 carriers exhibited 2.4- to 2.6-fold higher endoxifen concentrations and 1.9- to 2.1-fold lower NDM concentrations than either *10/*10 or *5/*10 carriers (P < 0.001). Similarly, the endoxifen concentrations were found to be 1.8- to 2.6-times higher in *1/*5 or *1/*10 carriers compared with *10/*10 and *5/*10 carriers (P≤ 0.001). Similar relationships were observed between the CYP2D6 polymorphisms and metabolic ratios of tamoxifen and its metabolites. No significant associations were observed with regards to the polymorphisms in CYP3A5, CYP2C9 and CYP2C19.
The present study in Asian breast cancer patients showed that CYP2D6*5/*10 and *10/*10 genotypes are associated with significantly lower concentrations of the active metabolite of tamoxifen, endoxifen. Identifying such patients before the start of treatment may be useful in optimizing therapy with tamoxifen. The role of CYP3A5, CYP2C9 and CYP2C19 seem to be minor.
PMCID: PMC3093079  PMID: 21480951
CYP2C19; CYP2D6; CYP3A5; pharmacogenetics; pharmacokinetics; tamoxifen
22.  White Blood Cell Differentials Enrich Whole Blood Expression Data in the Context of Acute Cardiac Allograft Rejection 
Acute cardiac allograft rejection is a serious complication of heart transplantation. Investigating molecular processes in whole blood via microarrays is a promising avenue of research in transplantation, particularly due to the non-invasive nature of blood sampling. However, whole blood is a complex tissue and the consequent heterogeneity in composition amongst samples is ignored in traditional microarray analysis. This complicates the biological interpretation of microarray data. Here we have applied a statistical deconvolution approach, cell-specific significance analysis of microarrays (csSAM), to whole blood samples from subjects either undergoing acute heart allograft rejection (AR) or not (NR). We identified eight differentially expressed probe-sets significantly correlated to monocytes (mapping to 6 genes, all down-regulated in ARs versus NRs) at a false discovery rate (FDR) ≤ 15%. None of the genes identified are present in a biomarker panel of acute heart rejection previously published by our group and discovered in the same data***.
PMCID: PMC3329187  PMID: 22550401
microarray expression; cell-specific expression; deconvolution; heart; transplantation
23.  A microRNA-21 surge facilitates rapid cyclin D1 translation and cell cycle progression in mouse liver regeneration 
The Journal of Clinical Investigation  2012;122(3):1097-1108.
MicroRNA-21 (miR-21) is thought to be an oncomir because it promotes cancer cell proliferation, migration, and survival. miR-21 is also expressed in normal cells, but its physiological role is poorly understood. Recently, it has been found that miR-21 expression is rapidly induced in rodent hepatocytes during liver regeneration after two-thirds partial hepatectomy (2/3 PH). Here, we investigated the function of miR-21 in regenerating mouse hepatocytes by inhibiting it with an antisense oligonucleotide. To maintain normal hepatocyte viability and function, we antagonized the miR-21 surge induced by 2/3 PH while preserving baseline expression. We found that knockdown of miR-21 impaired progression of hepatocytes into S phase of the cell cycle, mainly through a decrease in levels of cyclin D1 protein, but not Ccnd1 mRNA. Mechanistically, we discovered that increased miR-21 expression facilitated cyclin D1 translation in the early phase of liver regeneration by relieving Akt1/mTOR complex 1 signaling (and thus eIF-4F–mediated translation initiation) from suppression by Rhob. Our findings reveal that miR-21 enables rapid hepatocyte proliferation during liver regeneration by accelerating cyclin D1 translation.
PMCID: PMC3287214  PMID: 22326957
24.  Endodontic photodynamic therapy ex vivo 
Journal of endodontics  2011;37(2):217-222.
To evaluate the anti-microbial effects of photodynamic therapy (PDT) on infected human teeth ex vivo.
Materials and Methods
Fifty-two freshly extracted teeth with pulpal necrosis and associated periradicular radiolucencies were obtained from 34 subjects. Twenty-six teeth with 49 canals received chemomechanical debridement (CMD) with 6% NaOCl and twenty-six teeth with 52 canals received CMD plus PDT. For PDT, root canal systems were incubated with methylene blue (MB) at concentration of 50 µg/ml for 5 minutes followed by exposure to red light at 665 nm with an energy fluence of 30 J/cm2. The contents of root canals were sampled by flushing the canals at baseline and following CMD alone or CMD+PDT and were serially diluted and cultured on blood agar. Survival fractions were calculated by counting colony-forming units (CFU). Partial characterization of root canal species at baseline and following CMD alone or CMD+PDT was performed using DNA probes to a panel of 39 endodontic species in the checkerboard assay.
The Mantel-Haenszel chi-square test for treatment effects demonstrated the better performance of CMD+PDT over CMD (P=0.026). CMD+PDT significantly reduced the frequency of positive canals relative to CMD alone (P=0.0003). Following CMD+PDT, 45 of 52 canals (86.5%) had no CFU as compared to 24 of 49 canals (49%) treated with CMD (canal flush samples). The CFU reductions were similar when teeth or canals were treated as independent entities. Post-treatment detection levels for all species were markedly lower for canals treated by CMD+PDT than were for those treated by CMD alone. Bacterial species within dentinal tubules were detected in 17/22 (77.3%) and 15/29 (51.7%) of canals in the CMD and CMD+PDT group, respectively (P= 0.034).
Data indicate that PDT significantly reduces residual bacteria within the root canal system, and that PDT, if further enhanced by technical improvements, holds substantial promise as an adjunct to CMD.
PMCID: PMC3034089  PMID: 21238805
Photodynamic therapy; methylene blue; endodontic disinfection; ex vivo
25.  Fate tracing of mature hepatocytes in mouse liver homeostasis and regeneration 
The Journal of Clinical Investigation  2011;121(12):4850-4860.
Recent evidence has contradicted the prevailing view that homeostasis and regeneration of the adult liver are mediated by self duplication of lineage-restricted hepatocytes and biliary epithelial cells. These new data suggest that liver progenitor cells do not function solely as a backup system in chronic liver injury; rather, they also produce hepatocytes after acute injury and are in fact the main source of new hepatocytes during normal hepatocyte turnover. In addition, other evidence suggests that hepatocytes are capable of lineage conversion, acting as precursors of biliary epithelial cells during biliary injury. To test these concepts, we generated a hepatocyte fate-tracing model based on timed and specific Cre recombinase expression and marker gene activation in all hepatocytes of adult Rosa26 reporter mice with an adenoassociated viral vector. We found that newly formed hepatocytes derived from preexisting hepatocytes in the normal liver and that liver progenitor cells contributed minimally to acute hepatocyte regeneration. Further, we found no evidence that biliary injury induced conversion of hepatocytes into biliary epithelial cells. These results therefore restore the previously prevailing paradigms of liver homeostasis and regeneration. In addition, our new vector system will be a valuable tool for timed, efficient, and specific loop out of floxed sequences in hepatocytes.
PMCID: PMC3226005  PMID: 22105172

Results 1-25 (43)