Search tips
Search criteria

Results 1-25 (28)

Clipboard (0)

Select a Filter Below

Year of Publication
1.  A Simple and Robust Method for Partially Matched Samples Using the P-Values Pooling Approach 
Statistics in medicine  2013;32(19):3247-3259.
This paper focuses on statistical analyses in scenarios where some samples from the matched pairs design are missing, resulting in partially matched samples. Motivated by the idea of meta-analysis, we recast the partially matched samples as coming from two experimental designs, and propose a simple yet robust approach based on the weighted Z-test to integrate the p-values computed from these two designs. We show that the proposed approach achieves better operating characteristics in simulations and a case study, compared to existing methods for partially matched samples.
PMCID: PMC3717400  PMID: 23417968
Meta-analysis; Weighted Z-test; Microarray; False Discovery Rate
2.  Dynamic Changes in Nucleosome Occupancy Are Not Predictive of Gene Expression Dynamics but Are Linked to Transcription and Chromatin Regulators 
Molecular and Cellular Biology  2012;32(9):1645-1653.
The response to stressful stimuli requires rapid, precise, and dynamic gene expression changes that must be coordinated across the genome. To gain insight into the temporal ordering of genome reorganization, we investigated dynamic relationships between changing nucleosome occupancy, transcription factor binding, and gene expression in Saccharomyces cerevisiae yeast responding to oxidative stress. We applied deep sequencing to nucleosomal DNA at six time points before and after hydrogen peroxide treatment and revealed many distinct dynamic patterns of nucleosome gain and loss. The timing of nucleosome repositioning was not predictive of the dynamics of downstream gene expression change but instead was linked to nucleosome position relative to transcription start sites and specific cis-regulatory elements. We measured genome-wide binding of the stress-activated transcription factor Msn2p over time and found that Msn2p binds different loci with different dynamics. Nucleosome eviction from Msn2p binding sites was common across the genome; however, we show that, contrary to expectation, nucleosome loss occurred after Msn2p binding and in fact required Msn2p. This negates the prevailing model that nucleosomes obscuring Msn2p sites regulate DNA access and must be lost before Msn2p can bind DNA. Together, these results highlight the complexities of stress-dependent chromatin changes and their effects on gene expression.
PMCID: PMC3347246  PMID: 22354995
3.  The Cavβ1a subunit regulates gene expression and suppresses myogenin in muscle progenitor cells 
The Journal of Cell Biology  2014;205(6):829-846.
Cavβ1a acts as a voltage-gated calcium channel-independent regulator of gene expression in muscle progenitor cells and is required for their normal expansion during myogenic development.
Voltage-gated calcium channel (Cav) β subunits are auxiliary subunits to Cavs. Recent reports show Cavβ subunits may enter the nucleus and suggest a role in transcriptional regulation, but the physiological relevance of this localization remains unclear. We sought to define the nuclear function of Cavβ in muscle progenitor cells (MPCs). We found that Cavβ1a is expressed in proliferating MPCs, before expression of the calcium conducting subunit Cav1.1, and enters the nucleus. Loss of Cavβ1a expression impaired MPC expansion in vitro and in vivo and caused widespread changes in global gene expression, including up-regulation of myogenin. Additionally, we found that Cavβ1a localizes to the promoter region of a number of genes, preferentially at noncanonical (NC) E-box sites. Cavβ1a binds to a region of the Myog promoter containing an NC E-box, suggesting a mechanism for inhibition of myogenin gene expression. This work indicates that Cavβ1a acts as a Cav-independent regulator of gene expression in MPCs, and is required for their normal expansion during myogenic development.
PMCID: PMC4068134  PMID: 24934157
4.  Propensity Score Method for Partially Matched Omics Studies 
Cancer Informatics  2014;13(Suppl 7):1-10.
This paper focuses on the problem of partially matched samples in the presence of confounders. We propose using propensity score matching to adjust for confounding factors for the subset of data with incomplete pairs, followed by integrating the P-values computed from the complete and incomplete paired samples, respectively. Several simulations and a case study on DNA methylation are considered to evaluate the operating characteristics of the proposed method.
PMCID: PMC4267441  PMID: 25535453
microarray; confounders; observational studies; full matching; regression
5.  DNA methylation profiling in the Carolina Breast Cancer Study defines cancer subclasses differing in clinicopathologic characteristics and survival 
Breast cancer is a heterogeneous disease, with several intrinsic subtypes differing by hormone receptor (HR) status, molecular profiles, and prognosis. However, the role of DNA methylation in breast cancer development and progression and its relationship with the intrinsic tumor subtypes are not fully understood.
A microarray targeting promoters of cancer-related genes was used to evaluate DNA methylation at 935 CpG sites in 517 breast tumors from the Carolina Breast Cancer Study, a population-based study of invasive breast cancer.
Consensus clustering using methylation (β) values for the 167 most variant CpG loci defined four clusters differing most distinctly in HR status, intrinsic subtype (luminal versus basal-like), and p53 mutation status. Supervised analyses for HR status, subtype, and p53 status identified 266 differentially methylated CpG loci with considerable overlap. Genes relatively hypermethylated in HR+, luminal A, or p53 wild-type breast cancers included FABP3, FGF2, FZD9, GAS7, HDAC9, HOXA11, MME, PAX6, POMC, PTGS2, RASSF1, RBP1, and SCGB3A1, whereas those more highly methylated in HR-, basal-like, or p53 mutant tumors included BCR, C4B, DAB2IP, MEST, RARA, SEPT5, TFF1, THY1, and SERPINA5. Clustering also defined a hypermethylated luminal-enriched tumor cluster 3 that gene ontology analysis revealed to be enriched for homeobox and other developmental genes (ASCL2, DLK1, EYA4, GAS7, HOXA5, HOXA9, HOXB13, IHH, IPF1, ISL1, PAX6, TBX1, SOX1, and SOX17). Although basal-enriched cluster 2 showed worse short-term survival, the luminal-enriched cluster 3 showed worse long-term survival but was not independently prognostic in multivariate Cox proportional hazard analysis, likely due to the mostly early stage cases in this dataset.
This study demonstrates that epigenetic patterns are strongly associated with HR status, subtype, and p53 mutation status and may show heterogeneity within tumor subclass. Among HR+ breast tumors, a subset exhibiting a gene signature characterized by hypermethylation of developmental genes and poorer clinicopathologic features may have prognostic value and requires further study. Genes differentially methylated between clinically important tumor subsets have roles in differentiation, development, and tumor growth and may be critical to establishing and maintaining tumor phenotypes and clinical outcomes.
Electronic supplementary material
The online version of this article (doi:10.1186/s13058-014-0450-6) contains supplementary material, which is available to authorized users.
PMCID: PMC4303129  PMID: 25287138
6.  Obesity increases tumor aggressiveness in a genetically engineered mouse model of serous ovarian cancer☆ 
Gynecologic oncology  2014;133(1):90-97.
Obesity is associated with increased risk and worse outcomes for ovarian cancer. Thus, we examined the effects of obesity on ovarian cancer progression in a genetically engineered mouse model of serous ovarian cancer.
We utilized a unique serous ovarian cancer mouse model that specifically deletes the tumor suppressor genes, Brca1 and p53, and inactivates the retinoblastoma (Rb) proteins in adult ovarian surface epithelial cells, via injection of an adenoviral vector expressing Cre (AdCre) into the ovarian bursa cavity of adult female mice (KpB mouse model). KpB mice were subjected to a 60% calories-derived from fat in a high fat diet (HFD) versus 10% calories from fat in a low fat diet (LFD) to mimic diet-induced obesity. Tumors were isolated at 6 months after AdCre injection and evaluated histologically. Untargeted metabolomic and gene expression profiling was performed to assess differences in the ovarian tumors from obese versus non-obese KpB mice.
At sacrifice, mice on the HFD (obese) were twice the weight of mice on the LFD (non-obese) (51 g versus 31 g, p = 0.0003). Ovarian tumors were significantly larger in the obese versus non-obese mice (3.7 cm2 versus 1.2 cm2, p = 0.0065). Gene expression and metabolomic profiling indicated statistically significant differences between the ovarian tumors from the obese versus non-obese mice, including metabolically relevant pathways.
PMCID: PMC4090773  PMID: 24680597
Obesity; Ovarian cancer; Mouse model; Metabolomics; Genomics; Biomarkers
7.  Mutations in Isocitrate Dehydrogenase 1 and 2 Occur Frequently in Intrahepatic Cholangiocarcinomas and Share Hypermethylation Targets with Glioblastomas 
Oncogene  2012;32(25):3091-3100.
Mutations in the genes encoding isocitrate dehydrogenase, IDH1 and IDH2, have been reported in gliomas, myeloid leukemias, chondrosarcomas, and thyroid cancer. We discovered IDH1 and IDH2 mutations in 34 of 326 (10%) intrahepatic cholangiocarcinomas. Tumor with mutations in IDH1 or IDH2 had lower 5-hydroxymethylcytosine (5hmC) and higher 5-methylcytosine (5mC) levels, as well as increased dimethylation of histone H3K79. Mutations in IDH1 or IDH2 were associated with longer overall survival (p = 0.028) and were independently associated with a longer time to tumor recurrence after intrahepatic cholangiocarcinoma resection in multivariate analysis (p = 0.021). IDH1 and IDH2 mutations are significantly associated with increased levels of p53 in intrahepatic cholangiocarcinomas, but no mutations in the p53 gene were found, suggesting that mutations in IDH1 and IDH2 may cause a stress that leads to p53 activation. We identified 2,309 genes that were significantly hypermethylated in 19 cholangiocarcinomas with mutations in IDH1 or IDH2, compared with cholangiocarcinomas without these mutations. Hypermethylated CpG sites were significantly enriched in CpG shores and upstream of transcription start sites, suggesting a global regulation of transcriptional potential. Half of the hypermethylated genes overlapped with DNA hypermethylation in IDH1-mutant gliobastomas, suggesting the existence of a common set of genes whose expression may be affected by mutations in IDH1 or IDH2 in different types of tumors.
PMCID: PMC3500578  PMID: 22824796
DNA methylation; Epigenetics; Tumor metabolism
8.  A systematic assessment of normalization approaches for the Infinium 450K methylation platform 
Epigenetics  2013;9(2):318-329.
The Illumina Infinium HumanMethylation450 BeadChip has emerged as one of the most popular platforms for genome wide profiling of DNA methylation. While the technology is wide-spread, systematic technical biases are believed to be present in the data. For example, this array incorporates two different chemical assays, i.e., Type I and Type II probes, which exhibit different technical characteristics and potentially complicate the computational and statistical analysis. Several normalization methods have been introduced recently to adjust for possible biases. However, there is considerable debate within the field on which normalization procedure should be used and indeed whether normalization is even necessary. Yet despite the importance of the question, there has been little comprehensive comparison of normalization methods. We sought to systematically compare several popular normalization approaches using the Norwegian Mother and Child Cohort Study (MoBa) methylation data set and the technical replicates analyzed with it as a case study. We assessed both the reproducibility between technical replicates following normalization and the effect of normalization on association analysis. Results indicate that the raw data are already highly reproducible, some normalization approaches can slightly improve reproducibility, but other normalization approaches may introduce more variability into the data. Results also suggest that differences in association analysis after applying different normalizations are not large when the signal is strong, but when the signal is more modest, different normalizations can yield very different numbers of findings that meet a weaker statistical significance threshold. Overall, our work provides useful, objective assessment of the effectiveness of key normalization methods.
PMCID: PMC3962542  PMID: 24241353
association testing; cotinine exposure; genome wide methylation profiling; normalization; reproducibility
9.  The Role of Ect2 Nuclear RhoGEF Activity in Ovarian Cancer Cell Transformation 
Genes & Cancer  2013;4(11-12):460-475.
Ect2, a Rho guanine nucleotide exchange factor (RhoGEF), is atypical among RhoGEFs in its predominantly nuclear localization in interphase cells. One current model suggests that Ect2 mislocalization drives cellular transformation by promoting aberrant activation of cytoplasmic Rho family GTPase substrates. However, in ovarian cancers, where Ect2 is both amplified and overexpressed at the mRNA level, we observed that the protein is highly expressed and predominantly nuclear and that nuclear but not cytoplasmic Ect2 increases with advanced disease. Knockdown of Ect2 in ovarian cancer cell lines impaired their anchorage-independent growth without affecting their growth on plastic. Restoration of Ect2 expression rescued the anchorage-independent growth defect, but not if either the DH catalytic domain or the nuclear localization sequences of Ect2 were mutated. These results suggested a novel mechanism whereby Ect2 could drive transformation in ovarian cancer cells by acting as a RhoGEF specifically within the nucleus. Interestingly, Ect2 had an intrinsically distinct GTPase specificity profile in the nucleus versus the cytoplasm. Nuclear Ect2 bound preferentially to Rac1, while cytoplasmic Ect2 bound to RhoA but not Rac. Consistent with nuclear activation of endogenous Rac, Ect2 overexpression was sufficient to recruit Rac effectors to the nucleus, a process that required a functional Ect2 catalytic domain. Furthermore, expression of active nuclearly targeted Rac1 rescued the defect in transformed growth caused by Ect2 knockdown. Our work suggests a novel mechanism of Ect2-driven transformation, identifies subcellular localization as a regulator of GEF specificity, and implicates activation of nuclear Rac1 in cellular transformation.
PMCID: PMC3877668  PMID: 24386507
Ect2; RhoGEF; Rac; ovarian cancer
10.  Application of Multiplexed Kinase Inhibitor Beads to Study Kinome Adaptations in Drug-Resistant Leukemia 
PLoS ONE  2013;8(6):e66755.
Protein kinases play key roles in oncogenic signaling and are a major focus in the development of targeted cancer therapies. Imatinib, a BCR-Abl tyrosine kinase inhibitor, is a successful front-line treatment for chronic myelogenous leukemia (CML). However, resistance to imatinib may be acquired by BCR-Abl mutations or hyperactivation of Src family kinases such as Lyn. We have used multiplexed kinase inhibitor beads (MIBs) and quantitative mass spectrometry (MS) to compare kinase expression and activity in an imatinib-resistant (MYL-R) and -sensitive (MYL) cell model of CML. Using MIB/MS, expression and activity changes of over 150 kinases were quantitatively measured from various protein kinase families. Statistical analysis of experimental replicates assigned significance to 35 of these kinases, referred to as the MYL-R kinome profile. MIB/MS and immunoblotting confirmed the over-expression and activation of Lyn in MYL-R cells and identified additional kinases with increased (MEK, ERK, IKKα, PKCβ, NEK9) or decreased (Abl, Kit, JNK, ATM, Yes) abundance or activity. Inhibiting Lyn with dasatinib or by shRNA-mediated knockdown reduced the phosphorylation of MEK and IKKα. Because MYL-R cells showed elevated NF-κB signaling relative to MYL cells, as demonstrated by increased IκBα and IL-6 mRNA expression, we tested the effects of an IKK inhibitor (BAY 65-1942). MIB/MS and immunoblotting revealed that BAY 65-1942 increased MEK/ERK signaling and that this increase was prevented by co-treatment with a MEK inhibitor (AZD6244). Furthermore, the combined inhibition of MEK and IKKα resulted in reduced IL-6 mRNA expression, synergistic loss of cell viability and increased apoptosis. Thus, MIB/MS analysis identified MEK and IKKα as important downstream targets of Lyn, suggesting that co-targeting these kinases may provide a unique strategy to inhibit Lyn-dependent imatinib-resistant CML. These results demonstrate the utility of MIB/MS as a tool to identify dysregulated kinases and to interrogate kinome dynamics as cells respond to targeted kinase inhibition.
PMCID: PMC3691232  PMID: 23826126
11.  Dynamic Reprogramming of the Kinome In Response to Targeted MEK Inhibition In Triple Negative Breast Cancer 
Cell  2012;149(2):307-321.
Kinase inhibitors have limited success in cancer treatment because tumors circumvent their action. Using a quantitative proteomics approach, we assessed kinome activity in response to MEK inhibition in triple negative breast cancer (TNBC) cells and genetically engineered mice (GEMMs). MEK inhibition caused acute ERK activity loss, resulting in rapid c-Myc degradation that induced expression and activation of several receptor tyrosine kinases (RTKs). RNAi knockdown of ERK or c-Myc mimicked RTK induction by MEK inhibitors, whereas prevention of proteasomal c-Myc degradation blocked kinome reprogramming. MEK inhibitor-induced RTK stimulation overcame MEK2 but not MEK1 inhibition, reactivating ERK and producing drug resistance. The C3Tag GEMM for TNBC similarly induced RTKs in response to MEK inhibition. The inhibitor-induced RTK profile suggested a kinase inhibitor combination therapy that produced GEMM tumor apoptosis and regression where single agents were ineffective. This approach defines mechanisms of drug resistance, allowing rational design of combination therapies for cancer.
PMCID: PMC3328787  PMID: 22500798
12.  HIF1α and HIF2α independently activate SRC to promote melanoma metastases 
The Journal of Clinical Investigation  2013;123(5):2078-2093.
Malignant melanoma is characterized by a propensity for early lymphatic and hematogenous spread. The hypoxia-inducible factor (HIF) family of transcription factors is upregulated in melanoma by key oncogenic drivers. HIFs promote the activation of genes involved in cancer initiation, progression, and metastases. Hypoxia has been shown to enhance the invasiveness and metastatic potential of tumor cells by regulating the genes involved in the breakdown of the ECM as well as genes that control motility and adhesion of tumor cells. Using a Pten-deficient, Braf-mutant genetically engineered mouse model of melanoma, we demonstrated that inactivation of HIF1α or HIF2α abrogates metastasis without affecting primary tumor formation. HIF1α and HIF2α drive melanoma invasion and invadopodia formation through PDGFRα and focal adhesion kinase–mediated (FAK-mediated) activation of SRC and by coordinating ECM degradation via MT1-MMP and MMP2 expression. These results establish the importance of HIFs in melanoma progression and demonstrate that HIF1α and HIF2α activate independent transcriptional programs that promote metastasis by coordinately regulating cell invasion and ECM remodeling.
PMCID: PMC3635738  PMID: 23563312
13.  Performance of rapid influenza H1N1 diagnostic tests: a meta-analysis 
Following the outbreaks of 2009 pandemic H1N1 infection, rapid influenza diagnostic tests have been used to detect H1N1 infection. However, no meta-analysis has been undertaken to assess the diagnostic accuracy when this manuscript was drafted.
The literature was systematically searched to identify studies that reported the performance of rapid tests. Random effects meta-analyses were conducted to summarize the overall performance.
Seventeen studies were selected with 1879 cases and 3477 non-cases. The overall sensitivity and specificity estimates of the rapid tests were 0.51 (95%CI: 0.41, 0.60) and 0.98 (95%CI: 0.94, 0.99). Studies reported heterogeneous sensitivity estimates, ranging from 0.11 to 0.88. If the prevalence was 30%, the overall positive and negative predictive values were 0.94 (95%CI: 0.85, 0.98) and 0.82 (95%CI: 0.79, 0.85). The overall specificities from different manufacturers were comparable, while there were some differences for the overall sensitivity estimates. BinaxNOW had a lower overall sensitivity of 0.39 (95%CI: 0.24, 0.57) compared to all the others (p-value < 0.001), whereas QuickVue had a higher overall sensitivity of 0.57 (95%CI: 0.50, 0.63) compared to all the others (p-value = 0.005).
Rapid tests have high specificity but low sensitivity and thus limited usefulness.
PMCID: PMC3288365  PMID: 21883964
meta analysis; H1N1; diagnostic tests; rapid tests; sensitivity and specificity
14.  DiffSplice: the genome-wide detection of differential splicing events with RNA-seq 
Nucleic Acids Research  2012;41(2):e39.
The RNA transcriptome varies in response to cellular differentiation as well as environmental factors, and can be characterized by the diversity and abundance of transcript isoforms. Differential transcription analysis, the detection of differences between the transcriptomes of different cells, may improve understanding of cell differentiation and development and enable the identification of biomarkers that classify disease types. The availability of high-throughput short-read RNA sequencing technologies provides in-depth sampling of the transcriptome, making it possible to accurately detect the differences between transcriptomes. In this article, we present a new method for the detection and visualization of differential transcription. Our approach does not depend on transcript or gene annotations. It also circumvents the need for full transcript inference and quantification, which is a challenging problem because of short read lengths, as well as various sampling biases. Instead, our method takes a divide-and-conquer approach to localize the difference between transcriptomes in the form of alternative splicing modules (ASMs), where transcript isoforms diverge. Our approach starts with the identification of ASMs from the splice graph, constructed directly from the exons and introns predicted from RNA-seq read alignments. The abundance of alternative splicing isoforms residing in each ASM is estimated for each sample and is compared across sample groups. A non-parametric statistical test is applied to each ASM to detect significant differential transcription with a controlled false discovery rate. The sensitivity and specificity of the method have been assessed using simulated data sets and compared with other state-of-the-art approaches. Experimental validation using qRT-PCR confirmed a selected set of genes that are differentially expressed in a lung differentiation study and a breast cancer data set, demonstrating the utility of the approach applied on experimental biological data sets. The software of DiffSplice is available at
PMCID: PMC3553996  PMID: 23155066
15.  Integrating Prior Knowledge in Multiple Testing under Dependence with Applications to Detecting Differential DNA Methylation 
Biometrics  2012;68(3):774-783.
DNA methylation has emerged as an important hallmark of epigenetics. Numerous platforms including tiling arrays and next generation sequencing, and experimental protocols are available for profiling DNA methylation. Similar to other tiling array data, DNA methylation data shares the characteristics of inherent correlation structure among nearby probes. However, unlike gene expression or protein DNA binding data, the varying CpG density which gives rise to CpG island, shore and shelf definition provides exogenous information in detecting differential methylation. This paper aims to introduce a robust testing and probe ranking procedure based on a non-homogeneous hidden Markov model that incorporates the above-mentioned features for detecting differential methylation. We revisit the seminal work of Sun and Cai (2009, J. R. Stat. Soc. B. 71, 393-424) and propose modeling the non-null using a non-parametric symmetric distribution in two-sided hypothesis testing. We show that this model improves probe ranking and is robust to model misspecification based on extensive simulation studies. We further illustrate that our proposed framework achieves good operating characteristics as compared to commonly used methods in real DNA methylation data that aims to detect differential methylation sites.
PMCID: PMC3449228  PMID: 22260651
Non-homogeneous Hidden Markov Model; False Discovery Rate; Microarray; CpG Island; Kernel Density Estimation; Semiparametric Model
16.  Epstein-barr virus infected gastric adenocarcinoma expresses latent and lytic viral transcripts and has a distinct human gene expression profile 
EBV DNA is found within the malignant cells of 10% of gastric cancers. Modern molecular technology facilitates identification of virus-related biochemical effects that could assist in early diagnosis and disease management.
In this study, RNA expression profiling was performed on 326 macrodissected paraffin-embedded tissues including 204 cancers and, when available, adjacent non-malignant mucosa. Nanostring nCounter probes targeted 96 RNAs (20 viral, 73 human, and 3 spiked RNAs).
In 182 tissues with adequate housekeeper RNAs, distinct profiles were found in infected versus uninfected cancers, and in malignant versus adjacent benign mucosa. EBV-infected gastric cancers expressed nearly all of the 18 latent and lytic EBV RNAs in the test panel. Levels of EBER1 and EBER2 RNA were highest and were proportional to the quantity of EBV genomes as measured by Q-PCR. Among protein coding EBV RNAs, EBNA1 from the Q promoter and BRLF1 were highly expressed while EBNA2 levels were low positive in only 6/14 infected cancers. Concomitant upregulation of cellular factors implies that virus is not an innocent bystander but rather is linked to NFKB signaling (FCER2, TRAF1) and immune response (TNFSF9, CXCL11, IFITM1, FCRL3, MS4A1 and PLUNC), with PPARG expression implicating altered cellular metabolism. Compared to adjacent non-malignant mucosa, gastric cancers consistently expressed INHBA, SPP1, THY1, SERPINH1, CXCL1, FSCN1, PTGS2 (COX2), BBC3, ICAM1, TNFSF9, SULF1, SLC2A1, TYMS, three collagens, the cell proliferation markers MYC and PCNA, and EBV BLLF1 while they lacked CDH1 (E-cadherin), CLDN18, PTEN, SDC1 (CD138), GAST (gastrin) and its downstream effector CHGA (chromogranin). Compared to lymphoepithelioma-like carcinoma of the uterine cervix, gastric cancers expressed CLDN18, EPCAM, REG4, BBC3, OLFM4, PPARG, and CDH17 while they had diminished levels of IFITM1 and HIF1A. The druggable targets ERBB2 (Her2), MET, and the HIF pathway, as well as several other potential pharmacogenetic indicators (including EBV infection itself, as well as SPARC, TYMS, FCGR2B and REG4) were identified in some tumor specimens.
This study shows how modern molecular technology applied to archival fixed tissues yields novel insights into viral oncogenesis that could be useful in managing affected patients.
PMCID: PMC3598565  PMID: 22929309
Gastric adenocarcinoma; Epstein-barr virus; RNA expression profile; Stromal cells; Pharmacogenetic test
17.  Metabolomic Profiling Reveals Mitochondrial-Derived Lipid Biomarkers That Drive Obesity-Associated Inflammation 
PLoS ONE  2012;7(6):e38812.
Obesity has reached epidemic proportions worldwide. Several animal models of obesity exist, but studies are lacking that compare traditional lard-based high fat diets (HFD) to “Cafeteria diets" (CAF) consisting of nutrient poor human junk food. Our previous work demonstrated the rapid and severe obesogenic and inflammatory consequences of CAF compared to HFD including rapid weight gain, markers of Metabolic Syndrome, multi-tissue lipid accumulation, and dramatic inflammation. To identify potential mediators of CAF-induced obesity and Metabolic Syndrome, we used metabolomic analysis to profile serum, muscle, and white adipose from rats fed CAF, HFD, or standard control diets. Principle component analysis identified elevations in clusters of fatty acids and acylcarnitines. These increases in metabolites were associated with systemic mitochondrial dysfunction that paralleled weight gain, physiologic measures of Metabolic Syndrome, and tissue inflammation in CAF-fed rats. Spearman pairwise correlations between metabolites, physiologic, and histologic findings revealed strong correlations between elevated markers of inflammation in CAF-fed animals, measured as crown like structures in adipose, and specifically the pro-inflammatory saturated fatty acids and oxidation intermediates laurate and lauroyl carnitine. Treatment of bone marrow-derived macrophages with lauroyl carnitine polarized macrophages towards the M1 pro-inflammatory phenotype through downregulation of AMPK and secretion of pro-inflammatory cytokines. Results presented herein demonstrate that compared to a traditional HFD model, the CAF diet provides a robust model for diet-induced human obesity, which models Metabolic Syndrome-related mitochondrial dysfunction in serum, muscle, and adipose, along with pro-inflammatory metabolite alterations. These data also suggest that modifying the availability or metabolism of saturated fatty acids may limit the inflammation associated with obesity leading to Metabolic Syndrome.
PMCID: PMC3373493  PMID: 22701716
18.  DNA Methylation Profiling Distinguishes Malignant Melanomas from Benign Nevi 
Pigment cell & melanoma research  2011;24(2):352-360.
DNA methylation, an epigenetic alteration typically occurring early in cancer development, could aid in the molecular diagnosis of melanoma. We determined technical feasibility for high-throughput DNA-methylation array-based profiling using formalin-fixed paraffin-embedded tissues for selection of candidate DNA-methylation differences between melanomas and nevi. Promoter methylation was evaluated in 27 common benign nevi and 22 primary invasive melanomas using a 1505 CpG-site microarray. Unsupervised hierarchical clustering distinguished melanomas from nevi; and 26 CpG sites in 22 genes were identified with significantly different methylation levels between melanomas and nevi after adjustment for age, sex, and multiple comparisons and with β-value differences of ≥ 0.2. Prediction Analysis for Microarrays identified 12 CpG loci that were highly predictive of melanoma, with area under the receiver operating characteristic curves of greater than 0.95. Of our panel of 22 genes, 14 were statistically significant in an independent sample set of 29 nevi (including dysplastic nevi) and 25 primary invasive melanomas after adjustment for age, sex, and multiple comparisons. This first report of a DNA-methylation signature discriminating melanomas from nevi indicates that DNA methylation appears promising as an additional tool for enhancing melanoma diagnosis.
PMCID: PMC3073305  PMID: 21375697
melanoma; nevi; methylation profiling; diagnostic markers
19.  A statistical framework for Illumina DNA methylation arrays 
Bioinformatics  2010;26(22):2849-2855.
Motivation: The Illumina BeadArray is a popular platform for profiling DNA methylation, an important epigenetic event associated with gene silencing and chromosomal instability. However, current approaches rely on an arbitrary detection P-value cutoff for excluding probes and samples from subsequent analysis as a quality control step, which results in missing observations and information loss. It is desirable to have an approach that incorporates the whole data, but accounts for the different quality of individual observations.
Results: We first investigate and propose a statistical framework for removing the source of biases in Illumina Methylation BeadArray based on several positive control samples. We then introduce a weighted model-based clustering called LumiWCluster for Illumina BeadArray that weights each observation according to the detection P-values systematically and avoids discarding subsets of the data. LumiWCluster allows for discovery of distinct methylation patterns and automatic selection of informative CpG loci. We demonstrate the advantages of LumiWCluster on two publicly available Illumina GoldenGate Methylation datasets (ovarian cancer and hepatocellular carcinoma).
Availability: R package LumiWCluster can be downloaded from
Supplementary information: Supplementary data are available at Bioinformatics online.
PMCID: PMC3025715  PMID: 20880956
20.  Discovering Transcription Factor Binding Sites in Highly Repetitive Regions of Genomes with Multi-Read Analysis of ChIP-Seq Data 
PLoS Computational Biology  2011;7(7):e1002111.
Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) is rapidly replacing chromatin immunoprecipitation combined with genome-wide tiling array analysis (ChIP-chip) as the preferred approach for mapping transcription-factor binding sites and chromatin modifications. The state of the art for analyzing ChIP-seq data relies on using only reads that map uniquely to a relevant reference genome (uni-reads). This can lead to the omission of up to 30% of alignable reads. We describe a general approach for utilizing reads that map to multiple locations on the reference genome (multi-reads). Our approach is based on allocating multi-reads as fractional counts using a weighted alignment scheme. Using human STAT1 and mouse GATA1 ChIP-seq datasets, we illustrate that incorporation of multi-reads significantly increases sequencing depths, leads to detection of novel peaks that are not otherwise identifiable with uni-reads, and improves detection of peaks in mappable regions. We investigate various genome-wide characteristics of peaks detected only by utilization of multi-reads via computational experiments. Overall, peaks from multi-read analysis have similar characteristics to peaks that are identified by uni-reads except that the majority of them reside in segmental duplications. We further validate a number of GATA1 multi-read only peaks by independent quantitative real-time ChIP analysis and identify novel target genes of GATA1. These computational and experimental results establish that multi-reads can be of critical importance for studying transcription factor binding in highly repetitive regions of genomes with ChIP-seq experiments.
Author Summary
Annotating repetitive regions of genomes experimentally is a challenging task. Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) provides valuable data for characterizing repetitive regions of genomes in terms of transcription factor binding. Although ChIP-seq technology has been maturing, available ChIP-seq analysis methods and software rely on discarding sequence reads that map to multiple locations on the reference genome (multi-reads), thereby generating a missed opportunity for assessing transcription factor binding to highly repetitive regions of genomes. We develop a computational algorithm that takes multi-reads into account in ChIP-seq analysis. We show with computational experiments that multi-reads lead to significant increase in sequencing depths and identification of binding regions that are otherwise not identifiable when only reads that uniquely map to the reference genome (uni-reads) are used. In particular, we show that the number of binding regions identified can increase up to 36%. We support our computational predictions with independent quantitative real-time ChIP validation of binding regions identified only when multi-reads are incorporated in the analysis of a mouse GATA1 ChIP-seq experiment.
PMCID: PMC3136429  PMID: 21779159
21.  RhoGDI2 antagonizes ovarian carcinoma growth, invasion and metastasis 
Small GTPases  2011;2(4):202-210.
Previous studies described functional roles for Rho GDP dissociation inhibitor 2 (RhoGDI2) in bladder, gastric and breast cancers. However, only limited expression and no functional analyses have been done for RhoGDI2 in ovarian cancer. We determined RhoGDI2 protein expression and function in ovarian cancer. First, protein gel blot analysis was performed to determine the expression levels of RhoGDI2 in ovarian cells lines. RhoGDI2 but not RhoGDI1 protein expression levels varied widely in ovarian carcinoma cell lines, with elevated levels seen in Ras-transformed ovarian epithelial cells. Next, immunohistochemistry was performed to detect RhoGDI2 expression in patient samples of ovarian cysts and ovarian cancer with known histological subtype, stage, grade and outcome. RhoGDI2 protein was significantly overexpressed in high-grade compared with low-grade ovarian cancers, correlated with histological subtype, and did not correlate with stage of ovarian cancer nor between carcinomas and benign cysts. Unexpectedly, stable suppression of RhoGDI2 protein expression in HeyA8 ovarian cancer cells increased anchorage-independent growth and Matrigel invasion in vitro and in tail-vein lung colony metastatic growth in vivo. Finally, we found that RhoGDI2 stably-associated preferentially with Rac1 and suppression of RhoGDI2 expression resulted in decreased Rac1 activity and Rac-associated JNK and p38 mitogenactivated protein kinase signaling. RhoGDI2 antagonizes the invasive and metastatic phenotype of HeyA8 ovarian cancer cells. In summary, our results suggest significant cell context differences in RhoGDI2 function in cancer cell growth.
PMCID: PMC3225909  PMID: 22145092
guanine nucleotide dissociation inhibitor 2; Rho small GTPase; ovarian cancer; Rac; metastasis
22.  A Non-Homogeneous Hidden-State Model on First Order Differences for Automatic Detection of Nucleosome Positions 
The ability to map individual nucleosomes accurately across genomes enables the study of relationships between dynamic changes in nucleosome positioning/occupancy and gene regulation. However, the highly heterogeneous nature of nucleosome densities across genomes and short linker regions pose challenges in mapping nucleosome positions based on high-throughput microarray data of micrococcal nuclease (MNase) digested DNA. Previous works rely on additional detrending and careful visual examination to detect low-signal nucleosomes, which may exist in a subpopulation of cells. We propose a non-homogeneous hidden-state model based on first order differences of experimental data along genomic coordinates that bypasses the need for local detrending and can automatically detect nucleosome positions of various occupancy levels. Our proposed approach is applicable to both low and high resolution MNase-Chip and MNase-Seq (high throughput sequencing) data, and is able to map nucleosome-linker boundaries accurately. This automated algorithm is also computationally efficient and only requires a simple preprocessing step. We provide several examples illustrating the pitfalls of existing methods, the difficulties of detrending the observed hybridization signals and demonstrate the advantages of utilizing first order differences in detecting nucleosome occupancies via simulations and case studies involving MNase-Chip and MNase-Seq data of nucleosome occupancy in yeast S. cerevisiae.
PMCID: PMC2861327  PMID: 19572828
23.  CMARRT: A Tool for the Analysis of ChIP-chip Data from Tiling Arrays by Incorporating the Correlation Structure 
Whole genome tiling arrays at a user specified resolution are becoming a versatile tool in genomics. Chromatin immunoprecipitation on microarrays (ChIP-chip) is a powerful application of these arrays. Although there is an increasing number of methods for analyzing ChIP-chip data, perhaps the most simple and commonly used one, due to its computational efficiency, is testing with a moving average statistic. Current moving average methods assume exchangeability of the measurements within an array. They are not tailored to deal with the issues due to array designs such as overlapping probes that result in correlated measurements. We investigate the correlation structure of data from such arrays and propose an extension of the moving average testing via a robust and rapid method called CMARRT. We illustrate the pitfalls of ignoring the correlation structure in simulations and a case study. Our approach is implemented as an R package called CMARRT and can be used with any tiling array platform.
PMCID: PMC2862456  PMID: 18229712
ChIP-chip; moving average; autocorrelation; false discovery rate
24.  A Non-Homogeneous Hidden-State Model on First Order Differences for Automatic Detection of Nucleosome Positions* 
The ability to map individual nucleosomes accurately across genomes enables the study of relationships between dynamic changes in nucleosome positioning/occupancy and gene regulation. However, the highly heterogeneous nature of nucleosome densities across genomes and short linker regions pose challenges in mapping nucleosome positions based on high-throughput microarray data of micrococcal nuclease (MNase) digested DNA. Previous works rely on additional detrending and careful visual examination to detect low-signal nucleosomes, which may exist in a subpopulation of cells. We propose a non-homogeneous hidden-state model based on first order differences of experimental data along genomic coordinates that bypasses the need for local detrending and can automatically detect nucleosome positions of various occupancy levels. Our proposed approach is applicable to both low and high resolution MNase-Chip and MNase-Seq (high throughput sequencing) data, and is able to map nucleosome-linker boundaries accurately. This automated algorithm is also computationally efficient and only requires a simple preprocessing step. We provide several examples illustrating the pitfalls of existing methods, the difficulties of detrending the observed hybridization signals and demonstrate the advantages of utilizing first order differences in detecting nucleosome occupancies via simulations and case studies involving MNase-Chip and MNase-Seq data of nucleosome occupancy in yeast S. cerevisiae.
PMCID: PMC2861327  PMID: 19572828
nucleosomes; MNase-chip; MNase-Seq; non-homogeneous hidden Markov model; first order differences; smoothing
25.  Starr: Simple Tiling ARRay analysis of Affymetrix ChIP-chip data 
BMC Bioinformatics  2010;11:194.
Chromatin immunoprecipitation combined with DNA microarrays (ChIP-chip) is an assay used for investigating DNA-protein-binding or post-translational chromatin/histone modifications. As with all high-throughput technologies, it requires thorough bioinformatic processing of the data for which there is no standard yet. The primary goal is to reliably identify and localize genomic regions that bind a specific protein. Further investigation compares binding profiles of functionally related proteins, or binding profiles of the same proteins in different genetic backgrounds or experimental conditions. Ultimately, the goal is to gain a mechanistic understanding of the effects of DNA binding events on gene expression.
We present a free, open-source R/Bioconductor package Starr that facilitates comparative analysis of ChIP-chip data across experiments and across different microarray platforms. The package provides functions for data import, quality assessment, data visualization and exploration. Starr includes high-level analysis tools such as the alignment of ChIP signals along annotated features, correlation analysis of ChIP signals with complementary genomic data, peak-finding and comparative display of multiple clusters of binding profiles. It uses standard Bioconductor classes for maximum compatibility with other software. Moreover, Starr automatically updates microarray probe annotation files by a highly efficient remapping of microarray probe sequences to an arbitrary genome.
Starr is an R package that covers the complete ChIP-chip workflow from data processing to binding pattern detection. It focuses on the high-level data analysis, e.g., it provides methods for the integration and combined statistical analysis of binding profiles and complementary functional genomics data. Starr enables systematic assessment of binding behaviour for groups of genes that are alingned along arbitrary genomic features.
PMCID: PMC2868012  PMID: 20398407

Results 1-25 (28)