Small nucleolar RNAs (snoRNAs) and small Cajal body-specific RNAs are non-coding RNAs involved in the maturation of other RNA molecules. Alterations of sno/scaRNA expression may play a role in cancerogenesis. This study elucidates the patterns of sno/scaRNA expression in 211 chronic lymphocytic leukemia (CLL) patients (Binet stage A) also in comparison with those of different normal B-cell subsets.
The patterns of sno/scaRNA expression in highly purified CD19+ B-cells of 211 CLL patients and in 18 normal B-cell samples - 6 from peripheral blood, and 12 from tonsils (4 germinal center, 2 marginal zone, 3 switched memory and 3 naïve B-cells) - were analyzed on the Affymetrix GeneChip® Human Gene 1.0 ST array.
CLLs display a sno/scaRNAs expression profile similar to normal memory, naïve and marginal-zone B-cells, with the exception of a few down-regulated transcripts (SNORA31, -6, -62, and -71C). Our analyses also suggest some heterogeneity in the pattern of sno/scaRNAs expression which is apparently unrelated to the major biological (ZAP-70 and CD38), molecular (IGHV mutation) and cytogenetic markers. Moreover, we found that SNORA70F was significantly down-regulated in poor prognostic subgroups and this phenomenon was associated with the down-regulation of its host gene COBLL1. Finally, we generated an independent model based on SNORA74A and SNORD116-18 expression, which appears to distinguish two different prognostic CLL groups.
These data extend the view of sno/scaRNAs deregulation in cancer and may contribute to discover novel biomarkers associated with the disease and potentially useful to predict the clinical outcome of early stage CLL patients.
As part of the civil aviation safety program to define the adverse effects of ethanol on flying performance, we performed a DNA microarray analysis of human whole blood samples from a five-time point study of subjects administered ethanol orally, followed by breathalyzer analysis, to monitor blood alcohol concentration (BAC) to discover significant gene expression changes in response to the ethanol exposure.
Subjects were administered either orange juice or orange juice with ethanol. Blood samples were taken based on BAC and total RNA was isolated from PaxGene™ blood tubes. The amplified cDNA was used in microarray and quantitative real-time polymerase chain reaction (RT-qPCR) analyses to evaluate differential gene expression. Microarray data was analyzed in a pipeline fashion to summarize and normalize and the results evaluated for relative expression across time points with multiple methods. Candidate genes showing distinctive expression patterns in response to ethanol were clustered by pattern and further analyzed for related function, pathway membership and common transcription factor binding within and across clusters. RT-qPCR was used with representative genes to confirm relative transcript levels across time to those detected in microarrays.
Microarray analysis of samples representing 0%, 0.04%, 0.08%, return to 0.04%, and 0.02% wt/vol BAC showed that changes in gene expression could be detected across the time course. The expression changes were verified by qRT-PCR.
The candidate genes of interest (GOI) identified from the microarray analysis and clustered by expression pattern across the five BAC points showed seven coordinately expressed groups. Analysis showed function-based networks, shared transcription factor binding sites and signaling pathways for members of the clusters. These include hematological functions, innate immunity and inflammation functions, metabolic functions expected of ethanol metabolism, and pancreatic and hepatic function. Five of the seven clusters showed links to the p38 MAPK pathway.
The results of this study provide a first look at changing gene expression patterns in human blood during an acute rise in blood ethanol concentration and its depletion because of metabolism and excretion, and demonstrate that it is possible to detect changes in gene expression using total RNA isolated from whole blood. The analysis approach for this study serves as a workflow to investigate the biology linked to expression changes across a time course and from these changes, to identify target genes that could serve as biomarkers linked to pilot performance.
Ethanol; Blood; Gene expression; Biomarkers; Microarray
It is suspected that early gastric carcinoma (GC) is a dormant variant that rarely progresses to advanced GC. We demonstrated that the dormant and aggressive variants of tubular adenocarcinomas (TUBs) of the stomach are characterized by loss of MYC and gain of TP53 and gain of MYC and/or loss of TP53, respectively. The aim of this study is to determine whether this is also the case in undifferentiated-type GCs (UGCs) of different genetic lineages: one with a layered structure (LS+), derived from early signet ring cell carcinomas (SIGs), and the other, mostly poorly differentiated adenocarcinomas, without LS but with a minor tubular component (TC), dedifferentiated from TUBs (LS−/TC+).
Using 29 surgically resected stomachs with 9 intramucosal and 20 invasive UGCs (11 LS+ and 9 LS−/TC+), 63 genomic DNA samples of mucosal and invasive parts and corresponding reference DNAs were prepared from formalin-fixed, paraffin-embedded tissues with laser microdissection, and were subjected to array-based comparative genomic hybridization (aCGH), using 60K microarrays, and subsequent unsupervised, hierarchical clustering. Of 979 cancer-related genes assessed, we selected genes with mean copy numbers significantly different between the two major clusters.
Based on similarity in genomic copy-number profile, the 63 samples were classified into two major clusters. Clusters A and B, which were rich in LS+ UGC and LS−/TC+ UGC, respectively, were discriminated on the basis of 40 genes. The aggressive pattern was more frequently detected in LS−/TC+ UGCs, (20/26; 77%), than in LS+UGCs (17/37; 46%; P = 0.0195), whereas no dormant pattern was detected in any of the UGC samples.
In contrast to TUBs, copy number alterations of MYC and TP53 exhibited an aggressive pattern in LS+ SIG at early and advanced stages, indicating that early LS+ UGCs inevitably progress to an advanced GC. Cluster B (enriched in LS−/TC+) exhibited more frequent gain of driver genes and a more frequent aggressive pattern than cluster A, suggesting potentially worse prognosis in UGCs of cluster B.
Down syndrome (DS) is a complex disorder caused by the trisomy of either the entire, or a critical region of chromosome 21 (21q22.1-22.3). Despite representing the most common cause of mental retardation, the molecular bases of the syndrome are still largely unknown.
To better understand the pathogenesis of DS, we analyzed the genome-wide transcription profiles of lymphoblastoid cell lines (LCLs) from six DS and six euploid individuals and investigated differential gene expression and pathway deregulation associated with trisomy 21. Connectivity map and PASS-assisted exploration were used to identify compounds whose molecular signatures counteracted those of DS lymphoblasts and to predict their therapeutic potential. An experimental validation in DS LCLs and fetal fibroblasts was performed for the most deregulated GO categories, i.e. the ubiquitin mediated proteolysis and the NF-kB cascade.
We show, for the first time, that the level of protein ubiquitination is reduced in human DS cell lines and that proteasome activity is increased in both basal conditions and oxidative microenvironment. We also provide the first evidence that NF-kB transcription levels, a paradigm of gene expression control by ubiquitin-mediated degradation, is impaired in DS due to reduced IkB-alfa ubiquitination, increased NF-kB inhibitor (IkB-alfa) and reduced p65 nuclear fraction. Finally, the DSCR1/DYRK1A/NFAT genes were analysed. In human DS LCLs, we confirmed the presence of increased protein levels of DSCR1 and DYRK1A, and showed that the levels of the transcription factor NFATc2 were decreased in DS along with a reduction of its nuclear translocation upon induction of calcium fluxes.
The present work offers new perspectives to better understand the pathogenesis of DS and suggests a rationale for innovative approaches to treat some pathological conditions associated to DS.
Down syndrome; Trisomy 21; Expression; Ubiquitin-proteasome system; NF-kB
End-stage renal failure is associated with profound changes in physiology and health, but the molecular causation of these pleomorphic effects termed “uremia” is poorly understood. The genomic changes of uremia were explored in a whole genome microarray case-control comparison of 95 subjects with end-stage renal failure (n = 75) or healthy controls (n = 20).
RNA was separated from blood drawn in PAXgene tubes and gene expression analyzed using Affymetrix Human Genome U133 Plus 2.0 arrays. Quality control and normalization was performed, and statistical significance determined with multiple test corrections (qFDR). Biological interpretation was aided by knowledge mining using NIH DAVID, MetaCore and PubGene
Over 9,000 genes were differentially expressed in uremic subjects compared to normal controls (fold change: -5.3 to +6.8), and more than 65% were lower in uremia. Changes appeared to be regulated through key gene networks involving cMYC, SP1, P53, AP1, NFkB, HNF4 alpha, HIF1A, c-Jun, STAT1, STAT3 and CREB1. Gene set enrichment analysis showed that mRNA processing and transport, protein transport, chaperone functions, the unfolded protein response and genes involved in tumor genesis were prominently lower in uremia, while insulin-like growth factor activity, neuroactive receptor interaction, the complement system, lipoprotein metabolism and lipid transport were higher in uremia. Pathways involving cytoskeletal remodeling, the clathrin-coated endosomal pathway, T-cell receptor signaling and CD28 pathways, and many immune and biological mechanisms were significantly down-regulated, while the ubiquitin pathway and certain others were up-regulated.
End-stage renal failure is associated with profound changes in human gene expression which appears to be mediated through key transcription factors. Dialysis and primary kidney disease had minor effects on gene regulation, but uremia was the dominant influence in the changes observed. This data provides important insight into the changes in cellular biology and function, opportunities for biomarkers of disease progression and therapy, and potential targets for intervention in uremia.
Gene expression profiling; Uremia; Chronic renal failure
SCA28 is an autosomal dominant ataxia associated with AFG3L2 gene mutations. We performed a whole genome expression profiling using lymphoblastoid cell lines (LCLs) from four SCA28 patients and six unrelated healthy controls matched for sex and age.
Gene expression was evaluated with the Affymetrix GeneChip Human Genome U133A 2.0 Arrays and data were validated by real-time PCR.
We found 66 genes whose expression was statistically different in SCA28 LCLs, 35 of which were up-regulated and 31 down-regulated. The differentially expressed genes were clustered in five functional categories: (1) regulation of cell proliferation; (2) regulation of programmed cell death; (3) response to oxidative stress; (4) cell adhesion, and (5) chemical homeostasis. To validate these data, we performed functional experiments that proved an impaired SCA28 LCLs growth compared to controls (p < 0.005), an increased number of cells in the G0/G1 phase (p < 0.001), and an increased mortality because of apoptosis (p < 0.05). We also showed that respiratory chain activity and reactive oxygen species levels was not altered, although lipid peroxidation in SCA28 LCLs was increased in basal conditions (p < 0.05). We did not detect mitochondrial DNA large deletions. An increase of TFAM, a crucial protein for mtDNA maintenance, and of DRP1, a key regulator of mitochondrial dynamic mechanism, suggested an alteration of fission/fusion pathways.
Whole genome expression profiling, performed on SCA28 LCLs, allowed us to identify five altered functional categories that characterize the SCA28 LCLs phenotype, the first reported in human cells to our knowledge.
Autosomal dominant cerebellar ataxia; Spinocerebellar ataxia; SCA28; AFG3L2; Genome-wide expression; LCLs
Alternative splicing is critical for generating complex proteomes in response to extracellular signals. Nuclear receptors including estrogen receptor alpha (ERα) and their ligands promote alternative splicing. The endogenous targets of ERα:estradiol (E2)-mediated alternative splicing and the influence of extracellular kinases that phosphorylate ERα on E2-induced splicing are unknown.
MCF-7 and its anti-estrogen derivatives were used for the majority of the assays. CD44 mini gene was used to measure the effect of E2 and AKT on alternative splicing. ExonHit array analysis was performed to identify E2 and AKT-regulated endogenous alternatively spliced apoptosis-related genes. Quantitative reverse transcription polymerase chain reaction was performed to verify alternative splicing. ERα binding to alternatively spliced genes was verified by chromatin immunoprecipitation assay. Bromodeoxyuridine incorporation-ELISA and Annexin V labeling assays were done to measure cell proliferation and apoptosis, respectively.
We identified the targets of E2-induced alternative splicing and deconstructed some of the mechanisms surrounding E2-induced splicing by combining splice array with ERα cistrome and gene expression array. E2-induced alternatively spliced genes fall into at least two subgroups: coupled to E2-regulated transcription and ERα binding to the gene without an effect on rate of transcription. Further, AKT, which phosphorylates both ERα and splicing factors, influenced ERα:E2 dependent splicing in a gene-specific manner. Genes that are alternatively spliced include FAS/CD95, FGFR2, and AXIN-1. E2 increased the expression of FGFR2 C1 isoform but reduced C3 isoform at mRNA level. E2-induced alternative splicing of FAS and FGFR2 in MCF-7 cells correlated with resistance to FAS activation-induced apoptosis and response to keratinocyte growth factor (KGF), respectively. Resistance of MCF-7 breast cancer cells to the anti-estrogen tamoxifen was associated with ERα-dependent overexpression of FGFR2, whereas resistance to fulvestrant was associated with ERα-dependent isoform switching, which correlated with altered response to KGF.
E2 may partly alter cellular proteome through alternative splicing uncoupled to its effects on transcription initiation and aberration in E2-induced alternative splicing events may influence response to anti-estrogens.
The collection of viable DNA samples is an essential element of any genetics research programme. Biological samples for DNA purification are now routinely collected in many studies with a variety of sampling methods available. Initial observation in this study suggested a reduced genotyping success rate of some saliva derived DNA samples when compared to blood derived DNA samples prompting further investigation.
Genotyping success rate was investigated to assess the suitability of using saliva samples in future safety and efficacy pharmacogenetics experiments. The Oragene® OG-300 DNA Self-Collection kit was used to collect and extract DNA from saliva from 1468 subjects enrolled in global clinical studies. Statistical analysis evaluated the impact of saliva sample volume of collection on the quality, yield, concentration and performance of saliva DNA in genotyping assays.
Across 13 global clinical studies that utilized the Oragene® OG-300 DNA Self-Collection kit there was variability in the volume of saliva sample collection with ~31% of participants providing 0.5 mL of saliva, rather than the recommended 2 mL. While the majority of saliva DNA samples provided high quality genotype data, collection of 0.5 mL volumes of saliva contributed to DNA samples being significantly less likely to pass genotyping quality control standards. Assessment of DNA sample characteristics that may influence genotyping outcomes indicated that saliva sample volume, DNA purity and turbidity were independently associated with sample genotype pass rate, but that saliva collection volume had the greatest effect.
When employing saliva sampling to obtain DNA, it is important to encourage all study participants to provide sufficient sample to minimize potential loss of data in downstream genotyping experiments.
Global study; Volume of saliva collection; DNA characteristics; Genotyping performance
Genotype-Driven Recruitment (GDR) is a research design that recruits research participants based on genotype rather than based on the presence or absence of a particular condition or clinical outcome. Analyses of the ethical issues of GDR studies, and the recommendations derived from these analyses, are based on GDR research designs that make use of genetic information already collected in previous studies. However, as genotyping becomes more affordable, it is expected that genotypic information will become a common part of the information stored in biobanks and held in health care records. Furthermore, individuals will increasingly gain knowledge of their own genotypes through Direct-to-Consumer services. One can therefore foresee that individuals will be invited to participate not only in follow-up GDR studies but also in original GDR studies because genetic information about them is available. These individuals may or may have not participated in research before and may or may not be aware that their genetic information is available for research.
From a conceptual point of view, we investigate whether the current ethics-related recommendations for the conduct of GDR suffice for a broader array of circumstances under which genetic information can be available. Our analysis reveals that the existing recommendations do not suffice for a broader use of GDR.
Our findings refocus attention on ethical issues which are neither new nor specific to GDR but which place greater demand on coordinated solutions. These challenges and approaches for addressing them are discussed.
Genotype-Driven Recruitment; Feedback of results; Genetic testing; Biobanks
Cidofovir (CDV) proved efficacious in treatment of human papillomaviruses (HPVs) hyperplasias. Antiproliferative effects of CDV have been associated with apoptosis induction, S-phase accumulation, and increased levels of tumor suppressor proteins. However, the molecular mechanisms for the selectivity and antitumor activity of CDV against HPV-transformed cells remain unexplained.
We evaluated CDV drug metabolism and incorporation into cellular DNA, in addition to whole genome gene expression profiling by means of microarrays in two HPV+ cervical carcinoma cells, HPV- immortalized keratinocytes, and normal keratinocytes.
Determination of the metabolism and drug incorporation of CDV into genomic DNA demonstrated a higher rate of drug incorporation in HPV+ tumor cells and immortalized keratinocytes compared to normal keratinocytes. Gene expression profiling clearly showed distinct and specific drug effects in the cell types investigated. Although an effect on inflammatory response was seen in all cell types, different pathways were identified in normal keratinocytes compared to immortalized keratinocytes and HPV+ tumor cells. Notably, Rho GTPase pathways, LXR/RXR pathways, and acute phase response signaling were exclusively activated in immortalized cells. CDV exposed normal keratinocytes displayed activated cell cycle regulation upon DNA damage signaling to allow DNA repair via homologous recombination, resulting in genomic stability and survival. Although CDV induced cell cycle arrest in HPV- immortalized cells, DNA repair was not activated in these cells. In contrast, HPV+ cells lacked cell cycle regulation, leading to genomic instability and eventually apoptosis.
Taken together, our data provide novel insights into the mechanism of action of CDV and its selectivity for HPV-transformed cells. The proposed mechanism suggests that this selectivity is based on the inability of HPV+ cells to respond to DNA damage, rather than on a direct anti-HPV effect. Since cell cycle control is deregulated by the viral oncoproteins E6 and E7 in HPV+ cells, these cells are more susceptible to DNA damage than normal keratinocytes. Our findings underline the therapeutic potential of CDV for HPV-associated malignancies as well as other neoplasias.
Cervical carcinoma; Cidofovir; DNA damage and repair; Gene expression profiling; Human papillomavirus
Chronic kidney disease (CKD) patients present a complex interaction between the innate and adaptive immune systems, in which immune activation (hypercytokinemia and acute-phase response) and immune suppression (impairment of response to infections and poor development of adaptive immunity) coexist. In this setting, circulating uremic toxins and microinflammation play a critical role. This condition, already present in the last stages of renal damage, seems to be enhanced by the contact of blood with bioincompatible extracorporeal hemodialysis (HD) devices. However, although largely described, the cellular machinery associated to the CKD- and HD-related immune-dysfunction is still poorly defined. Understanding the mechanisms behind this important complication may generate a perspective for improving patients outcome.
To better recognize the biological bases of the CKD-related immune dysfunction and to identify differences between CKD patients in conservative (CKD) from those in HD treatment, we used an high-throughput strategy (microarray) combined with classical bio-molecular approaches.
Immune transcriptomic screening of peripheral blood mononuclear cells (1030 gene probe sets selected by Gene-Ontology) showed that 275 gene probe sets (corresponding to 213 genes) discriminated 9 CKD patients stage III-IV (mean ± SD of eGFR: 32.27±14.7 ml/min) from 17 HD patients (p < 0.0001, FDR = 5%). Seventy-one genes were up- and 142 down-regulated in HD patients. Functional analysis revealed, then, close biological links among the selected genes with a pivotal role of PTX3, IL-15 (up-regulated in HD) and HLA-G (down-regulated in HD). ELISA, performed on an independent testing-group [11 CKD stage III-IV (mean ± SD of eGFR: 30.26±14.89 ml/min) and 13 HD] confirmed that HLA-G, a protein with inhibition effects on several immunological cell lines including natural killers (NK), was down-expressed in HD (p = 0.04). Additionally, in the testing-group, protein levels of CX3CR1, an highly selective chemokine receptor and surface marker for cytotoxic effector lymphocytes, resulted higher expressed in HD compared to CKD (p < 0.01).
Taken together our results show, for the first time, that HD patients present a different immune-pattern compared to the un-dialyzed CKD patients. Among the selected genes, some of them encode for important biological elements involved in proliferation/activation of cytotoxic effector lymphocytes and in the immune-inflammatory cellular machinery. Additionally, this study reveals new potential diagnostic bio-markers and therapeutic targets.
It was previously reported that an association analysis based on haplotype clusters increased power over single-locus tests, and that another association test based on diplotype trend regression analysis outperformed other, more common association approaches. We suggest a novel algorithm to combine haplotype cluster- and diplotype-based analyses.
Diplotyper combines a novel algorithm designed to cluster haplotypes of interest from a given set of haplotypes with two existing tools: Haploview, for analyses of linkage disequilibrium blocks and haplotypes, and PLINK, to generate all possible diplotypes from given genotypes of samples and calculate linear or logistic regression. In addition, procedures for generating all possible diplotypes from the haplotype clusters and transforming these diplotypes into PLINK formats were implemented.
Diplotyper is a fully automated tool for performing association analysis based on diplotypes in a population. Diplotyper was tested through association analysis of hepatic lipase (LIPC) gene polymorphisms or diplotypes and levels of high-density lipoprotein (HDL) cholesterol.
Diplotyper is useful for identifying more precise and distinct signals over single-locus tests.
High-throughput (HT) RNA interference (RNAi) screens are increasingly used for reverse genetics and drug discovery. These experiments are laborious and costly, hence sample sizes are often very small. Powerful statistical techniques to detect siRNAs that potentially enhance treatment are currently lacking, because they do not optimally use the amount of data in the other dimension, the feature dimension.
We introduce ShrinkHT, a Bayesian method for shrinking multiple parameters in a statistical model, where 'shrinkage' refers to borrowing information across features. ShrinkHT is very flexible in fitting the effect size distribution for the main parameter of interest, thereby accommodating skewness that naturally occurs when siRNAs are compared with controls. In addition, it naturally down-weights the impact of nuisance parameters (e.g. assay-specific effects) when these tend to have little effects across siRNAs. We show that these properties lead to better ROC-curves than with the popular limma software. Moreover, in a 3 + 3 treatment vs control experiment with 'assay' as an additional nuisance factor, ShrinkHT is able to detect three (out of 960) significant siRNAs with stronger enhancement effects than the positive control. These were not detected by limma. In the context of gene-targeted (conjugate) treatment, these are interesting candidates for further research.
Using annotations to the articles in MEDLINE®/PubMed®, over six thousand chemical compounds with pharmacological actions have been tracked since 1996. Medical Subject Heading Over-representation Profiles (MeSHOPs) quantitatively leverage the literature associated with biological entities such as diseases or drugs, providing the opportunity to reposition known compounds towards novel disease applications.
A MeSHOP is constructed by counting the number of times each medical subject term is assigned to an entity-related research publication in the MEDLINE database and calculating the significance of the count by comparing against the count of the term in a background set of publications. Based on the expectation that drugs suitable for treatment of a disease (or disease symptom) will have similar annotation properties to the disease, we successfully predict drug-disease associations by comparing MeSHOPs of diseases and drugs.
The MeSHOP comparison approach delivers an 11% improvement over bibliometric baselines. However, novel drug-disease associations are observed to be biased towards drugs and diseases with more publications. To account for the annotation biases, a correction procedure is introduced and evaluated.
By explicitly accounting for the annotation bias, unexpectedly similar drug-disease pairs are highlighted as candidates for drug repositioning research. MeSHOPs are shown to provide a literature-supported perspective for discovery of new links between drugs and diseases based on pre-existing knowledge.
Congenital muscular torticollis (CMT) is characterized by thickening and/or tightness of the unilateral sternocleidomastoid muscle (SCM), ending up with torticollis. Our aim was to identify differentially expressed genes (DEGs) and novel protein interaction network modules of CMT, and to discover the relationship between gene expressions and clinical severity of CMT.
Twenty-eight sternocleidomastoid muscles (SCMs) from 23 subjects with CMT and 5 SCMs without CMT were allocated for microarray, MRI, or imunohistochemical studies. We first identified 269 genes as the DEGs in CMT. Gene ontology enrichment analysis revealed that the main function of the DEGs is for extracellular region part during developmental processes. Five CMT-related protein network modules were identified, which showed that the important pathway is fibrosis related with collagen and elastin fibrillogenesis with an evidence of DNA repair mechanism. Interestingly, the expression levels of the 8 DEGs called CMT signature genes whose mRNA expression was double-confirmed by quantitative real time PCR showed good correlation with the severity of CMT which was measured with the pre-operational MRI images (R2 ranging from 0.82 to 0.21). Moreover, the protein expressions of ELN, ASPN and CHD3 which were identified from the CMT-related protein network modules demonstrated the differential expression between the CMT and normal SCM.
We here provided an integrative analysis of CMT from gene expression to clinical significance, which showed good correlation with clinical severity of CMT. Furthermore, the CMT-related protein network modules were identified, which provided more in-depth understanding of pathophysiology of CMT.
Gene expression-based prostate cancer gene signatures of poor prognosis are hampered by lack of gene feature reproducibility and a lack of understandability of their function. Molecular pathway-level mechanisms are intrinsically more stable and more robust than an individual gene. The Functional Analysis of Individual Microarray Expression (FAIME) we developed allows distinctive sample-level pathway measurements with utility for correlation with continuous phenotypes (e.g. survival). Further, we and others have previously demonstrated that pathway-level classifiers can be as accurate as gene-level classifiers using curated genesets that may implicitly comprise ascertainment biases (e.g. KEGG, GO). Here, we hypothesized that transformation of individual prostate cancer patient gene expression to pathway-level mechanisms derived from automated high throughput analyses of genomic datasets may also permit personalized pathway analysis and improve prognosis of recurrent disease.
Via FAIME, three independent prostate gene expression arrays with both normal and tumor samples were transformed into two distinct types of molecular pathway mechanisms: (i) the curated Gene Ontology (GO) and (ii) dynamic expression activity networks of cancer (Cancer Modules). FAIME-derived mechanisms for tumorigenesis were then identified and compared. Curated GO and computationally generated "Cancer Module" mechanisms overlap significantly and are enriched for known oncogenic deregulations and highlight potential areas of investigation. We further show in two independent datasets that these pathway-level tumorigenesis mechanisms can identify men who are more likely to develop recurrent prostate cancer (log-rank_p = 0.019).
Curation-free biomodules classification derived from congruent gene expression activation breaks from the paradigm of recapitulating the known curated pathway mechanism universe.
With the recent decreasing cost of genome sequence data, there has been increasing interest in rare variants and methods to detect their association to disease. We developed BioBin, a flexible collapsing method inspired by biological knowledge that can be used to automate the binning of low frequency variants for association testing. We also built the Library of Knowledge Integration (LOKI), a repository of data assembled from public databases, which contains resources such as: dbSNP and gene Entrez database information from the National Center for Biotechnology (NCBI), pathway information from Gene Ontology (GO), Protein families database (Pfam), Kyoto Encyclopedia of Genes and Genomes (KEGG), Reactome, NetPath - signal transduction pathways, Open Regulatory Annotation Database (ORegAnno), Biological General Repository for Interaction Datasets (BioGrid), Pharmacogenomics Knowledge Base (PharmGKB), Molecular INTeraction database (MINT), and evolutionary conserved regions (ECRs) from UCSC Genome Browser. The novelty of BioBin is access to comprehensive knowledge-guided multi-level binning. For example, bin boundaries can be formed using genomic locations from: functional regions, evolutionary conserved regions, genes, and/or pathways.
We tested BioBin using simulated data and 1000 Genomes Project low coverage data to test our method with simulated causative variants and a pairwise comparison of rare variant (MAF < 0.03) burden differences between Yoruba individuals (YRI) and individuals of European descent (CEU). Lastly, we analyzed the NHLBI GO Exome Sequencing Project Kabuki dataset, a congenital disorder affecting multiple organs and often intellectual disability, contrasted with Complete Genomics data as controls.
The results from our simulation studies indicate type I error rate is controlled, however, power falls quickly for small sample sizes using variants with modest effect sizes. Using BioBin, we were able to find simulated variants in genes with less than 20 loci, but found the sensitivity to be much less in large bins. We also highlighted the scale of population stratification between two 1000 Genomes Project data, CEU and YRI populations. Lastly, we were able to apply BioBin to natural biological data from dbGaP and identify an interesting candidate gene for further study.
We have established that BioBin will be a very practical and flexible tool to analyze sequence data and potentially uncover novel associations between low frequency variants and complex disease.
Genes do not act in isolation but instead as part of complex regulatory networks. To understand how breast tumors adapt to the presence of the drug letrozole, at the molecular level, it is necessary to consider how the expression levels of genes in these networks change relative to one another.
Using transcriptomic data generated from sequential tumor biopsy samples, taken at diagnosis, following 10-14 days and following 90 days of letrozole treatment, and a pairwise partial correlation statistic, we build temporal gene coexpression networks. We characterize the structure of each network and identify genes that hold prominent positions for maintaining network integrity and controlling information-flow.
Letrozole treatment leads to extensive rewiring of the breast tumor coexpression network. Approximately 20% of gene-gene relationships are conserved over time in the presence of letrozole while 80% of relationships are condition dependent. The positions of influence within the networks are transiently held with few genes stably maintaining high centrality scores across the three time points.
Genes integral for maintaining network integrity and controlling information flow are dynamically changing as the breast tumor coexpression network adapts to perturbation by the drug letrozole.
Type 1 diabetes (T1D) is a complex disease and harmful to human health, and most of the existing biomarkers are mainly to measure the disease phenotype after the disease onset (or drastic deterioration). Until now, there is no effective biomarker which can predict the upcoming disease (or pre-disease state) before disease onset or disease deterioration. Further, the detail molecular mechanism for such deterioration of the disease, e.g., driver genes or causal network of the disease, is still unclear.
In this study, we detected early-warning signals of T1D and its leading biomolecular networks based on serial gene expression profiles of NOD (non-obese diabetic) mice by identifying a new type of biomarker, i.e., dynamical network biomarker (DNB) which forms a specific module for marking the time period just before the drastic deterioration of T1D.
Two dynamical network biomarkers were obtained to signal the emergence of two critical deteriorations for the disease, and could be used to predict the upcoming sudden changes during the disease progression. We found that the two critical transitions led to peri-insulitis and hyperglycemia in NOD mices, which are consistent with other independent experimental results from literature.
The identified dynamical network biomarkers can be used to detect the early-warning signals of T1D and predict upcoming disease onset before the drastic deterioration. In addition, we also demonstrated that the leading biomolecular networks are causally related to the initiation and progression of T1D, and provided the biological insight into the molecular mechanism of T1D. Experimental data from literature and functional analysis on DNBs validated the computational results.
The rise of personalized medicine has reminded us that each patient must be treated as an individual. One factor in making treatment decisions is the physiological state of each patient, but definitions of relevant states and methods to visualize state-related physiologic changes are scarce. We constructed correlation networks from physiologic data to demonstrate changes associated with pressor use in the intensive care unit.
We collected 29 physiological variables at one-minute intervals from nineteen trauma patients in the intensive care unit of an academic hospital and grouped each minute of data as receiving or not receiving pressors. For each group we constructed Spearman correlation networks of pairs of physiologic variables. To visualize drug-associated changes we split the networks into three components: an unchanging network, a network of connections with changing correlation sign, and a network of connections only present in one group.
Out of a possible 406 connections between the 29 physiological measures, 64, 39, and 48 were present in each of the three component networks. The static network confirms expected physiological relationships while the network of associations with changed correlation sign suggests putative changes due to the drugs. The network of associations present only with pressors suggests new relationships that could be worthy of study.
We demonstrated that visualizing physiological relationships using correlation networks provides insight into underlying physiologic states while also showing that many of these relationships change when the state is defined by the presence of drugs. This method applied to targeted experiments could change the way critical care patients are monitored and treated.
Multifactor dimensionality reduction (MDR) is a powerful method for analysis of gene-gene interactions and has been successfully applied to many genetic studies of complex diseases. However, the main application of MDR has been limited to binary traits, while traits having ordinal features are commonly observed in many genetic studies (e.g., obesity classification - normal, pre-obese, mild obese and severe obese).
We propose ordinal MDR (OMDR) to facilitate gene-gene interaction analysis for ordinal traits. As an alternative to balanced accuracy, the use of tau-b, a common ordinal association measure, was suggested to evaluate interactions. Also, we generalized cross-validation consistency (GCVC) to identify multiple best interactions. GCVC can be practically useful for analyzing complex traits, especially in large-scale genetic studies.
Results and conclusions
In simulations, OMDR showed fairly good performance in terms of power, predictability and selection stability and outperformed MDR. For demonstration, we used a real data of body mass index (BMI) and scanned 1~4-way interactions of obesity ordinal and binary traits of BMI via OMDR and MDR, respectively. In real data analysis, more interactions were identified for ordinal trait than binary traits. On average, the commonly identified interactions showed higher predictability for ordinal trait than binary traits. The proposed OMDR and GCVC were implemented in a C/C++ program, executables of which are freely available for Linux, Windows and MacOS upon request for non-commercial research institutions.
In order to identify miRNAs expression profiling from genome-wide screen for diagnosis of acute myocardial infarction (AMI) and angina pectoris (AP), we investigated the altered profile of serum microRNAs in AMI and AP patients at a relative early stage.
Serum samples were taken from 117 AMI patients, 182 AP patients and 100 age-and gender-matched controls. An initial screening of miRNAs expression was performed by Solexa sequencing. Differential expression was validated using RT-qPCR in individuals samples, the samples were arranged in a two-phase selection and validation.
The Solexa sequencing results demonstrated marked upregulation of serum miRNAs in AMI patients compared with controls. RT-qPCR analysis identified a profile of six serum miRNAs (miR-1, miR-134, miR-186, miR-208, miR-223 and miR-499) as AMI biomarkers. MiR-208 and miR-499 were elevated higher in AP cases than in AMI cases. The ROC curves indicated a panel of six miRNAs has a great potential to offer sensitive and specific diagnostic tests for AMI. More especially, the panel of six miRNAs presents significantly differences between the AMI and AP cases.
The six-miRNAs signature identified from genome-wide serum miRNA expression profiling may serves as a fingerprint for AMI and AP diagnosis.
Acute myocardial infarction; Angina pectoris; Serum microRNAs
Upon co-stimulation with CD3/CD28 antibodies, activated CD4 + T cells were found to lose their susceptibility to HIV-1 infection, exhibiting an induced resistant phenotype. This rather unexpected phenomenon has been repeatedly confirmed but the underlying cell and molecular mechanisms are still unknown.
We first replicated the reported system using the specified Dynal beads with PHA/IL-2-stimulated and un-stimulated cells as controls. Genome-wide expression and analysis were then performed by using Agilent whole genome microarrays and established bioinformatics tools.
We showed that following CD3/CD28 co-stimulation, a homogeneous population emerged with uniform expression of activation markers CD25 and CD69 as well as a memory marker CD45RO at high levels. These cells differentially expressed 7,824 genes when compared with the controls on microarrays. Series-Cluster analysis identified 6 distinct expression profiles containing 1,345 genes as the representative signatures in the permissive and resistant cells. Of them, 245 (101 potentially permissive and 144 potentially resistant) were significant in gene ontology categories related to immune response, cell adhesion and metabolism. Co-expression networks analysis identified 137 “key regulatory” genes (84 potentially permissive and 53 potentially resistant), holding hub positions in the gene interactions. By mapping these genes on KEGG pathways, the predominance of actin cytoskeleton functions, proteasomes, and cell cycle arrest in induced resistance emerged. We also revealed an entire set of previously unreported novel genes for further mining and functional validation.
This initial microarray study will stimulate renewed interest in exploring this system and open new avenues for research into HIV-1 susceptibility and its reversal in target cells, serving as a foundation for the development of novel therapeutic and clinical treatments.
HIV-1; Susceptibility; Resistance; CD4 + T cells; CD3/CD28 costimulation
The identification of microRNA-disease associations is critical for understanding the molecular mechanisms of diseases. However, experimental determination of associations between microRNAs and diseases remains challenging. Meanwhile, target diseases need to be revealed for some new microRNAs without any known target disease association information as new microRNAs are discovered each year. Therefore, computational methods for microRNA-disease association prediction have gained a lot of research interest.
Herein, based on the assumption that functionally related microRNAs tend to be associated with phenotypically similar diseases, three inference methods were presented for microRNA-disease association prediction, namely MBSI (microRNA-based similarity inference), PBSI (phenotype-based similarity inference) and NetCBI (network-consistency-based inference). Global network similarity measure was used in the three methods to predict new microRNA-disease associations.
We tested the three methods on 242 known microRNA-disease associations by leave-one-out cross-validation for prediction evaluation, and achieved AUC values of 74.83%, 54.02% and 80.66%, respectively. The best-performed method NetCBI was then chosen for novel microRNA-disease association prediction. Some associations strongly predicted by NetCBI were confirmed by the publicly accessible databases, which indicated the usefulness of this method. The newly predicted associations were publicly released to facilitate future studies. Moreover, NetCBI was especially applicable to predicting target diseases for microRNAs whose target association information was not available.
The encouraging results suggest that our method NetCBI can not only provide help in identifying novel microRNA-disease associations but also guide biological experiments for scientific research.
MicroRNA-disease association prediction; Network similarity; Network consistency