Maternal alcohol consumption inflicts a multitude of phenotypic consequences that range from undetectable changes to severe dysmorphology. Using tightly controlled murine studies that deliver precise amounts of alcohol at discrete developmental stages, our group and other labs demonstrated in prior studies that the C57BL/6 and DBA/2 inbred mouse strains display differential susceptibility to the teratogenic effects of alcohol. Since the phenotypic diversity extends beyond the amount, dosage and timing of alcohol exposure, it is likely that an individual's genetic background contributes to the phenotypic spectrum. To identify the genomic signatures associated with these observed differences in alcohol-induced dysmorphology, we conducted a microarray-based transcriptome study that also interrogated the genomic signatures between these two lines based on genetic background and alcohol exposure. This approach is called a gene x environment (GxE) analysis; one example of a GxE interaction would be a gene whose expression level increases in C57BL/6, but decreases in DBA/2 embryos, following alcohol exposure. We identified 35 candidate genes exhibiting GxE interactions. To identify cis-acting factors that mediated these interactions, we interrogated the proximal promoters of these 35 candidates and found 241 single nucleotide variants (SNVs) in 16 promoters. Further investigation indicated that 186 SNVs (15 promoters) are predicted to alter transcription factor binding. In addition, 62 SNVs created, removed or altered the placement of a CpG dinucleotide in 13 of the proximal promoters, 53 of which overlapped putative transcription factor binding sites. These 53 SNVs are also our top candidates for future studies aimed at examining the effects of alcohol on epigenetic gene regulation.
fetal alcohol syndrome; gene x environment interactions; genomics; gene expression; next generation sequencing; genetic association; epigenetics
Our efforts to prevent and treat breast cancer are significantly impeded by a lack of knowledge of the biology and developmental genetics of the normal mammary gland. In order to provide the specimens that will facilitate such an understanding, The Susan G. Komen for the Cure Tissue Bank at the IU Simon Cancer Center (KTB) was established. The KTB is, to our knowledge, the only biorepository in the world prospectively established to collect normal, healthy breast tissue from volunteer donors. As a first initiative toward a molecular understanding of the biology and developmental genetics of the normal mammary gland, the effect of the menstrual cycle and hormonal contraceptives on DNA expression in the normal breast epithelium was examined.
Using normal breast tissue from 20 premenopausal donors to KTB, the changes in the mRNA of the normal breast epithelium as a function of phase of the menstrual cycle and hormonal contraception were assayed using next-generation whole transcriptome sequencing (RNA-Seq).
In total, 255 genes representing 1.4% of all genes were deemed to have statistically significant differential expression between the two phases of the menstrual cycle. The overwhelming majority (221; 87%) of the genes have higher expression during the luteal phase. These data provide important insights into the processes occurring during each phase of the menstrual cycle. There was only a single gene significantly differentially expressed when comparing the epithelium of women using hormonal contraception to those in the luteal phase.
We have taken advantage of a unique research resource, the KTB, to complete the first-ever next-generation transcriptome sequencing of the epithelial compartment of 20 normal human breast specimens. This work has produced a comprehensive catalog of the differences in the expression of protein-coding genes as a function of the phase of the menstrual cycle. These data constitute the beginning of a reference data set of the normal mammary gland, which can be consulted for comparison with data developed from malignant specimens, or to mine the effects of the hormonal flux that occurs during the menstrual cycle.
To determine whether cholera toxin B subunit and active peptide from shark liver (CTB-APSL) fusion protein plays a role in treatment of type 2 diabetic mice, the CTB-APSL gene was cloned and expressed in silkworm (Bombyx mori) baculovirus expression vector system (BEVS), then the fusion protein was orally administrated at a dose of 100 mg/kg for five weeks in diabetic mice. The results demonstrated that the oral administration of CTB-APSL fusion protein can effectively reduce the levels of both fasting blood glucose (FBG) and glycosylated hemoglobin (GHb), promote insulin secretion and improve insulin resistance, significantly improve lipid metabolism, reduce triglycerides (TG), total cholesterol (TC) and low density lipoprotein (LDL) levels and increase high density lipoprotein (HDL) levels, as well as effectively improve the inflammatory response of type 2 diabetic mice through the reduction of the levels of inflammatory cytokines tumor necrosis factor-α (TNF-α) and interleukin-6 (IL-6). Histopathology shows that the fusion protein can significantly repair damaged pancreatic tissue in type 2 diabetic mice, significantly improve hepatic steatosis and hepatic cell cloudy swelling, reduce the content of lipid droplets in type 2 diabetic mice, effectively inhibit renal interstitial inflammatory cells invasion and improve renal tubular epithelial cell nucleus pyknosis, thus providing an experimental basis for the development of a new type of oral therapy for type 2 diabetes.
active peptide from shark liver; cholera toxin B subunit; Bombyx mori pupae; type 2 diabetes mellitus; oral administration
Summary: NGSUtils is a suite of software tools for manipulating data common to next-generation sequencing experiments, such as FASTQ, BED and BAM format files. These tools provide a stable and modular platform for data management and analysis.
Availability and implementation: NGSUtils is available under a BSD license and works on Mac OS X and Linux systems. Python 2.6+ and virtualenv are required. More information and source code may be obtained from the website: http://ngsutils.org.
Supplementary data are available at Bioinformatics online.
Our fundamental understanding of how several thousand diverse RNAs are recognized in the soma, sorted, packaged, transported and localized within the cell is fragmentary. The COPa and COPb proteins of the coatomer protein I (COPI) vesicle complex were reported to interact with specific RNAs and represent a candidate RNA sorting and transport system. To determine the RNA-binding profile of Golgi-derived COPI in neuronal cells, we performed formaldehyde-linked RNA immunoprecipitation, followed by high-throughput sequencing, a process we term FLRIP-Seq (FLRIP, formaldehyde-cross-linked immunoprecipitation). We demonstrate that COPa co-immunoprecipitates a specific set of RNAs that are enriched in G-quadruplex motifs and fragile X mental retardation protein-associated RNAs and that encode factors that predominantly localize to the plasma membrane and cytoskeleton and function within signaling pathways. These data support the novel function of COPI in inter-compartmental trafficking of RNA.
To adapt to stresses encountered in stationary phase, Gram-negative bacteria utilize the alternative sigma factor RpoS. However, some species lack RpoS; thus, it is unclear how stationary-phase adaptation is regulated in these organisms. Here we defined the growth-phase-dependent transcriptomes of Haemophilus ducreyi, which lacks an RpoS homolog. Compared to mid-log-phase organisms, cells harvested from the stationary phase upregulated genes encoding several virulence determinants and a homolog of hfq. Insertional inactivation of hfq altered the expression of ~16% of the H. ducreyi genes. Importantly, there were a significant overlap and an inverse correlation in the transcript levels of genes differentially expressed in the hfq inactivation mutant relative to its parent and the genes differentially expressed in stationary phase relative to mid-log phase in the parent. Inactivation of hfq downregulated genes in the flp-tad and lspB-lspA2 operons, which encode several virulence determinants. To comply with FDA guidelines for human inoculation experiments, an unmarked hfq deletion mutant was constructed and was fully attenuated for virulence in humans. Inactivation or deletion of hfq downregulated Flp1 and impaired the ability of H. ducreyi to form microcolonies, downregulated DsrA and rendered H. ducreyi serum susceptible, and downregulated LspB and LspA2, which allow H. ducreyi to resist phagocytosis. We propose that, in the absence of an RpoS homolog, Hfq serves as a major contributor of H. ducreyi stationary-phase and virulence gene regulation. The contribution of Hfq to stationary-phase gene regulation may have broad implications for other organisms that lack an RpoS homolog.
Pathogenic bacteria encounter a wide range of stresses in their hosts, including nutrient limitation; the ability to sense and respond to such stresses is crucial for bacterial pathogens to successfully establish an infection. Gram-negative bacteria frequently utilize the alternative sigma factor RpoS to adapt to stresses and stationary phase. However, homologs of RpoS are absent in some bacterial pathogens, including Haemophilus ducreyi, which causes chancroid and facilitates the acquisition and transmission of HIV-1. Here, we provide evidence that, in the absence of an RpoS homolog, Hfq serves as a major contributor of stationary-phase gene regulation and that Hfq is required for H. ducreyi to infect humans. To our knowledge, this is the first study describing Hfq as a major contributor of stationary-phase gene regulation in bacteria and the requirement of Hfq for the virulence of a bacterial pathogen in humans.
Haemophilus ducreyi causes chancroid, a genital ulcer disease that facilitates the transmission of human immunodeficiency virus type 1. In humans, H. ducreyi is surrounded by phagocytes and must adapt to a hostile environment to survive. To sense and respond to environmental cues, bacteria frequently use two-component signal transduction (2CST) systems. The only obvious 2CST system in H. ducreyi is CpxRA; CpxR is a response regulator, and CpxA is a sensor kinase. Previous studies by Hansen and coworkers showed that CpxR directly represses the expression of dsrA, the lspB-lspA2 operon, and the flp operon, which are required for virulence in humans. They further showed that CpxA functions predominantly as a phosphatase in vitro to maintain the expression of virulence determinants. Since a cpxA mutant is avirulent while a cpxR mutant is fully virulent in humans, CpxA also likely functions predominantly as a phosphatase in vivo. To better understand the role of H. ducreyi CpxRA in controlling virulence determinants, here we defined genes potentially regulated by CpxRA by using RNA-Seq. Activation of CpxR by deletion of cpxA repressed nearly 70% of its targets, including seven established virulence determinants. Inactivation of CpxR by deletion of cpxR differentially regulated few genes and increased the expression of one virulence determinant. We identified a CpxR binding motif that was enriched in downregulated but not upregulated targets. These data reinforce the hypothesis that CpxA phosphatase activity plays a critical role in controlling H. ducreyi virulence in vivo. Characterization of the downregulated genes may offer new insights into pathogenesis.
Bone marrow-derived mesenchymal stem cells (MSCs) are dominant seed cell sources for bone regeneration. Bone morphogenetic proteins (BMPs) initiate cartilage and bone formation in a sequential cascade. Vascular endothelial growth factor (VEGF) is an essential coordinator of extracellular matrix remodeling, angiogenesis and bone formation. In the present study, the effects of the vascular endothelial growth factor 165 (VEGF165) and bone morphogenetic protein 2 (BMP2) genes on bone regeneration were investigated by the lentivirus-mediated cotransfection of the two genes into rat bone marrow-derived MSCs. The successful co-expression of the two genes in the MSCs was confirmed using quantitative polymerase chain reaction (qPCR) and western blot analysis. The results of alizarin red and alkaline phosphatase (ALP) staining at 14 days subsequent to transfection showed that the area of staining in cells transfected with BMP2 alone was higher than that in cells transfected with BMP2 and VEGF165 or untransfected control cells, while the BMP2 + VEGF165 group showed significantly more staining than the untransfected control. This indicated that BMP2 alone exhibited a stronger effect in bone regeneration than BMP2 in combination with VEGF165. Similarly, in inducing culture medium, the ALP activity of the BMP2 + VEGF165 group was notably suppressed compared with that of the BMP2 group. The overexpression of VEGF165 inhibited BMP2-induced MSC differentiation and osteogenesis in vitro. Whether or not local VEGF gene therapy is likely to affect bone regeneration in vivo requires further investigation.
bone marrow-derived mesenchymal stem cells; bone morphogenetic protein 2; vascular endothelial growth factor; bone regeneration; co-transfection
Genome-wide association studies (GWAS) have identified hundreds of genetic variants associated with complex human diseases, clinical conditions and traits. Genetic mapping of expression quantitative trait loci (eQTLs) is providing us with novel functional effects of thousands of single nucleotide polymorphisms (SNPs). In a classical quantitative trail loci (QTL) mapping problem multiple tests are done to assess whether one trait is associated with a number of loci. In contrast to QTL studies, thousands of traits are measured alongwith thousands of gene expressions in an eQTL study. For such a study, a huge number of tests have to be performed (~106). This extreme multiplicity gives rise to many computational and statistical problems. In this paper we have tried to address these issues using two closely related inferential approaches: an empirical Bayes method that bears the Bayesian flavor without having much a priori knowledge and the frequentist method of false discovery rates. A three-component t-mixture model has been used for the parametric empirical Bayes (PEB) method. Inferences have been obtained using Expectation/Conditional Maximization Either (ECME) algorithm. A simulation study has also been performed and has been compared with a nonparametric empirical Bayes (NPEB) alternative.
The results show that PEB has an edge over NPEB. The proposed methodology has been applied to human liver cohort (LHC) data. Our method enables to discover more significant SNPs with FDR<10% compared to the previous study done by Yang et al. (Genome Research, 2010).
In contrast to previously available methods based on p-values, the empirical Bayes method uses local false discovery rate (lfdr) as the threshold. This method controls false positive rate.
Background. The genome-wide association studies (GWAS) have been successful during the last few years. A key challenge is that the interpretation of the results is not straightforward, especially for transacting SNPs. Integration of transcriptome data into GWAS may provide clues elucidating the mechanisms by which a genetic variant leads to a disease. Methods. Here, we developed a novel mediation analysis approach to identify new expression quantitative trait loci (eQTL) driving CYP2D6 activity by combining genotype, gene expression, and enzyme activity data. Results. 389,573 and 1,214,416 SNP-transcript-CYP2D6 activity trios are found strongly associated (P < 10−5, FDR = 16.6% and 11.7%) for two different genotype platforms, namely, Affymetrix and Illumina, respectively. The majority of eQTLs are trans-SNPs. A single polymorphism leads to widespread downstream changes in the expression of distant genes by affecting major regulators or transcription factors (TFs), which would be visible as an eQTL hotspot and can lead to large and consistent biological effects. Overlapped eQTL hotspots with the mediators lead to the discovery of 64 TFs.
Conclusions. Our mediation analysis is a powerful approach in identifying the trans-QTL-phenotype associations. It improves our understanding of the functional genetic variations for the liver metabolism mechanisms.
BACKGROUND & AIMS
Early embryogenesis involves cell fate decisions that define the body axes and establish pools of progenitor cells. Development does not stop once lineages are specified; cells continue to undergo specific maturation events, and changes in gene expression patterns lead to their unique physiological functions. Secretory pancreatic acinar cells mature postnatally to synthesize large amounts of protein, polarize, and communicate with other cells. The transcription factor MIST1 is expressed by only secretory cells and regulates maturation events. MIST1-deficient acinar cells in mice do not establish apical-basal polarity, properly position zymogen granules, or communicate with adjacent cells, disrupting pancreatic function. We investigated whether MIST1 directly induces and maintains the mature phenotype of acinar cells.
We analyzed the effects of Cre-mediated expression of Mist1 in adult Mist1– deficient (Mist1KO) mice. Pancreatic tissues were collected and analyzed by light and electron microscopy, immunohistochemistry, real-time polymerase chain reaction analysis, and chromatin immunoprecipitation. Primary acini were isolated from mice and analyzed in amylase secretion assays.
Induced expression of Mist1 in adult Mist1KO mice restored wild-type gene expression patterns in acinar cells. The acinar cells changed phenotypes, establishing apical-basal polarity, increasing the size of zymogen granules, reorganizing the cytoskeletal network, communicating intercellularly (by synthesizing gap junctions), and undergoing exocytosis.
The exocrine pancreas of adult mice can be remodeled by re-expression of the transcription factor MIST1. MIST1 regulates acinar cell maturation and might be used to repair damaged pancreata in patients with pancreatic disorders.
DIMM; Exocrine Pancreas Disease; Secretion; Transcription
Active peptide from shark liver (APSL) is a cytokine from Chiloscyllium plagiosum that can stimulate liver regeneration and protects the pancreas. To study the effect of orally administered recombinant APSL (rAPSL) on an animal model of type 2 diabetes mellitus, the APSL gene was cloned, and APSL was expressed in Bombyx mori N cells (BmN cells), silkworm larvae and silkworm pupae using the silkworm baculovirus expression vector system (BEVS). It was demonstrated that rAPSL was able to significantly reduce the blood glucose level in mice with type 2 diabetes induced by streptozotocin. The analysis of paraffin sections of mouse pancreatic tissues revealed that rAPSL could effectively protect mouse islets from streptozotocin-induced lesions. Compared with the powder prepared from normal silkworm pupae, the powder prepared from pupae expressing rAPSL exhibited greater protective effects, and these results suggest that rAPSL has potential uses as an oral drug for the treatment of diabetes mellitus in the future.
active peptide from shark liver; Bombyx mori pupae; BmNPV/Bac-to-Bac baculovirus expression system; type 2 diabetes mellitus; oral administration
Micro-indels (insertions or deletions shorter than 21 bps) constitute the second most frequent class of human gene mutation after single nucleotide variants. Despite the relative abundance of non-frameshifting indels, their damaging effect on protein structure and function has gone largely unstudied. We have developed a support vector machine-based method named DDIG-in (Detecting disease-causing genetic variations due to indels) to prioritize non-frameshifting indels by comparing disease-associated mutations with putatively neutral mutations from the 1,000 Genomes Project. The final model gives good discrimination for indels and is robust against annotation errors. A webserver implementing DDIG-in is available at http://sparks-lab.org/ddig.
Bidirectional promoters are shared promoter sequences between divergent gene pair (genes proximal to each other on opposite strands), and can regulate the genes in both directions. In the human genome, > 10% of protein-coding genes are arranged head-to-head on opposite strands, with transcription start sites that are separated by < 1,000 base pairs. Many transcription factor binding sites occur in the bidirectional promoters that influence the expression of 2 opposite genes. Recently, RNA polymerase II (RPol II) ChIP-seq data are used to identify the promoters of coding genes and non-coding RNAs. However, a bidirectional promoter with RPol II ChIP-Seq data has not been found.
In some bidirectional promoter regions, the RPol II forms a bi-peak shape, which indicates that 2 promoters are located in the bidirectional region. We have developed a computational approach to identify the regulatory regions of all divergent gene pairs using genome-wide RPol II binding patterns derived from ChIP-seq data, based upon the assumption that the distribution of RPol II binding patterns around the bidirectional promoters are accumulated by RPol II binding of 2 promoters. In HeLa S3 cells, 249 promoter pairs and 1094 single promoters were identified, of which 76 promoters cover only positive genes, 86 promoters cover only negative genes, and 932 promoters cover 2 genes. Gene expression levels and STAT1 binding sites for different promoter categories were therefore examined.
The regulatory region of bidirectional promoter identification based upon RPol II binding patterns provides important temporal and spatial measurements regarding the initiation of transcription. From gene expression and transcription factor binding site analysis, the promoters in bidirectional regions may regulate the closest gene, and STAT1 is involved in primary promoter.
Over 10,000 long intergenic non-coding RNAs (lincRNAs) have been identified in the human genome. Some have been well characterized and known to participate in various stages of gene regulation. In the post-transcriptional process, another class of well-known small non-coding RNA, or microRNA (miRNA), is very active in inhibiting mRNA. Though similar features between mRNA and lincRNA have been revealed in several recent studies, and a few isolated miRNA-lincRNA relationships have been observed. Despite these advances, the comprehensive miRNA regulation pattern of lincRNA has not been clarified.
In this study, we investigated the possible interaction between the two classes of non-coding RNAs. Instead of using the existing long non-coding database, we employed an ab initio method to annotate lincRNAs expressed in a group of normal breast tissues and breast tumors.
Approximately 90 lincRNAs show strong reverse expression correlation with miRNAs, which have at least one predicted target site presented. These target sites are statistically more conserved than their neighboring genetic regions and other predicted target sites. Several miRNAs that target to these lincRNAs are known to play an essential role in breast cancer.
Similar to inhibiting mRNAs, miRNAs show potential in promoting the degeneration of lincRNAs. Breast-cancer-related miRNAs may influence their target lincRNAs resulting in differential expression in normal and malignant breast tissues. This implies the miRNA regulation of lincRNAs may be involved in the regulatory process in tumor cells.
Typical analysis of time-series gene expression data such as clustering or graphical models cannot distinguish between early and later drug responsive gene targets in cancer cells. However, these genes would represent good candidate biomarkers.
We propose a new model - the dynamic time order network - to distinguish and connect early and later drug responsive gene targets. This network is constructed based on an integrated differential equation. Spline regression is applied for an accurate modeling of the time variation of gene expressions. Then a likelihood ratio test is implemented to infer the time order of any gene expression pair. One application of the model is the discovery of estrogen response biomarkers. For this purpose, we focused on genes whose responses are late when the breast cancer cells are treated with estradiol (E2).
Our approach has been validated by successfully finding time order relations between genes of the cell cycle system. More notably, we found late response genes potentially interesting as biomarkers of E2 treatment.
Alternative splicing increases proteome diversity by expressing multiple gene isoforms that often differ in function. Identifying alternative splicing events from RNA-seq experiments is important for understanding the diversity of transcripts and for investigating the regulation of splicing.
We developed Alt Event Finder, a tool for identifying novel splicing events by using transcript annotation derived from genome-guided construction tools, such as Cufflinks and Scripture. With a proper combination of alignment and transcript reconstruction tools, Alt Event Finder is capable of identifying novel splicing events in the human genome. We further applied Alt Event Finder on a set of RNA-seq data from rat liver tissues, and identified dozens of novel cassette exon events whose splicing patterns changed after extensive alcohol exposure.
Alt Event Finder is capable of identifying de novo splicing events from data-driven transcript annotation, and is a useful tool for studying splicing regulation.
Estrogens control multiple functions of hormone-responsive breast cancer cells. They regulate diverse physiological processes in various tissues through genomic and non-genomic mechanisms that result in activation or repression of gene expression. Transcription regulation upon estrogen stimulation is a critical biological process underlying the onset and progress of the majority of breast cancer. ERα requires distinct co-regulator or modulators for efficient transcriptional regulation, and they form a regulatory network. Knowing this regulatory network will enable systematic study of the effect of ERα on breast cancer.
To investigate the regulatory network of ERα and discover novel modulators of ERα functions, we proposed an analytical method based on a linear regression model to identify translational modulators and their network relationships. In the network analysis, a group of specific modulator and target genes were selected according to the functionality of modulator and the ERα binding. Network formed from targets genes with ERα binding was called ERα genomic regulatory network; while network formed from targets genes without ERα binding was called ERα non-genomic regulatory network. Considering the active or repressive function of ERα, active or repressive function of a modulator, and agonist or antagonist effect of a modulator on ERα, the ERα/modulator/target relationships were categorized into 27 classes.
Using the gene expression data and ERα Chip-seq data from the MCF-7 cell line, the ERα genomic/non-genomic regulatory networks were built by merging ERα/ modulator/target triplets (TF, M, T), where TF refers to the ERα, M refers to the modulator, and T refers to the target. Comparing these two networks, ERα non-genomic network has lower FDR than the genomic network. In order to validate these two networks, the same network analysis was performed in the gene expression data from the ZR-75.1 cell. The network overlap analysis between two cancer cells showed 1% overlap for the ERα genomic regulatory network, but 4% overlap for the non-genomic regulatory network.
We proposed a novel approach to infer the ERα/modulator/target relationships, and construct the genomic/non-genomic regulatory networks in two cancer cells. We found that the non-genomic regulatory network is more reliable than the genomic regulatory network.
A number of empirical Bayes models (each with different statistical distribution assumptions) have now been developed to analyze differential DNA methylation using high-density oligonucleotide tiling arrays. However, it remains unclear which model performs best. For example, for analysis of differentially methylated regions for conservative and functional sequence characteristics (e.g., enrichment of transcription factor-binding sites (TFBSs)), the sensitivity of such analyses, using various empirical Bayes models, remains unclear. In this paper, five empirical Bayes models were constructed, based on either a gamma distribution or a log-normal distribution, for the identification of differential methylated loci and their cell division—(1, 3, and 5) and drug-treatment-(cisplatin) dependent methylation patterns. While differential methylation patterns generated by log-normal models were enriched with numerous TFBSs, we observed almost no TFBS-enriched sequences using gamma assumption models. Statistical and biological results suggest log-normal, rather than gamma, empirical Bayes model distribution to be a highly accurate and precise method for differential methylation microarray analysis. In addition, we presented one of the log-normal models for differential methylation analysis and tested its reproducibility by simulation study. We believe this research to be the first extensive comparison of statistical modeling for the analysis of differential DNA methylation, an important biological phenomenon that precisely regulates gene transcription.
Motivation: One of the fundamental questions in genetics study is to identify functional DNA variants that are responsible to a disease or phenotype of interest. Results from large-scale genetics studies, such as genome-wide association studies (GWAS), and the availability of high-throughput sequencing technologies provide opportunities in identifying causal variants. Despite the technical advances, informatics methodologies need to be developed to prioritize thousands of variants for potential causative effects.
Results: We present regSNPs, an informatics strategy that integrates several established bioinformatics tools, for prioritizing regulatory SNPs, i.e. the SNPs in the promoter regions that potentially affect phenotype through changing transcription of downstream genes. Comparing to existing tools, regSNPs has two distinct features. It considers degenerative features of binding motifs by calculating the differences on the binding affinity caused by the candidate variants and integrates potential phenotypic effects of various transcription factors. When tested by using the disease-causing variants documented in the Human Gene Mutation Database, regSNPs showed mixed performance on various diseases. regSNPs predicted three SNPs that can potentially affect bone density in a region detected in an earlier linkage study. Potential effects of one of the variants were validated using luciferase reporter assay.
Supplementary data are available at Bioinformatics online
Potential epigenetic mechanisms underlying fetal alcohol syndrome (FAS) include alcohol-induced alterations of methyl metabolism, resulting in aberrant patterns of DNA methylation and gene expression during development. Having previously demonstrated an essential role for epigenetics in neural stem cell (NSC) development and that inhibiting DNA methylation prevents NSC differentiation, here we investigated the effect of alcohol exposure on genome-wide DNA methylation patterns and NSC differentiation.
NSCs in culture were treated with or without a 6-hr 88mM (“binge-like”) alcohol exposure and examined at 48 hrs, for migration, growth, and genome-wide DNA methylation. The DNA methylation was examined using DNA-methylation immunoprecipitation (MeDIP) followed by microarray analysis. Further validation was performed using Independent Sequenom analysis.
NSC differentiated in 24 to 48 hrs with migration, neuronal expression, and morphological transformation. Alcohol exposure retarded the migration, neuronal formation, and growth processes of NSC, similar to treatment with the methylation inhibitor 5-aza-cytidine. When NSC departed from the quiescent state, a genome-wide diversification of DNA methylation was observed—that is, many moderately methylated genes altered methylation levels and became hyper- and hypomethylated. Alcohol prevented many genes from such diversification, including genes related to neural development, neuronal receptors, and olfaction, while retarding differentiation. Validation of specific genes by Sequenom analysis demonstrated that alcohol exposure prevented methylation of specific genes associated with neural development [cutl2 (cut-like 2), Igf1 (insulin-like growth factor 1), Efemp1 (epidermal growth factor-containing fibulin-like extracellular matrix protein 1), and Sox 7 (SRY-box containing gene 7)]; eye development, Lim 2 (lens intrinsic membrane protein 2); the epigenetic mark Smarca2 (SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 2); and developmental disorder [Dgcr2 (DiGeorge syndrome critical region gene 2)]. Specific sites altered by DNA methylation also correlated with transcription factor binding sites known to be critical for regulating neural development.
The data indicate that alcohol prevents normal DNA methylation programming of key neural stem cell genes and retards NSC differentiation. Thus, the role of DNA methylation in FAS warrants further investigation.
Epigenetics; Epigenomics; MeDIP-Chip; Neural development; Fetal alcohol syndrome
It is now established that, as compared to normal cells, the cancer cell genome has an overall inverse distribution of DNA methylation (“methylome”), i.e., predominant hypomethylation and localized hypermethylation, within “CpG islands” (CGIs). Moreover, although cancer cells have reduced methylation “fidelity” and genomic instability, accurate maintenance of aberrant methylomes that underlie malignant phenotypes remains necessary. However, the mechanism(s) of cancer methylome maintenance remains largely unknown. Here, we assessed CGI methylation patterns propagated over 1, 3, and 5 divisions of A2780 ovarian cancer cells, concurrent with exposure to the DNA cross-linking chemotherapeutic cisplatin, and observed cell generation-successive increases in total hyper- and hypo-methylated CGIs. Empirical Bayesian modeling revealed five distinct modes of methylation propagation: (1) heritable (i.e., unchanged) high- methylation (1186 probe loci in CGI microarray); (2) heritable (i.e., unchanged) low-methylation (286 loci); (3) stochastic hypermethylation (i.e., progressively increased, 243 loci); (4) stochastic hypomethylation (i.e., progressively decreased, 247 loci); and (5) considerable “random” methylation (582 loci). These results support a “stochastic model” of DNA methylation equilibrium deriving from the efficiency of two distinct processes, methylation maintenance and de novo methylation. A role for cis-regulatory elements in methylation fidelity was also demonstrated by highly significant (p<2.2×10−5) enrichment of transcription factor binding sites in CGI probe loci showing heritably high (118 elements) and low (47 elements) methylation, and also in loci demonstrating stochastic hyper-(30 elements) and hypo-(31 elements) methylation. Notably, loci having “random” methylation heritability displayed nearly no enrichment. These results demonstrate an influence of cis-regulatory elements on the nonrandom propagation of both strictly heritable and stochastically heritable CGIs.
It is estimated that more than 90% of human genes express multiple mRNA transcripts due to alternative splicing. Consequently, the proteins produced by different splice variants will likely have different functions and expression levels. Several genes with splice variants are known in bone, with functions that affect osteoblast function and bone formation. The primary goal of this study was to evaluate the extent of alternative splicing in a bone subjected to mechanical loading and subsequent bone formation. We used the rat forelimb loading model, in which the right forelimb was loaded axially for 3 minutes, while the left forearm served as a non-loaded control. Animals were subjected to loading sessions every day, with 24 hours between sessions. Ulnae were sampled at 11 time points, from 4 hours to 32 days after beginning loading. RNA was isolated and mRNA abundance was measured at each time point using Affymetrix exon arrays (GeneChip® Rat Exon 1.0 ST Arrays). An ANOVA model was used to identify potential alternatively spliced genes across the time course, and five alternatively spliced genes were validated with qPCR: Akap12, Fn1, Pcolce, Sfrp4, and Tpm1. The number of alternatively spliced genes varied with time, ranging from a low of 68 at 12h to a high of 992 at 16d. We identified genes across the time course that encoded proteins with known functions in bone formation, including collagens, matrix proteins, and components of the Wnt/β-catenin and TGF-β signaling pathways. We also identified alternatively spliced genes encoding cytokines, ion channels, muscle-related genes, and solute carriers that do not have a known function in bone formation and represent potentially novel findings. In addition, a functional characterization was performed to categorize the global functions of the alternatively spliced genes in our data set. In conclusion, mechanical loading induces alternative splicing in bone, which may play an important role in the response of bone to mechanical loading.
Alternative splicing; bone formation; exon arrays; mechanical loading
Bone responds with increased bone formation to mechanical loading, and the time course of bone formation after initiating mechanical loading is well characterized. However, the regulatory activities governing the loading-dependent changes in gene expression are not well understood. The goal of this study was to identify the time-dependent regulatory mechanisms that governed mechanical loading-induced gene expression in bone using a predictive bioinformatics algorithm. A standard model for bone loading in rodents was employed in which the right forelimb was loaded axially for three minutes per day, while the left forearm served as a non-loaded, contralateral control. Animals were subjected to loading sessions every day, with 24 hours between sessions. Ulnas were sampled at 11 time points, from 4 hours to 32 days after beginning loading. Using a predictive bioinformatics algorithm, we created a linear model of gene expression and identified 44 transcription factor binding motifs and 29 microRNA binding sites that were predicted to regulate gene expression across the time course. Known and novel transcription factor binding motifs were identified throughout the time course, as were several novel microRNA binding sites. These time-dependent regulatory mechanisms may be important in controlling the loading-induced bone formation process.
bone; exon array; mechanical loading; microRNA; regulation; transcription factor
Next-generation sequencing technology provides new opportunities and challenges in the search for genetic variants that underlie complex traits. It will also presumably uncover many new rare variants, but exactly how these variants should be incorporated into the data analysis remains a question. Several papers in our group from Genetic Analysis Workshop 17 evaluated different methods of rare variant analysis, including single-variant, gene-based, and pathway-based analyses and analyses that incorporated biological information. Although the performance of some of these methods strongly depends on the underlying disease model, integration of known biological information is helpful in detecting causal genes. Two work groups demonstrated that use of a Bayesian network and a collapsing receiver operating characteristic curve approach improves risk prediction when a disease is caused by many rare variants. Another work group suggested that modeling local rather than global ancestry may be beneficial when controlling the effect of population structure in rare variant association analysis.
rare variant; association analysis; risk prediction model; population structure; biological information; receiver operating characteristic; Bayesian network