Background
Bidirectional promoters are shared promoter sequences between divergent gene pair (genes proximal to each other on opposite strands), and can regulate the genes in both directions. In the human genome, > 10% of protein-coding genes are arranged head-to-head on opposite strands, with transcription start sites that are separated by < 1,000 base pairs. Many transcription factor binding sites occur in the bidirectional promoters that influence the expression of 2 opposite genes. Recently, RNA polymerase II (RPol II) ChIP-seq data are used to identify the promoters of coding genes and non-coding RNAs. However, a bidirectional promoter with RPol II ChIP-Seq data has not been found.
Results
In some bidirectional promoter regions, the RPol II forms a bi-peak shape, which indicates that 2 promoters are located in the bidirectional region. We have developed a computational approach to identify the regulatory regions of all divergent gene pairs using genome-wide RPol II binding patterns derived from ChIP-seq data, based upon the assumption that the distribution of RPol II binding patterns around the bidirectional promoters are accumulated by RPol II binding of 2 promoters. In HeLa S3 cells, 249 promoter pairs and 1094 single promoters were identified, of which 76 promoters cover only positive genes, 86 promoters cover only negative genes, and 932 promoters cover 2 genes. Gene expression levels and STAT1 binding sites for different promoter categories were therefore examined.
Conclusions
The regulatory region of bidirectional promoter identification based upon RPol II binding patterns provides important temporal and spatial measurements regarding the initiation of transcription. From gene expression and transcription factor binding site analysis, the promoters in bidirectional regions may regulate the closest gene, and STAT1 is involved in primary promoter.
doi:10.1186/1755-8794-6-S1-S5
PMCID: PMC3552671
PMID: 23369456
Background
Over 10,000 long intergenic non-coding RNAs (lincRNAs) have been identified in the human genome. Some have been well characterized and known to participate in various stages of gene regulation. In the post-transcriptional process, another class of well-known small non-coding RNA, or microRNA (miRNA), is very active in inhibiting mRNA. Though similar features between mRNA and lincRNA have been revealed in several recent studies, and a few isolated miRNA-lincRNA relationships have been observed. Despite these advances, the comprehensive miRNA regulation pattern of lincRNA has not been clarified.
Methods
In this study, we investigated the possible interaction between the two classes of non-coding RNAs. Instead of using the existing long non-coding database, we employed an ab initio method to annotate lincRNAs expressed in a group of normal breast tissues and breast tumors.
Results
Approximately 90 lincRNAs show strong reverse expression correlation with miRNAs, which have at least one predicted target site presented. These target sites are statistically more conserved than their neighboring genetic regions and other predicted target sites. Several miRNAs that target to these lincRNAs are known to play an essential role in breast cancer.
Conclusion
Similar to inhibiting mRNAs, miRNAs show potential in promoting the degeneration of lincRNAs. Breast-cancer-related miRNAs may influence their target lincRNAs resulting in differential expression in normal and malignant breast tissues. This implies the miRNA regulation of lincRNAs may be involved in the regulatory process in tumor cells.
doi:10.1186/1755-8794-6-S1-S7
PMCID: PMC3552696
PMID: 23369519
Background
Typical analysis of time-series gene expression data such as clustering or graphical models cannot distinguish between early and later drug responsive gene targets in cancer cells. However, these genes would represent good candidate biomarkers.
Results
We propose a new model - the dynamic time order network - to distinguish and connect early and later drug responsive gene targets. This network is constructed based on an integrated differential equation. Spline regression is applied for an accurate modeling of the time variation of gene expressions. Then a likelihood ratio test is implemented to infer the time order of any gene expression pair. One application of the model is the discovery of estrogen response biomarkers. For this purpose, we focused on genes whose responses are late when the breast cancer cells are treated with estradiol (E2).
Conclusions
Our approach has been validated by successfully finding time order relations between genes of the cell cycle system. More notably, we found late response genes potentially interesting as biomarkers of E2 treatment.
doi:10.1186/1752-0509-6-S3-S9
PMCID: PMC3524318
PMID: 23281615
Background
Alternative splicing increases proteome diversity by expressing multiple gene isoforms that often differ in function. Identifying alternative splicing events from RNA-seq experiments is important for understanding the diversity of transcripts and for investigating the regulation of splicing.
Results
We developed Alt Event Finder, a tool for identifying novel splicing events by using transcript annotation derived from genome-guided construction tools, such as Cufflinks and Scripture. With a proper combination of alignment and transcript reconstruction tools, Alt Event Finder is capable of identifying novel splicing events in the human genome. We further applied Alt Event Finder on a set of RNA-seq data from rat liver tissues, and identified dozens of novel cassette exon events whose splicing patterns changed after extensive alcohol exposure.
Conclusions
Alt Event Finder is capable of identifying de novo splicing events from data-driven transcript annotation, and is a useful tool for studying splicing regulation.
doi:10.1186/1471-2164-13-S8-S10
PMCID: PMC3535697
PMID: 23281921
Background
Estrogens control multiple functions of hormone-responsive breast cancer cells. They regulate diverse physiological processes in various tissues through genomic and non-genomic mechanisms that result in activation or repression of gene expression. Transcription regulation upon estrogen stimulation is a critical biological process underlying the onset and progress of the majority of breast cancer. ERα requires distinct co-regulator or modulators for efficient transcriptional regulation, and they form a regulatory network. Knowing this regulatory network will enable systematic study of the effect of ERα on breast cancer.
Methods
To investigate the regulatory network of ERα and discover novel modulators of ERα functions, we proposed an analytical method based on a linear regression model to identify translational modulators and their network relationships. In the network analysis, a group of specific modulator and target genes were selected according to the functionality of modulator and the ERα binding. Network formed from targets genes with ERα binding was called ERα genomic regulatory network; while network formed from targets genes without ERα binding was called ERα non-genomic regulatory network. Considering the active or repressive function of ERα, active or repressive function of a modulator, and agonist or antagonist effect of a modulator on ERα, the ERα/modulator/target relationships were categorized into 27 classes.
Results
Using the gene expression data and ERα Chip-seq data from the MCF-7 cell line, the ERα genomic/non-genomic regulatory networks were built by merging ERα/ modulator/target triplets (TF, M, T), where TF refers to the ERα, M refers to the modulator, and T refers to the target. Comparing these two networks, ERα non-genomic network has lower FDR than the genomic network. In order to validate these two networks, the same network analysis was performed in the gene expression data from the ZR-75.1 cell. The network overlap analysis between two cancer cells showed 1% overlap for the ERα genomic regulatory network, but 4% overlap for the non-genomic regulatory network.
Conclusions
We proposed a novel approach to infer the ERα/modulator/target relationships, and construct the genomic/non-genomic regulatory networks in two cancer cells. We found that the non-genomic regulatory network is more reliable than the genomic regulatory network.
doi:10.1186/1471-2164-13-S6-S6
PMCID: PMC3481450
PMID: 23134758
Teng, Mingxiang | Wang, Yadong | Kim, Seongho | Li, Lang | Shen, Changyu | Wang, Guohua | Liu, Yunlong | Huang, Tim H. M. | Nephew, Kenneth P. | Balch, Curt
A number of empirical Bayes models (each with different statistical distribution assumptions) have now been developed to analyze differential DNA methylation using high-density oligonucleotide tiling arrays. However, it remains unclear which model performs best. For example, for analysis of differentially methylated regions for conservative and functional sequence characteristics (e.g., enrichment of transcription factor-binding sites (TFBSs)), the sensitivity of such analyses, using various empirical Bayes models, remains unclear. In this paper, five empirical Bayes models were constructed, based on either a gamma distribution or a log-normal distribution, for the identification of differential methylated loci and their cell division—(1, 3, and 5) and drug-treatment-(cisplatin) dependent methylation patterns. While differential methylation patterns generated by log-normal models were enriched with numerous TFBSs, we observed almost no TFBS-enriched sequences using gamma assumption models. Statistical and biological results suggest log-normal, rather than gamma, empirical Bayes model distribution to be a highly accurate and precise method for differential methylation microarray analysis. In addition, we presented one of the log-normal models for differential methylation analysis and tested its reproducibility by simulation study. We believe this research to be the first extensive comparison of statistical modeling for the analysis of differential DNA methylation, an important biological phenomenon that precisely regulates gene transcription.
doi:10.1155/2012/376706
PMCID: PMC3432337
PMID: 22956892
Teng, Mingxiang | Ichikawa, Shoji | Padgett, Leah R. | Wang, Yadong | Mort, Matthew | Cooper, David N. | Koller, Daniel L. | Foroud, Tatiana | Edenberg, Howard J. | Econs, Michael J. | Liu, Yunlong
Motivation: One of the fundamental questions in genetics study is to identify functional DNA variants that are responsible to a disease or phenotype of interest. Results from large-scale genetics studies, such as genome-wide association studies (GWAS), and the availability of high-throughput sequencing technologies provide opportunities in identifying causal variants. Despite the technical advances, informatics methodologies need to be developed to prioritize thousands of variants for potential causative effects.
Results: We present regSNPs, an informatics strategy that integrates several established bioinformatics tools, for prioritizing regulatory SNPs, i.e. the SNPs in the promoter regions that potentially affect phenotype through changing transcription of downstream genes. Comparing to existing tools, regSNPs has two distinct features. It considers degenerative features of binding motifs by calculating the differences on the binding affinity caused by the candidate variants and integrates potential phenotypic effects of various transcription factors. When tested by using the disease-causing variants documented in the Human Gene Mutation Database, regSNPs showed mixed performance on various diseases. regSNPs predicted three SNPs that can potentially affect bone density in a region detected in an earlier linkage study. Potential effects of one of the variants were validated using luciferase reporter assay.
Contact:
yunliu@iupui.edu
Supplementary information:
Supplementary data are available at Bioinformatics online
doi:10.1093/bioinformatics/bts275
PMCID: PMC3389767
PMID: 22611130
Background
Potential epigenetic mechanisms underlying fetal alcohol syndrome (FAS) include alcohol-induced alterations of methyl metabolism, resulting in aberrant patterns of DNA methylation and gene expression during development. Having previously demonstrated an essential role for epigenetics in neural stem cell (NSC) development and that inhibiting DNA methylation prevents NSC differentiation, here we investigated the effect of alcohol exposure on genome-wide DNA methylation patterns and NSC differentiation.
Methods
NSCs in culture were treated with or without a 6-hr 88mM (“binge-like”) alcohol exposure and examined at 48 hrs, for migration, growth, and genome-wide DNA methylation. The DNA methylation was examined using DNA-methylation immunoprecipitation (MeDIP) followed by microarray analysis. Further validation was performed using Independent Sequenom analysis.
Results
NSC differentiated in 24 to 48 hrs with migration, neuronal expression, and morphological transformation. Alcohol exposure retarded the migration, neuronal formation, and growth processes of NSC, similar to treatment with the methylation inhibitor 5-aza-cytidine. When NSC departed from the quiescent state, a genome-wide diversification of DNA methylation was observed—that is, many moderately methylated genes altered methylation levels and became hyper- and hypomethylated. Alcohol prevented many genes from such diversification, including genes related to neural development, neuronal receptors, and olfaction, while retarding differentiation. Validation of specific genes by Sequenom analysis demonstrated that alcohol exposure prevented methylation of specific genes associated with neural development [cutl2 (cut-like 2), Igf1 (insulin-like growth factor 1), Efemp1 (epidermal growth factor-containing fibulin-like extracellular matrix protein 1), and Sox 7 (SRY-box containing gene 7)]; eye development, Lim 2 (lens intrinsic membrane protein 2); the epigenetic mark Smarca2 (SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 2); and developmental disorder [Dgcr2 (DiGeorge syndrome critical region gene 2)]. Specific sites altered by DNA methylation also correlated with transcription factor binding sites known to be critical for regulating neural development.
Conclusion
The data indicate that alcohol prevents normal DNA methylation programming of key neural stem cell genes and retards NSC differentiation. Thus, the role of DNA methylation in FAS warrants further investigation.
doi:10.1111/j.1530-0277.2010.01391.x
PMCID: PMC3076804
PMID: 21223309
Epigenetics; Epigenomics; MeDIP-Chip; Neural development; Fetal alcohol syndrome
It is now established that, as compared to normal cells, the cancer cell genome has an overall inverse distribution of DNA methylation (“methylome”), i.e., predominant hypomethylation and localized hypermethylation, within “CpG islands” (CGIs). Moreover, although cancer cells have reduced methylation “fidelity” and genomic instability, accurate maintenance of aberrant methylomes that underlie malignant phenotypes remains necessary. However, the mechanism(s) of cancer methylome maintenance remains largely unknown. Here, we assessed CGI methylation patterns propagated over 1, 3, and 5 divisions of A2780 ovarian cancer cells, concurrent with exposure to the DNA cross-linking chemotherapeutic cisplatin, and observed cell generation-successive increases in total hyper- and hypo-methylated CGIs. Empirical Bayesian modeling revealed five distinct modes of methylation propagation: (1) heritable (i.e., unchanged) high- methylation (1186 probe loci in CGI microarray); (2) heritable (i.e., unchanged) low-methylation (286 loci); (3) stochastic hypermethylation (i.e., progressively increased, 243 loci); (4) stochastic hypomethylation (i.e., progressively decreased, 247 loci); and (5) considerable “random” methylation (582 loci). These results support a “stochastic model” of DNA methylation equilibrium deriving from the efficiency of two distinct processes, methylation maintenance and de novo methylation. A role for cis-regulatory elements in methylation fidelity was also demonstrated by highly significant (p<2.2×10−5) enrichment of transcription factor binding sites in CGI probe loci showing heritably high (118 elements) and low (47 elements) methylation, and also in loci demonstrating stochastic hyper-(30 elements) and hypo-(31 elements) methylation. Notably, loci having “random” methylation heritability displayed nearly no enrichment. These results demonstrate an influence of cis-regulatory elements on the nonrandom propagation of both strictly heritable and stochastically heritable CGIs.
doi:10.1371/journal.pone.0032928
PMCID: PMC3295790
PMID: 22412954
It is estimated that more than 90% of human genes express multiple mRNA transcripts due to alternative splicing. Consequently, the proteins produced by different splice variants will likely have different functions and expression levels. Several genes with splice variants are known in bone, with functions that affect osteoblast function and bone formation. The primary goal of this study was to evaluate the extent of alternative splicing in a bone subjected to mechanical loading and subsequent bone formation. We used the rat forelimb loading model, in which the right forelimb was loaded axially for 3 minutes, while the left forearm served as a non-loaded control. Animals were subjected to loading sessions every day, with 24 hours between sessions. Ulnae were sampled at 11 time points, from 4 hours to 32 days after beginning loading. RNA was isolated and mRNA abundance was measured at each time point using Affymetrix exon arrays (GeneChip® Rat Exon 1.0 ST Arrays). An ANOVA model was used to identify potential alternatively spliced genes across the time course, and five alternatively spliced genes were validated with qPCR: Akap12, Fn1, Pcolce, Sfrp4, and Tpm1. The number of alternatively spliced genes varied with time, ranging from a low of 68 at 12h to a high of 992 at 16d. We identified genes across the time course that encoded proteins with known functions in bone formation, including collagens, matrix proteins, and components of the Wnt/β-catenin and TGF-β signaling pathways. We also identified alternatively spliced genes encoding cytokines, ion channels, muscle-related genes, and solute carriers that do not have a known function in bone formation and represent potentially novel findings. In addition, a functional characterization was performed to categorize the global functions of the alternatively spliced genes in our data set. In conclusion, mechanical loading induces alternative splicing in bone, which may play an important role in the response of bone to mechanical loading.
doi:10.1016/j.bone.2010.11.006
PMCID: PMC3039044
PMID: 21095247
Alternative splicing; bone formation; exon arrays; mechanical loading
Bone responds with increased bone formation to mechanical loading, and the time course of bone formation after initiating mechanical loading is well characterized. However, the regulatory activities governing the loading-dependent changes in gene expression are not well understood. The goal of this study was to identify the time-dependent regulatory mechanisms that governed mechanical loading-induced gene expression in bone using a predictive bioinformatics algorithm. A standard model for bone loading in rodents was employed in which the right forelimb was loaded axially for three minutes per day, while the left forearm served as a non-loaded, contralateral control. Animals were subjected to loading sessions every day, with 24 hours between sessions. Ulnas were sampled at 11 time points, from 4 hours to 32 days after beginning loading. Using a predictive bioinformatics algorithm, we created a linear model of gene expression and identified 44 transcription factor binding motifs and 29 microRNA binding sites that were predicted to regulate gene expression across the time course. Known and novel transcription factor binding motifs were identified throughout the time course, as were several novel microRNA binding sites. These time-dependent regulatory mechanisms may be important in controlling the loading-induced bone formation process.
doi:10.4137/GRSB.S8068
PMCID: PMC3273934
PMID: 22346344
bone; exon array; mechanical loading; microRNA; regulation; transcription factor
Next-generation sequencing technology provides new opportunities and challenges in the search for genetic variants that underlie complex traits. It will also presumably uncover many new rare variants, but exactly how these variants should be incorporated into the data analysis remains a question. Several papers in our group from Genetic Analysis Workshop 17 evaluated different methods of rare variant analysis, including single-variant, gene-based, and pathway-based analyses and analyses that incorporated biological information. Although the performance of some of these methods strongly depends on the underlying disease model, integration of known biological information is helpful in detecting causal genes. Two work groups demonstrated that use of a Bayesian network and a collapsing receiver operating characteristic curve approach improves risk prediction when a disease is caused by many rare variants. Another work group suggested that modeling local rather than global ancestry may be beneficial when controlling the effect of population structure in rare variant association analysis.
doi:10.1002/gepi.20649
PMCID: PMC3250084
PMID: 22128058
rare variant; association analysis; risk prediction model; population structure; biological information; receiver operating characteristic; Bayesian network
The advent of high-throughput measurements of gene expression and bioinformatics analysis methods offers new ways to study gene expression patterns. The primary goal of this study was to determine the time sequence for gene expression in a bone subjected to mechanical loading during key periods of the bone-formation process, including expression of matrix-related genes, the appearance of active osteoblasts, and bone desensitization. A standard model for bone loading was employed in which the right forelimb was loaded axially for 3 minutes per day, whereas the left forearm served as a nonloaded contralateral control. We evaluated loading-induced gene expression over a time course of 4 hours to 32 days after the first loading session. Six distinct time-dependent patterns of gene expression were identified over the time course and were categorized into three primary clusters: genes upregulated early in the time course, genes upregulated during matrix formation, and genes downregulated during matrix formation. Genes then were grouped based on function and/or signaling pathways. Many gene groups known to be important in loading-induced bone formation were identified within the clusters, including AP-1-related genes in the early-response cluster, matrix-related genes in the upregulated gene clusters, and Wnt/β-catenin signaling pathway inhibitors in the downregulated gene clusters. Several novel gene groups were identified as well, including chemokine-related genes, which were upregulated early but downregulated later in the time course; solute carrier genes, which were both upregulated and downregulated; and muscle-related genes, which were primarily downregulated. © 2011 American Society for Bone and Mineral Research.
doi:10.1002/jbmr.193
PMCID: PMC3179310
PMID: 20658561
EXON ARRAYS; GENE EXPRESSION; MECHANICAL LOADING
We present a report of the BIOCOMP'10 - The 2010 International Conference on Bioinformatics & Computational Biology and other related work in the area of systems biology.
doi:10.1186/1752-0509-5-S3-I1
PMCID: PMC3287563
PMID: 22784614
Background
Recent studies suggest that many proteins or regions of proteins lack 3D structure. Defined as intrinsically disordered proteins, these proteins/peptides are functionally important. Recent advances in next generation sequencing technologies enable genome-wide identification of novel nucleotide variations in a specific population or cohort.
Results
Using the exonic single nucleotide variations (SNVs) identified in the 1,000 Genomes Project and distributed by the Genetic Analysis Workshop 17, we systematically analysed the genetic and predicted disorder potential features of the non-synonymous variations. The result of experiments suggests that a significant change in the tendency of a protein region to be structured or disordered caused by SNVs may lead to malfunction of such a protein and contribute to disease risk.
Conclusions
After validation with functional SNVs on the traits distributed by GAW17, we conclude that it is valuable to consider structure/disorder tendencies while prioritizing and predicting mechanistic effects arising from novel genetic variations.
doi:10.1186/1471-2164-12-S5-S2
PMCID: PMC3287498
PMID: 22369681
Background
RNA-binding proteins (RBPs) play diverse roles in eukaryotic RNA processing. Despite their pervasive functions in coding and noncoding RNA biogenesis and regulation, elucidating the sequence specificities that define protein-RNA interactions remains a major challenge. Recently, CLIP-seq (Cross-linking immunoprecipitation followed by high-throughput sequencing) has been successfully implemented to study the transcriptome-wide binding patterns of SRSF1, PTBP1, NOVA and fox2 proteins. These studies either adopted traditional methods like Multiple EM for Motif Elicitation (MEME) to discover the sequence consensus of RBP's binding sites or used Z-score statistics to search for the overrepresented nucleotides of a certain size. We argue that most of these methods are not well-suited for RNA motif identification, as they are unable to incorporate the RNA structural context of protein-RNA interactions, which may affect to binding specificity. Here, we describe a novel model-based approach--RNAMotifModeler to identify the consensus of protein-RNA binding regions by integrating sequence features and RNA secondary structures.
Results
As an example, we implemented RNAMotifModeler on SRSF1 (SF2/ASF) CLIP-seq data. The sequence-structural consensus we identified is a purine-rich octamer 'AGAAGAAG' in a highly single-stranded RNA context. The unpaired probabilities, the probabilities of not forming pairs, are significantly higher than negative controls and the flanking sequence surrounding the binding site, indicating that SRSF1 proteins tend to bind on single-stranded RNA. Further statistical evaluations revealed that the second and fifth bases of SRSF1octamer motif have much stronger sequence specificities, but weaker single-strandedness, while the third, fourth, sixth and seventh bases are far more likely to be single-stranded, but have more degenerate sequence specificities. Therefore, we hypothesize that nucleotide specificity and secondary structure play complementary roles during binding site recognition by SRSF1.
Conclusion
In this study, we presented a computational model to predict the sequence consensus and optimal RNA secondary structure for protein-RNA binding regions. The successful implementation on SRSF1 CLIP-seq data demonstrates great potential to improve our understanding on the binding specificity of RNA binding proteins.
doi:10.1186/1471-2164-12-S5-S8
PMCID: PMC3287504
PMID: 22369183
Identifying rare variants that are responsible for complex disease has been promoted by advances in sequencing technologies. However, statistical methods that can handle the vast amount of data generated and that can interpret the complicated relationship between disease and these variants have lagged. We apply a zero-inflated Poisson regression model to take into account the excess of zeros caused by the extremely low frequency of the 24,487 exonic variants in the Genetic Analysis Workshop 17 data. We grouped the 697 subjects in the data set as Europeans, Asians, and Africans based on principal components analysis and found the total number of rare variants per gene for each individual. We then analyzed these collapsed variants based on the assumption that rare variants are enriched in a group of people affected by a disease compared to a group of unaffected people. We also tested the hypothesis with quantitative traits Q1, Q2, and Q4. Analyses performed on the combined 697 individuals and on each ethnic group yielded different results. For the combined population analysis, we found that UGT1A1, which was not part of the simulation model, was associated with disease liability and that FLT1, which was a causal locus in the simulation model, was associated with Q1. Of the causal loci in the simulation models, FLT1 and KDR were associated with Q1 and VNN1 was correlated with Q2. No significant genes were associated with Q4. These results show the feasibility and capability of our new statistical model to detect multiple rare variants influencing disease risk.
doi:10.1186/1753-6561-5-S9-S103
PMCID: PMC3287826
PMID: 22373445
Recent evidence suggests that many complex diseases are caused by genetic variations that play regulatory roles in controlling gene expression. Most genetic studies focus on nonsynonymous variations that can alter the amino acid composition of a protein and are therefore believed to have the highest impact on phenotype. Synonymous variations, however, can also play important roles in disease pathogenesis by regulating pre-mRNA processing and translational control. In this study, we systematically survey the effects of single-nucleotide variations (SNVs) on binding affinity of RNA-binding proteins (RBPs). Among the 10,113 synonymous SNVs identified in 697 individuals in the 1,000 Genomes Project and distributed by Genetic Analysis Workshop 17 (GAW17), we identified 182 variations located in alternatively spliced exons that can significantly change the binding affinity of nine RBPs whose binding preferences on 7-mer RNA sequences were previously reported. We found that the minor allele frequencies of these variations are similar to those of nonsynonymous SNVs, suggesting that they are in fact functional. We propose a workflow to identify phenotype-associated regulatory SNVs that might affect alternative splicing from exome-sequencing-derived genetic variations. Based on the affecting SNVs on the quantitative traits simulated in GAW17, we further identified two and four functional SNVs that are predicted to be involved in alternative splicing regulation in traits Q1 and Q2, respectively.
doi:10.1186/1753-6561-5-S9-S40
PMCID: PMC3287877
PMID: 22373210
Introduction
Serum microRNAs have the potential to be valuable biomarkers of cancer. This investigation addresses two issues that impact their utility: a) appropriate normalization controls and b) whether their altered levels persist in patients who are clinically free of the disease.
Methods
Sera from 40 age-matched healthy women and 39 breast cancer patients without clinical disease at the time of serum collection were analyzed for microRNAs let-7f, miR-16, miR-21 and miR-155 using quantitative real-time PCR. U6 and 5S, which are transcribed by RNA polymerase III (RNAP-III) and the small nucleolar RNU44 (SNORD44), were also analyzed for normalization. Significant results from the initial study were verified using a second set of sera from 15 healthy patients, 15 breast cancer patients without clinical disease and 15 with metastatic disease, and a third set of 12 healthy and 18 patients with metastatic disease. U6 was further verified in the extended second cohort of 75 healthy and 68 breast cancer patients without clinical disease.
Results
U6:SNORD44 ratio was consistently higher in breast cancer patients with or without active disease (fold change range 1.5-6.6, p value range 0.0003 to 0.05). This increase in U6:SNORD44 ratio was observed in the sera of both estrogen receptor-positive (ER+) and ER-negative breast cancer patients. MiR-16 and 5S, which are often used as normalization controls for microRNAs, showed remarkable experimental variability and thus are not ideal for normalization.
Conclusions
Elevated serum U6 levels in breast cancer patients irrespective of disease activity at the time of serum collection suggest a new paradigm in cancer; persistent systemic changes during cancer progression, which result in elevated activity of RNAP-III and/or the stability/release pathways of U6 in non-cancer tissues. Additionally, these results highlight the need for developing standards for normalization between samples in microRNA-related studies for healthy versus cancer and for inter-laboratory reproducibility. Our studies rule out the utility of miR-16, U6 and 5S RNAs for this purpose.
doi:10.1186/bcr2943
PMCID: PMC3262198
PMID: 21914171
We previously showed that alcohol-preferring (P) rats have higher bone density than alcohol-nonpreferring (NP) rats. Genetic mapping in P and NP rats identified a major quantitative trait locus (QTL) between 4q22 and 4q34 for alcohol preference. At the same location, several QTLs linked to bone density and structure were detected in Fischer 344 (F344) and Lewis (LEW) rats, suggesting that bone mass and strength genes might cosegregate with genes that regulate alcohol preference. The aim of this study was to identify the genes segregating for skeletal phenotypes in congenic P and NP rats. Transfer of the NP chromosome 4 QTL into the P background (P.NP) significantly decreased areal bone mineral density (aBMD) and volumetric bone mineral density (vBMD) at several skeletal sites, whereas transfer of the P chromosome 4 QTL into the NP background (NP.P) significantly increased bone mineral content (BMC) and aBMD in the same skeletal sites. Microarray analysis from the femurs using Affymetrix Rat Genome arrays revealed 53 genes that were differentially expressed among the rat strains with a false discovery rate (FDR) of less than 10%. Nine candidate genes were found to be strongly correlated (r2 > 0.50) with bone mass at multiple skeletal sites. The top three candidate genes, neuropeptide Y (Npy), α synuclein (Snca), and sepiapterin reductase (Spr), were confirmed using real-time quantitative PCR (qPCR). Ingenuity pathway analysis revealed relationships among the candidate genes related to bone metabolism involving β-estradiol, interferon-γ, and a voltage-gated calcium channel. We identified several candidate genes, including some novel genes on chromosome 4 segregating for skeletal phenotypes in reciprocal congenic P and NP rats. © 2010 American Society for Bone and Mineral Research.
doi:10.1002/jbmr.8
PMCID: PMC3153136
PMID: 20200994
bone mass; congenic; QTL; neuropeptide Y; gene
Di Leva, Gianpiero | Gasparini, Pierluigi | Piovan, Claudia | Ngankeu, Apollinaire | Garofalo, Michela | Taccioli, Cristian | Iorio, Marilena V. | Li, Meng | Volinia, Stefano | Alder, Hansjuerg | Nakamura, Tatsuya | Nuovo, Gerard | Liu, Yunlong | Nephew, Kenneth P. | Croce, Carlo M.
Background
Several lines of evidence have suggested that estrogen receptor α (ERα)–negative breast tumors, which are highly aggressive and nonresponsive to hormonal therapy, arise from ERα-positive precursors through different molecular pathways. Because microRNAs (miRNAs) modulate gene expression, we hypothesized that they may have a role in ER-negative tumor formation.
Methods
Gene expression profiles were used to highlight the global changes induced by miRNA modulation of ERα protein. miRNA transfection and luciferase assays enabled us to identify new targets of miRNA 206 (miR-206) and miRNA cluster 221-222 (miR-221-222). Northern blot, luciferase assays, estradiol treatment, and chromatin immunoprecipitation were performed to identify the miR-221-222 transcription unit and the mechanism implicated in its regulation.
Results
Different global changes in gene expression were induced by overexpression of miR-221-222 and miR-206 in ER-positive cells. miR-221 and -222 increased proliferation of ERα-positive cells, whereas miR-206 had an inhibitory effect (mean absorbance units [AU]: miR-206: 500 AU, 95% confidence interval [CI]) = 480 to 520; miR-221: 850 AU, 95% CI = 810 to 873; miR-222: 879 AU, 95% CI = 850 to 893; P < .05). We identified hepatocyte growth factor receptor and forkhead box O3 as new targets of miR-206 and miR-221-222, respectively. We demonstrated that ERα negatively modulates miR-221 and -222 through the recruitment of transcriptional corepressor partners: nuclear receptor corepressor and silencing mediator of retinoic acid and thyroid hormone receptor.
Conclusions
These findings suggest that the negative regulatory loop involving miR-221-222 and ERα may confer proliferative advantage and migratory activity to breast cancer cells and promote the transition from ER-positive to ER-negative tumors.
doi:10.1093/jnci/djq102
PMCID: PMC2873185
PMID: 20388878
Shen, Changyu | Huang, Yiwen | Liu, Yunlong | Wang, Guohua | Zhao, Yuming | Wang, Zhiping | Teng, Mingxiang | Wang, Yadong | Flockhart, David A | Skaar, Todd C | Yan, Pearlly | Nephew, Kenneth P | Huang, Tim HM | Li, Lang
Background
Estrogens regulate diverse physiological processes in various tissues through genomic and non-genomic mechanisms that result in activation or repression of gene expression. Transcription regulation upon estrogen stimulation is a critical biological process underlying the onset and progress of the majority of breast cancer. Dynamic gene expression changes have been shown to characterize the breast cancer cell response to estrogens, the every molecular mechanism of which is still not well understood.
Results
We developed a modulated empirical Bayes model, and constructed a novel topological and temporal transcription factor (TF) regulatory network in MCF7 breast cancer cell line upon stimulation by 17β-estradiol stimulation. In the network, significant TF genomic hubs were identified including ER-alpha and AP-1; significant non-genomic hubs include ZFP161, TFDP1, NRF1, TFAP2A, EGR1, E2F1, and PITX2. Although the early and late networks were distinct (<5% overlap of ERα target genes between the 4 and 24 h time points), all nine hubs were significantly represented in both networks. In MCF7 cells with acquired resistance to tamoxifen, the ERα regulatory network was unresponsive to 17β-estradiol stimulation. The significant loss of hormone responsiveness was associated with marked epigenomic changes, including hyper- or hypo-methylation of promoter CpG islands and repressive histone methylations.
Conclusions
We identified a number of estrogen regulated target genes and established estrogen-regulated network that distinguishes the genomic and non-genomic actions of estrogen receptor. Many gene targets of this network were not active anymore in anti-estrogen resistant cell lines, possibly because their DNA methylation and histone acetylation patterns have changed.
doi:10.1186/1752-0509-5-67
PMCID: PMC3117732
PMID: 21554733
The region of chromosome 1q33–q54 harbors quantitative trait loci (QTL) for femur strength in COP×DA and F344×LEW F2 rats. The purpose of this study is to identify the genes within this QTL region that contribute to the variation in femur strength. Microarray analysis was performed using RNA extracted from femurs of COP, DA, F344 and LEW rats. Genes differentially expressed in the 1q33–q54 region among these rat strains were then ranked based on the strength of correlation with femur strength in F2 animals derived from these rats. A total of 214 genes in this QTL region were differentially expressed among all rat strains, and 81 genes were found to be strongly correlated (r2>0.50) with femur strength. Of these, 12 candidate genes were prioritized for further validation, and 8 of these genes (Ifit3, Ppp2r5b, Irf7, Mpeg1, Bloc1s2, Pycard, Sec23ip, and Hps6) were confirmed by quantitative PCR (qPCR). Ingenuity Pathway Analysis suggested that these genes were involved in interferon alpha, nuclear factor-kappa B (NFkB), extracellular signal-related kinase (ERK), hepatocyte nuclear factor 4 alpha (HNF4A) and tumor necrosis factor (TNF) pathways.
doi:10.1016/j.ygeno.2009.05.008
PMCID: PMC3052638
PMID: 19482074
Femur strength; Gene expression; Microarray; QTLs; Osteoporotic fracture
It is well established that in adults, long-term repopulating hematopoietic stem cells (HSC) are mitotically quiescent cells that reside in specialized bone marrow (BM) niches that maintain the dormancy of HSC. Our laboratory demonstrated that the engraftment potential of human HSC (CD34+ cells) from BM and mobilized peripheral blood (MPB) is restricted to cells in the G0 phase of cell cycle but that in the case of umbilical cord blood (UCB) -derived CD34+ cells, cell cycle status is not a determining factor in the ability of these cells to engraft and sustain hematopoiesis. We used this distinct in vivo behavior of CD34+ cells from these tissues to identify genes associated with the engraftment potential of human HSC. CD34+ cells from BM, MPB, and UCB were fractionated into G0 and G1 phases of cell cycle and subjected in parallel to microarray and proteomic analyses. A total of 484 target genes were identified to be associated with engraftment potential of HSC. System biology modeling indicated that the top four signaling pathways associated with these genes are Integrin signaling, p53 signaling, cytotoxic T lymphocyte-mediated apoptosis, and Myc mediated apoptosis signaling. Our data suggest that a continuum of functions of hematopoietic cells directly associated with cell cycle progression may play a major role in governing the engraftment potential of stem cells. While proteomic analysis identified a total of 646 proteins in analyzed samples, a very limited overlap between genomic and proteomic data was observed. These data provide a new insight into the genetic control of engraftment of human HSC from distinct tissues and suggest that mitotic quiescence may not be the requisite characteristic of engrafting stem cells, but instead may be the physiologic status conducive to the expression of genetic elements favoring engraftment.
doi:10.1371/journal.pone.0017498
PMCID: PMC3049784
PMID: 21408179
Previously, we identified the regions of chromosomes 10q12–q31 and 15p16–q21 harbor quantitative trait loci (QTLs) for lumbar volumetric bone mineral density (vBMD) in female F2 rats derived from Fischer 344 (F344) × Lewis (LEW) and Copenhagen 2331 (COP) × Dark Agouti (DA) crosses. The purpose of this study is to identify the candidate genes within these QTL regions contributing to the variation in lumbar vBMD. RNA was extracted from bone tissue of F344, LEW, COP, and DA rats. Microarray analysis was performed using Affymetrix Rat Genome 230 2.0 Arrays. Genes differentially expressed among the rat strains were then ranked based on the strength of the correlation with lumbar vBMD in F2 animals derived from these rats. Quantitative PCR (qPCR) analysis was performed to confirm the prioritized candidate genes. A total of 285 genes were differentially expressed among all strains of rats with a false discovery rate less than 10%. Among these genes, 18 candidate genes were prioritized based on their strong correlation (r2 > 0.90) with lumbar vBMD. Of these, 14 genes (Akap1, Asgr2, Esd, Fam101b, Irf1, Lcp1, Ltc4s, Mdp-1, Pdhb, Plxdc1, Rabep1, Rhot1, Slc2a4, Xpo4) were confirmed by qPCR. We identified several novel candidate genes influencing spinal vBMD in rats.
doi:10.1007/s10142-009-0147-6
PMCID: PMC2835802
PMID: 19841953
Lumbar vBMD; Gene expression; Microarray; QTLs; Osteoporotic fracture