|Home | About | Journals | Submit | Contact Us | Français|
The mechanistic relevance of intergenic disease-associated genetic loci (IDAGL) containing highly statistically significant disease-linked SNPs remains unknown. Here, we present experimental and clinical evidence supporting the importantance of the role of IDAGL in human diseases. A targeted RT-PCR screen coupled with sequencing of purified PCR products detects widespread transcription at multiple IDAGL and identifies 96 small noncoding trans-regulatory RNAs of ~100–300 nt in length containing SNPs (snpRNAs) associated with 21 common disorders. Multiple independent lines of experimental evidence support functionality of snpRNAs by documenting their cell type-specific expression and evolutionary conservation of sequences, genomic coordinates and biological effects. Chromatin state signatures, expression profiling experiments and luciferase reporter assays demonstrate that many IDAGL are Polycomb-regulated long-range enhancers. Expression of snpRNAs in human and mouse cells markedly affects cellular behavior and induces allele-specific clinically relevant phenotypic changes: NLRP1-locus snpRNAs rs2670660 exert regulatory effects on monocyte/macrophage transdifferentiation, induce prostate cancer (PC) susceptibility snpRNAs and transform low-malignancy hormone-dependent human PC cells into highly malignant androgen-independent PC. Q-PCR analysis and luciferase reporter assays demonstrate that snpRNA sequences represent allele-specific “decoy” targets of microRNAs that function as SNP allele-specific modifiers of microRNA expression and activity. We demonstrate that trans-acting RNA molecules facilitating resistance to androgen depletion (RAD) in vitro and castration-resistant phenotype (CRP) in vivo of PC contain intergenic 8q24-locus SNP variants (rs1447295; rs16901979; rs6983267) that were recently linked with increased risk of PC. Q-PCR analysis of clinical samples reveals markedly increased and highly concordant (r = 0.896; p < 0.0001) snpRNA expression levels in tumor tissues compared with the adjacent normal prostate [122-fold and 45-fold in Gleason 7 tumors (p = 0.03); 370-fold and 127-fold in Gleason 8 tumors (p = 0.0001) for NLRP1-locus and 8q24-locus snpRNAs, respectively]. Our experiments indicate that RAD and CR phenotype of human PC cells can be triggered by ncRNA molecules transcribed from the NLRP1-locus intergenic enhancer at 17p13 and by downstream activation of the 8q24-locus snpRNAs. Our results define the IDAGL at 17p13 and 8q24 as candidate regulatory loci of RAD and CR phenotypes of PC, reveal previously unknown molecular links between the innate immunity/inflammasome system and development of hormone-independent PC and identify novel molecular and genetic targets with diagnostic and therapeutic potentials, exploration of which should be highly beneficial for personalized clinical management of PC.
Meta-analysis of genomic coordinates of nearly 800 significant SNP phenotype associations (p < 5 × 10−8) identified in nearly 600 genome-wide association studies (GWAS) of 150 distinct diseases and traits demonstrates that only 12% of disease-linked SNPs identified to date are located within exons of protein-coding genes (or occur in tight linkage disequilibrium with protein-coding regions of genes), and a vast majority (80%) of variations occurs within non-protein-coding sequences.1 These findings are particularly striking, because SNPs in protein-coding regions are heavily over-represented on genotyping arrays,2,3 which further highlights the significant knowledge gap in our understanding the role of intronic and intergenic sequences in regulation of phenotypes. Intergenic SNPs, which are located in genomic regions distant from known protein-coding genes and microRNAs, comprise ~40% and represent the largest class of SNP variations with significant associations to multiple common human disorders.4 Recently, we reported identification of 13 intergenic small trans-regulatory RNAs (trans-RNAs) containing disease-linked SNPs and manifesting cell type-specific patterns of expression in human cells.4 Functional analysis of one of these trans-RNAs (rs2670660), which are transcribed from the intergenic sequence located in the 17p13 region at ~30 kb distance from the NLRP1 gene and termed NLRP1-locus snpRNA, demonstrates SNP allele-specific biological effects of trans-RNAs on cell cycle progression, differentiation and gene expression in human cells.4 These initial observations indicate that intergenic disease-associated genetic loci (IDAGL) are transcriptionally active and may influence the biological behavior of human cells via noncoding RNA intermediaries.
Several independent GWAS identified prostate cancer susceptibility SNPs within the 8q24 gene desert,5–12 which were subsequently classified into three adjacent genomic regions of increased risk for prostate cancer.13 These regions contain multiple prostate cancer risk SNPs with the most significant convincingly replicated SNPs being rs1447295 in region 1, rs16901979 in region 2 and rs6983267 in region 3.13 Consistent with the hypothesis of biological significance of IDAGL, several groups reported that multiple long-range enhancers are present within genetically defined prostate cancer risk regions of the 8q24 gene desert14–18 and demonstrated functional correlations of enhancer's activity with prostate cancer risk SNP sequences.
In this work, we report sequences of 96 snpRNAs transcribed from IDAGL that were discovered in GWAS and linkage studies of common human disorders and replicated in independent cohorts of patients (Supplemental refs. 1–43). Functional analyses of snpRNAs associated with prostate cancer reveal novel genome-wide trans-regulatory networks of noncoding snpRNAs that are transcribed from long-range enhancers and facilitate development of a castration-resistant phenotype of human prostate cancer. Expression level of 8q24-locus prostate cancer (PC) susceptibility snpRNAs is regulated by NLRP1-locus snpRNAs (rs2670660), which are transcribed from the intergenic long-range enhancer sequence located in the 17p13 region at ~30 kb distance from the NLRP1 gene. Highly concordant expression profiles of the NLRP1-locus snpRNAs and 8q24 snpRNAs (r = 0.896; p < 0.0001) in clinical PC samples and experimental evidence of trans-regulatory effects of NLRP1-locus snpRNAs on expression of 8q24-locus snpRNAs indicate that resistance to androgen depletion (RAD) in vitro and thecastration resistance (CR) phenotype in vivo of human PC cells can be triggered by noncoding (nc) RNA molecules transcribed from the NLRP1-locus intergenic enhancer and downstream activation of the 8q24-locus snpRNAs. Our experiments indicate that activation of the NLRP1-locus snpRNA/miR-205 axis may contribute to development of clinically significant prostate cancer by reducing expression of the PTEN tumor suppressor.
To date, we analyzed 109 IDAGL using a targeted RT-PCR-based screening protocol that is designed for identification of RNA molecules containing disease-associated SNP sequences.4 We identified 96 trans-regulatory RNAs (snpRNAs) 100 to 300 nucleotides in length containing intergenic SNPs that are associated with 21 common human disorders (Tables S1–9). Molecular identities of discovered RNA molecules were established based on a requirement of reverse transcription for detection of expected size PCR products and confirmed by size correspondence, sensitivity to RNase treatment and resistance to DNase treatment of primary PCR products and nested PCR products, which were derived from second round of amplification of purified primary PCR products. In all instances, molecular identities of discovered RNA molecules were validated by direct sequencing of purified primary PCR products (Tables S6–9).
Considering the evidence of evolutionary conservation as one of the important criteria supporting the hypothesis of functionality of discovered snpRNA molecules, we analyzed a set of 13 IDAGL that generate evolutionarily conserved snpRNAs. Notably, 9 of 13 (70%) evolutionarily conserved intergenic sequences manifest common genomic topologies in the mouse and human genomes; that is, snpRNA-encoding sequences in both species have the same flanking protein-coding genes (Fig. S1). We made use of the extensive genome-wide chromatin domain maps19,20 to assess the chromatin state of evolutionarily conserved IDAGL. This analysis reveals a consensus chromatin signature of evolutionarily conserved snpRNA-encoding IDAGL comprising H3K27Me3, CBP/CREB and POL2 proteins (Fig. S2), which indicate that IDAGL may represent a distinct class of Polycomb-regulated enhancers. Consistent with this notion, many evolutionarily conserved snpRNA-encoding IDAGL display enhancer's signature H3K4me1 histone marks (Fig. S2), whereas histones H3K4Me3 and H3K36Me3, which represent chromatin signatures of promoters and transcriptionally active sites,21–23 appear less frequently. We confirmed the validity of these findings for both evolutionary-conserved and non-conserved IDAGL by performing the analysis of the chromatin state maps of IDAGL in human embryonic stem cells, ESC (Figs. S1–3 and Table S10). Based on this analysis, we concluded that the nearly ubiquitous chromatin state signature of IDAGL in human ESC consists of histone H3K27me3 and Ezh2 protein (Figs. S2 and S3). In addition to consensus components, the IDAGL chromatin maps display clearly discernable disorder type-specific protein marks that manifest common association patterns for pathogenetically- and epidemiologically related disease phenotypes (Fig. S4 and Table S10). Analysis of ENCODE chromatin state maps in nine human cell lines validates this conclusion and reveals a consensus chromatin signature of IDAGL comprising H3K27Me3 and H3K4Me1 histones, Ezh2 and disease-state-specific parts of transcription factors (genome.ucsc.edu/ENCODE).
Given that SNP phenotype associations can result from tagging and may, therefore, not necessarily indicate the true causal relationships, we thought to undertake the functional analysis of selected snpRNAs to show that snpRNA-mediated phenotype-altering effects on cellular behavior reflect their role in diseases. Functional analysis of human cell lines engineered to constitutively express distinct allelic variants of NLRP1-locus snpRNAs revealed marked changes of biological behavior of cells expressing disease-associated allelic variants of snpRNAs that are consistent with targeted alterations of cell cycle progression/differentiation pathways.4 To ascertain whether changes of cellular functions caused by forced expression of NLRP1-locus snpRNAs are consistent with their suspected role in diseases, we analyzed monocyte/macrophage transdifferentiation of human THP-1 cells constitutively expressing 52 nt NLRP1-locus snpRNAs containing either ancestral, major A allele or disease-linked, minor G allele (Fig. 1). We found that G allele-expressing THP-1 cells, in contrast to control or A allele-expressing cells, undergo massive apoptosis during the differentiation process and are capable of generating 5-fold less macrophages compared with A allele-expressing cells (Fig. 1A). Expression of snpRNAs containing ancestral A allele confers protection against apoptosis and facilitates production of nearly 2-fold more macrophages compared with control cells (Fig. 1A). Notably, macrophages derived from A allele-expressing THP-1 cells manifest significantly more potent phagocytic activity compared with control cells or macrophages derived from G allele-expressing THP-1 cells (Fig. 1A, inset). These phenotypic changes are not due to the generally diminished cellular functions, because G allele-expressing cells manifest a sustained long-term viability in optimal growth conditions and have significantly higher motility compared with the control cultures and A allele-expressing cells (Fig. 1B).
Next, we thought to determine whether the evolutionary conservation of NLRP1-locus snpRNAs is sufficient to facilitate the similar biological effects on mouse cells. We found that human NLRP1-locus snpRNAs exert biological effects on mouse macrophages and mesenchymal cells that are strikingly similar to those documented for human cells (Fig. 1C, top parts). Furthermore, the allele-specific biological effects of forced expression NLRP1-locus snpRNAs mediated by the lentiviral gene transfer can be readily reproduced by transfection of synthetic LNA oligonucleotide analogs of snpRNA molecules (Fig. 1C, bottom parts). Based on these observations, we conclude that substitution of a single nucleotide in snpRNA molecules may cause dramatic alterations of cell fate and result in marked changes of cellular phenotypes of potential clinical relevance.
Q-RT-PCR expression profiling experiments reveal that NLRP1-locus snpRNAs markedly alter the expression of prostate cancer susceptibility (PCS) snpRNAs in human cells (Fig. 2), which suggests that extensive phenotypically relevant regulatory crosstalk may exist between distinct classes of snpRNAs. We noted particularly interesting patterns of association between expression of NLRP1-locus snpRNAs and 8q24-locus PCS-snpRNAs A6, expression of which appears elevated in highly metastatic variants of human prostate cancer cell lines as well as prostate cancer cell lines selected for growth in androgen-free media (Fig. 2). To confirm that observations of clinically relevant biological effects of IDAGL snpRNAs are not limited to the NLRP1-locus snpRNAs, we analyzed the effect of forced expression of distinct allelic variants of PCS-snpRNAs on behavior of hormone-dependent human prostate carcinoma cell line LNCap cultured under various growth conditions. We found that expression of both NLRP1-locus and PCS-snpRNAs have no effect on growth of LNCap cells in adherent cultures (data not shown), whereas the anchorage-independent growth potential of LNCap cells in agar was markedly enhanced by PCS-snpRNAs A6, which was more significant for PCS-snpRNAs containing the prostate cancer susceptibility allele (Fig. 2C). Human prostate carcinoma LNCap cells are hormone-dependent and are not capable of growing in androgen-free media (Fig. 2C). Intriguingly, when we cultured LNCap cells engineered to stably express snpRNAs in androgen-free conditions, we found that expression of either NLRP1-locus snpRNAs or PCS-snpRNAs A6 facilitates hormone depletion-independent growth of LNCap cells (Fig. 2C).
Highly metastatic LNCapLN3 cells that were selected for increased metastatic potential by serial orthotopic re-implantation after recovery from metastatic lesions acquire markedly increased ability to grow in agar and in androgen-depleted media (Fig. 2C, insets). RT-PCR expression profiling analysis reveals that LNCapLN3 cells manifest markedly higher expression level of the PCS-snpRNA A6 (Fig. 2A). To determine whether altered grow properties of the highly metastatic LNCaLN3 cells are associated with increased expression of PCS-snpRNA A6, we engineered LNCapLN3 cell variants stably expressing antisense variants of the PCS-snpRNA A6 (AS-PCS-snpRNA A6). We found that expression of AS-PCS-snpRNA A6 did not affect growth of adherent cultures of LNCapLN3 cells in complete media (data not shown). In contrast, expression of AS-PCS-snpRNA A6 markedly diminishes growth of LNCapLN3 cells in agar cultures and in androgen-depleted media (Fig. 2C, insets).
Parental LNCap cells are poorly tumorigenic in vivo and are not capable of forming tumors in castrated mice. Remarkably, when we inoculated LNCap cells engineered to stably express PCS-snpRNAs A6 (LNCap-A6 cells), all castrated mice developed rapidly growing, large tumors (Fig. 2D), which appear to grow faster in castrated animals compared with sham-operated controls. As expected, no tumors were detected in castrated mice after inoculation of parental or control-transfected LNCap cells. These experiments demonstrate that stable expression of PCS-snpRNA A6 in low-malignancy, hormone-dependent human prostate carcinoma cells confers castration-resistant phenotype and facilitates androgen depletion-independent growth.
To facilitate evaluation of the clinical significance of discovered RNA molecules, we developed and validated a Q-PCR-based method of quantitative analysis of snpRNA expression. Our Q-PCR method of snpRNA expression demonstrates excellent reproducibility for both NLRP1-locus snpRNAs (r2 = 0.888 for technical replicates) and 8q24-locus snpRNAs (r2 = 0.808 for technical replicates) and > 8,000-fold dynamic range of detection's sensitivity in 3–5 ng of cDNA (Fig. 3C). snpRNAs expression analysis in clinical PC samples reveals markedly increased snpRNA expression levels in tumor tissues compared with the adjacent normal prostate (Fig. 3A and B). Notably higher expression levels of snpRNAs in human prostate adenocarcinoma samples and apparent association of increased snpRNA expression with pathohistological features of clinically significant disease [122-fold and 45-fold in stage 2, Gleason 7 tumors (p = 0.03); 370-fold and 127-fold in stage 2, Gleason 8 tumors (p = 0.0001) for NLRP1-locus and 8q24-locus snpRNAs, respectively; Fig. 3C] underscore potential translational relevance of our findings. Unexpectedly, we found a remarkably high correlation (r = 0.896; p < 0.0001) of expression levels of the NLRP1-locus snpRNAs and 8q24 RAD-locus snpRNAs (r2 = 0.803) in clinical PC samples (Fig. 3C). We conclude that Q-PCR analysis of clinical samples reveals pathophisiological relevance of discovered snpRNA molecules by demonstrating (1) markedly increased expression levels in prostate cancer; (2) association of increased expression levels with pathohistological features of the clinically aggressive disease; (3) highly concordant expression profiles of snpRNAs transcribed from genetic loci located at distinct chromosomes (17p13 and 8q24). Collectively, these data indicate the presence of a novel trans-regulatory pathway that involves increased expression of intergenic snpRNAs transcribed from 17p13 and 8q24 loci and suggest co-selection of these regulatory features in clinically significant prostate cancer. Our clinical and experimental findings, presented in Figures 2 and and33, indicate that prostate cancer cells that will emerge in individuals expressing high level of the prostate cancer susceptibility snpRNAs are more likely to acquire the intrinsic ability to grow in androgen-low or androgen-free conditions, and, therefore, this type of prostate cancer is more likely to progress early to hormone-independent, incurable metastatic disease.
Several intergenic PC-risk SNPs were mapped to functional enhancer elements in human cells,14–18 suggesting that snpRNAs are products of transcriptional activity of long-range enhancers. Consistent with this idea, RNA polymerase II (RNAPII) binding to thousands of enhancers and widespread transcription of neuronal activity-regulated enhancers in mice has been documented recently.19 RNAPII at enhancers transcribes bi-directionally to generate small non-coding enhancer RNA (eRNAs), synthesis of which appears to require the engagement of the enhancers with promoters of target protein-coding genes.19 We thought to investigate whether the NLRP1-locus IDAGL may assert the detectable allele-specific functional effect on transcription by cloning distinct allelic variants of the chemically synthesized 2 kb NLRP1 IDAGL region into the vector containing the firefly luciferase reporter, transfecting the purified experimental and control plasmids into target cells and measuring the levels of luciferase activity. To control for transfection efficiency, the IDAGL-containing firefly luciferase reporter plasmids were co-transfected with plasmids containing a renilla luciferase reporter. We found that the presence of distinct allelic variants of IDAGL sequences significantly alter the expression of firefly luciferase (Fig. 4), which suggests that IDAGL may function as SNP allele-specific intergenic enhancers/insulators and affect the transcription of protein-coding genes. DNA/RNA complementarily rules suggest that snpRNAs may affect the activity of corresponding enhancer elements. Consistent with this idea, both enhancer and insulator activities of the NLRP1-locus IDAGL are regulated by NLRP1-locus snpRNAs (Fig. 4). The enhancer/insulator functions of IDAGL/snpRNA feedback regulatory loops appear specific, because they are dependent on the SNP allele status of both DNA and RNA sequences (Fig. 4 and data not shown). Collectively, results of Chip-seq experiments and luciferase reporter assays are consistent with the hypothesis that many snpRNAs are products of transcriptional activity of intergenic long-range enhancers.
Analysis of the predicted secondary structures of identified RNA molecules reveals that one of the notable common features of snpRNAs secondary structures is the presence of loop sequences that contain SNP-bearing 8–11 nt segments identical to primary sequences of microRNAs (Fig. S5). Intergenic snpRNA sequences contain multiple potential target sites for microRNAs, which are often clustered around SNP nucleotides (Fig. S5). These data suggest that an epigenetic regulatory crosstalk between snpRNAs and microRNAs may exist with the potential downstream effect on expression of protein-coding genes. Analysis of human cell lines engineered to stably express distinct allelic variants of the NLRP1-locus snpRNAs confirmed the validity of this hypothesis.4 Expression profiling experiments identify 36 microRNAs differentially regulated in BJ1 cells expressing distinct allelic variants of the NLRP1-locus snpRNAs (Figs. 5 and S6). Analysis of genomic coordinates reveals that genes encoding 18 of 36 (50%) NLRP1-locus snpRNA-regulated microRNAs are located within ~200 kb regions on 14q32, which is immediately adjacent to the long noncoding RNA gene, MEG3 (Figs. S6 and S7). Thus, our analysis demonstrates that principal molecular targets for allele-specific trans-regulatory actions of NLRP1-locus snpRNAs are clusters of microRNAs within a 200 kb segment of the 14q32 region, which is adjacent to the snpRNA-targeted long noncoding RNA gene, MEG3. Detailed exploration of individual snpRNA-regulated transcripts shows that expression levels of distinct classes of noncoding RNAs are significantly altered in human cells engineered to stably express NLRP1-locus snpRNAs (Figs. 5E and S6). These results indicate that one of the important epigenetic features of snpRNA-mediated regulatory effects is genome-wide changes in expression of multiple diverse classes of noncoding RNAs, documented examples of which include snoRNAs and snoRNA-host genes (SNORD113; SNHG1; SNHG3; SNHG8); long noncoding RNAs (MEG3, HOTAIR, tncRNA and MALAT1); microRNAs, microRNA-precursors and protein-coding genes introns of which host microRNA genes (ATAD2; KIAA1199; Figs. 5E and S6).
To understand the molecular basis of the regulatory crosstalk between snpRNAs and a diverse network of noncoding RNAs, we performed sequence homology profiling and structure-functional analysis of relevant molecular entities. This analysis shows that snpRNA-regulated noncoding RNAs manifest discernable homology/complementarity features of primary sequences. Notably, all 36 NLRP1-locus snpRNA-regulated microRNAs have at least one potential target site within 152 nt sequence of the NLRP1-locus snpRNA molecule, and many microRNA target sites manifest allele-associated changes of the minimal free energy (mfe) snpRNA/microRNA hybridization (Figs. 5 and S5–7). Comparisons of the allele-associated changes of the mfe values and experimentally defined changes of the microRNA expression levels reveal a highly significant inverse correlation between these two variables; that is, lower mfe values appear to correspond to higher levels of microRNA expression (Figs. 5 and S8). These results suggest a model of snpRNA-mediated regulation of microRNA expression according to which high affinity (low mfe) snpRNA alleles would facilitate increase abundance levels of corresponding microRNAs (Figs. 5 and S8).
Recent experiments demonstrate that let-7 microRNA release from complexes with Argonaute proteins and subsequent degradation can both be blocked by addition of microRNA target RNA, which results in increased let-7 microRNA levels.24 Computer modeling experiments demonstrate that let-7b microRNA follows the pattern of allele-associated mfe changes characteristic of microRNAs expression levels of which are lower in G allele-expressing cells (Figs. 5 and S6D). We reasoned that if let-7 bioactivity model is valid for snpRNA-mediated effects on microRNAs, we should be able to detect the corresponding snpRNA allele-context-specific changes of the let-7b expression and activity in NLRP1-locus snpRNA-expressing cells. Consistent with this hypothesis, Q-PCR experiments and luciferase reporter assays demonstrate that both expression and activity of the let-7 microRNA are significantly modified in RWPE1 cells genetically engineered to stably express NLRP1-locus snpRNAs (Figs. 5 and S8D). We confirmed the validity of these conclusions by documenting similar relationships between snpRNA allele-context-specific mfe changes and effects on microRNA expression and activity for miR-205 microRNA (Figs. 5 and S8D, bottom parts). Based on these observations, we propose that one of the mechanisms of snpRNA-mediated effects on microRNA-regulated processes is associated with snpRNA allele-specific changes of microRNA abundance level and activity. The underlying molecular events are likely to resemble the let-7 bioactivity model, according to which interaction of RNA molecules with microRNAs interfer with microRNA release from complexes with Argonaute proteins and prevent subsequent degradation of microRNA.24
Microarray analysis demonstrates that epigenetic reprogramming of human cells following activation of the NLRP1-locus snpRNA-regulated network of noncoding RNAs involves thousand's of downstream protein-coding targets, including 586 transcripts encoded by Polycomb-regulated bivalent chromatin domain genes4 and 209 mRNAs encoded by human homologs of mouse genes flanking intergenic enhancers (data not shown). Collectively, these results as well as functional data support the model of sequential regulatory pathway of NLRP1-locus snpRNAs > microRNAs > protein-coding transcripts. Further analysis of the regulatory circuitries of this network reveals markedly altered expression of prostate susceptibility snpRNAs in cell lines genetically engineered to stably express either NLRP1-locus snpRNAs or snpRNA-regulated microRNAs (Figs. 2A, ,2B2B and S8). Results of these experiments demonstrate that:
Interestingly, forced expression of miR-205 recapitulates many molecular, phenotypic and clinical features associated with expression of NLRP1-locus snpRNAs (Fig. 6), including markedly decreased expression of the PTEN tumor suppressor. These data in conjunction with the experimental evidence of NLRP1-locus snpRNA-induced expression and activity of miR-205 in human cells (Fig. 5) indicate that activation of the NLRP1-locus snpRNA/miR-205 axis may contribute to development of clinically significant prostate cancer.
GWAS identify highly significant SNP phenotype associations (p < 5 × 10−8), the vast majority of which (80%) occur within non-protein-coding sequences.1 These consistent and reproducible findings highlight a major knowledge gap in our understanding of phenotype-defining functions of human genome segments lacking protein-coding potentials. Our experiments reveal widespread transcription at IDAGL, raising the possibility that non-protein-coding RNA molecules may play an important role in predisposition to multiple common human disorders. We demonstrate that forced expression of 52 nt snpRNAs imposes a castration-resistant phenotype on human prostate carcinoma cells. It transforms low-malignancy, hormone-dependent human prostate cancer cells into highly malignant, androgen depletion-independent prostate cancer. To facilitate the assessment of clinical significance of discovered snpRNAs, we developed Q-PCR methods of quantitative analysis of snpRNAs and validated its utility by analysis of snpRNA expression in clinical samples. Our analysis reveals markedly elevated snpRNA expression levels in prostate cancer tissues compared with the adjacent normal prostate (Fig. 3). Notably higher expression levels of snpRNAs in human prostate adenocarcinoma samples and apparent association of increased snpRNA expression with pathohistological features of clinically significant disease (high Gleason score) highlight potential translational relevance of our findings. Collectively, our data imply that prostate cancer cells that emerge in individuals expressing high levels of the prostate cancer susceptibility snpRNAs are more likely to progress early to hormone-independent, incurable, metastatic disease. Our work defines the intergenic 8q24 region as RAD-regulatory locus of critical significance for human prostate cancer, reveals previously unknown molecular links between the innate immunity/inflammasome system and development of hormone-independent PC and identifies novel diagnostic and therapeutic targets successful validation of which should be highly beneficial for clinical management of prostate cancer patients.
Our experiments demonstrate that IDAGLs represent multifunctional genomic trans-regulatory domains possessing a broad range of intrinsic regulatory functions that are mediated by both DNA sequences and transcribed RNA molecules. Many IDAGLs harbor a consensus chromatin signature comprising H3K27Me3 and H3K4Me1 histones, Ezh2 and disease-state-specific parts of transcription factors. IDAGL's functions as cell type-specific, long-range enhancers or insulators appear dependent on the allelic status of a disease-linked SNP and are regulated by snpRNAs. Our experiments indicate that microRNAs that have complementary sequences in corresponding snpRNAs may constitute one of the primary targets of snpRNA-induced genomewide epigenetic regulatory networks, engagement of which is triggered by distinctive single-base-level molecular recognition events (Figs. 7 and S8). Altered microRNA expression and activity would facilitate an epigenetic amplification of a single-base-driven regulatory event by inducing downstream mRNA expression changes of many (perhaps, thousands) protein-coding genes which would ultimately cause clinically significant alterations of cellular functions. Examples of experimentally identified components of such a regulatory network are key inflammasome/innate immunity pathway-related genetic targets.4 In agreement with this mechanism, we found markedly altered expression of prostate cancer susceptibility snpRNAs in cell lines genetically engineered to stably express either NLRP1-locus snpRNAs or snpRNA-regulated microRNAs (Figs. 2 and S8). Further, our data suggest that microRNAs may contribute to biogenesis of snpRNAs by guiding Argonaute family endonucleases to execute a sequence-specific cleavage of snpRNAs and putative small snpRNA-precursors, long noncoding snpRNAs. Consistently, we found that many small snpRNAs exhibit cell type specific expression profiles, whereas long noncoding snpRNAs containing disease-associated SNP sequences manifest more ubiquitous expression patterns.4 These observations indicate that small snpRNAs may represent products of a cell type-specific processing of long noncoding snpRNAs and support the hypothesis that microRNAs are intrinsic regulatory components of snpRNA/enhancer IDAGL networks that contribute to maintenance of epigenetic regulatory state in a cell.
Our experiments indicate that activation of the NLRP1-locus snpRNA/miR-205 axis may contribute to development of clinically significant prostate cancer by reducing expression of the PTEN tumor suppressor. We found that NLRP1-locus snpRNAs induce expression and activity of miR-205 in human cells (Fig. 5) and forced expression of miR-205 recapitulates many molecular, phenotypic and clinical features associated with expression of NLRP1-locus snpRNAs (Fig. 6), including markedly decreased expression of the PTEN tumor suppressor. PTEN tumor suppressor has been identified as a target for miR-205 based on target prediction algorithms and miR-205 overexpression experiments using the pGL3-PTEN 3′UTR reporter constructs with mutations of miR-205-binding sites, microarray and TaqMan Q-PCR analyses and protein gel blot analysis.25 Altered expression and activity of miR-205 have been associated with epithelial-to-mesenchymal transition (EMT), emergence of stem cell-like properties and maintenance of mammary epithelial cell progenitors.25–27
Results of our experiments are highly consistent with the emerging concept of the pervasive, global transcriptional activity of human genomes, which is supported by observations that vast majority of transcripts in human cells is represented by noncoding RNAs (refs. 28–37; see Sup. Material for discussion and additional references). Collectively, they lend credence to the idea that intergenic DNA sequence variations may contribute to disease pathogenesis via noncoding RNA intermediaries that assert trans-regulatory effects on epigenetic regulatory circuitry of a cell. This concept challenges the dominant position of protein-centric experimental paradigm that is focused on analysis of effects of genetic variations on protein-coding genes within or near boundaries where the genetic variants are located. Conclusive evidence of ubiquitous expression in human cells of intergenic disease-associated genetic loci indicates the potential clinical benefits of the concurrent analyses of DNA sequences and expression profiling of corresponding RNA products. This is critically important, because the transcriptional activity and transcript abundance levels are genetically defined, quantitative traits, assessment of which should enhance the disease risk prediction power of corresponding genetic tests. We anticipate that findings reported here should have a major near-term implication on design and execution of genome-wide association studies and follow-up mechanistic studies of non-protein-coding disease-linked loci. To this end, we provide sequences of validated primers and experimental protocols (Tables S1–5) to facilitate the immediate implementation of this type of analyses for 96 intergenic SNPs associated with increased risk of developing 21 common human disorders.
Experimental progress in defining candidate SNP variations associated with human disorders is rapidly generating leads with potential clinical relevance. Analysis of defined SNP variations appears to distinguish distinct autoimmune disorders and has a prognostic significance in leukemia and lymphoma.44,45 This progress is not limited to the investigations of disease phenotypes. Aging and longevity phenotypes in human populations have been associated with multiple SNP variations.46–48 Novel conceptual principles and comprehensive analytical approaches are beginning to emerge that signify growth and maturity of this exciting field. There is increasing understanding of a requirement for systematic, in-depth analysis of tge functional and clinical significance of non-protein-coding risk regions, particularly with respect to the cancer risk loci.49,50 Rapidly emerging experimental and clinical evidence supports the growing recognition of the important roles of microRNAs and other classes of noncoding RNAs in human diseases.51,52 Based on theoretical considerations as well as computational and bioinformatics analyses, we proposed a disease phenocode hypothesis that integrates the potential mechanistic relationships between structural features and gene expression patterns of disease-linked SNPs, microRNAs and mRNAs of protein-coding genes in association to phenotypes of multiple common human disorders.53–58 In this context, our work documents several important steps toward critical experimental testing and validation of potential practical utility and clinical relevance of a disease phenocode hypothesis.
Detailed description of all aspects of the experimental procedures, including snpRNAs and microRNA isolation and activity analysis; microRNA expression analysis; cell staining and flow cytometry; induced differentiation of THP-1 cells; lentivirus production and generation of stably transfected BJ1, RWPE1, U937 and THP-1 cells; colony growth and clonogenic growth assays can be found in the online Supplement and previous publications.4 Supplemental information is available at www.genlighttechnology.com
Sub-confluent monolayers of control LNCaP or LNCaP-A6 cells were trypsinized and counted then resuspended in RPMI-1640 medium with 25% charcoal-dextran stripped fetal bovine serum (CDS-FBS) and 50% (vol/vol) 2x Matrigel (Millipore, Inc.). Aliquots (250 µls) containing 106 cells were then injected subcutaneously in the flanks of 8-week-old male NCR Nu/Nu mice (Taconic, Inc.) that had been surgically castrated 10 d prior through a scrotal incision under anesthesia or sham-castrated, involving the scrotal incision without removal of the testis. Tumor size was estimated from largest diameter (a), smallest diameter (b) and height (c) measurements with calipers, with tumor volume calculated using the formula V = π/6 × a × b × c. Mice were euthanized at the end of the experiments; tumors were surgically recovered, weighted and processed for pathological and molecular analyses.
Primary data sets of SNPs for metaanalysis of genomic coordinates of SNP variations identified in genome-wide association studies (GWAS) of up to 712,253 samples comprising 221,158 disease cases, 322,862 controls and 168,233 case/control subjects of obesity GWAS were obtained from previously published studies (references in online Sup. Material). Mapping of the SNP genomic coordinates was performed based on the NCBI release of Human Genome Build 36.3 (reference assembly). Genomic coordinates of the human K4-K36 domains and human lincRNAs are available in the online Supplemental Data Set (Sup. Methods, ref. 7). Genomic coordinates and gene names of the human bivalent domain genes were obtained from a recently published study (Sup. Methods, ref. 8).
Technical and analytical aspects as well as stringent QC and statistical protocols of gene expression analysis experiments was carried-out essentially as described in our published work.39–42 Briefly, the array hybridization and processing, data retrieval and analysis was performed using standard sets of the Affymetrix equipment, software and protocols in a state-of-the-art Affymetrix microarray core facility. RNA was extracted from cell cultures of two independent biological replicates of each experimental condition and analyzed for sample purity and integrity using BioAnalyzer (Agilent). Expression analysis of 54,675 transcripts was performed for each sample in duplicates using Affymetrix HG-U133A Plus 2.0 arrays. Data retrieval and analysis was performed using MAS5.0 software, and concordant changes of gene expression for each experimental condition were determined at the statistical threshold p-value < 0.05 (two-tailed t-test). All microarray analysis data are publicly available coincident with the date of publication.
microRNA was extracted from adherent cells lysed on culture plates using the mirVana miRNA Isolation kit (Ambion). Homogenized cell lysates were frozen at −80°C for at least 24 h prior to microRNA purification. miRNA concentration was checked using a NanoDrop (Thermo Scientific) before checking quality on a Bioanalyzer (Agilent Technologies).
To assay the activity of microRNAs in transfected cells, we used a miRNA Luciferase Reporter Vector (Signosis) specific for the microRNA of interest. The target site sequence of the reporter vector is complementary to the miRNA, therefore a decrease in luciferase signal would indicate an increase in microRNA activity. Cells were transfected with the reporter vector using FuGENE 6 Transfection Reagent (Roche); the transfection was allowed to run 48 h before the cells were lysed using Luciferase Cell Culture Lysis Reagent (Promega). The lysates were read using the FLUOstar OPTIMA system (BMG Lab Technologies), with 20 micro liters of Luciferase Assay Reagent (Promega) injected into each well immediately prior to reading.
To analyze a spectrum of miRNA activity in the infected cell lines, we performed qPCR using the TaqMan Human MicroRNA Array v1.0 (Applied Biosystems) run on the 7900HT Fast Real-Time PCR System fitted with the specific block to run 384-well TaqMan Low Density Arrays (Applied Biosystems). This TaqMan array is distributed on a micro fluidics card, which allows for high reproducibility with minimal error. The array contains 365 different human miRNA assays and two small nucleolar RNAs that function as endogenous controls for data normalization. All miRNA samples were analyzed for quality control and processed at the Functional Genomics Core of the University of Rochester in Rochester, New York. We used the SDS 2.2 software, the platform for the computer interface with the 7900HT PCR System, to generate normalized data, compare samples and calculate RQ.
(Detailed descriptions of protocols A–E are presented in the Supplement)
Luciferase reporter assays were performed to identify allele-specific features of SNP-bearing RNA and DNA sequences. Enhancer/insulator activities of the 2 kb intergenic DNA sequence containing distinct allelic variants of the rs2670660 NLRP1-locus SNP. Luciferase reporter assays to assess the enhancer activity of 2 kb NLRP1-locus intergenic region were performed on RWPE1 and HEK293 cells transiently expressing either control luciferase reporter plasmids or plasmids containing chemically synthesized 2 kb sequences flanking distinct allelic variants of the NLRP1-locus intergenic SNP rs2670660, which is positioned exactly in the middle of the enhancer's sequence. Because allelic differences in the luciferase reporter assays tend to be modest, measurements were controlled by multiple replications of all experimental elements of the enhancer assay. We considered the possibility of multiple levels of variations, including plasmid preps, transfection replicates (along with transfection efficiency measurements, such as using the Renilla lucferase co-transfection controls for normalization). At least three independent experiments were performed for each setting and three replicates of the luciferase assay were performed in each experiment. Results were validated using at least two independent plasmid preps to control for this potentially significant source of variation.
Cells were stained at a concentration of 1 × 106 cells per 100 ul of HBSS with 2% HICS. Antibodies at appropriate dilutions (CD14-Pacific Blue, Biolegend, Inc. and CD11b-Alexa Fluor® 647, Biolegend, Inc.) were added. Staining duration was for 30 min with rotation at 4°C. Cells were then washed with staining medium three times and re-suspended in staining medium. The stained specimens were then analyzed using FACSVantage (BD Biosciences; www.bdbiosciences.com) or FACSAria with either Diva or CellQuest software (BD Biosciences). The cell counter of the flow cytometers was used to determine cell numbers. Cells were collected into HBSS with 2% HICS.
Approximately 2 × 106 THP-1 cells (5 × 105 cells/ml) in a 25 cm flask were induced to differentiate by treatment with 20 uM PMA (Sigma-Aldrich) for 4 d.
Allele-specific sense and antisense variants of the rs2670660 sequence of 52 nt in length (Fig. 1A in ref. 4; nucleotide sequence shown in shaded box) were chemically synthesized and cloned sequentially into pUC57 plasmid by EcoRV (GeneScript Corporation) and pCDH-CMV-MCS-EF1-copGFP plasmid by EcoRI and NotI (SystemsBio). The integrity and molecular identity of the synthetic sequences as well as designed plasmid vectors were monitored by restriction enzyme-mapping analysis and direct sequencing. Lentiviruses were generated by co-transfecting pLentiviral vector with GFP only plasmids (control cultures) or GFP plasmids with synthetic allele-specific 52 nt sequences of the SNP rs2670660, as shown in Figure 1 (ref. 4 and Sup. Material for additional information), and packaging mix (Invitrogen) into 293FT cells using Lipofectamine 2000 according to the manufacturer's instructions (Invitrogen). Target cells were then infected with viral supernatant for 24 hr. Flow cytometry analysis for GFP expression was performed to confirm the infection and assess the transfection efficiency. Experiments were performed using cultures with transfection efficiency > 90%.
Cells from sub-confluent cultures (~70% confluence) were seeded in triplicates into 6-well plates (100 cells per well), cultured for 2 weeks and then stained with 0.1% crystal violet for 5 min. Plates were scanned, and the number of colonies containing > 50 cells was counted.
Detailed protocols for data analysis and documentation of the sensitivity, reproducibility and other aspects of the quantitative statistical microarray analysis using Affymetrix technology have been reported previously in references 39–42. Forty to 60 percent of the surveyed genes were called present by Affymetrix Microarray Suite version 5.0 software in these experiments. The concordance analysis of differential gene expression across the data sets was performed using Affymetrix MicroDB version 3.0 and DMT version 3.0 software as described previously in references 39–42. We processed the microarray data using the Affymetrix Microarray Suite version 5.0 software and performed statistical analysis of the expression data set using the Affymetrix MicroDB and Affymetrix DMT software. The Pearson correlation coefficient for individual test samples and the appropriate reference standard were determined using GraphPad Prism version 4.00 software (GraphPad Software). Two-tailed t-test or Fisher's exact tests were employed to estimate the statistical significance of the differences between various experimental conditions and corresponding controls. We calculated the significance of the overlap between the lists of differentially regulated genes and other experimental end-points by using the hypergeometric distribution test.43 Analyses of the predicted secondary structures of identified snpRNAs, screening for potential target sites of microRNAs and sequence homology profiling of snpRNAs and microRNAs were performed using web-accessible, publicly available resources (rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi; www.ncrna.org/centroidfold; bioinfo.uni-plovdiv.bg/microinspector; www.mirbase.org).
This work was supported, in part, by the grants from the National Institute of Health and Charitable Leadership Foundation (CLF), Clifton Park, NY. We thank you many reviewers for constructive critical comments during the nearly two-year-long peer review process of this manuscript and acknowledge their contribution to our work.
A.B.G., J.M. and S.M. contributed equally to this work. G.V.G. conceived the idea and conceptualized the framework of the experimental work. G.V.G. and A.B.G. conceived and designed the experiments. S.M., J.M., D.G., CL. I.G., R.B. performed the experiments. G.V.G., A.B.G., S.M., J.M., D.G., C.L., R.B. analyzed the data. G.V.G., A.B.G., S.S., R.B. contributed reagents/materials/analysis tools/financial support. G.V.G. wrote the paper.
No potential conflicts of interest were disclosed.