|Home | About | Journals | Submit | Contact Us | Français|
Many studies have highlighted the role that microRNAs have in physiological processes and how their deregulation can lead to cancer. More recently it has been proposed that the presence of single nucleotide polymorphisms in microRNA genes, their processing machinery and target binding sites affects cancer risk, treatment efficacy and patient prognosis. In reviewing this new field of cancer biology, we describe the methodological approaches of these studies and make recommendations for which strategies will be most informative in the future.
Human populations are estimated to be 99% identical at the level of the genetic code; thus, human diversity (other than epigenetics) arises from the remaining 1% of variation1, most of which is due to single nucleotide polymorphisms (SNPs). These are a non-repetitive form of sequence variation that was first identified in 1978 in the β-globin gene cluster2. To date, approximately 10 million SNPs have been identified in the human genome, occurring on average every 100 to 300 base pairs (International HapMap Project website; see Further information)1,3. Although most SNPs are silent, epidemiological studies have established a link between variations in gene sequence, environmental interaction and cancer risk. By identifying genetic markers of susceptibility and characterizing gene–environment interactions, it might be possible to reduce cancer mortality through early diagnosis and personalized therapy.
As our knowledge of the topology of the genome has evolved, a new class of non-coding RNAs has emerged called microRNAs (miRNAs). The latest release of the miRBase database has catalogued 721 human miRNAs. Smaller than protein-coding genes, miRNAs can regulate the translation of hundreds of genes through sequence-specific binding to mRNA4, and depending on the degree of sequence complimentary will result in the inhibition of translation and/or degradation of target mRNAs4,5. Interestingly, a recent report shows that miR-369-3p can upregulate the expression of its target, tumour necrosis factor-α (TNFα)6.
Our knowledge and understanding of miRNA biogenesis has evolved in recent years, and is thoroughly described elsewhere4,7 (FIG. 1). Briefly, mature miRNAs are short RNA molecules of between 19 and 22 nucleotides in length. Nucleotides 2–7 of the mature miRNA sequence create the ‘seed region’ (REFS 8–11) that primarily specifies the specific mRNA that the miRNA will bind. The degree of specificity conferred by the seed region is comparable to that of the DNA sites recognized by transcription factors12. Although the binding between the seed region is (mostly) in perfect Watson–Crick complementarity, flanking regions do not have to bind with equal precision. In an additional analogy to transcription factors, it is now apparent that base pairing outside the seed region provides a further layer of specificity, just as chromatin structure limits the potential for transcription factor binding. As multiple transcription factors work cooperatively to ignite gene expression, so too can multiple miRNAs bind to cognate sites in the 3′ untranslated region (UTR) of target mRNAs. Indeed, the complexity of translation can be further extended through heterotypic miRNA–mRNA interactions, as genes can harbour binding sites for several miRNAs12–14.
To date, miRNAs have been linked to the aetiology, progression and prognosis of cancer15, and miRNA expression profiles can uniquely identify cancer types16,17. The gain or loss of specific miRNAs can function as an oncogene or tumour suppressor18,19, the archetypical examples of this being miR-21 and Let-7, respectively. It should also be noted that some miRNAs can have dual oncogenic and tumour suppressive roles in cancer depending on the cell type and pattern of gene expression20. In addition, approximately 50% of all annotated human miRNA genes are located in fragile sites or areas of the genome that are associated with cancer21–23.
Their functional association with cancer, small gene size and potential to simultaneously affect a multitude of genes makes them unique candidate loci for conferring cancer susceptibility, as a small genetic change in an miRNA sequence can theoretically lead to widespread phenotypic effects24,25.
The initial demonstration that miRNA-related SNPs can affect phenotype was elegantly depicted by Abelson et al.26 who found that a mutation in the miR-189 binding site of SLITRK1 was associated with Tourette’s syndrome. Since then, several studies have used systematic sequencing or in silico approaches to identify SNPs in miRNA-related genes, catalogues of which have been created and made public27–29. These reports provide fertile ground for follow-up case–control studies to determine the association between these genetic markers and cancer risk.
Sequencing has shown that SNPs in miRNA coding genes, and specifically in miRNA seed regions, are rare27,30 and would at first glance seem to be of limited functional importance. On close inspection, however, it is apparent that such rare occurrence is the result of well-documented evolutionary selective pressures27,31 (BOX 1).
From the seminal findings by Ambros150, Ruvkun151 and colleagues that lin-4 (REF. 150) and the 3′ untranslated region of its target, lin-14 (REF 151), had blocks of conservation across Caenorhabditis elegans species, cross-species sequence conservation has become part of the standard used in prediction of both microRNA (miRNA) genes and their target sites. The widespread understanding that has stemmed from these and subsequent studies is that the functional importance of these sequences leads to selective pressure that limits the frequency of deleterious alleles. This is in fact the case, as single nucleotide polymorphism (SNP) density in miRNA genes and miRNA target sites has been found to be lower in human miRNA loci27, as well as predicted miRNA binding sites, particularly seed sites, than their flanking regions30. SNPs in miRNA seed regions were predicted to occur at a frequency of 1%27. Conversely, given the constraint on sequence conservation as a means of prediction, it is not entirely surprising to find that SNP density is, in fact, lower in these regions. It is satisfying that evidence for negative selection has also been observed in human-specific miRNAs; that is, those that do not show cross-species conservation13. Although globally rare, mir-SNPs may be positively propagated, and so contribute to gene expression differences that become established in species. Such is the indication from miRNAs that are conserved in primates only152 or that are human-specific57, and there is evidence of local positive selection that may have contributed to expression diversity and adaptation, as well as higher-order thinking in humans. mir-SNPs may be another genetic modification that explains phenotypic differences in humans compared with other primates in the absence of high genotypic diversity. From the perspective of the molecular epidemiology of cancer, because mir-SNPs are so rare, their likelihood to be disruptive is higher.
SNPs in miRNA genes are thought to affect function in one of three ways: first, through the transcription of the primary transcript; second, through pri-miRNA and pre-miRNA processing; and third, through effects on miRNA–mRNA interactions (FIGS 2,,3).3). See Supplementary information S1 (table) for a list of SNPs in all currently known pre-miRNAs and mature miRNAs, which has been created using build 130 of dbSNP (see Further information). Most studies that have followed a biologically based candidate gene approach to search for SNPs in miRNAs that might confer cancer susceptibility rely on knowledge of a functional link between a particular miRNA and gene target.
The first evidence that point mutations in miRNA genes can have a functional effect and confer cancer susceptibility comes from a seminal study by Carlo Croce’s group32 in which a germline mutation in pri-mir-16-1 was found in a kindred with familial chronic lymphocytic leukaemia (CLL) and resulted in low levels of miR-16-1 expression (FIG. 2). Moreover, the mutation was subsequently discovered in New Zealand black mice that naturally develop CLL-like disease33. However, this mutation was not detected in a large panel of other tumour types, thereby showing specificity for cancers of a particular origin34. Notably, the oncogenic mir-17–92 cluster contains two miRNAs that harbour three SNPs (see Supplementary information S1 (table)), but the relevance to inherited cancer risk and carcinogenesis remains to be investigated.
To date, the literature suggests that the functional consequences of SNPs in pri-miRNAs relate to processing and levels of the mature miRNA. For example, SNPs in the pri regions of let-7e and mir-16 lead to decreased mature miRNA levels29,32. Indeed, several studies show an association between pri-mir SNPs and cancer risk (TABLE 1). Specifically, rs7372209 in mir-26a-1 was associated with a 64% decreased risk of bladder cancer in females35, and a twofold increased risk of premalignant oral lesions36. The rs531564 SNP in mir-124-1 is associated with an increased risk of bladder cancer35 and oesophageal cancer in males37, and the pri-miR SNP rs213210 in mir-219 increased the risk of oesophageal cancer37. Another pri-miR SNP, rs2292932 in mir-149, has been tested in several cancers but has not been associated with cancer risk38,39. This suggests that the molecular mechanisms underlying the genetic associations of mir-SNPs with cancer are complex and vary by cancer site.
Case–control studies have also provided evidence for an association of pre-mir SNPs and cancer risk. Rs11614913 in pre-mir-196a-2 contributes to the risk of developing breast40, lung39 and gastric cancers41 in the Chinese population (TABLE 1). In each case the rs11614913 variant homozygote CC was associated with increased cancer risk. Risk of developing oesophageal cancer in Caucasian males and never-smokers was significantly associated with the rs11614913 variant homozygote TT, the minor allele in this population37. Rs11614913 is located in the 3′ passenger (3p) strand mature sequence of mir-196a-2, thereby possibly affecting both maturation and the repertoire of target mRNAs with which it interacts. Indeed, previous studies have shown that sequence variations in mature and precursor miRNA sequences affect miRNA biogenesis28,42, and levels of mature miR-196a-2 were lower in CC carriers than in TT carriers39. Notably, this SNP has also been associated with poor survival in patients with lung cancer43. Indeed, this was the first demonstration that miRNA-related SNPs could be related to cancer prognosis.
An SNP in the terminal loop of pre-mir-27a, rs895819, confers a reduced risk of developing breast cancer in families with a history of non-BRCA-related disease44 and in families with mutant BRCA2 (REF. 45). It has been shown that artificial mutations in the terminal loop of miRNAs such as mir-21 and mir-30 can block miRNA maturation46, so it is conceivable that the variant allele of rs895819 might impair the maturation of oncogenic mir-27a, thus explaining the protective effect of the SNP. However, no alterations of free energy or conformation of the miR-27a–mRNA duplex were predicted in silico, thereby leading to the assumption that the SNP would not affect mir-27a maturation or targeting45. Nevertheless, SNPs in pre-miRNAs may affect expression even in the absence of apparent effects on its secondary structure. Such is the case for an SNP in let-7e (rs41275792), which leads to reduced levels of the mature miRNA in vivo even though its secondary structure is not predicted to change29. The location of the rs895819 SNP in the centre of the terminal loop of mir-27a is likely to decrease the size of the loop and affect the binding of DROSHA, thus decreasing miRNA maturation29,46. Alternatively, the SNP might influence the binding affinity of several DROSHA inhibitors, such as Lin28 (REFS 47,48).
Another pre-miRNA SNP, rs6505162 in pre-mir-423, is associated with an increased risk of bladder cancer35 and ovarian cancer in carriers of mutant BRCA2 (REF. 45), and decreased risk of oesophageal cancer in Caucasians37. There is no clear explanation for the opposing effects of this SNP in different cancer types within the same population. Modulations of mature levels of miR-423 have not yet been functionally linked to this SNP, and the RNAFold software (see Further information) does not predict a change in secondary structure45. An A/C SNP in mir-30c-2 was predicted to cause the greatest change in target gene identification, and as such was postulated to affect cancer risk31, although results from a study in hepatocellular carcinoma (HCC) did not support this hypothesis49.
A unique example of a functional miRNA SNP is rs2910164, which is located in the 3p strand of mir-146a (FIG. 3). This polymorphism involves a mispairing in the hairpin of the precursor, which leads to altered processing, lower expression of the mature sequence and predisposition to papillary thyroid carcinoma50. Intriguingly, individuals with a heterozygous genotype have a greater risk of developing papillary thyroid carcinoma than homozygous individuals50. Heterozygosity as a genetic risk factor is rare, and as postulated by the authors may be a form of genetic epistasis in which the phenotype of the heterozygote differs from the sum of the phenotypes of both alleles. Indeed, in a follow-up study, the same authors found that the SNP fell within the seed of the 3p strand, and would give rise to three mature miRNAs (one from the leading strand and two from the 3p strand) instead of the expected two observed in homozygous individuals51. The resulting complexity is underscored by the finding that each mature miRNA will target a specific repertoire of mRNAs51. Interestingly, there is evidence of homozygous to heterozygous somatic mutations at rs2910164 in several patients with papillary thyroid tumours50. These seminal studies elegantly decipher how a small genetic change can influence the gene expression profile and show that somatic mutations in miRNAs can be an oncogenic event. In separate studies, rs2910164 was associated with an increased risk of developing prostate cancer52 and HCC53 in males through reduced expression levels of miR-146a53. Although not directly associated with breast cancer risk, rs2910164 was associated with a younger age of breast cancer diagnosis in familial breast cancer after adjustment for BRCA1 and BRCA2 mutation status54. The variant allele of rs2910164 leads to increased levels of mature miR-146a and binds with greater affinity to BRCA1. Predisposition may therefore develop through the downregulation of BRCA1 (REF. 54). Alternatively, rs2910146 could disrupt the well-documented role of miR-146a as a mediator of the pro-apoptotic transcription factor nuclear factor-κB (NF-κB)55,56. In support of this possibility, genes involved in the regulation of apoptosis were differentially transcribed in heterozygote rs2910146 carriers51.
Although studies have started to reveal the nature of the association between miRNA SNPs and cancer risk, several considerations remain: most of the studies used a candidate gene approach; of those that used a systematic approach, their lists are outdated owing to enhanced screening techniques that have identified new miRNA genes and updated builds of genome-wide SNP repositories. According to the most recent updates of miRNAs, 202 pre-miR genes have 283 SNPs (Supplementary information S1 (table)). It is likely that as new builds of dbSNP are formulated more SNPs with potential relevance to cancer risk will be identified. In addition, the minor allele frequencies of many of the mir-SNPs already identified have not been determined. Therefore, population studies should be conducted that will ascertain whether or not these SNPs are polymorphic and if so in what populations. This is an important consideration, as data are emerging to suggest that some mir-SNPs have evolved to a high level of variance in distinct populations. For example, the variant alleles of several SNPs occur only in populations of African descent57. These epidemiological associations need to be validated in independent populations and functionally tested58. Furthermore, the inclusion of mir-SNPs in future genome-wide association studies (GWAs) will help to unveil low-penetrance susceptibility mutations. Most mir-SNPs are not included in current GWAS designs, and as such there is a paucity of information in this regard. The identification of tag sNPs for miRNA-related SNPs will also be a useful endeavour. Clarification of the extent of the pri region of miRNA genes is also needed to more accurately assess miRNA-related genetic variation. Many studies currently limit their analysis of mir-SNPs to the pre and mature regions, as these are clearly defined.
The power of SNPs to affect phenotype is highlighted by the myriad of disorders and traits discovered in association with them59–71. In an analogous manner to seed region SNPs, an SNP in the 3′ UTR of a gene may create, as well as destroy, an miRNA binding site (FIG. 3). Disruption of miRNA-dependent regulation by SNPs in the miRNA binding site of target mRNAs is a bona fide mechanism for altered gene expression in cancer. Let-7 binds to the 3′ UTR of KRAS and regulates its expression72, and both let-7 and KRAS are implicated in lung carcinogenesis73. Chin et al.74 sequenced the 3′ UTR of KRAS and identified ten Let-7 complementarity sites (LCS). A new SNP in LCS6 (now designated rs61764370) was present in 20% of lung cancer cases and 5% of the control population (TABLE 2). Correspondingly, the SNP was associated with a 2.3-fold increased risk of developing lung cancer in moderate smokers. Having validated their findings in a second large cohort, the authors demonstrated that the SNP led to increased luciferase activity and decreased levels of Let-7 family members, especially Let-7b 75, suggesting the possibility of a negative feedback loop, similar to that between Let-7 and Lin28 (REFS 76,77). It is possible that the presence of the SNP along with mutant KRAS could lead to an amplified oncogenic hit, and might identify a group of patients who are at a particularly high risk of developing lung cancer75. Subsequently, this SNP has been associated with reduced survival in patients with oral cancer78.
As illustrated in this example, several elements have to converge for an miRNA binding site SNP to be considered functional: the SNP must have a proven association with cancer, both the miRNA and its predicted target must be expressed in the tissue, and the allelic changes must result in differential binding of the miRNA and affect expression of the target gene. Choosing candidate genes to analyse in specific cancer types has been one favoured approach to find such interactions. In this manner, rs17281995 in the 3′ UTR of CD86 was predicted to disrupt the binding sites for five miRNAs and was found to be associated with an increased risk of colorectal cancer79. miR-582, which is expressed in normal colon tissue, bound less tightly to the variant allele of CD86, thus increasing its expression level. CD86 functions as a co-stimulatory molecule and increases the production of the pro-inflammatory cytokine interleukin-4 (IL-4)80, which might explain the contribution of this SNP to colorectal cancer risk.
Among the 120,000 known SNPs that occur in 3′ UTRs, ~17% destroy putative conserved or non-conserved miRNA binding sites. Furthermore, 8.6% create new predicted target sites according to the Patrocles database81 (see Further information; Supplementary Information S2 (box)). Yu et al.82 searched the 3′ UTRs of all genes in the genome and cross-referenced this with dbSNP 126 and the Targetscan database (see Further information) for possible overlap. Interestingly, they observed that 12 of these SNPs were differentially expressed in cancer EST databases82. Subsequently, one of these SNPs, rs16917496 in SETD8, was associated with an increased risk of breast cancer83. Landi et al.84 have now catalogued 79 SNPs in the 3′ UTRs of 129 colorectal cancer-associated genes. One of these, an insertion/deletion (indel) polymorphism (rs3783553) in the 3′ UTR of IL1A led to a 38% decrease in the risk of developing HCC. The TTCA insertion allele for rs3783553 disrupts a binding site for miR-122 and miR-378, thereby increasing transcription of IL1A in vitro and in vivo 85.
Saetrom et al.86 mapped HapMap SNPs to putative miRNA recognition sites in genes deregulated in oestrogen receptor-stratified breast tumours and used local linkage disequilibrium patterns to identify high-ranking SNPs in the Cancer Genetic Markers of Susceptibility (CGeMS) breast cancer GWAS. Two SNPs, rs1970801 and rs1109745, were in strong linkage disequilibrium with rs1434536, an SNP in an miR-125b target site in the 3′ UTR of bone morphogenetic protein receptor type 1B (BMPR1B). Subsequently, rs1434536 was validated and miR-125b was shown to differentially regulate the C and T alleles. These results suggest that allele-specific regulation of BMPR1B by miR-125b explains the observed disease risk86.
Drawing on the list of putative polymorphic miRNA binding sites in the genome provided by Chen et al.87, 11 possible candidate SNPs were selected for their potential relevance to breast cancer88. Subsequently, rs2747648, which resides in a predicted binding site for three miRNAs in the oestrogen receptor-α (ESR1) gene, was associated with a 27% reduction in breast cancer risk in premenopausal women. When the C allele is present, miR-453 binds with greater affinity to ESR1, thus leading to decreased levels of eRα. Postmenopausal women already have reduced levels of endogenous oestrogen, perhaps explaining why this SNP is relevant only in premeno-pausal women. Indeed, the authors pose the interesting question of whether or not carriers of the ancestral T allele would respond better to endocrine therapy, given that they will naturally express increased levels of the receptor88. Moreover, there is evidence that the variant allele of another SNP in ESR1, rs93410170, enhances the binding between miR-206 and the 3′ UTR, thereby decreasing eRα levels89. Subsequently, the authors postulated that the lower incidences of breast cancer in Hispanic and european populations could be partially associated with the increased prevalence of the variant T allele.
The idea that an SNP in an miRNA binding site could specifically affect pharmacokinetics is illustrated in a study that described the effect of a C–T SNP near the miR-24 binding site in the 3′ UTR of human dihydrofolate reductase (DHFR) on its translation. The variant allele interferes with miR-24 binding to the 3′ UTR and leads to a twofold increase in the mRNA half-life, DHFR overexpression and methotrexate resistance90. Moreover, a recent follow-up study found that DHFR imparts a selective growth advantage and neoplastic transformation in immortalized cells91. However, an important caveat of this study, and one that introduces an additional level of complexity when studying miRNA-related SNPs, is that the SNP that affected drug resistance was not located in the miRNA binding site. Rather, it was located further downstream. The relationship between miRNAs and pharmacogenomics was recently discussed92. The concept of ‘integrative epidemiology’ (REF. 93) proposes several models that depict how SNPs may overlap in their relationships to therapeutic response, susceptibility and prognosis94. An intriguing observation reminds us that multiple SNPs in the miRNA network may interact in a similar manner in carcinogenesis. DHFR is one of the predicted targets of miR-196a-2, the SNP of which, rs11614913, is associated with cancer risk43. Thymidylate synthase, the target of the cancer drug 5-fluorouracil, is another predicted target of miR-196a-2 (REF. 43). Indeed, recent studies have found that genetic variations in the 3′ UTR of thymidylate synthase are associated with resistance95,96. Whether rs11614913 affects expression of these genes and therapeutic efficacy is an intriguing question.
A recent paper provides new insight into mir-SNPs and their relevance to cancer. New miRNAs have been found in epstein–Barr virus, which is a causal factor for several cancers97. One of these miRNAs, mir-BART22, harbours a single genetic variant that increases its mature levels and downregulates its target, latent membrane protein 2A (LMP2A), a potent immunogenic viral antigen. As a consequence, it is possible that miR-BART22 could accelerate carcinogenesis through the downregulation of LMP2A and the evasion of the host immune response98. Whether similar interactions contribute to other viral-related cancers remains to be determined.
Although much less explored, miRNAs can also bind to target sites in the 5′ UTR and open reading frames99. Such binding sites can occasionally be interrupted by introns and therefore require splicing to bind with their complementary miRNAs100. Several lines of evidence support miRNA binding to the 5′ UTR101–103 and coding sequences9,11,104. For example, miR-10a binds to the 5′ UTR of ribosomal proteins to increase their translation105, and miR-148 regulates DNMT3B expression through a conserved site in the protein coding sequence. Interestingly, the target site is absent in the DNMT3B3 splice variant. Therefore, the expression of miR-148 changes the relative abundance of DNMT3B splice variants106. Let-7 directly targets DICER1 in its coding sequence, thus establishing a mechanism for a miRNA–DICeR1 negative feedback loop107. This work suggests that the search for miRNA-related susceptibility loci should be expanded to include both the 5′ UTR and coding regions. Sensitive alleles identified in epidemiological studies, but with obscure functional roles, should perhaps be tested under miRNA prediction algorithms that are not limited to the 3′ UTR of genes, particularly if evidence indicates that altered expression of that gene can be associated with specific phenotypes.
Although the data on SNPs in miRNA binding sites and cancer risk are exciting, several limitations and caveats remain that may affect the field as it moves forwards. One of the major pitfalls in these studies is the ambiguity of computationally predicted miRNA binding sites. Programs such as Patrocles and PolymiRTS (see Further information; Supplementary information S2 (box)) intercalate and cross-reference this data with dbSNP information, and as such are invaluable in the search for polymorphic miRNA binding sites. However, miRNAs bind to their targets in a manner that is either 5′ dominant (perfect base pairing at 7–8 nucleotides in the 5′ end) or 3′ compensatory (imperfect 5′ binding, therefore the 3′ end has a stronger degree of complementarity)108. Algorithms that do not take this into account might miss SNPs that could be important for 3′ compensatory-based matching. Furthermore, the databases need to keep pace with the discovery of new SNPs and miRNA genes to maintain their relevance for researchers. Therefore, it is noteworthy that several SNPs50,51,57,74, including the KRAS LCS6 SNP, were identified through direct sequencing. A major lesson from GWAS is that variants in regulatory regions are much more likely to cause disease than nonsynonymous coding SNPs, and as such miRNA binding regions should be considered in future GWAS approaches109.
The global repression of miRNA maturation promotes cellular transformation and tumorigenesis15, thus SNPs that affect the proteins involved in miRNA biogenesis may have deleterious effects on the miRNAome (FIG. 4). Furthermore, recent data suggest that SNPs in the biogenesis machinery are also linked to cancer risk, drug response and prognosis. Low levels of DROSHA, for example, are associated with poor cancer survival110. However, SNPs in DROSHA and DGCR8 did not predispose to cancer susceptibility35–37,111 (TABLE 3).
By contrast, the nuclear export proteins XPO5 and RAN have been associated with cancer risk37. XPO5 is responsible for miRNA nuclear export, and knocking down its expression leads to reduced miRNA levels112. XPO5 is downregulated in bronchioloalveolar carcinoma and stage 1 lung cancer113 but upregulated in high-grade prostate cancer114. The SNPs in XPO5 and RAN occur in their 3′ UTRs, suggesting that they might affect mRNA stability. Our analysis of the RAN SNP, rs14035, in the PolymiRTS database suggests that the ancestral allele lies in a binding site for miR-575, which is disrupted by the derived allele that in addition creates a binding site for miR-182*. Although these are in silico results, they raise the possibility that in addition to affecting cancer risk through the disruption of miRNA nuclear export, a more intricate pathway may be involved that includes miRNA regulation.
DICER1 and transactivation-responsive RNA-binding protein (TRBP) mediate pre-miRNA processing. A recent study indicated that DICeR1 functions as a haploinsufficient tumour suppressor in cancer115. Indeed, lower levels of DICER1 mRNA have been associated with decreased cancer survival110. Interestingly, rs3742330 in the 3′ UTR of DICER1 was associated with an increased risk of premalignant oral lesions in individuals with leukoplakia and/or erythroplakia36. Melo and colleagues116 identified two frameshift mutations in TRBP that introduce premature stop codons, resulting in reduced TRBP expression. One function of TRBP is regulating DICER stability, thus these mutations resulted in reduced DICER expression and lower miRNA production and were associated with higher cellular proliferation levels116.
RISC has a pivotal role in guiding single-stranded mature miRNA sequences to their target mRNA sites. The variant allele of the GEMIN3 nonsynonymous SNP, rs197412, was associated with a reduced risk of premalignant oral lesions, and rs197414, also in GEMIN3, was associated with an increased risk of bladder35 and oesophageal cancer37. Therefore, GEMIN3 variants could alter global miRNA homeostasis and have a major effect on cellular signalling pathways. Two SNPs in GEMIN4, rs2740348 and rs7813, were associated with a decreased risk of renal cell carcinoma111 and reduced transformation of Hep3B cells117.
The tumour suppressor p53 was recently implicated in miRNA processing118. Through interaction with p68 and DROSHA, p53 facilitates pri-miRNA to pre-miRNA processing. Given the well-documented relationship between p53 mutations and cancer119–121, it is possible that there might be p53 mutations or SNPs that affect miRNA processing and so increase or decrease the risk of cancer development. This also raises the possibility that p53-associated conditions, such as Li–Fraumeni syndrome, may relate to a global decrease in miRNA production and function.
On a cautionary note, although these associations are interesting and may have far-reaching implications, none of the studies of SNPs in miRNA processing machinery has been validated in independent studies, nor has the biological mechanisms of how they affect miRNA maturation and cancer been delineated.
A new form of miRNA sequence heterogeneity has recently been identified. Before their official annotation as ‘isomiRs’ by Morin et al.122 in 2008, miRNA variants had been identified in cloning studies but were ambiguously misclassified as putative experimental error123,124. In cases in which several closely matching sequences were discovered for a single miRNA, the most commonly detected sequence was chosen as the reference122,125,126. So far, three main types of miRNA sequence modification have been described: 3′ deletion/addition, 5′ deletion/addition and internal modifications (BOX 2). Distinct from SNPs, the location of these variations suggests that they generally arise from variable cleavage sites for DROSHA and DICER1 in the hairpin. Their identification was possible owing to the application of next-generation deep 454 sequencing approaches to miRNA discovery.
The most prevalent type of modification noted among mature microRNA (miRNA) sequences is single nucleotide 3′ extensions122,127,153,154. These modifications produce an isomiR that matches the genome at every position except the terminal nucleotide. A 3′ extension was found in 66% of mir-326 reads129. The nucleotides most commonly added were adenine and uridine, followed by cytosine and guanine122,153. Intriguingly, in the study by Kuchenbauer et al.125 151 miRNAs and miRNA*s had a 3′ variation that not did not match the genome, suggesting the possibility of an as yet unknown new mechanism of miRNA processing. The changes in terminal nucleotide were proposed to be partly due to deamination of cytosine to uracil by cytidine deaminases (CDARs) or deamination of an adenosine to inosine by adenosine deaminases (ADARs).
Uridylation at the 3′ end of miRNAs has also been reported76,155,156. The biological importance of 3′ uridylation is unclear. It could mediate miRNA turnover or facilitate mRNA–miRNA binding in cases in which 3′ compensatory binding is predominant. This signal may also function as a degradation signal, perhaps in a manner that is analogous to protein ubiquitylation156.
Contrary to the idea that pre-miRNA processing leads to a mature miRNA sequence with a fixed nucleotide at the 5′ end, recent data indicate that isomiRs may also result from variation at the 5′ end. They are of particular interest as they have a different seed sequence from the reference miRNA, and therefore have the potential to bind a different repertoire of targets. Not surprisingly, nucleotide modification to the 5′ end of mature miRNAs seems to be less likely than at the 3′ end. It is currently unclear whether these non-canonical variants associate in RNA-induced silencing complex (RISC). If so, the presence of isomiRs may have implications in future annotation of miRNAs and the development of new target prediction algorithms. Modification at the 5′ end of miRNAs has been noticed in T cells, in which distinct mir-142 variants seem to regulate different target gene pools157. IsomiRs of mir-142 seem to arise from shifting of the processing sites in the pri-miRNA sequence by DROSHA to generate alternative pre-miRNA variants that can then be independently processed by DICER1 to generate mature miRNAs that might have altered target specificity. The seed-matched target binding sites of the different miR-142 variants seem to be evolutionarily conserved157 and these distinct miR-142 variants seem to regulate different 3′ UTR targets. It is also noteworthy that miR-142 is the most highly expressed miRNA in naive T cells. There is also evidence for 5′ end processing in Caenorhabditis elegans, mammals, viruses and Drosophila melanogaster 25,123,157,158. Therefore, 5′ end modifications are a conserved phenomenon157. It is possible that many more miRNAs with 5′ shifts will be found in the future.
Using massively parallel sequencing in mouse ovary, Reid and colleagues128 found evidence of internal editing of murine let-7a in the form of internal insertion, deletions and substitutions. There is a selection against nucleotide alterations in nucleotides 3–7 (the seed) and 10–15 (the cleavage and anchor sites), that is, those positions that generally have Watson–Crick binding. It has been speculated that the change in nucleotide sequence expands the target repertoire and/or enhances mRNA decay over translational repression by increasing or decreasing the degree of complementarity128.
The specificity of isomiRs as bona fide genetic variants of miRNAs, and not experimental artefacts, is strengthened by the detection of the new miRNA sequences and analogous nucleotide modifications in different genomes122,125,126; their detection using both a linker-based miRNA cloning approach127 and massively parallel sequencing122; an observation that the frequencies of the nucleotide modifications were remarkably higher than the estimates attributed to sequencing errors; and the non-random positioning of the nucleotide changes125,128. Indeed, an analysis of the most prevalent 3′ additions in human and mouse tissues demonstrated that the nature of the nucleotide change is evolutionarily conserved125.
Despite the large number of isomiRs detected, their role in post-transcriptional regulation remains to be experimentally determined. It is postulated that the modifications could affect miRNA half-life, subcellular localization and miRNA target specificity. IsomiRs resulting from variation at the 5′ end may be of particular interest, as they have different seed sequences from the reference miRNA, with the subsequent ability to potentially target different transcripts.
Several isomiRs have been implicated in cancer. In a mouse model of leukaemia, several isomiRs of mmu-miR-10a, mmu-miR-155, mmu-miR-27a, mmu-miR-27c, mmu-let-7a and mmu-miR-222 were differentially expressed125. Concordant with their reference sequences, most of the isomiRs were downregulated in the tumour cells. One isomiR of mmu-miR-223-5p was downregulated ~2,500-fold in metastases125. The isomiR count for members of the Let-7 family is among the highest detected129. For example, Let-7a-5p has 78 sequences derived from various combinations of 5′ and 3′ modification: some of these sequences had counts greater than 4,000 and were therefore highly expressed. The mir-181 family of putative tumour suppressors and mir-21, an oncogene, also have a remarkable level of sequence variation. A full description of the detected isomiR sequences is provided by Morin et al.122. Although these variations have not been interrogated in human cancer, it is plausible that they are relevant to tumorigenesis and it is likely that navigating this mercurial maze may lead to many answers underlying both normal and cancer cell biology.
miRNA expression can also be affected by epigenetic silencing. Indeed, many miRNAs are found in CpG islands (Supplementary information S1 (table)). epigenetic silencing of several miRNAs is a frequent and early event in breast cancer130,131, and although the let-7 family is globally downregulated in lung cancer73,132 there is evidence of let-7a-3 hypomethylation133; this is perhaps another example of how miRNAs can have bivalent roles in malignancy20. A case–control analysis exploring a possible relationship between aberrant epigenetic profiles and cancer risk has yet to be instigated, and so the implications of this form of genetic variation at the population level are unknown. Moreover, it is likely that mir-SNPs in CpG islands might also affect the pattern of miRNA expression and contribute to cancer susceptibility. Consistent with this idea, an SNP occurring in the promoter of an miRNA (whether in a CpG island or not) would also be predicted to affect miRNA levels. Indeed, Sevignani et al.23 found that most of the sequence differences in miRNA genes in tumour-susceptible mice rather than tumour-resistant mice occur in the promoters.
Approximately 60% of human genes are thought to undergo alternative splicing134. The implications of this form of transcriptional control for miRNAs remain largely unexplored. However, there is evidence that proliferating cells have shorter 3′ UTRs, fewer miRNA binding sites and therefore diminished regulation by miRNAs135, although the converse is also true64. Furthermore, miR-124 augments neural differentiation by targeting PTBP1 mRNA, a global repressor of alternative pre-mRNA splicing in non-neuronal cells136. Sandberg et al.135 found evidence of shorter 3′ UTRs in proliferating T cells mediated by alternative cleavage and polyadenylation. Moreover, it was subsequently shown that globally cancer cells have shorter 3′ UTRs than untransformed cells and so escape regulation by miRNAs — these shorter isoforms can give rise to tenfold more protein. In addition, the expression of the shorter isoform of the proto-oncogene insulin-like growth factor 2 mRNA binding protein 1 (IMP1) led to oncogenic transformation, although the longer form did not, thus showing that the loss of repressive elements in the mRNA sequence through alternative cleavage and polyadenylation promotes an oncogenic phenotype137,138.
This Review has focused on the genetic variations known to occur in miRNAs, their binding sites and the genes that facilitate their processing. However, as the ‘miRNAome’ evolves, it is likely that new candidate SNPs and forms of genetic variation linked to cancer susceptibility will emerge. Furthermore, given the differential cell of origin for cancers arising from different anatomic sites, and the cell type specificity of miRNA transcriptomes, it is reasonable to assume that the effects of mir-SNPs will be modulated in a cell type-specific manner. The incorporation of miRNA target co-expression and expression Quantitative Trait Locus (eQTL) mapping should aid in deciding whether mir-SNP is functional. Such features are part of the databases Patrocles and PolymiRTS (Supplementary information S2 (box)).
Although candidate gene approaches can certainly ascertain the effect of a single SNP on an individual’s risk of cancer, the cumulative effect of the inheritance of multiple SNPs in miRNA-related genes might augment risk. Consistent with this idea, an increased risk of oesophageal and bladder cancer was observed in individuals with SNPs in both miRNAs and miRNA processing genes37. Furthermore, some of those mir-SNPs might be in linkage disequilibrium and therefore inherited together. The polygenic model of inherited breast cancer purports that unfavourable combinations of polymorphic genetic variants in low-penetrance susceptibility genes contribute to the excess familial breast cancer risk and most of these genes have not yet been discovered139–141. As mir-SNPs are rare27,30, and their minor allele frequencies are globally low, large studies will be needed to draw out their importance.
The complexity of the miRNA network is further intensified by the discovery of miRNA functions that fall outside their classic range. For example, there is evidence of miRNA-mediated increases in protein translation6, nuclear import of miRNAs with distinctive hexanucleotide terminal motifs142 and the secretion of miRNAs143,144. Furthermore, an alternative miRNA processing pathway has been uncovered in both Drosophila melanogaster and Caenorhabditis elegans that bypasses DROSHA and instead uses a splicing technique to generate miRNA precursors from short intronic sequences (mirtrons)145–147. How SNPs in miRNAs affect these pathways remains to be tested.
The data reviewed here explore the strong link between alterations in miRNA structure and function and inherited cancer risk. Although the pathways that mediate this risk have not been fully elucidated, there is a clear suggestion that cancer risk is mediated by changes in miRNA sequence and maturation. mir-SNPs affect cancer susceptibility, response to treatment and prognosis75,78,148. In addition to broadening our understanding of the astounding complexity of how miRNAs function, the study of genetic variation in miRNA networks has expanded our knowledge of the myriad ways in which miRNAs can affect cancer. The 3′ UTRs of genes are involved in multiple levels of regulation87. These oftneglected regions are now known to be prime regulators of the transcriptome, and the importance of SNPs in these regions for human traits is exemplified by the range of phenotypes affected by these mutations (for reviews see REFs 59,149). Historically, these regions were not extensively mined for SNP discovery, something that should now be addressed74. Moreover, it is clear that miRNA genetic miscellany can affect the diversity of the genome and is related to cancer susceptibility. Several approaches have been used to apply this genetic basis to cancer, and a polygenic, network-based approach should be adopted in the future140. The validation of these findings in multiple cohorts and the testing of their applicability to different ethnic populations is also required. Furthermore, the linkage of population-based studies to functional validation is crucial for both basic science and the advancement of these findings to clinical applications — this step should no longer be overlooked in epidemiological studies if we are to unravel the implications of these networks in human disease. Therefore, despite the initial insights covered in this Review, we believe that a vast anthology of knowledge remains to be discovered.
This work was supported by the Intramural Research Program of the National Institute of Health, NCI-CCR.
Competing interests statement
The authors declare no competing financial interests.
Entrez Gene: http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene.
Let-7b | let-7e | miR-10a | miR-16-1 | miR-21 | mir-26a-1 | mir-27a | miR-122 | mir-124-1 | miR-182* | mir-149 | miR-189 | mir-196a-2 | miR-378 | miR-423 | miR-453 | miR-575 | miR-582 | mir-BART22 | mmu-miR-10a | mmu-miR-27a | mmu-miR-27c | mmu-miR-155 | mmu-miR-222
Curtis C. Harris’ homepage: http://www3.cancer.gov/intra/LHC/LHCPAGE.htm
International HapMap Project: http://hapmap.ncbi.nlm.nih.gov/
Patrocles database: http://www.patrocles.org/
PolymiRTS database: http://compbio.uthsc.edu/miRSNP/
RNAfold software: http://rna.tbi.univie.ac.at/cgi-bin/RNAfold.cgi
Targetscan database: http://www.targetscan.org/
See online article: S1 (table) | S2 (box)
ALL LINKS ARE ACTIVE IN THE ONLINE PDF