Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Mol Cancer Res. Author manuscript; available in PMC 2013 October 1.
Published in final edited form as:
PMCID: PMC3475755

SINE Retrotransposons Cause Epigenetic Reprogramming of Adjacent Gene Promoters


Almost half of the human genome and as much as 40% of the mouse genome is composed of repetitive DNA sequences. The majority of these repeats are retrotransposons of the SINE and LINE families, and such repeats are generally repressed by epigenetic mechanisms. It has been proposed that these elements can act as methylation centers from which DNA methylation spreads into gene promoters in cancer. Contradictory to a methylation center function, we have found that retrotransposons are enriched near promoter CpG islands that stay methylation-free in cancer. Clearly, it is important to determine which influence, if any, these repetitive elements have on nearby gene promoters. Using an in vitro system, we confirm here that SINE B1 elements can influence the activity of downstream gene promoters, with acquisition of DNA methylation and loss of activating histone marks, thus resulting in a repressed state. SINE sequences themselves did not immediately acquire DNA methylation, but were marked by H3K9me2 and H3K27me3. Moreover, our bisulfite sequencing data did not support that gain of DNA methylation in gene promoters occurred by methylation spreading from SINE B1 repeats. Genome-wide analysis of SINE repeats distribution showed that their enrichment is directly correlated with the presence of USF1, USF2 and CTCF binding, proteins with insulator function. In summary, our work supports the concept that SINE repeats interfere negatively with gene expression and that their presence near gene promoters is counter-selected, except when the promoter is protected by an insulator element.

Keywords: DNA methylation, gene silencing, retrotransposons


Retrotransposons of SINE and LINE classes have been extremely successful in colonizing mammalian genomes (1). Specifically in human and mouse, these two classes of repeats represent almost 50% of their genomes. This high frequency was in past regarded as a result of multiplication by retrotransposition and random integration into the genome. However, new evidences suggest that the current distribution of these elements was modeled by evolutionary constraints and that retrotransposons and other repeats have evolved from parasitic sequences into functional genomic elements (2, 3). Adding to these data, earlier evidences supported that the distribution of interspersed elements is conserved in the human and mouse genomes (4). The direct comparison of these genomes revealed a spatial concordance in positioning of SINE but not LINE repeats, indicating evolutionary pressure to maintain and/or exclude these repeats from orthologous regions (5).

Mobilization of retroelements is inhibited by epigenetic mechanisms (6). SINE and LINE elements are frequently methylated, as confirmed by the recent papers on genome-wide methylation studies in mouse and in human (7, 8). In normal cells, DNA methylation is rare in promoter-associated CpG islands but important to X-inactivation (9) and genomic imprinting (10). During tumorigenesis hundreds to thousands of genes gain methylation in promoter-associated CpG islands, affecting many pathways including tumor-suppressor pathways (11). The causes of this massive switch in DNA methylation remain mysterious. When searching for sequences that differentiate genes that are rarely methylated in cancer from those frequently methylated, we found that SINE and LINE retrotransposons are enriched around the transcription start sites (TSSs) of genes that are never or rarely methylated in cancer (12). Such association was so strong that it was possible to build a mathematical model to predict gene methylation based on the presence of retrotransposons near the genes TSS, and this model worked well when applied to methylation data from both cancer cell lines and primary tissues. Similar findings supporting that a signature of retrotransposon abundance predicts epigenetic states have been described for DNA methylation in normal tissues (13), DNA methylation in cancer (14), genomic imprinting (15) and X-chromosome inactivation (16), among other studies. Given the hypermethylated state of retrotransposons in normal cells, it is somewhat paradoxical that SINE and LINE elements are poorly represented in the vicinity of genes that become hypermethylated in cancer (12, 14, 17). Moreover, SINE B1 elements were previously reported to cause transcriptional silencing of the adjacent aprt gene by spreading of DNA methylation (18).

We hypothesized that the enrichment of retrotransposons near cancer methylation-resistant genes is due to a negative influence of SINE repeats on gene expression and counter selection over evolution, resulting in exclusion of these repeats from vulnerable genomic environments. Thus, only gene promoters with intrinsic resistance to DNA methylation are permissive to the nearby presence of retrotransposons. This hypothesis predicts that (i) SINE retrotransposons cause silencing of vulnerable nearby gene promoters, and (ii) SINE abundance near promoters correlate with genomic features that limit retrotransposons influence on adjacent chromatin. Here, we show that both of these predictions are correct.


Cell Lines, Plasmids and Transfection

Promoter regions containing CpG islands from murine genes Cdkn2d (−440 bp to +618 bp from the gene TSS), Cdkn2a long transcript, also known as p14Arf (−273 bp to +474 bp from the gene TSS) and Mlh1 (−226 bp to +267 bp from the gene TSS) were identified from mouse genome sequence repositories and amplified by PCR. The rationale for selecting these genes among thousands of others with a promoter CpG island was that p14Arf and Mlh1 have tumorigenic potential in mice and that their human homologues are prone to become hypermethylated in human neoplasias (19, 20). Cdkn2d was included because it belongs to the same gene family that p14Arf. Also, none of these promoter CpG islands are constitutively methylated in normal cells. NIH/3T3 DNA was used as template to Cdkn2d and Mlh1 PCR, and p14Arf was amplified from BALB/c mice liver DNA. The PCR fragments were digested by appropriate restriction enzymes and subcloned into a pGL3-Basic reporter plasmid (wild-type promoters). SINE B1 elements were inserted immediately upstream to cloned promoters, generating the plasmid variants containing 2, 4 or 6 B1 elements (see Table S1 for more details on the cloned SINE B1s and other repetitive elements used in this study). Mouse embryonic fibroblast NIH/3T3 cells were transiently co-transfected with the promoter constructs (1 μg) and 20 ng of pRL-TK vector using FuGene reagent (Roche) according standard methods. Cells were harvested in approximately 24, 48 and 72 hours after transfection and luciferase activity from cell extracts was detected using the Luciferase Assay System (Promega, Madison, WI, USA) as specified by the manufacturer. The magnitude of activation of luciferase constructs was determined after normalization to pRL-TK activity, and the values of each wild-type promoter was then taken as 1.0-fold. Stable transfections were performed by co-transfection of a neomycin-expression vector. Neomycin selection was initiated on day 2 after transfection, the medium was replaced with fresh DMEM containing 10% calf serum and G418 at 400 ug/ml (Invitrogen), and the cells were sub-cultured every 2–3 days. Cell pellets were collected for DNA extraction and luciferase readings at days 19, 36, 48, 60 and 90 for Cdkn2d and Mlh1 constructs and days 15, 22, 29, 36, 43 and 50 for p14Arf constructs. Sequences of all primers used in this work are provided in Table S2. For investigation of human SINEs, we cloned the human CDH1 gene promoter upstream to the luciferase reporter gene in four different configurations: wt-CDH1, without any retroelements (−303 bp to +1,370 bp from the gene TSS); SINE-CDH1, with Alu and Mir sequences downstream to the gene promoter (−303 bp to +2,318 bp from the gene TSS); LINE-CDH1, with a LINE element sequence downstream to the gene promoter cloned in the position +836 bp from the gene TSS in the wt-CDH1 plasmid; and (S/L)INE-CDH1, with Alu, Mir and LINE sequences downstream to the promoter (see Fig. 6a for a graphic representation of these constructs). Luciferase activity of the transgenes was measure as described above after transient transfection in the human cell lines RKO (colorectal carcinoma) and NCI-H1299 (lung carcinoma). All cell lines were obtained directly from ATCC (Manassas, Virginia). These cell lines were not re-authenticated in our laboratory since the vendor already authenticated them.

Fig. 6
The human CDH1 gene promoter is sensitive to SINE, but not LINE retrotransposons. (a) Cloning strategy of the CDH1 gene promoter in the pGL3-basic vectors, with and without SINE and LINE retrotransposons. As shown, for different constructs were transiently ...

DNA Methylation Studies

Bisulfite treatment was performed as previously reported (21) and 1/10th of the final volume was used as a template for PCR. Except for p14Arf (which is deleted in NIH/3T3 cells), the other gene promoters were amplified by semi-nested PCRs, where one of the primers of the first reaction was located in the plasmid sequence to avoid detection of the endogenous gene. Methylation density was determined by COBRA (Combined Bisulfite Restriction Analysis (22)); PCR products were separated by 6% polyacrylamide gel electrophoresis and stained with ethidium bromide, imaged, and quantitated in a Bio-Rad Geldoc 2000 imager (Bio-Rad, Hercules, CA), and the methylation density for each sample was computed as a ratio of the density of the digested band to the density of all bands in a given lane. DNA Methylation was confirmed by bisulfite sequencing and/or pyroMeth analysis. The full list or primers used in these studies is presented in Table S2.

Chromatin Immunoprecipitation (ChIP) Assays

Log-phase growing cells were crosslinked with 1% formaldehyde, washed twice with cold PBS with protease inhibitors and harvest by scrapping. Cells were sonicated in SDS lysis buffer, followed by elution in ChIP buffer. We selected to evaluate promoter marking by active histone modifications that are universally observed in active promoters (H3K9ac) or preferentially found in active promoter CpG islands (H3K4me3), and that are associated with gene repression at the same time or not that DNA methylation (H3K9me2) or typically mutually exclusive to DNA methylation (H3K27me3). The following antibodies were used for immunoprecipitation: H3K4me3 (Millipore, 17-614); H3K9ac (Millipore, 07-352); H3K9me2 (ABCAM, ab1220), H3K27me3 (Millipore, 17-622), Histone H3 (ABCAM, ab1791-100) and rabbit IgG (ABCAM, ab46540). Chromatin-antibody complexes were capture using Dynabeads Protein A/G (Invitrogen, Carlsbad) and the immunoprecipitated DNA was treated with proteinase K, purified by column filtration and eluted in Tris-buffer. Quantitative real-time PCR was used to detect amplicons from the target sequences (p14Arf promoter and SINE B1) and positive controls of active (tuba) and repressed genes (hbb-b1 and nanog), and the fold-enrichment of each histone modification and rabbit IgG to Histone H3 was calculated using the delta-Ct method.

Genome-wide mapping of SINE and LINE repeats

Transcription start site coordinates of mouse and human RefSeq genes and SINE repeats were downloaded from the UCSC Genome Browser (mm8 and hg18 releases). The coordinates of binding sites of CTCF in the human genome were available from public data releases from the Encode Chromatin Group at Broad Institute and Massachusetts General Hospital, and mouse CTCF binding sites were reported by (23). Genome coordinates of USF1 and USF2 binding were available from (24). Each RefSeq gene was represented by 20 bins of 1-kb sequence each (10 bins upstream and 10 bins downstream of gene TSS), and each bin was then annotated as occupied or not by SINE retrotransposons (when falling in between bins, SINEs were annotated to the closest bin to TSS). We compared the average abundance of SINE retrotransposons per bin in gene bound and not bound by each CTCF, USF1 and USF2 in the 20-kb genomic region around gene TSS. Genes were also classified as having or not a CpG island overlapping with their proximal promoter regions (−200bp to +200bp). The frequency distribution of SINEs in the two groups (genes bound and not bound by insulator proteins) was compared using t-Student test.


SINE B1 elements reduce the transcriptional activity of adjacent promoters

To test the capacity of SINE B1 elements in promoting gene silencing, we generated a system where the luciferase gene is under the control of three different mouse gene promoters (Cdkn2d, p14Arf and Mlh1), and we inserted two and four copies of SINE B1 elements upstream to these promoters (Fig. 1a; all primers used in this study are included in Table S2). One plasmid containing six copies of SINE B1 elements (6B1-p14Arf) was also used in some of the experiments. These plasmids were transfected in the immortalized mouse fibroblast cell line NIH/3T3 (more details on Methods). This cell line was chosen for our experiments because it showed the capacity to sustain DNA methylation in transfected plasmids (25) and also due to its high efficiency of transfection. We initially assayed luciferase activity after short-term transfection, and we observed that the plasmids with SINE B1 elements were 30–60% less transcriptionally active than plasmids without SINE B1 insertion (Fig. 1b–d). The level of repression was dependent on the number of copies of inserted SINE elements, and we found that the repression increased with time. Transcriptional repression was independent of the orientation of SINE B1 elements, as these retrotransposons caused the same degree of repression when they were inserted in negative orientation to the gene promoter (Fig. S1).

Fig. 1
Gene promoters with nearby SINE B1 show reduced transcriptional activity. (a) Plasmid constructs used to evaluate the effect of SINE B1 elements on promoter activity. The luciferase reporter gene was put under the control of mouse Cdkn2d, Mlh1 or p14Arf ...

p14Arf is silenced by SINE B1s and shows characteristic repressed chromatin

We followed the dynamics of promoter activity over time during a two-month period, and we found that the p14Arf plasmids with SINE B1 insertion become fully repressed (Fig. 2a). The repression of the p14Arf promoter activity was reproducible in two independent experiments, ruling out possible effects related to plasmid insertion sites in the genome. Notably, the loss in promoter activity occurred in a temporal fashion, suggesting that reinforcing mechanisms act to promote gene repression. To evaluate whether epigenetic reprogramming had occurred, DNA methylation density near the TSS of p14Arf was measured using the COBRA assay (22). No special considerations for primer design were required to avoid investigation of an endogenous copy of p14Arf because this gene is homozygously deleted in NIH/3T3 cells. We observed that after an initial gain of methylation in the p14Arf promoter by day 29, no significant difference was observed afterwards (Fig. 2b), despite consistent transcriptional repression. The methylation data at day 29 were confirmed by pyroMeth, showing that the gain of methylation was not restricted to an individual CpG site but was concordant between 9 individual CpG sites located between the positions −21 to +88 base pairs from the gene TSS (Fig. 2c). We hypothesized that for p14Arf the DNA methylation mark was replaced by other epigenetic mechanisms of silencing, most likely histone marking. Indeed, using chromatin immunoprecipitation (ChIP) assays, we found that the p14Arf promoter showed a large enrichment for H3K4me3 and H3K9ac in p14Arf-wt compared to 4B1-p14Arf plasmids. However, the repressive marks H3K9me2 and H3K27me3 did not differ significantly (Fig. 2d), suggesting that other marks are relevant here. In summary, the presence of SINE B1 elements near gene TSS promoted epigenetic reprogramming with changes in DNA methylation and histone posttranslational modifications.

Fig. 2
SINE B1 elements cause transcriptional repression and epigenetic reprogramming of the p14Arf gene promoter. (a) Promoter activity over time with and without proximal insertion of SINE B1 elements. The p14Arf promoter show gradual decline in promoter activity ...

Cdkn2d is also silenced by SINE B1s while the Mlh1 gene promoter is refractory to SINE-mediated silencing

We also studied the long-term effects of SINE B1s in Cdkn2d and Mlh1 promoter activity. Consistent with a role of SINE B1s as repressors, the Cdkn2d promoter gradually lost activity until reaching only 5% to 10% of its original strength when cloned close to B1 sequences (Fig. 3a). In a sharp contrast to p14Arf, there was a progressive and persistent gain of methylation in the Cdkn2d promoter in B1-containing plasmids (Fig. 3b). An exception to continued silencing by SINE B1s was seen for Mlh1 plasmids (Fig. 3c). Despite the decrease in promoter activity of Mlh1 promoter in B1-containing plasmids in short-term transfections, no additional repression was measured afterwards. Inverting the direction of the Mlh1 promoter did not change this result in an independent experiment. These differences between p14Arf, Cdkn2d and Mlh1 in response to SINE-B1s in long-term transfections are also supported by the non-normalized data (Fig. S2). The Mlh1 promoter remained mostly unmethylated both in wt- and B1-containing plasmids, in agreement with lack of change in promoter activity as measured by luciferase activity (Fig. 3d). Bisulfite sequencing of the Mlh1 promoter region confirmed that the CpG sites close to the TSS were resistant to de-novo methylation (Fig. 3e, 3f) but the remaining CpG sites elsewhere gained methylation. However, this gain in methylation did not differ between wt- and B1-plasmids, suggesting that it occurred stochastically.

Fig. 3
Mlh1 is refractory to silencing by SINE B1s. (a) As observed for p14Arf, the Cdkn2d promoter was sensitive to the presence of SINE B1s and showed gradual decline in promoter activity along time, but relatively stable activity when cloned in a B1-free ...

A cryptic methylation-center upstream to p14Arf TSS is destabilized by SINE B1s

An important question is whether the SINE B1s are acting as methylation centers in this system, as reported previously for the mouse aprt gene. To address this question, we performed bisulfite sequencing to gain deeper information regarding methylation changes in the p14Arf promoter. The pattern of methylation was consistent with the data generated by COBRA and pyroMeth, with higher overall methylation of B1-containing plasmids (Fig. 4a). Two additional pieces of information were generated in this experiment. First, the pattern of methylation was not consistent with methylation spreading from B1 elements into nearby DNA, as the promoter sequence adjacent to SINE B1 remained mostly unmethylated. We observed, however, the existence of internal sequences in the p14Arf promoter that gained methylation before other CpG sites, and methylation spreading from these regions, which we call here a cryptic methylation center. Interestingly, the alignment of the four central CpG sites in the cryptic methylation center revealed a repeated sequence motif (A/G)GCG(A/G)(A/G), but no protein is known or predicted to bind to this site. Although also present in wt-p14, these regions gained methylation more rapidly in B1-containing plasmids. Second, as observed for the Mlh1 promoter, the CpG sites away from the TSS are more susceptible to gain of methylation (Fig. 4b), accumulating methylation density as high as 90% (for the methylation center) and 35% (the region upstream to the TSS).

Fig. 4
The p14Arf promoter has a cryptic methylation center. (a) Four CpG sites in the p14Arf promoters gained methylation at a faster rate than other CpG sites at 22 days after transfection, and spreading of DNA methylation can be observed in the vicinity of ...

Cloned SINE B1s did not become methylated, but acquired a silent chromatin configuration

In order to better characterize the epigenetic status of the SINE B1 elements, we performed bisulfite sequencing of the most 5′ region of the p14Arf promoter together with two or four copies of SINE B1s. This sequencing of B1 elements in p14Arf plasmids revealed that they remained methylation-free, supporting the idea that in this system these elements do not function as centers for spreading of DNA methylation (Fig. 5a). B1 elements, however, acquired H3K9me2 and H3K27me3, both markings associated with repressed status (Fig. 5b) as revealed by ChIP assay in comparison to the physiologically repressed genes Hbb-b1 and Nanog, and the constitutively expressed Tuba1a gene.

Fig. 5
SINE B1 elements remained methylation-free and do not function as methylation centers in this system. (a) Bisulfite sequencing of B1 sequences upstream to the p14Arf promoter shows no sign of significant cytosine methylation of these elements. (b) Despite ...

Human Alu SINEs also trigger rapid gene repression

Since mouse SINE B1 elements caused transcriptional repression, we sought to investigate whether human SINE and LINE repeats could have the same effect. For this study, we used the human E-cadherin gene (CDH1) as a model (Fig. 6a). The human CDH1 gene has four SINEs adjacent to the 3′ region of its CpG island (three Alus and a Mir element), and in reporter assays the removal of these repeats resulted in higher, stable promoter activity in a cell line-dependent manner (Fig. 6b). However, the introduction of LINE sequence to plasmids with or without SINEs did not change promoter activity. Thus, SINE retrotransposons of human origin, similar to the tested mouse SINE, can interfere negatively with promoter activity.

Retrotransposon distribution is influenced by insulator elements

To investigate the second prediction derived from our hypothesis, i.e. that SINE fitness is influenced by insulator elements, we annotated the abundance of mouse and human SINE retrotransposons in a 20-kb genomic region around genes TSS, and then compared the frequency of these elements between genes bound and not bound by CTCF, USF1 and USF2 (Fig. 7). These transcription factors have been previously proven to function as barriers to heterochromatin spreading (26, 27), and data of their in vivo binding in mouse ES cells (23) and multiple normal cell lines are available from whole-genome maps (24) and in public releases from the Encode Chromatin Group at Broad Institute and Massachusetts General Hospital. Other putative insulators like SP1 (25, 28) and VEZF1 (29) were not tested because they lack extensive in vivo binding studies, and we excluded from our study predicted binding sites. In general, genes bound by any of these factors show a higher frequency of SINE repeats than unbound genes, and the observed difference is statistically significant (p<0.01, t-Test). An exception is that the distribution of Alu repeats between CTCF-bound and -unbound genes is similar in promoter CpG islands (but different in non CpG island promoters). This finding was concordant between mouse and human genomes, and we believe that the lack of difference is related to the high frequency of CTCF binding sites in CpG islands (40% of promoter CGI are bound by CTCF, compared to 20% in non CpG island promoters), a weaker insulating activity of CTCF compared to USF1 and USF2, and that CpG islands are maintained in open chromatin status by additional mechanisms (for example, binding of Cfp1 and KDM2A proteins (30, 31)). USF1 and USF2 were similarly distributed between promoter CGI and non-CGI (USF1 and USF2 are present in 7% and 5% of gene promoters, respectively). In conclusion, the presence of SINE repeats is better tolerated by gene promoters insulated through USF1, USF2 and CTCF (in the case of non-CpG island promoters).

Fig. 7
SINE retrotransposons accumulate near gene promoters bound by insulator proteins. Gene promoters were grouped according their overlap with CpG islands (CGI) and binding by CTCF (mouse and human promoters), USF1 and USF2 (human promoters only). The graphics ...


Here we show that SINE B1 elements can influence the activity of proximal promoters and ultimately lead to epigenetic reprogramming. The early effect on promoter activity is compatible with direct recruitment of co-repressors, however alternative mechanisms cannot be ruled out. Independent of the mechanism by which SINE elements promote transcriptional repression with associated epigenetic remodeling of adjacent promoters, it is evident that their close proximity to gene promoters have a deleterious effect and, as such, their insertion in close proximity to gene promoters is evolutionary constrained. Indeed, we show here that Alu repeats in the human genome are more frequently found near gene promoters that are bound by insulator proteins. Additionally, our results show that for the tested gene promoters, transcriptional repression occurred before DNA methylation, supporting previous findings in a different cellular system (32). In the case of Cdkn2d promoter, transcriptional repression was sufficient to trigger relatively stable DNA methylation. A more dynamic mechanism of repression was involved in p14Arf silencing, with DNA methylation being subsequently replaced by histone marking. A similar phenomenon has been observed in cancer cells and dubbed “epigenetic switch”, however in that case repressive histone markings were replaced by DNA methylation (33, 34). It is clear from our results that not all genes are equally sensitivity to repression by retrotransposons, as Mlh1 promoter activity was only moderately affected by B1 SINEs: despite an initial decrease in promoter activity very early after transfection, no additional effect is observed later on. We speculate that the proximal promoter of Mlh1 is protected from heterochromatinization; indeed, profiling of multiple cancer tissues has shown that Mlh1 is methylated in a lower fraction of tumors compared to other classically studied genes like p16 and DAPK1.

Our data add to previous reports where silencing of nearby genes was induced by a retrotransposon. As we mentioned earlier, mouse B1 elements were reported to act as methylation centers from where DNA methylation leaked into the aprt gene promoter (18). Our data differs from this reporter in that spreading of DNA methylation from the SINE B1 into the nearby gene promoters was not observed. Still, this retroelement mediated gene repression and facilitated methylation spreading from a cryptic methylation center, located in the proximal promoter of the p14Arf gene. A possible explanation for this difference is that hundreds of thousands of SINE B1 elements occur in mice, and it is likely that different subfamilies exert their influence in nearby sequences through alternative mechanisms. For example, a subfamily of B1 elements has been previously described, and its influence on the transcriptional activity of proximal genes is mediated by the transcription factors Ahr and Slug (35). It still remains to be fully understood the primordial role that this newly identified SINE B1 subfamily plays in the genome, as the same research group later described an insulator function for this element (36). Such apparently contradictory findings show the complexity of the issue in question, with alternative outcomes of SINE B1 presence being mediated by sequence variation, physical localization in the genome and potentially tissue-specific regulation. Lunyak et al. (37) have identified an insulator activity for another SINE family, SINE B2. In their study, the insulating activity of SINE B2 elements appeared to be tissue-specific, as it created a permissive chromatin state for the transcription of the pituitary-specific growth hormone gene, and also developmentally regulated. All together, these data point out that it is unlikely that all retroelements of a certain family will have a universal function. Subtle changes in sequence and location in the genome, which creates an opportunity for interaction with diverse regulatory elements, will ultimately model their activity.

A weakness of our study is that our system is based on random integration of plasmids. Thus, we cannot rule out site-specific effects that may influence the SINE B1 activity and the transcriptional outcome of the tested gene promoters. However, it appears that site-specific effects were not the major determinants of the transcriptional and epigenetic fate of the tested constructs, given the reproducibility of the data across different gene promoters and between repeated experiments. In addition, the observed correlation of SINE elements both in human and mouse genomes with known insulator factors suggests that by large there is a need for isolating genes from these elements. Recent reports revealed that retrotransposition occurs in relatively high levels and account for genome variability across individuals and normal to disease states (3841). Once enough data have been generate with correlated genome, epigenome and transcriptome information for multiple subjects, it will be possible to more directly access the effect of newly inserted elements on the transcription and epigenetic state of neighboring genes. Naturally, hundreds to thousands of observations will be necessary to distinguish noise from an actual effect.

Supplementary Material




This work was supported by National Institutes of Health grants P50CA100632, RO1CA098006 and R33CA89837. J-PJI is an American Cancer Society Professor. All DNA sequencing was performed in the DNA Analysis Core Facility at the M.D. Anderson Cancer Center, which is supported by NCI Grant CA-16672 (DAF).


Conflict of interest: None


1. Kazazian HH., Jr Mobile elements: drivers of genome evolution. Science. 2004;303:1626–32. [PubMed]
2. Faulkner GJ, Kimura Y, Daub CO, Wani S, Plessy C, et al. The regulated retrotransposon transcriptome of mammalian cells. Nat Genet. 2009;41:563–71. [PubMed]
3. Shen S, Lin L, Cai JJ, Jiang P, Kenkel EJ, et al. Widespread establishment and regulatory impact of Alu exons in human genes. Proc Natl Acad Sci U S A. 2011;108:2837–42. [PubMed]
4. Soriano P, Meunier-Rotival M, Bernardi G. The distribution of interspersed repeats is nonuniform and conserved in the mouse and human genomes. Proc Natl Acad Sci U S A. 1983;80:1816–20. [PubMed]
5. Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, et al. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420:520–62. [PubMed]
6. Walsh CP, Chaillet JR, Bestor TH. Transcription of IAP endogenous retroviruses is constrained by cytosine methylation. Nat Genet. 1998;20:116–7. [PubMed]
7. Lister R, Pelizzola M, Dowen RH, Hawkins RD, Hon G, et al. Human DNA methylomes at base resolution show widespread epigenomic differences. Nature. 2009;462:315–22. [PMC free article] [PubMed]
8. Meissner A, Mikkelsen TS, Gu H, Wernig M, Hanna J, et al. Genome-scale DNA methylation maps of pluripotent and differentiated cells. Nature. 2008;454:766–70. [PMC free article] [PubMed]
9. Heard E, Clerc P, Avner P. X-chromosome inactivation in mammals. Annu Rev Genet. 1997;31:571–610. [PubMed]
10. Barlow DP. Gametic imprinting in mammals. Science. 1995;270:1610–3. [PubMed]
11. Jones PA, Baylin SB. The fundamental role of epigenetic events in cancer. Nat Rev Genet. 2002;3:415–28. [PubMed]
12. Estecio MR, Gallegos J, Vallot C, Castoro RJ, Chung W, et al. Genome architecture marked by retrotransposons modulates predisposition to DNA methylation in cancer. Genome Res. 2010;20:1369–82. [PubMed]
13. Bock C, Paulsen M, Tierling S, Mikeska T, Lengauer T, et al. CpG island methylation in human lymphocytes is highly correlated with DNA sequence, repeats, and predicted DNA structure. PLoS Genet. 2006;2:e26. [PubMed]
14. Feltus FA, Lee EK, Costello JF, Plass C, Vertino PM. DNA motifs associated with aberrant CpG island methylation. Genomics. 2006;87:572–9. [PubMed]
15. Greally JM. Short interspersed transposable elements (SINEs) are excluded from imprinted regions in the human genome. Proc Natl Acad Sci U S A. 2002;99:327–32. [PubMed]
16. Wang Z, Willard HF, Mukherjee S, Furey TS. Evidence of influence of genomic DNA sequence on human X chromosome inactivation. PLoS Comput Biol. 2006;2:e113. [PubMed]
17. Feltus FA, Lee EK, Costello JF, Plass C, Vertino PM. Predicting aberrant CpG island methylation. Proc Natl Acad Sci U S A. 2003;100:12253–8. [PubMed]
18. Yates PA, Burman R, Simpson J, Ponomoreva ON, Thayer MJ, et al. Silencing of mouse Aprt is a gradual process in differentiated cells. Mol Cell Biol. 2003;23:4461–70. [PMC free article] [PubMed]
19. Fu Z, Regan K, Zhang L, Muders MH, Thibodeau SN, et al. Deficiencies in Chfr and Mlh1 synergistically enhance tumor susceptibility in mice. The Journal of clinical investigation. 2009;119:2714–24. [PMC free article] [PubMed]
20. Kamijo T, Zindy F, Roussel MF, Quelle DE, Downing JR, et al. Tumor suppression at the mouse INK4a locus mediated by the alternative reading frame product p19ARF. Cell. 1997;91:649–59. [PubMed]
21. Toyota M, Ho C, Ahuja N, Jair KW, Li Q, et al. Identification of differentially methylated sequences in colorectal cancer by methylated CpG island amplification. Cancer Res. 1999;59:2307–12. [PubMed]
22. Xiong Z, Laird PW. COBRA: a sensitive and quantitative DNA methylation assay. Nucleic Acids Res. 1997;25:2532–4. [PMC free article] [PubMed]
23. Chen X, Xu H, Yuan P, Fang F, Huss M, et al. Integration of external signaling pathways with the core transcriptional network in embryonic stem cells. Cell. 2008;133:1106–17. [PubMed]
24. Rada-Iglesias A, Ameur A, Kapranov P, Enroth S, Komorowski J, et al. Whole-genome maps of USF1 and USF2 binding and histone H3 acetylation reveal new aspects of promoter structure and candidate genes for common human disorders. Genome Res. 2008;18:380–92. [PubMed]
25. Boumber YA, Kondo Y, Chen X, Shen L, Guo Y, et al. An Sp1/Sp3 binding polymorphism confers methylation protection. PLoS genetics. 2008;4:e1000162. [PMC free article] [PubMed]
26. Bell AC, West AG, Felsenfeld G. The protein CTCF is required for the enhancer blocking activity of vertebrate insulators. Cell. 1999;98:387–96. [PubMed]
27. West AG, Huang S, Gaszner M, Litt MD, Felsenfeld G. Recruitment of histone modifications by USF proteins at a vertebrate barrier element. Mol Cell. 2004;16:453–63. [PubMed]
28. Mummaneni P, Yates P, Simpson J, Rose J, Turker MS. The primary function of a redundant Sp1 binding site in the mouse aprt gene promoter is to block epigenetic gene inactivation. Nucleic Acids Res. 1998;26:5163–9. [PMC free article] [PubMed]
29. Dickson J, Gowher H, Strogantsev R, Gaszner M, Hair A, et al. VEZF1 elements mediate protection from DNA methylation. PLoS genetics. 2010;6:e1000804. [PMC free article] [PubMed]
30. Blackledge NP, Zhou JC, Tolstorukov MY, Farcas AM, Park PJ, et al. CpG islands recruit a histone H3 lysine 36 demethylase. Mol Cell. 2010;38:179–90. [PMC free article] [PubMed]
31. Thomson JP, Skene PJ, Selfridge J, Clouaire T, Guy J, et al. CpG islands influence chromatin structure via the CpG-binding protein Cfp1. Nature. 2010;464:1082–6. [PMC free article] [PubMed]
32. Mutskov V, Felsenfeld G. Silencing of transgene transcription precedes methylation of promoter DNA and histone H3 lysine 9. Embo J. 2004;23:138–49. [PubMed]
33. Gal-Yam EN, Egger G, Iniguez L, Holster H, Einarsson S, et al. Frequent switching of Polycomb repressive marks and DNA hypermethylation in the PC3 prostate cancer cell line. Proc Natl Acad Sci U S A. 2008;105:12979–84. [PubMed]
34. Kondo Y, Shen L, Cheng AS, Ahmed S, Boumber Y, et al. Gene silencing in cancer by histone H3 lysine 27 trimethylation independent of promoter DNA methylation. Nat Genet. 2008;40:741–50. [PubMed]
35. Roman AC, Benitez DA, Carvajal-Gonzalez JM, Fernandez-Salguero PM. Genome-wide B1 retrotransposon binds the transcription factors dioxin receptor and Slug and regulates gene expression in vivo. Proc Natl Acad Sci U S A. 2008;105:1632–7. [PubMed]
36. Roman AC, Gonzalez-Rico FJ, Molto E, Hernando H, Neto A, et al. Dioxin receptor and SLUG transcription factors regulate the insulator activity of B1 SINE retrotransposons via an RNA polymerase switch. Genome Res. 2011;21:422–32. [PubMed]
37. Lunyak VV, Prefontaine GG, Nunez E, Cramer T, Ju BG, et al. Developmentally regulated activation of a SINE B2 repeat as a domain boundary in organogenesis. Science. 2007;317:248–51. [PubMed]
38. Beck CR, Collier P, Macfarlane C, Malig M, Kidd JM, et al. LINE-1 retrotransposition activity in human genomes. Cell. 2010;141:1159–70. [PMC free article] [PubMed]
39. Ewing AD, Kazazian HH., Jr High-throughput sequencing reveals extensive variation in human-specific L1 content in individual human genomes. Genome Res. 2010;20:1262–70. [PubMed]
40. Huang CR, Schneider AM, Lu Y, Niranjan T, Shen P, et al. Mobile interspersed repeats are major structural variants in the human genome. Cell. 2010;141:1171–82. [PMC free article] [PubMed]
41. Iskow RC, McCabe MT, Mills RE, Torene S, Pittard WS, et al. Natural mutagenesis of human genomes by endogenous retrotransposons. Cell. 2010;141:1253–61. [PMC free article] [PubMed]