This study discovered a strong association between the genomic location of HERV-H proviruses and H3K4me3-modified histones, that is exclusive to human ES and some iPS cells. Consistent with H3K4me3 serving as a marker for active transcription, HERV-H expression was high in these pluripotent stem cells. Moreover, the pluripotency transcription factors NANOG, OCT4 and SOX2 bound to the LTRs of transcriptionally active HERV-H proviruses, or within 2 kB of them. NANOG, OCT4 and SOX2 frequently co-occupy the promoters of target genes, many of which are transcription factors that regulate development such as homeodomain proteins [41
These observations strongly support the hypothesis that HERV-H transcripts play a role in human pluripotency and that this role is finely regulated by three of the most important transcription factors in ES cells. In addition to the binding of NANOG, OCT4, and SOX2 to the HERV-H promoter, HERV-H RNA decreased as ES cells differentiated, in a manner that was proportional to the expression of NANOG and OCT4. Conversely, HERV-H RNA was undetectable in primary fibroblasts but increased enormously after forced re-programming to generate pluripotent stem cells (unpublished data provided by Audrey Letourneau and Stylianos Antonarakis). HERV-H, then, can be exploited as a reliable marker of ES cell pluripotency, as well as an indicator of the degree of “stemness” of iPS cells as they are generated from fibroblasts.
HERV-H transcripts are 5 to 6 kB in length and lack open reading frames. We can only speculate about the function of these lncRNAs. They might, for example, serve to soak-up miRNAs that promote differentiation, as has been shown with linc-MD1 in muscle differentiation [42
] or the PTENP1
pseudogene in the regulation of PTEN and growth suppression [43
]. They might bind to chromatin and act as a scaffold for the local recruitment of pluripotency transcription factors, similar to other lncRNAs like HOTAIR for histone modification complexes [44
] and Xist in the context of X-chromosome inactivation [44
]. Alternatively, HERV-H might counteract retrovirus spread by interfering with packaging of retroviral genomic RNA [47
] or by soaking up miRNAs that are required for retrovirus transduction.
The study here failed to identify chromatin markers that associate with endogenous retro-elements in mice. This was somewhat surprising given the many endogenous retro-elements in this species, including endogenous gamma-retroviruses, some of which are intact and functional [8
]. It was also surprising because exogenous gamma-retroviruses have the same integration site preferences in mouse cells as they have in human cells [18
]. MLV integration sites are associated with the H3K4me3 profile in mouse embryonic fibroblasts (F score
). Similar results with murine hematopoietic stem cells (F score of 0.81; p
) indicate that, as in human cells, the association strength is cell-type dependent.
One possible explanation for the failure to identify chromatin markers associated with endogenous murine retroviruses is species-specific differences in the recruitment of the transcriptional silencing machinery. In murine ES cells, for example, a sequence-specific DNA-binding protein, ZNF809 [49
], recruits TRIM28 and other components of the cellular machinery that silence MLV [50
]. ZNF809 has no orthologue in humans; perhaps ZNF809 arose as a result of selective pressure exerted by murine specific gamma-retroviruses during evolution.
Previous work demonstrated that when exogenous retroviruses integrate they home to sites of H3K4me3 [18
]. Similarly, the association of endogenous gamma-retrovirus HERV-H with H3K4me3 suggests that when human and simian germ cells were bombarded with the HERV-H ancestor 15 to 30 millions years ago, these ancient retroviruses integrated in proximity to H3K4me3-marked chromatin. These proviruses might then have retained these cell-type specific marks as they became fixed in the primate genome. Alternatively, unmethylated HERV-H LTRs might have recruited chromatin remodeling factors and induced H3K4me3 modification of the viral promoter after integration had occurred.
Analysis of the DNA surrounding HERV-H proviruses failed to clarify which of these two scenarios is more likely. Additionally, search for epigenetic markers like H3K4me3 in syntenic regions in the mouse genome was attempted to determine if these chromatin marks are conserved across the species and predate the entry of HERV-H into the primate genome. DNA surrounding HERV-H proviruses in the human genome was aligned to the mouse genome (using the tool LiftOver, http://genome.ucsc.edu/cgi-bin/hgLiftOver
) after excision of all repetitive elements (performed with RepeatMasker, http://www.repeatmasker.org/
). Nothing informative was found by measuring the association of mouse H3K4me3 with these syntenic regions and comparing these values with those obtained using control loci.
P300 and H3K27ac bind intra-species conserved regions and co-localize in embryonic-specific enhancers [34
]. These markers are also associated with HERV-H in human ES cells, lying within 4 kB of 80% of HERV-H proviruses (F score 0.95; p
). It seems unlikely that an exogenous retrovirus would be capable of recruiting these factors by exploiting random conserved regions around its integration site. This suggests that a pre-existing layer of epigenetic markers favored integration of HERV-H into particular host loci and that these features are still preserved millions of years later.