In summary, we found that TEs have contributed nearly half of the open chromatin regions of the human genome and the majority of primate-specific elements. This estimate is a lower bound that is likely to grow given that better strategies using longer and paired-end reads will be needed to measure the contribution of young repeat subfamilies and polymorphic sites (Figure S3
). An example is the L1PA2 repeat subfamily where, despite the fact that the mappability ratio is 0.08, 117 and 257 of the 4904 L1PA2 instances contributed to the H1 and H7 DHSs respectively. This finding is consistent with previous observations 
but greatly expands on our understanding of the repeat families contributing to open chromatin in the human genome.
To better understand the regulatory functions that could have been retained in exapted TEs beyond the ones that have already been studied (e.g. 
), we predicted a total of 2150 TF-repeat subfamily associations and confirmed that a broad range of functional proteins are targeting these regions ( and Tables S3
). This resource will be useful to provide insights into the regulation of some of the TE-derived loci that have already been implicated in disease 
. There is an important distinction between biochemical activity and functional relevance to the host. To help confirm the importance of these regions, we also showed that repeat instances contributing to open chromatin were more conserved than expected by chance ().
Next, we demonstrated that LTR/ERV repeats have contributed a disproportionate fraction of cell type-specific accessible chromatin regions especially in embryonic and cancer cell lines (). This is interesting given that network rewiring using ERV elements has already been described in ESCs 
and that it has been shown that stem cell potency fluctuates with endogenous retrovirus activity in mouse 
. The level of activity observed in ERV sequences is likely a consequence of the permissive chromatin state found in ESCs that it sometimes reinstated in cancer 
. There is fine balance between the successful replication of endogenous retroviruses, from which these repeats are derived, and retrotransposition control in the host 
. One intriguing possibility is that the manipulations that were initially exerted by the ancestral viruses on their host to by-pass these control mechanisms have also facilitated co-option 
Finally, we also reported that repeat subfamilies activated in a cell type-specific manner were also frequently associated with higher expression of neighboring genes. This result corroborates the fact that at the level of expression, TE-derived transcripts, including lincRNAs 
, are also usually tissue-specific 
. Interestingly, this pattern was observed not only in ESCs but also in differentiated and cancer cells ( and Table S6
Taken together, these results demonstrate that TEs, and in particular endogenous retroviruses, have considerably transformed the transcriptional landscape during primate evolution.