Transposable elements (TEs) occupy almost half, 46%, of the human genome, making the TE content of our genome one of the highest among mammals, second only to the opossum genome with a reported TE content of 52% [1
]. The total representation of TE-related sequences in the human genome is probably even higher, as many of the sequences of the most ancient TEs have deteriorated beyond recognition [3
]. The human genome contains two major classes of TEs, DNA and RNA transposons, defined by the type of molecule used as an intermediate in their mobilization.
DNA TEs encode a transposase that re-enters the nucleus to specifically recognize transposon sequences in chromosomal DNA. The transposase excises these sequences from their genomic location and inserts them into a new genomic site (reviewed in [4
]); this is also referred to as 'cut and paste' transposition. Human DNA TE activity subsided over 37 million years ago [5
]; as a result, DNA TEs no longer contribute significantly to the ongoing mutagenesis in humans.
Retrotransposons or retroelements make use of an RNA-mediated transposition process. Retroelements are subdivided into two major groups: those containing long-terminal repeats, LTR retroelements, and all others, lumped into the category of non-LTR retroelements. Although inactive in humans for millions of years, the best known LTR retrotransposons, the endogenous retroviruses, make up approximately 8% of the human genome [1
]. This contrasts with rodent genomes, in which LTR elements continue to contribute a high proportion of the germline TE-associated mutations (reviewed in [6
Non-LTR retrotransposons include autonomous and non-autonomous members. The autonomous long interspersed element-1 (LINE-1 or L1), and its non-autonomous partners, such as 'SINER, VNTR, and Alu' (SVA) and the short interspersed element (SINE) Alu, are the only mobile elements with clear evidence of current retrotrans-positional activity in the human genome [7
] and will therefore be the primary focus of this article.
The human L1 is about 6 kb long and encodes two open reading frames, ORF1 and ORF2, which are both required for L1 retrotransposition (Figure ) [8
]. ORF2 encodes endonuclease and reverse transcriptase activities that are crucial for the insertion mechanism [8
]. SINEs and SVA elements do not encode any proteins [10
], instead they depend on the presence of the functional L1s, and they are therefore often referred to as L1 parasites [11
]. In contrast to L1, Alu elements require only ORF2 of L1 for their mobilization [11
]. Alu elements are transcribed by RNA polymerase III and encode a variable length adenosine-rich region at their 3' end, a critical feature for retro-transposition [10
]. SVA is a composite element containing a complex sequence composed of a (CCCTCT)n
hexamer repeat region, an Alu-derived region, a variable number tandem repeat (VNTR) region and a retroviral-derived sequence (Figure ) [13
]. The requirements for SVA mobilization are still poorly understood [13
Figure 1 L1 expression leads to different types of DNA damage. Schematic structures of an SVA element (labeled SVA), showing the CCCTCT repeat, the Alu-derived (A-like) region, the variable number tandem repeat (VNTR) region, and the long terminal repeat (LTR)-derived (more ...)
TE activity has often been assumed to be confined to the germline, early embryogenesis, and potentially cancer cells [15
]. The most recent reports indicate that expression of L1 RNA (VP Belancio, A Roy-Engel, R Pochampally and P Deininger, personal communication) and L1 protein [16
] occurs in human somatic tissues and that somatic L1 retrotransposition takes place in transgenic mouse models [19
]. Interestingly, L1 transgenic mice show higher L1 mobilization in somatic tissues than in the germline [19
]. Other evidence of somatic L1 mobilization comes from a somatic L1 insertion that inactivates the adenomatous polyposis coli (APC
) gene, leading to colon cancer [22
]. There are currently very limited data on the somatic expression of Alu and SVA elements, and we do not have a true appreciation of the level of somatic insertion that is occurring from endogenous elements.