|Home | About | Journals | Submit | Contact Us | Français|
Mobile elements rely on cellular processes to replicate, and therefore, mobile element proteins frequently interact with a variety of cellular factors. The integrase (IN) encoded by the retrotransposon Ty5 interacts with the heterochromatin protein Sir4, and this interaction determines Ty5's preference to integrate into heterochromatin. We explored the hypothesis that Ty5's targeting mechanism arose by mimicking an interaction between Sir4 and another cellular protein(s). Mutational analyses defined the requirements for the IN-Sir4 interaction, providing criteria to screen for cellular analogues. Esc1, a protein associated with the inner nuclear membrane, interacted with the same domain of Sir4 as IN, and 75% of mutations that disrupted IN-Sir4 interactions also abrogated Esc1-Sir4 interactions. A small motif critical for recognizing Sir4 was identified in Esc1. The functional equivalency of this motif and the Sir4-interacting domain of IN was demonstrated by swapping these motifs and showing that the chimeric IN and Esc1 proteins effectively target integration and partition DNA, respectively. We conclude that Ty5 targets integration by imitating the Esc1-Sir4 interaction and suggest molecular mimicry as a general mechanism that enables mobile elements to interface with cellular processes.
The impact that transposable elements have on genome organization and function is determined by the site of integration. For a number of transposable elements, target site choice is nonrandom (8, 10, 16, 43). Whereas some transposable elements recognize specific DNA sequences, integration sites of others are influenced by the epigenetic state of the target, including its transcriptional status. Chromatin effects on target site choice are particularly evident for the long terminal repeat retrotransposons and retroviruses (collectively referred to as retroelements), which copy a retroelement mRNA into cDNA by reverse transcription. The element-encoded integrase (IN) inserts this cDNA into the host genome, the site of which is often dictated by DNA-bound protein complexes.
The influence of chromatin and transcription complexes on target site choice has been most extensively studied for the retrotransposons of yeast. The Tf1 element of Schizosaccharomyces pombe, for example, integrates within a narrow window upstream of transcription start sites, consistent with a role for RNA polymerase II transcription complexes in target site choice (7, 48). In Saccharomyces cerevisiae, the Ty1 and Ty3 retrotransposons integrate preferentially upstream of genes transcribed by RNA polymerase III (Pol III), such as tRNA and 5S ribosomal genes. Ty1 requires Pol III transcription to integrate within its preferred target, a 700-bp window upstream of Pol III-transcribed genes. Both the TFIIIB subunit Bdp1 and the chromatin remodeler Isw2 influence the pattern of Ty1 insertions upstream of target genes (6, 18). Ty3 integration site choice, on the other hand, is more precise, occurring within 1 to 3 bp of Pol III transcription start sites. This suggests that Ty3 recognizes a factor closely tied to Pol III transcription (12) and is consistent with the observation that the TFIIIB subunits Brf and TATA-binding protein are required for targeted integration (59).
In contrast to Ty1 and Ty3, the S. cerevisiae Ty5 retrotransposon integrates into regions of heterochromatin at the telomeres and silent mating-type loci. Ty5 selects integration sites through a direct interaction between Ty5 IN and the heterochromatin protein Sir4. Essential for Sir4 recognition is a 6-amino-acid motif of Ty5 IN, termed the targeting domain (TD) (58). The TD is phosphorylated, and substitutions in TD that prevent phosphorylation abrogate IN-Sir4 interactions and result in the random integration of Ty5 throughout the genome (17). Sir4 serves as a molecular scaffold at the nuclear periphery and interacts with many proteins, including the hypoacetylated tails of histone H3 (22). The TD recognizes the Sir4 C terminus (amino acids [aa] 951 to 1358), a region of the protein that encodes a coiled-coil domain with lamin-like heptad repeats (19). The recognition of Sir4 by IN is sufficient to mediate targeted integration: Ty5 integration occurs with high efficiency at DNA sites to which Sir4 is ectopically tethered (61). Furthermore, Ty5 target specificity can be changed by replacing TD with peptide motifs that recognize other DNA-bound protein partners (61).
Recognition of chromatin by integration complexes also appears to underlie retroviral target site choice. Both human immunodeficiency virus (HIV) and murine leukemia virus integrate preferentially at sites of RNA Pol II transcription (39, 44, 56), and HIV integration favors sites with transcription-associated chromatin modifications (55). For HIV, IN is the primary virally encoded determinant of integration site choice (32), and lens epithelium-derived growth factor (LEDGF)/p75, a chromatin-bound cotranscription factor, interacts with HIV IN and impacts target site preference. The absence or reduction of LEDGF levels in vivo significantly decreases HIV integration frequencies and target site bias (14, 35, 47, 51). Furthermore, HIV integration can be directed in vitro to sites of ectopically tethered LEDGF (15). Other chromatin-associated proteins also influence HIV integration. Barrier-to-autointegration factor (BAF) is an essential protein associated with chromatin structure and nuclear assembly (30, 45, 60) and prevents the autointegration of retroviral cDNA (31). Therefore, chromatin is emerging as a critical player in retroelement integration and target site choice.
In all cases examined to date, retrotransposons and retroviruses recognize aspects of chromatin or components of transcription complexes that serve critical cellular functions. This suggested to us that retroelements may mimic cellular factors that normally recognize DNA-bound protein complexes. We explored this hypothesis using Ty5 and specifically sought a cellular protein that interacts with Sir4 in a manner similar to that of Ty5 IN. Here, we demonstrate that Ty5 IN and Esc1, a protein associated with the nuclear periphery, are virtually equivalent in terms of their interactions with Sir4. We conclude that Ty5 targets integration to heterochromatin by mimicking the Esc1-Sir4 interaction.
YPH499 or isogenic strains with esc1, ku70, or sir4 deletions were used in the tethered targeting and mitotic stability assays. Deletions were constructed by the one-step gene knockout method (5) using hygromycin B or G418 cassettes from pFA6a-hphNT1 and pFA6a-KANMX, respectively (28, 54). The yeast two-hybrid reporter strains L40 and PJ69 were used for measuring protein-protein interactions with LexA and Sir4 and GBD and Esc1 proteins, respectively (24, 27). Strains YSB1, YSB2, YSB35, and YSB41, used for monitoring the nucleation of heterochromatin, were described elsewhere previously (13). All cultures were grown at 30°C unless noted otherwise.
All LexA-Sir4 constructs were made by PCR amplification of the relevant coding sequences (aa 950 to 1358) from a previously described LexA-Sir4 plasmid (3). PCR products were then digested with EcoRI/BglII and ligated into the EcoRI/BamHI sites of the LexA expression construct pBTM116 (40). A fusion between the Gal4 activating domain (GAD) and Sir4 (aa 950 to 1358) was made by moving the Sir4-containing EcoRI/PstI fragment from pYZ127 into the corresponding sites of pYZ277, a TRP1-marked version of pGAD-C1 (27). pPF204 (GAD-Esc1 aa 1361 to 1658) was isolated from the yeast two-hybrid screen. GAD-Esc1 (aa 1440 to 1473) was made by PCR amplification of pPF204 (GAD-Esc1 aa 1361 to 1658) and ligation of the PCR product into the EcoRI/BamHI sites of pGAD-C1. Mutant versions of the construct at aa 1440 to 1473 were made by primer extension of partially overlapping primers, followed by digestion and ligation into pGAD-C1. Other GAD-Esc1 constructs (aa 1443 to 1455 and aa 1448 to 1456) were made by annealing complementary primers and ligating them into the EcoRI/BamHI sites of pGAD-C1. GBD-Esc1 plasmids were generated by digesting GAD-Esc1 plasmids with EcoRI/EcoRV and ligating the resulting fragment into the EcoRI/MscI sites of pGBD-U (27). Chimeric Ty5 and Esc1 sequences were made using four-primer PCR to generate the desired coding sequence. For Ty5 constructs, PCR products were introduced into the BspEI/PflMI sites of Ty5 on pTB60, a pDR14 derivative with an Arg-Gly-Ser-His6 epitope tag within IN (21, 25). For chimeric Esc1 constructs, PCR products were ligated into the ApaI sites of pDZ45, a centromeric plasmid containing the entire Esc1 reading frame that complements esc1Δ cells (2). For the tethered targeting assay, target plasmids with either zero or four LexA operators were created by swapping the TRP1 marker from pYZ316 and pYZ317 with LEU2 (61). For the reverse two-hybrid screens, a plasmid suitable for generating the LexA-Sir4 mutant library by recombination was made by substituting silent mutations into LexA-Sir4 aa 950 to 1358 at residues K971 and R1331 to introduce SacI and PpuMI sites, respectively. All plasmids constructed using PCR were verified by sequencing. Primer sequences are available upon request.
Mutagenic PCR was used to create the mutant LexA-Sir4 library used for the reverse two-hybrid screens (11, 53). Briefly, Sir4-His6 was PCR amplified in a 100-μl reaction mixture containing final concentrations of 7 mM MgCl2, 0.5 mM MnCl2, 0.2 mM dATP and dGTP, and 1 mM dCTP and dTTP. Mutagenized Sir4 fragments were cotransformed with a gapped pCS439 plasmid (SacI/PpuMI) using a yeast strain expressing GAD-IN (58). Transformants were stamped after 2 days of growth on selective medium lacking histidine to identify colonies in which Sir4 and IN no longer interacted. Candidates were retested, and those colonies in which Sir4 and IN were still unable to interact were subjected to colony PCR to determine whether they contained a Sir4 insert. The remaining pool of candidates was then screened by immunoblot analysis to detect Sir4 proteins that were unstable or truncated due to nonsense mutations. Plasmids were rescued from yeast by the glass bead method (5) and sequenced.
A previously described assay was used to measure the mitotic stability of unstable plasmids to which various LexA-Sir4 proteins were tethered (3). Cells were grown in 96-well plates until cultures reached an optical density at 600 nm of 3.0 to 4.0 (36 to 48 h). The effect of various chimeric Esc1 proteins on mitotic stability was tested in a similar manner in an esc1Δ strain. For these experiments, wild-type Esc1 or the Esc1 chimeras were encoded on a LEU2 plasmid. The targeted integration of Ty5 to sites of tethered Sir4 was measured as described previously (9).
We previously demonstrated that Ty5 IN interacts with the C terminus of Sir4 (residues 951 to 1358) and that N-terminal truncations beyond residue 971 impair interactions with IN (Fig. (Fig.1A)1A) (61). To define the C-terminal boundary of the IN-interacting domain, yeast two-hybrid assays were performed with GAD-IN and various Sir4 C-terminal truncations fused to LexA. The coiled-coil domain of Sir4 was not required for the interaction, because the LexA fusion at aa 951 to 1251 still interacted with IN (Fig. (Fig.1A)1A) even though it lacks the coiled-coil domain (residues 1271 to 1346) (41). Truncations past aa 1082 resulted in a low level of expression of the Sir4 fusion proteins (see Fig. S1A in the supplemental material), leading us to designate the minimal IN-interacting domain of Sir4 aa 971 to 1082.
Mutations in two Sir4 residues in the IN-interacting domain (W974 and R975) were previously shown to abrogate Sir4-IN interactions (61) (Fig. (Fig.1B).1B). Four additional residues were identified by alanine-scanning mutagenesis of the region spanning aa 976 to 990 (Fig. (Fig.1B;1B; see Fig. S1B in the supplemental material). The relatively large number of critical residues identified (6 out of 19 tested) suggested that the IN-interacting domain may be more extensive, and so a reverse two-hybrid screen was performed by randomly mutagenizing a DNA fragment encoding the Sir4 C terminus (aa 951 to 1358) and screening for variants that failed to interact with IN. Forty-one mutants with single or multiple amino acid substitutions were recovered, 90% of which contained at least one missense mutation in the minimal IN-interacting domain (see Table S1 and Fig. S1C in the supplemental material).
Six single point mutants and three double mutants were selected for further analyses. The protein expression levels of these mutants were comparable to levels of the wild-type protein (see Fig. S1D in the supplemental material). In Fig. Fig.1B,1B, the mutations in the six single and three double mutants are mapped onto an amino acid sequence alignment of Sir4 homologues from different yeast species. All of the single mutations reside within the minimal IN-interacting region. T957L and K1123R both lie outside this domain; however, each one was recovered in a double mutant, and we did not determine if the amino acid change responsible for the phenotype was within the IN interaction domain. All mutants were tested for their abilities to interact with Sir4 in sir2Δ, sir3Δ, and sir4Δ strains with little or no observable effect, suggesting that the interaction is direct. Interestingly, all of the mutations are in residues that are conserved in at least three of the four Sir4 homologues, and 7 of the 11 residues in the single mutants are highly hydrophobic, implying that the IN interaction is mediated by a hydrophobic interface or that these residues are important for the proper folding of the protein. Based on the data from both mutagenesis approaches, we conclude that IN interacts with a discrete domain of Sir4. Furthermore, the amino acid sequence conservation suggests that this domain is likely to be important for Sir4 function.
Because targeting occurs as a consequence of the IN-Sir4 interaction, mutations in Sir4 that abrogate interactions with IN might affect Ty5 target specificity. To test this, we measured Ty5 integration specificity using a previously established targeting assay that measures integration in the vicinity of LexA operators to which the various mutant Sir4 proteins are tethered (61) (Fig. (Fig.2A).2A). In this assay, a subset of Ty5 insertions occur near LexA operators on a target plasmid, and the ratio of target plasmids with Ty5 to those without Ty5 provides a measure of targeting efficiency. Of the nine mutants tested in this assay, the majority showed a decrease in targeting efficiency of greater than threefold (Fig. (Fig.2B).2B). Several Sir4 alleles still supported targeting, however, suggesting that endogenous Sir4 may affect targeting through the formation of complexes with the LexA-Sir4 fusions (see below) or that residual IN-Sir4 interactions that were undetectable by the two-hybrid assay were sufficient to target Ty5.
Ty5 insertions mediated by interactions between Sir4 and IN would be expected to cluster around the LexA operator, and so sites of Ty5 integration in the Sir4 mutants were determined by DNA sequencing and compared to the integration pattern observed with wild-type Sir4. Between 7 and 11 target plasmids with Ty5 insertions were analyzed for each of the nine Sir4 mutants (Fig. (Fig.2C).2C). The majority of insertions mirrored the wild-type pattern, occurring within a window ±100 bp of the LexA operators. The S1047P allele, which supported targeting at background levels, displayed the only altered integration pattern, with 4 of 11 insertions occurring near the origin of replication on the target plasmid (data not shown). In a sir4Δ strain, this altered integration pattern was no longer observed (zero of seven insertions characterized) (data not shown). No significant changes in targeting efficiencies were observed for the other mutants in a sir4Δ strain, except for the F1076L mutant, which showed a sixfold decrease in targeting (data not shown). These data indicate that endogenous Sir4 can contribute to both target specificity and integration efficiency in some instances where interactions between IN and the LexA-Sir4 fusion are attenuated.
If the conserved, IN-interacting domain of Sir4 is critical for its function, then this domain may interact with other proteins to carry out its biological activity. To identify such proteins, a yeast two-hybrid screen was carried out using LexA-Sir4C (aa 951 to 1358) as bait. The screen identified six proteins: Sum1, Chd1, Nma2, and Nup157 (none of which were previously shown to bind Sir4), Esc1 (a known Sir4-interacting protein) (2), and Sir4 itself. Sum1 is a transcriptional repressor of middle-sporulation-specific genes and is involved in telomere maintenance (4, 57). Chd1 is a chromatin remodeler (50), and Nma2 (nicotinic acid mononucleotide adenylyltransferase) is involved in the NAD+ salvage pathway (20). Nup157 is a member of the nucleoporin protein family (1), and Esc1 is important for the assembly of the nuclear pore complex (33) and tethers telomeres to the nuclear periphery (49). All of the proteins interacted with LexA-Sir4 in sir2Δ, sir3Δ, and sir4Δ strains (data not shown), suggesting that the interaction between these proteins and Sir4 is direct.
To determine whether IN mimics one of the identified cellular proteins in its interaction with Sir4, we tested these proteins for interactions with selected Sir4 deletion constructs and the W974A mutant to assess whether their interactions resembled those of GAD-IN (Fig. (Fig.3A).3A). In contrast to IN, Sum1 required the entire Sir4 C terminus for the interaction, and Nma2, Chd1, Nup157, and Sir4 required only the coiled-coil domain. As reported previously, Esc1 interacted with the Sir4 C terminus in the absence of the coiled-coil domain but, like IN, required the upstream N-terminal region (2). The W974A mutation abrogated interactions with Sum1, Chd1, and Esc1 but not interactions with Nma2, Nup157, or Sir4 itself (Fig. (Fig.3B).3B). Based on these results, we concluded that Esc1 shares the most similarity with IN in terms of its Sir4 interactions.
To further characterize the similarity between the IN-Sir4 and Esc1-Sir4 interactions, Esc1-Sir4 two-hybrid assays were conducted with all of the Sir4 truncations and point mutants tested with IN. Two fragments of Esc1 were used: a GAD-Esc1 C-terminal fragment (aa 1361 to 1658) isolated in our two-hybrid screen and a 34-amino-acid region of Esc1 (aa 1440 to 1473) reported previously to interact with Sir4 (2). Esc1 from aa 1361 to 1658 failed to interact with N-terminal truncations past aa 971 and C-terminal truncations preceding aa 1080. Esc1 from aa 1440 to 1473 had similar requirements, with N- and C-terminal boundaries of aa 961 and 1082, respectively (Fig. (Fig.4A4A).
Among the LexA-Sir4 mutants tested, 10 of 15 mutants failed to interact with the full-length Esc1 C terminus (Fig. (Fig.4B4B and data not shown). Of the six mutants that interacted, three (R975G, T957L/K1037E, and V987A) exhibited slow growth, indicating a weakened interaction. This effect was more evident on media containing higher concentrations of 3AT, an inhibitor of the HIS3 reporter gene (data not shown). The smaller Esc1 fragment interacted only with K1064N and K1050M/R1075G mutants, both of which showed strong interactions with the larger Esc1 C terminus. These residues are therefore important for interactions with IN and are not critical for interactions with Esc1. Collectively, the data show that Esc1 interacts with Sir4 in a fashion similar to that of IN and that the 34-amino-acid Esc1 fragment (aa 1440 to 1473) encodes a strong Sir4-interacting region that behaves like the larger Esc1 C terminus.
The region of Sir4 that interacts with IN and Esc1 is involved in partitioning DNA to daughter cells during mitosis (3). This domain, referred to as the partitioning and anchoring domain (PAD), spans residues 950 to 1262. Partitioning of DNA by the PAD was demonstrated in assays in which a LexA-Sir4 PAD fusion was tethered to an otherwise unstable plasmid. The interaction of the Sir4 PAD with Esc1 is necessary for the plasmid to be partitioned to daughter cells, and partitioning is measured by the retention of a plasmid marker after cell division. While the full PAD is required for optimal partitioning, a smaller region (aa 950 to 1150) retains partial activity even in the absence of endogenous Sir4 (3). The minimal PAD roughly corresponds to the IN-interacting domain of Sir4.
To assess whether mutations that abrogate IN-Sir4 interactions affect Esc1 function in vivo, nine Sir4 mutants were tested for their ability to support DNA partitioning. The two mutants that interacted strongly with both Esc1 from aa 1361 to 1658 and aa 1440 to 1473 (K1064N and K1050M/R1075G) displayed the highest DNA-partitioning activity (compare Fig. 4B and C). The remaining seven mutants that showed little or no interaction with Esc1 from aa 1440 to 1473 had reduced levels of partitioning, which were significantly different from that of the wild type (Fig. (Fig.4C).4C). The ability of these Sir4 mutants to engage in yeast two-hybrid interactions correlates well with their ability to efficiently partition plasmids. The data further suggest a role for aa 1440 to 1473 of Esc1 in DNA partitioning and support the hypothesis that IN imitates Esc1 in its interaction with Sir4.
The ability of the Sir4 mutants to interact with Esc1 in two-hybrid assays correlated well with their ability to target Ty5 integration (compare Fig. Fig.2B2B and and4B).4B). To test whether Esc1 or other associated proteins in the nuclear periphery affect Ty5 target specificity, targeted transposition assays were carried out in single- and double-deletion strains lacking Esc1, Sir4, and Ku70. We did not detect any effect on targeted integration in any of these deletion strains, indicating that these proteins either alone or in combination are not required for integration in the tethered targeting assay (see Table S2 in the supplemental material).
In addition to having comparable Sir4 interaction profiles, the C termini of IN and Esc1 share amino acid sequence similarities. First, the Sir4-interacting domain of Esc1 (aa 1395 to 1551) is serine and proline rich (11.9% serine and 9.3% proline), a feature shared with the Ty5 IN C terminus (aa 934 to 1131, with 11.6% serine and 7.1% proline). This amino acid sequence composition is well above the average for proteins in the UniProtKB database (254,609 sequence entries), which average 6.8% serine and 4.8% proline. Second, an alignment of Esc1 homologues from five different yeast species revealed a stretch of 13 highly conserved residues within the region spanning aa 1440 to 1473, suggesting that this Sir4 interaction domain plays an important role in Esc1 function (Fig. (Fig.5A).5A). Although there is little significant similarity in terms of amino acid sequence signatures between the Esc1 and Ty5 IN domains, we did notice that Esc1 encodes a sequence motif within the conserved block (aa 1448 to 1453, LPSDPP) that has three of the four residues in the Ty5 TD that are required for targeted integration (Fig. (Fig.5B5B).
To assess the functional relevance of the TD-like motif of Esc1, we performed two hybrid assays between GAD-Sir4 and a Gal4 DNA binding domain (GBD) fusion with the Esc1 fragment at aa 1440 to 1473 or two derivatives with alanine point mutations (S1450A and D1451A) (Fig. (Fig.5C).5C). Of the proteins tested, only GBD-Esc1 (S1450A) failed to interact with Sir4. All proteins were expressed well, including the S1450A mutant (see Fig. S2A in the supplemental material). Comparable results were produced in a heterochromatin nucleation assay in which the GBD-Esc1 fusion proteins were assayed for their abilities to nucleate heterochromatin at a mutant HMR silencer (see Fig. S2B in the supplemental material). The Esc1 fragments, with the exception of the S1450A mutant, nucleated heterochromatin in a fashion similar to that of a GBD-TD fusion protein (see Fig. S2C in the supplemental material). In summary, the data indicate that a TD-like motif within the conserved Esc1 domain is capable of interacting with Sir4 and that a serine within the motif at position 1450 is critical for the interaction.
Given the similar requirements for interactions of IN and Esc1 with Sir4 and the existence of small Sir4-interacting motifs in both proteins, we hypothesized that these domains are functionally equivalent and therefore may functionally substitute for one another. To test this, Ty5's TD and adjacent amino acids were swapped with the conserved region of Esc1 (aa 1443 to 1455). An S1450A mutation was introduced into the Ty5-Esc1 chimera and tested along with the wild type for targeted integration using our tethered targeting assay. The chimera targeted as well as did wild-type Ty5, showing that the Esc1 motif can functionally substitute for TD (Fig. (Fig.6A).6A). The S1450A mutation, which disrupted interactions with Sir4 in yeast two-hybrid assays and failed to nucleate heterochromatin, significantly reduced targeting when placed in the context of the Ty5 IN.
A reciprocal swap in which Esc1 at aa 1443 to 1455 was replaced with a TD-encoding sequence from Ty5 or an equivalently sized region of glutathione S-transferase (GST) predicted to have a structure similar to that of the Esc1 motif was also performed (data not shown). The Esc1 chimeras were introduced into an esc1Δ strain on single-copy plasmids and tested for their abilities to partition DNA as described above. The replacement of the conserved region of Esc1 with GST abolished all partitioning activity (Fig. (Fig.6B).6B). The TD-containing Esc1 chimera showed an intermediate level of DNA partitioning that was well above levels for the negative control. The ability of these domains to replace one another demonstrates their functional equivalency and supports the hypothesis that Ty5 targets integration by mimicking the interaction between Esc1 and Sir4.
Different retroelements use different strategies to populate their host genomes. Retrotransposons often integrate into “safe havens,” benign regions of the genome that can withstand insertions without deleterious consequences to host fitness, thereby ensuring that both the retrotransposon and host persist (8, 10, 16, 43). The retroviruses, on the other hand, often integrate preferentially into regions of the genome conducive to transcription, presumably so that progeny viruses can be produced to amplify the infection. How do retroelements evolve their targeting strategies? Here, we present evidence that one mechanism is for IN to mimic a chromatin factor already associated with the desired target sites. For Ty5, targeting integration to heterochromatin is accomplished by mimicking the interaction between Esc1 and the heterochromatin protein Sir4.
Although Esc1 and IN act in very different biological processes, both proteins have short motifs in their C termini that mediate the interaction with Sir4. Importantly, the motifs are functionally equivalent: Esc1 chimeras with the Ty5 TD partition DNA during mitosis, and Ty5 chimeras with the Esc1 motif target integration to sites of tethered Sir4. The amino acid sequence signature of critical residues in the Ty5 targeting domain are strikingly similar to the core of the Esc1 motif (LDSSPP for Ty5 [underlined residues are important for function] and LPSDPP for Esc1 [underlined residues are shared with the TD]). The only difference is the second serine in the TD (S1095), which is an aspartate in Esc1 (D1451). We have recently shown that S1095 of the TD is phosphorylated and that phosphorylation is required for both interactions with Sir4 and targeted integration (17). Furthermore, negatively charged amino acids can functionally replace S1095. This likely explains the ability of the Esc1 motif to serve as a targeting determinant when it is substituted for the TD. The two motifs, however, are not completely equivalent: a D1451A mutation in Esc1, for example, does not affect interactions with Sir4, whereas mutations at the analogous position of TD (S1095) do (17). This difference may reflect the adaptation of this Sir4 binding motif for optimal function in distinct biological processes. Interestingly, Esc1 has been reported to be a phosphoprotein (23), and we found by mass spectrometry that the 34-aa fragment of Esc1 that interacts with Sir4 is multiply phosphorylated (data not shown). The phosphorylation status of Esc1 underscores the similarities between the small Sir4-interacting regions of these two proteins. Whether Esc1 phosphorylation plays a role in its affinity for Sir4 is currently under investigation. It will also be interesting to know if this Esc1 domain has any role in telomere localization and/or nuclear pore function (33, 49).
Although the Ty5 TD and the Esc1 motif are essential for interactions with Sir4 in vivo, they are not the only Sir4-interacting regions of these proteins. Both motifs reside in extended serine/proline-rich domains, a feature shared by some protein interaction domains regulated by phosphorylation (29, 42). Larger fragments of IN show a significantly stronger interaction with Sir4 in yeast two-hybrid assays than does TD alone (58). Similarly, regions of Esc1 other than those described here can nucleate heterochromatin, presumably through their interaction with Sir4 (2). Although other regions of Esc1 and IN appear to bind Sir4, both the Esc1 motif described here and the TD are critical for productive Sir4 interactions. These motifs may serve as communication modules for a larger protein scaffold that directs their respective Sir4 associations.
How did Ty5 acquire its TD? Esc1 and Ty5 IN may have independently evolved a common motif for interacting with Sir4. Alternatively, and what seems to be a more likely scenario, Ty5 acquired the TD by transducing the motif from Esc1 or a related protein, similar to the mechanism by which oncoviruses acquire oncogenes. Some retrotransposons have chromodomains in their IN C termini (38), which, in cellular genes, recognize specific histone modifications and therefore are logical targeting determinants. It may be that the acquisition of chromatin interaction modules has been commonly employed by retroelements to evolve new target site specificities.
The strategy of tethering integration complexes through mimicry raises interesting questions about the dynamics of the integration process. Because both IN and Esc1 interact with the same region of Sir4, it is possible that these two proteins compete for binding in vivo. However, we saw no transposition or targeting phenotype in esc1Δ strains. We also saw no effect in mitotic stability assays when Ty5 was overexpressed or when IN was overexpressed as a GAD or GBD fusion protein (our unpublished observations). Finally, no effect on transposition was observed upon GBD-Esc1 overexpression (our unpublished observations). Previous colocalization studies of Sir4 and an overexpressed GFP-Esc1 fusion protein showed that the two proteins only partially colocalize (2). Although this observation may not reflect the association of these proteins at endogenous expression levels, it is likely that free Sir4 with which both IN and Esc1 can interact is available.
Although Sir4 appears to be the main targeting determinant, several lines of evidence suggest that other factors play a role. For example, the level of targeted integration in sir4Δ strains is significantly higher than expected for random integration, suggesting the involvement of targeting cofactors (62). Sir4 interacts with numerous proteins, and we have not exhaustively tested other Sir4-interacting proteins for transposition phenotypes, including the novel Sir4 interactors identified in this study. Among the Sir4 interactors that we did analyze, namely, Esc1, Ku70, and Sir4 itself, we did not observe an effect on Ty5 transposition frequencies or target specificity as measured by our tethered targeting assay (see Table S2 in the supplemental material) (data not shown). However, there is a hint that Esc1 or the Esc1-Sir4 interaction contributes to target specificity: many of the Sir4 mutants that failed to interact with IN retained the ability to target Ty5 integration at wild-type levels, and all of these mutants also interacted with the C terminus of Esc1. A subtle effect on targeting in esc1Δ strains may not be revealed by our tethered targeting assays but may be evident by surveying integration patterns of chromosomal insertions.
Also of interest is how integration complexes access DNA during the integration process. One mechanism for Ty5 may be to link integration to certain times in the cell cycle, perhaps when heterochromatin is dismantled or newly established. At the telomeres, heterochromatin formation begins with Rap1 binding to the telomeric repeats, followed by the recruitment of Sir4 (36). At this point, Ty5 could gain access to DNA targets as Esc1 begins to tether the telomeres to the nuclear periphery (Fig. (Fig.7).7). Regulation by phosphorylation could help coordinate the timing of such events, raising the possibility that the IN-Sir4 and Esc1-Sir4 interactions are controlled by the same regulatory pathways. Work to determine the timing of Ty5 integration during the cell cycle, to test for a role for phosphorylation in the Esc1-Sir4 interaction, and to identify kinases involved in regulating the interactions of these proteins is currently under way.
Recent findings demonstrate that HIV also targets integration through a protein-protein interaction between HIV IN and LEDGF/p75, a general transcription factor that is constitutively chromatin bound (52). The depletion of LEDGF blocks HIV integration (35), indicating the importance of chromatin localization in the infection process. Emerin, a protein localized at the nuclear periphery, has also been reported to be required for HIV infectivity (26). Although the role of emerin remains controversial (46), clear parallels can be drawn between Ty5 and HIV integration in terms of the impact of chromatin-bound proteins, some of which localize to the nuclear periphery (Fig. (Fig.77).
Another similarity between Ty5 and HIV is the recent finding that LEDGF interacts with the Myc-binding host factor JPO2 (37). This LEDGF-JPO2 association stabilizes JPO2 protein levels, and LEDGF tethers JPO2 to chromatin in a manner analogous to the way it tethers IN (34, 52) (Fig. (Fig.7).7). Mutations that affect the IN binding surface of LEDGF disrupt interactions with JPO2, suggesting that an HIV IN-JPO2 relationship is similar to that of Ty5 IN and Esc1. Thus, mimicry of host factor interactions may be a general phenomenon utilized by retroelements to tether integration complexes to target sites. A further understanding of the role of chromatin in target site selection will likely provide insight into how these elements impact genome organization and may offer new therapeutic opportunities for treating retroviral diseases.
We thank P. James for the yeast two-hybrid library, M. Gartenberg for reagents for measuring mitotic stability, and R. Sternglanz for Esc1 plasmids and yeast strains for the heterochromatin nucleation assays. We are grateful to X. Gao for help with statistical analyses.
This work was supported by National Institutes of Health grant GM061657 to D.F.V.
Published ahead of print on 17 December 2007.
†Supplemental material for this article may be found at http://mcb.asm.org/.