|Home | About | Journals | Submit | Contact Us | Français|
Translocated in liposarcoma, Ewing's sarcoma and TATA-binding protein-associated factor 15 constitute an interesting and important family of proteins known as the TET proteins. The proteins function in several aspects of cell growth control, including multiple different steps in gene expression, and they are also found mutated in a number of specific diseases. For example, all contain domains for binding nucleic acids and have been shown to function in both RNA polymerase II-mediated transcription and pre-mRNA splicing, possibly connecting these two processes. Chromosomal translocations in human sarcomas result in a fusion of the amino terminus of these proteins, which contains a transcription activation domain, to the DNA-binding domain of a transcription factor. Although the fusion proteins have been characterized in a clinical environment, the function of the cognate full-length protein in normal cells is a more recent topic of study. The first part of this review will describe the TET proteins, followed by detailed descriptions of their multiple roles in cells. The final sections will examine changes that occur in gene regulation in cells expressing the fusion proteins. The clinical implications and treatment of sarcomas will not be addressed but have recently been reviewed.
The TET family of proteins consists of translocated in liposarcoma (TLS), Ewing's sarcoma (EWS) and TATA-binding protein-associated factor 15 (TAF15), and has various roles in gene expression. TLS, EWS and TAF15 are predominantly nuclear and are highly expressed in all human fetal and adult tissues examined (Zinszner et al., 1994; Morohoshi et al., 1996; Andersson et al., 2008). Each of these proteins contains an amino terminus rich in Gln, Gly, Ser and Tyr, and all three also contain a conserved RNA-binding domain (RBD), RGG regions that may affect RNA binding, and a Cys2–Cys2 zinc finger that may bind nucleic acids (Figure 1).
Both EWS and TLS (also called FUS) were originally discovered as a result of characteristic chromosomal translocations in EWS and myxoid liposarcoma, respectively. In the former, the chimeric protein consists of the amino terminus of EWS joined to the DNA-binding domain of the transcription factor FLI-1, whereas in the latter, the full-length CHOP protein replaces the carboxy-terminus of TLS (Delattre et al., 1992; Crozat et al., 1993; Rabbitts et al., 1993). TAF15 was originally found as a TAF in the general transcription initiation TFIID complex, and cloned both through biochemical methods and through its homology in the RBD to TLS and EWS (Bertolotti et al., 1996; Morohoshi et al., 1996; Tora, 2002). The TAF15 gene has subsequently been found translocated to the transcription factor CIZ in acute leukemia (Martini et al., 2002) and to the nuclear receptor CHN/TEC in extraskeletal myxoid chondrosarcoma (Attwooll et al., 1999; Panagopoulos et al., 1999; Sjogren et al., 1999), but these occur at a much lower frequency than translocations involving TLS and EWS. In all cases, the fusions are expressed under the control of the TET protein promoter.
The amino terminus of TET proteins can function like a transcriptional activation domain when fused to a DNA-binding domain (Zinszner et al., 1994; Bertolotti et al., 1999). Although the amino terminus of all TET proteins is enriched for Gln, Gly, Ser and Tyr residues, there are also some variations between the TET proteins. The amino terminus of EWS is rich in Pro and Thr and contains many copies of a hexapeptide repeat of consensus Ser-Tyr-Gly-Gln-Gln-Ser, with absolute conservation of Tyr at the second position and high conservation of Gln at the fourth position (Ng et al., 2007). This region is reminiscent of the heptapeptide repeats that constitute the carboxy-terminal domain (CTD) of the RNA polymerase (RNAP) II largest subunit, and the high Gln and Pro content is similar to activation domains of various transcription factors (Delattre et al., 1992). Computational modeling and synthetic constructs indicated that multiple Tyr residues, or at least an aromatic side chain since Phe can substitute, are required for transcription activation, and that the region is highly disordered, so structure is not a key determinant (Lee, 2007; Ng et al., 2007).
The 90-amino acid RBD in TET proteins folds into a structure with a sheet of four anti-parallel β-strands perpendicular to two α-helices (Burd and Dreyfuss, 1994). Within this domain, two short motifs, RNP-1 and RNP-2, directly contact RNA via hydrogen bonds and ring stacking. However, unlike other proteins with this type of RBD, TET proteins have an acidic residue at the second position and a Thr residue in the fourth position of RNP-1 as well as an unusually long loop after the first α-helix (Bertolotti et al., 1999), which may affect the structure of the protein since this region contributes to the hydrophobic core of the domain and to RNA-binding specificity or affinity. The RBD is the most conserved region within the TET protein family (Figure 2). Sequence-specific RNA binding by TET proteins has been examined by various groups. Early on, it was shown that EWS binds polyU and polyG sequences through its carboxy-terminal RGG domain (Ohno et al., 1994). Subsequent experiments showed that TLS also has affinity for polyG and polyU homoribopolymers, and in vitro systematic evolution of ligands by exponential enrichment (Tuerk and Gold, 1990) experiments suggested that the RBD and RGG motifs co-operate to bind GGUG RNA, but with relatively low affinity (Kd = 250 nM) (Lerga et al., 2001).
In addition to the RBD, TET proteins contain other domains that bind nucleic acids. The zinc finger of TET proteins resembles those in ZIS, a splicing protein that contains two zinc fingers, a stretch of acidic amino acids and a Ser/Arg-rich (RS) domain (Ladomery and Dellaire, 2002). Limited proteolysis followed by MALDI-TOF and circular dichroism analyses demonstrated that both the RBD and zinc finger regions of TLS fold into protease-resistant structures, and NMR analysis suggested that the zinc finger may bind RNA (Iko et al., 2004). The carboxy-terminus of TLS also contains three RGG motifs that may increase RNA affinity of the RBD or zinc finger, and may also be the site of post-translational modifications that regulate RNA binding or protein–protein interactions (Burd and Dreyfuss, 1994). The carboxy-terminus of TAF15 contains ~20 copies of perfect or degenerate Gly-Gly-Tyr-Gly-Gly-Asp-Arg repeats, and this region is likely to have a role in RNA binding as it encompasses many RGG boxes (Burd and Dreyfuss, 1994; Bertolotti et al., 1996; Morohoshi et al., 1996).
TET proteins also bind single-stranded DNA and possibly double-stranded DNA. TLS promotes D-loop formation, a process where a single strand of DNA invades and pairs with one of the strands in a double-stranded region of DNA, and this process is necessary for DNA repair and recombination (Baechtold et al., 1999; Bertrand et al., 1999). DNA binding is likely to occur through the Cys2–Cys2 zinc finger at the carboxy-terminus of the protein, a domain frequently found in transcription factors that is well known for binding nucleic acids, particularly DNA (Pieler and Theunissen, 1993). There are, however, reports of RBDs mediating sequence-specific DNA binding (Ding et al., 1999; Ladomery and Dellaire, 2002). Further analysis of each of the nucleic acid-binding domains is required to separate the specificity, affinity and function.
A Drosophila melanogaster ortholog was found to share homology with the RBD and the carboxy-terminus of TLS and EWS, and named SARFH (sarcoma-associated RNA-binding fly homolog) or Cabeza (Immanuel et al., 1995; Stolow and Haynes, 1995). This protein both binds RNA in vitro through its RBD and co-localizes with sites of active RNAP II transcription (Immanuel et al., 1995; Stolow and Haynes, 1995).
TET proteins contain many sites for post-translational modifications, including phosphorylation and arginine methylation. Such modifications in or near the RBD and RGG motifs may affect RNA- and DNA-binding, interactions with other proteins, protein stability or subcellular localization (Burd and Dreyfuss, 1994).
TLS and EWS are the target of various Ser/Thr protein kinases. Ser 42 in TLS, but not in the TLS–CHOP fusion protein, is phosphorylated by the protein kinase ATM in response to double-strand breaks in DNA caused by ionizing radiation (Gardiner et al., 2008). EWS and EWS–FLI1 are phosphorylated at Thr 79 by the JNK and p38 families of protein kinases in response to DNA damage, and weakly phosphorylated by ERK in response to mitogens (Klevernic et al., 2008). TET proteins are also substrates for the PKC family: EWS is phosphorylated at Ser 266 and TLS at Ser 256 (Deloulme et al., 1997; Perrotti et al., 1998). Phosphorylation of Ser 256 protects TLS from proteolytic degradation by masking the binding site that targets TLS for proteasome-mediated degradation (Perrotti et al., 2000). Separately, the form of TLS that has homologous DNA pairing activity was also found to be phosphorylated in the zinc finger domain at Ser 439 (Guipaud et al., 2006). TLS and TAF15 are also substrates for Tyr phosphorylation. TLS is a target of the fibroblast growth factor receptor 1 kinase and a distinct fraction of TLS is phosphorylated on Tyr residues, and localized to the cytoplasm (Klint et al., 2004). Although Tyr phosphorylation of TLS results in cytoplasmic localization, TAF15 phosphorylation on Tyr residues by v-Src leads to increased transcriptional activation by TAF15 in the nucleus (Lee et al., 2004). The amino terminus of EWS is also post-translationally modified by O-linked-β-N-acetylglucosaminylation, which may overlap the sites of and thus prevent Ser/Thr phosphorylation (Bachmaier et al., 2009). These data indicate that phosphorylation of TET proteins regulates cellular localization, protein stability and function.
Methylation of arginine residues is a post-translational modification that occurs on many RNA-binding proteins and may affect protein–RNA and protein–protein interactions, protein stability or subcellular localization (Liu and Dreyfuss, 1995). Although methylation does not affect the positive charge on Arg residues, it does alter the steric properties of the side chain contacts to RNA or proteins and may decrease the hydrophilicity of a protein (Beyer et al., 1977). This modification is performed by protein arginine methyltransferases (PRMTs) and the structure of Arg allows for mono- and di-methylation. Type I PRMTs, including PRMT1 and PRMT3, monomethylate and asymmetrically dimethylate Arg residues, whereas type II PRMTs monomethylate and symmetrically dimethylate Arg residues. Dimethylation of hnRNPs accounts for the bulk of arginine methylation in the nucleus (Jobert et al., 2009a), occurs on ~12% of Arg residues in hnRNPs (Liu and Dreyfuss, 1995) and targets hnRNPs for nuclear export (Stallcup, 2001).
TET proteins were found to be targets of PRMTs, and all three are substrates for PRMT1. TLS contains over 20 asymmetrically dimethylated Arg residues in the RGG motifs at the carboxy-terminus of the protein (Rappsilber et al., 2003). Similarly, EWS is extensively methylated in vitro and in vivo, and the majority of RGG sites are asymmetrically dimethylated, with rare cases of monomethylation and no instances of symmetric dimethylation (Belyanskaya et al., 2001; Pahlich et al., 2005). Such post-translational modification may alter cellular localization, including nuclear export of methylated EWS (Belyanskaya et al., 2003; Araya et al., 2005). TAF15 is also methylated on Arg residues in its RGG boxes by PRMT1, but in this case, the modification appears to be important for nuclear localization rather than nuclear export (Jobert et al., 2009a). Furthermore, cellular stress or lack of Arg methylation through knockdown of PRMT1 causes TAF15 to localize to stress granules (Jobert et al., 2009a). PRMT8, a type I PRMT that is located at the cell surface and predominantly expressed in brain tissue, binds the third RGG box in EWS (Pahlich et al., 2008), but the functional significance of this interaction is unclear. It should be noted that a brain-specific form of EWS, which results from alternative splicing, has been detected (Melot et al., 2001). The brain-specific form of EWS bound by PRMT8 may be monomethylated or asymmetrically dimethylated, opening the possibility for differential regulation of EWS in different cellular compartments and cell types.
TET proteins are likely to function in RNAP II transcription by interacting with TFIID and subunits of RNAP II itself. TFIID consists of TBP and a heterogeneous mixture of TAFs in stoichiometric or sub-stoichiometric ratios, and the various complexes may have different effects on basal or activated transcription (Brou et al., 1993). It is possible that the proteins associated with core TFIID components may affect promoter choice and recruitment of processing factors. Since each of the TET proteins co-purifies with a separate fraction of TFIID in a sub-stoichiometric ratio (Bertolotti et al., 1996, 1998), these related proteins may have distinct functions and may regulate different groups of genes.
TLS, EWS and TAF15 also associate directly with RNAP II (Bertolotti et al., 1996, 1998). EWS interacts with the Rbp3 subunit of RNAP II, whereas TAF15 interacts with Rbp3, Rbp5 and Rpb7 (Bertolotti et al., 1998). Another study indicated that the EWS amino terminal trans-activation domain also interacts with Rpb4 and Rpb7 subunits (Zhou and Lee, 2001). Rpb7 resembles the bacterial σ factor and in yeast may have a role in transcription regulation (Petermann et al., 1998). The amino terminus of EWS binds RNAP II and although this region is present in the fusion protein, it is unclear whether EWS–FLI1 is also able to do so. One study found that the EWS–FLI1 fusion protein does not interact with RNAP II (Bertolotti et al., 1998), whereas another group found that not only does EWS–FLI1 interact with RNAP II but also that overexpression of RPB7 increases EWS–FLI1, but not FLI1, transactivation (Petermann et al., 1998).
In addition to directly contacting general transcription factors and RNAP II, TET proteins may regulate transcription through contacting activators or repressors. TLS, but not EWS, interacts with the DNA-binding domain of various nuclear hormone receptors, without affecting the ability of the receptor to bind DNA response elements, suggesting that TLS may have a role in activating transcription of certain receptors under specific conditions (Powers et al., 1998). EWS binds various proteins containing a POU DNA-binding domain, including the transcriptional activator Oct4, which is expressed in embryonic stem and germ cells to maintain an undifferentiated totipotent state (Lee et al., 2005), and Brn3a, which is expressed in the developing and adult nervous systems to promote development of specific neuronal lineages (Thomas and Latchman, 2002).
The amino terminus of each TET protein was shown to act as a transcriptional activator when fused to a DNA-binding domain (Bailly et al., 1994; Zinszner et al., 1994; Bertolotti et al., 1999). In the case of EWS, however, transcription activation potential is decreased or even abolished in the full-length protein, indicating that the RBD and RGG boxes may regulate the activation domain in the normal protein (Li and Lee, 2000; Rossow and Janknecht, 2001; Alex and Lee, 2005). Although the EWS–FLI1 fusion protein activates transcription in vitro (Uren et al., 2004), others suggest that the in vitro system does not reflect in vivo events (Ng et al., 2009). EWS interacts with both the CREB-binding protein (CBP; Fujimura et al., 2001; Rossow and Janknecht, 2001; Araya et al., 2003) transcriptional co-activator and also the transcriptional repressor SF1 (Zhang et al., 1998), indicating that it may positively and negatively regulate transcription.
TLS can also repress transcription by RNAP III, which transcribes small structural and catalytic RNAs (A.Y.T. and J.L.M, submitted for publication). Repression occurs both in vitro and in vivo, and changes in levels of TLS protein in cells affect levels of RNAP III transcripts. This regulation is likely to occur through TLS binding to TBP and possibly affecting interactions with the general RNAP III transcription machinery. TLS joins a list of cancer-related factors, such as p53 and Rb, which can affect transcription by more than one RNAP.
Many experiments have suggested a role for TET proteins in pre-mRNA splicing. Early on, mass spectrometry identified TLS as hnRNP P2 in the H complex of proteins that assembles non-specifically and in an ATP-independent manner on pre-mRNAs in vitro (Calvio et al., 1995). More significantly, TLS was found to crosslink to the pre-mRNA 3′ splice site during the second step of splicing (Wu and Green, 1997) and was also found at the 5′ splice site in a large complex containing hyperphosphorylated RNAP II, U1 snRNP, p54nrb/PSF and transcription elongation factors P-TEFb, Tat-SF1 and TFIIF (Kameoka et al., 2004). Large-scale purification of functional spliceosomes and mass spectrometry analyses confirmed the presence of TET proteins in the spliceosome (Rappsilber et al., 2002; Zhou et al., 2002).
TET proteins bind various SR proteins and other known splicing factors. For example, the carboxy-terminus of TLS was shown to interact with SR proteins TASR (TLS-associated SR protein; also known as SRp38) and SC35 (Yang et al., 1998, 2000b; Lerga et al., 2001; Shin and Manley, 2002). In addition, TLS associates with SRp75 and SRm160 and with hnRNPs A1 and C1/C2 and PTB (Lerga et al., 2001; Meissner et al., 2003). The carboxy-termini of TLS and EWS were found to interact with YB-1, a splicing activator that may also have a role in mRNA packaging (Chansky et al., 2001).
Interactions between TET proteins and known splicing factors can affect patterns of alternative splicing. Transient transfection of TLS and reporter constructs in HeLa cells showed that TLS overexpression increased production of the 13S and 12S isoforms from an adenovirus E1A pre-mRNA, reflecting preferential use of downstream alternative 5′ splice sites (Yang et al., 1998). However, in an erythroleukemic cell line that lacks the transcription/splicing factor Spi-1, transient expression of TLS and the E1A reporter construct increased the use of the upstream 5′ splice site and thus production of the 9S isoform (Hallier et al., 1998). The differing results observed with TLS may reflect the presence or absence of Spi-1, which can also affect E1A splicing in these assays, and which binds directly to TLS (Hallier et al., 1998). These contrasting results highlight the difficulties in analyzing effects on splicing solely by transient overexpression assays. It will be important in the future to confirm and extend these results using other methods, such as RNAi-mediated knockdowns and in vitro splicing assays, to determine how TET proteins affect alternative splicing and to clarify what interactions are important in different cell types or conditions.
Transcription and splicing are likely to be connected by proteins with roles in both processes. Proteins and their various mechanisms of coordinating transcription and splicing are being characterized, and some may be gene-specific. It is possible that TET proteins connect transcription and splicing, since the amino terminus mediates interactions with RNAP II and the carboxy-terminus binds to splicing factors. TET proteins may recruit splicing factors to the RNAP II CTD, which co-ordinates pre-mRNA processing events (for review, see Hirose and Manley, 2000; Bentley, 2005; de Almeida and Carmo-Fonseca, 2008). As mentioned above, TLS was found as a component of in vitro transcription and splicing complexes (Kameoka et al., 2004), further suggesting that TLS may bridge these processes. Confirmation that TLS links transcription to splicing and elucidation of the mechanism by which this occurs would add to existing knowledge about the regulation of gene expression and may suggest how this protein family functions in disease. Furthermore, it is possible that each TET protein recruits different splicing factors to separate types of genes; such specificity could be achieved through TET proteins associating with distinct populations of TFIID.
In addition to the type of promoter and recruitment of specific splicing factors, transcription elongation rate may also affect alternative splicing (Kornblihtt et al., 2004) and splicing may affect mRNA transcription (Manley, 2002). Tat-SF1, a protein distantly related to the TET proteins, stimulates the Tat protein of HIV-1 and acts as a general transcription elongation factor (Zhou and Sharp, 1996; Li and Green, 1998). Tat-SF1 contains two tandem RRMs with homology to EWS and TLS and a highly acidic carboxy-terminus (Zhou and Sharp, 1996). Tat-SF1 recruits the CTD kinase P-TEFb to Tat bound to TAR RNA to stimulate transcription, and U snRNPs to elongating RNAP II to enhance splicing (Fong and Zhou, 2001).
The TET proteins interact with a diverse range of proteins. In keeping with this, they may act to connect a number of processes within the cell, possibly by acting as a scaffold. For example, EWS has been shown to interact with a large number of other proteins using high-throughput mapping of protein–protein interactions (Rual et al., 2005). In addition, systematic analysis of proteins in a complex containing the RNase III enzyme Drosha, which processes microRNA precursors, identified EWS along with a number of other proteins involved in RNA binding (Gregory et al., 2004). The functional significance of EWS co-eluting with Drosha in this large complex remains to be determined, but it provides a possible link between the TET proteins and regulation of microRNAs.
More recently, TAF15 was found associated with U1 snRNA, chromatin and RNA, in a complex distinct from the Sm-containing U1 snRNP that functions in splicing. Although the function of this particle has not been determined, the TAF15-U1 snRNA interaction increased after transcriptional inhibition and the complex localized to perinucleolar structures (Jobert et al., 2009b). These findings raise the possibility that TET proteins may regulate transcription and splicing through multiple mechanisms.
TET proteins have also been implicated in the DNA repair pathway. In vitro pairing on membrane (POM) assays demonstrated that TLS mediates annealing of complementary DNA strands to form D-loops, and this function is lost in TLS–CHOP (Baechtold et al., 1999; Bertrand et al., 1999). The POM blot followed by mass spectrometry revealed that all TET proteins are able to pair homologous DNA in vitro, and that this activity was specific to TET proteins and PSF, a splicing factor with domains similar to those in the TET proteins (Guipaud et al., 2006). Thus, TET proteins may have a role in DNA repair, especially at sites of active transcription or after cellular signaling via kinase cascades.
In an animal model, TLS knockout mice have high genomic instability due to chromosomal pairing defects and enhanced sensitivity to radiation (Hicks et al., 2000; Kuroda et al., 2000). The phenotype of TLS knockout mice depends on the genetic background: inbred TLS−/− mice showed defects in B-lymphocyte development, genomic instability and perinatal death (Hicks et al., 2000). Although outbred TLS−/− mice were able to survive until adulthood, they displayed defects in spermatogenesis and sensitivity to ionizing radiation (Kuroda et al., 2000). The c-ABL and ATM kinases are both activated by DNA damage and target TLS (Perrotti et al., 1998; Gardiner et al., 2008), so cells lacking TLS might not undergo appropriate DNA repair, resulting in genomic instability. DNA damage can also cause TLS to bind a non-coding RNA near the cyclin D1 promoter. This subsequently inhibits the histone acetyl transferase activity of CBP/p300 and results in transcriptional repression of the cyclin D1 gene (Wang et al., 2008). These data suggest that TLS could have multiple roles in DNA repair, including a link to the cell cycle through regulating transcription of cyclin D1.
Similarly, knockout of EWS in mice resulted in B-lymphocyte defects, decreased meiotic recombination, sensitivity to ionizing radiation, and pre-natal mortality of inbred mice and high rates of post-natal mortality in outbred strains (Li et al., 2007). Although EWS and TLS are quite similar, loss of EWS does not result in increased levels of TLS, and TLS has a role in pairing autosomes whereas EWS is involved with pairing the XY sex chromosomes during meiosis (Li et al., 2007). Morpholino-mediated knockdown of EWS orthologs in zebrafish resulted in mitotic defects during development, apoptosis of preneural cells and embryonic lethality, whereas siRNA-mediated knockdown of EWS in HeLa cells also led to mitotic defects and apoptosis (Azuma et al., 2007). Members of the TET family thus have distinct, non-redundant roles in meiosis and functional conservation between species, as demonstrated by studies using animal models.
Although TET proteins are predominantly nuclear, TLS shuttles between the nucleus and the cytoplasm attached to RNA (Zinszner et al., 1997b) and thus may function in RNA transport. TLS was found in a complex with RNA-transporting proteins, translational regulators and mRNA, and this kinesin-associated granule transports RNA to the plus ends of microtubules, such as the dendrites of neurons (Kanai et al., 2004). TLS binds and is transported with the mRNA encoding Nd1-L, a protein that stabilizes actin, and thus might have a role in regulating actin organization in dendritic spines (Fujii and Takumi, 2005). Both microtubules and actin target RNA-bound TLS to dendritic granules, and this process is activated by metabotropic glutamate receptor 5 (mGluR5) excitation of synapses (Fujii et al., 2005). TLS also binds Sam68, an RNA-binding protein that functions in splicing but is also found in dendritic granules in neurons, and the N-methyl-d-aspartate receptor, which is involved in synaptic signaling (Belly et al., 2005). Taken together, these results indicate that TLS could affect mRNA transport with either actin or microtubules. This may alter dendritic structure after excitation and may affect long-term synaptic plasticity.
TLS has also been implicated in a form of familial amyotrophic lateral sclerosis (ALS), a fatal neurodegenerative disease. Genetic linkage analysis had previously implicated a region of chromosome 16 as the source of an autosomal dominant mutation. Very recently, two groups found that specific mutations in the highly conserved last 13 amino acids of TLS correlated with familial ALS but not sporadic ALS (Kwiatkowski et al., 2009; Vance et al., 2009). Single nucleotide changes were found that result in the substitution of a single amino acid, including missense mutations in each of the last five Arg residues of TLS in ALS patients, with the most common being an Arg to Cys substitution (R521C). These mutations resulted in increased cytoplasmic localization of TLS, apparently in an aggregated form, in the motor neurons of patients, as well as in transfected cells and rat cortical neurons and in cells expressing a fluorescently tagged version of the mutant TLS protein (Kwiatkowski et al., 2009; Vance et al., 2009). The carboxy-terminus is crucial for normal functioning of the protein in neurons, since the mutated Arg residues may be targets for dimethylation that could alter subcellular localization of TLS. The role of TLS in ALS remains to be characterized, and it is currently not clear whether ALS results from loss or alteration of TLS function, or from a toxic effect of the cytoplasmic aggregates.
TLS and EWS were discovered translocated in sarcomas. Sarcomas are aggressive cancers that occur in connective tissue and of mesodermal origin, possibly originating from mesenchymal progenitor cells and rarely appear in carcinomas of epithelial origin (Arvand and Denny, 2001). Sarcomas are relatively infrequent, accounting for under 10% of human cancers, and bone sarcomas tend to affect children and adolescents whereas soft tissue sarcomas such as liposarcoma are more common in adults (Riggi and Stamenkovic, 2007; Osuna and de Alava, 2009).
Translocation results in the TET protein promoter driving expression of the fusion protein. Decreased level of the full-length TET protein in cells containing the translocation is unlikely to be the cause of transformation given that mice heterozygous for either EWS or TLS are indistinguishable from their wild-type littermates (Kuroda et al., 2000; Li et al., 2007). Instead, introduction of the gain-of-function fusion protein into cells causes deregulated expression of target genes and alters the differentiation pattern of certain cell types (Aman, 1999; Martini et al., 2002). Fusion proteins target genes that permit cell growth and require other genetic alterations for tumor development (reviewed in Janknecht, 2005; Xia and Barr, 2005). Variations in chromosome breakpoints lead to fusion proteins with differing transcriptional activation strengths and abilities to transform cells, and these correlate with phenotype, tumor progression and patient prognosis (Zoubek et al., 1996; Lin et al., 1999; Gonzalez et al., 2007). Each type of sarcoma is caused by one or a small number of related fusion proteins, suggesting that the fusion proteins can only activate the oncogenic pathway in specific cell types.
The process of translocation has been studied and characteristic sequences at the fusion gene breakpoints recognized by specific DNA-binding proteins may have a role in recombination (Panagopoulos et al., 1997). Sequencing at the breakpoints and the presence of extra nucleotides indicate that certain proteins may be required and involved in the process of chromosomal breakage and re-ligation reactions. In the case of TLS–CHOP, the protein Translin, which is often found at recombination hotspots, binds to DNA sequences near some TLS and CHOP chromosome breakpoints (Kanoe et al., 1999; Hosaka et al., 2000; Yu and Hecht, 2008). The translocation breakpoints also contain binding sites for topoisomerase II, which creates staggered ends of double-stranded DNA and allows for the re-joining of DNA strands (Kanoe et al., 1999). Alu repeat sequences or an octamer similar to the bacterial Chi recombination site could also promote translocation, suggesting that an established pathway is involved in the process of recombination (Xiang et al., 2008).
Many variations to EWS–FLI1 and TLS–CHOP (Figure 3) have been documented in clinical literature and involve a TET protein fused to the DNA-binding domain or a full-length transcription factor (for a full list, see Oda and Tsuneyoshi, 2009; Osuna and de Alava, 2009). Clinical therapies are beyond the scope of this review but are discussed elsewhere (Olsen et al., 2002; Bernstein et al., 2006; Oda and Tsuneyoshi, 2009). In the following sections, we will address some of the mechanisms through which TLS–CHOP and EWS–FLI1 lead to cellular transformation. We focus on these two fusions as they are the most prevalent and best studied. The role of these fusion proteins has been studied both by introducing the fusion protein into normal cells and by knocking down expression of the fusion protein in sarcoma cells.
Myxoid liposarcoma displays a number of characteristic phenotypes, including morphology change to round nuclei in lipoblasts, intracellular lipid accumulation and induction of adipocyte-specific genes (Osuna and de Alava, 2009). Myxoid liposarcoma arises from translocation t(12;16)(q13:p11) and expression of the TLS–CHOP fusion protein. CHOP is a transcription factor that negatively regulates the CAAT/enhancer binding protein (C/EBP) family; this family includes genes that control adipogenesis (Zinszner et al., 1994). CHOP dimerizes and induces growth arrest in response to cellular stress. Although TLS–CHOP can dimerize, it does not result in growth arrest and acts in a dominant negative manner, possibly by competing for binding partners or inappropriately activating transcription (Barone et al., 1994). TLS–CHOP is more highly localized to the nucleus than TLS, and is not exported to the cytoplasm or located to the nucleolus when transcription is inhibited (Zinszner et al., 1994, 1997a), which allows for unregulated gene expression.
The oncogenic transformation in human liposarcoma has been attributed to uncontrolled transcriptional activation by TLS–CHOP and interference with adipocytic differentiation to promote liposarcoma development. TLS–CHOP-containing cells express the neural-specific genes Nexin and Neuronatin, as well as the RET proto-oncogene, at higher levels than they are expressed in normal fat tissue (Thelin-Jarnum et al., 1999). TLS–CHOP activates the DOL54 gene, which is a secreted protein that can promote tumorigenicity (Kuroda et al., 1999). In addition, TLS–CHOP halts C/EBP-mediated activation of the PPARγ adipocyte differentiation cascade (Perez-Mancera et al., 2007, 2008) to promote cell proliferation and liposarcoma development.
Significantly, the TLS–CHOP fusion protein is only able to direct tumor development in certain cell types: despite expressing TLS–CHOP in all cells, a transgenic mouse model only develops liposarcomas (Perez-Losada et al., 2000a). Overexpression of CHOP alone is not enough to transform cells in culture or induce tumors in mice (Zinszner et al., 1994, Perez-Losada et al., 2000b). Thus, while transcriptional activation conferred by the amino terminus of TLS is required, unregulated expression of TLS–CHOP is only able to activate the liposarcoma tumorigenesis pathway in adipocytes. Similarly, in patients, each type of translocation (and the resulting fusion protein) corresponds to a specific type of sarcoma, suggesting that unregulated expression of target genes can only drive tumor formation in certain cell types.
EWS typically arises in bones as a result of translocation t(11;22)(q24;q12) joining the EWS and FLI1 genes (Delattre et al., 1992). The EWS–FLI1 fusion protein is found in ~85% of the cases, whereas EWS fusion to other transcription factors, including ERG, ETV-1, E1AF, ATF-1 and WT-1, comprise the remainder (Oda and Tsuneyoshi, 2009). The ETS family, which includes FLI1, ERG, ETV-1 and E1AF, is a large group of transcription factors found only in metazoans and contains a characteristic DNA-binding domain that recognizes GGAA or GGAT sequences (for a review, see Oikawa and Yamada, 2003). EWS fusions to other transcription factors, including ATF-1 and WT-1, have been found but are infrequent, and are reviewed elsewhere (Gerald and Haber, 2005; Riggi et al., 2007). The amino terminus of FLI1 contains a weak transcription activation domain, which is replaced by the stronger EWS activation domain in the fusion protein. EWS–FLI1, like TLS–CHOP and unlike EWS, is predominantly nuclear with very little cytoplasmic localization (Yang et al., 2000a).
The breakpoints in EWS are not homogeneous but are instead clustered within a small region of EWS and spread over a larger region in the FLI1 gene. Although it is near the EWS breakpoint and could be included in-frame in the fusion protein, the RBD of EWS was not detected during examination of a large number of tumors containing EWS–FLI1 or EWS–ERG fusion proteins, suggesting that the RBD may decrease or inhibit the oncogenic potential of the fusion protein (Zucman et al., 1993). The particular variant of the EWS–FLI1 fusion correlates with patient prognosis, reflecting different transcriptional activation potentials of the resulting fusion proteins (de Alava et al., 1998; Lin et al., 1999; Aryee et al., 2000). Indeed, the transcriptional activation and metastatic activity of different isoforms of EWS–FLI1 in a mouse model correlated with disease progression in human patients (Gonzalez et al., 2007). Furthermore, duplication or loss of certain chromosome regions, which is frequent in Ewing's sarcoma patients, also correlates with disease progression and clinical outcome (Savola et al., 2009).
Introduction of EWS–FLI1 into a cultured cell line confers changes typical of transformation: alterations in cell morphology, an increased proliferation rate and anchorage-independent growth. Expression of EWS–FLI1 leads to growth arrest in primary cells (Lessnick et al., 2002) but to growth and inhibition of differentiation in transformed cells (Eliazer et al., 2003) and prevents apoptosis in immortalized 3T3 cells (Yi et al., 1997). Many of these changes are the result of de-regulated gene expression by the fusion protein. Both FLI1 and EWS–FLI1 bind the same sequences, due to the ETS domain of FLI1, but EWS–FLI1 drives stronger activation of target genes due to its interaction with RNAP II through the EWS amino terminus (Bailly et al., 1994; Yang et al., 2000a). However, EWS–FLI1 proteins with mutations in the DNA-binding domain were still able to transform cells (Jaishankar et al., 1999; Welford et al., 2001), indicating that some of the oncogenic effects are independent of DNA binding.
EWS–FLI1 affects a range of genes to alter gene expression and cause cellular transformation (for reviews, see Janknecht, 2005; Kovar, 2005; Riggi and Stamenkovic, 2007). EWS–FLI1, but not FLI1 alone, rapidly up-regulates transcription of the gene encoding the SH2 domain-containing adapter protein EAT-2, and this is correlated with transformation (Thompson et al., 1996). EWS–FLI1 also up-regulates expression of the morphogenic gene manic fringe (MFNG), which has a role in somatic development and can cause tumorigenesis when overexpressed (May et al., 1997). EWS–FLI1 binds to the promoter of the telomerase reverse transcriptase (TERT) gene and recruits CBP/p300 to activate TERT transcription (Takahashi et al., 2003), and an elevated level of TERT is important for continued cell division and a characteristic of many cancers. EWS–FLI1 also represses the tumor suppressor TGF-β type II receptor to relieve growth inhibition (Hahm et al., 1999). Depleting EWS–FLI1 protein in EWS-derived cells caused decreased tumorigenicity in vivo and reduced growth and cell cycle arrest in vitro, partly by alleviating repression of the Rb tumor suppressor and decreasing cyclin D1 and CDK2 levels (Tanaka et al., 1997; Chansky et al., 2004; Prieur et al., 2004; Smith et al., 2006; Hu et al., 2008). Similarly, examination of global changes in Ewing's sarcoma cell lines indicated that EWS–FLI alters cell cycle and proliferation genes to down-regulate differentiation (Kauer et al., 2009).
To find genes regulated by EWS–FLI1 but not FLI1, cells overexpressing these proteins were analyzed by cDNA microarray. EWS–FLI1 binds the promoter region and up-regulates expression of the Id2 (inhibitor of DNA binding 2) gene (Nishimori et al., 2002). Id2 is a helix-loop-helix protein that has a dominant negative repressive effect on basic HLH transcription factors and can affect cell differentiation and proliferation (Nishimori et al., 2002). Similarly, EWS–FLI1, but not FLI1, binds to the promoter and up-regulates the expression of DAX1, a nuclear receptor (Mendiola et al., 2006). Further gene expression analysis suggests that the DAX1 pathway regulates ~10% of EWS–FLI1 target genes, especially those involved in G1 to S cell cycle progression (Garcia-Aragoncillo et al., 2008). DAX1 is thus an important direct target of EWS–FLI1 and indicates a pathway that is important for proliferative growth of sarcoma cells. Taken together, EWS–FLI1 represses key tumor suppressors and activates a range of genes to promote transformation in cells and tumorigenesis in vivo.
The TET proteins are involved in a wide range of cellular processes and tightly regulated to ensure appropriate localization and activity. The domains in these proteins and early experiments suggested a role in RNAP II transcription and splicing, possibly coupling these processes, but subsequent studies have shown that TET proteins are also involved in a wide range of other processes, including repressing RNAP III transcription, DNA repair and RNA transport in neurons. The mechanism of TET protein activity, regulation of their cellular localization and the domains of the proteins involved in various processes remain to be clarified, especially in the recent discovery that implicates TLS in familial ALS.
The amino terminus of TET proteins is also found fused to transcription factors, and these oncogenic fusion proteins in sarcomas are responsible for inappropriate transcriptional activation of target genes. Many therapies are in development, including knocking down the levels of fusion proteins, and further knowledge of the pathways that are disrupted by the fusion proteins will aid both the understanding of sarcoma development and the identification of other possible drug targets.
A.Y.T. was partially funded by a Postgraduate Scholarship from the Natural Sciences and Engineering Research Council of Canada and related work in the lab of J.L.M. was supported by grants from the National Institutes of Health.