The data provides support to the hypothesis that the major triplex DNA-binding protein in human cells is more abundant and has higher binding activity in vitro in extracts from colorectal cancer tissues compared to adjacent normal tissues. This increased binding activity correlated significantly with the expression of triplex/G-quadruplex DNA-unwinding helicase WRN, and with the spread of cancer to the lymph nodes, metastasis, and reduced overall survival. The major triplex DNA-binding protein in gel shifts was identified as the U2AF65 splicing factor. U2AF65 expression was higher in more advanced colon tumor stages and correlated significantly with total and truncated beta-catenin expression.
U2AF is a non-small nuclear ribonucleoprotein (snRNP) splicing factor required for the binding of U2 snRNP to the pre-mRNA branch site
[
41,
42]. Purified U2AF is comprised of two polypeptides of 65- (U2AF65) and 35-kDa (U2AF35), respectively. U2AF65 binds to the polypyrimidine (Py) tract adjacent to the 3’ splice site using RNA-recognition motifs and cross-links to the branch point in an ATP-independent manner at the earliest stage of spliceosome formation
[
43]. Both subunits of U2AF are essential for the viability of many model organisms, such as zebra fish,
Drosophila,
C. elegans, and
S. pombe[
44]. Both U2AF65 and U2AF35 shuttle continuously between the nucleus and cytoplasm by a mechanism that involves carrier receptors and is independent from binding to mRNA. It has also been suggested that U2AF participates in the nuclear export of mRNA
[
45].
U2AF65 binds to single-stranded RNA and recognizes a wide variety of pyrimidine (Py)-tracts. The Py-tracts of higher eukaryotic pre-mRNAs are often interrupted with purines, yet U2AF65 must identify these degenerate Py-tracts for accurate pre-mRNA splicing. Based on
in vitro studies, investigators have proposed that U2AF35 assists U2AF65 recruitment to nonconsensus polypyrimidine tracts. Pacheco
et al. analyzed the roles of the two U2AF subunits
in vivo in the selection of alternative 3' splice sites associated with polypyrimidine tracts of different strengths. Their results revealed a feedback mechanism by which RNA interference-mediated depletion of U2AF65 triggers down regulation of U2AF35 expression. They also showed that knockdown of each U2AF subunit inhibits weak 3' splice site recognition, while over-expression of U2AF65 alone is sufficient to activate selection of this splice site
[
46,
47]. It would be interesting to examine if over-expression of U2AF65 alone in the context of cancer activates splicing of weak or nonconsensus polypyrimidine tracts that could tip the balance of splicing regulation in a subset of cellular transcripts which could promote tumorigenesis.
The proteins we identified in RKO nuclear extracts using biotin triplex DNA affinity were PSF, a 100-kDa protein that also binds to the polypyrimidine tract, and its heterodimeric binding partner p54nrb. We speculate that the 100- and 60-kDa proteins identified in previous studies using Southwestern blotting with HeLa nuclear extracts
[
16] probed with the same purine triplex DNA probe used in this study are indeed PSF and p54nrb, but this has yet to be tested. Both PSF and p54nrb bind to double-stranded (ds)DNA, single-stranded (ss)DNA, and RNA, and contain DNA- and RNA-binding domains. PSF participates in constitutive pre-mRNA splicing and is a component of later spliceosomal B and C complexes (when U2AF65 is no longer present). PSF and p54nrb also bind and function in nuclear retention of defective RNAs and are involved in transcriptional regulation and the DNA damage response
[
48-
51]. Interestingly, PSF also functions in DNA annealing, where PSF requires ssDNA and dsDNA with sequence homology for their
in vitro pairing activity as well as divalent cations. PSF can promote the incorporation of ssDNA within the two separated strands of a homologous superhelical DNA duplex and produce a three-stranded D-loop structure, which is required for homologous recombination. Other splicing factors SF2/ASF and U2AF65 also caused DNA annealing but could not form D loops
[
52]. PSF and p54nrb, as well as GRSF-1, YB-1, and polypyrimidine tract-binding protein (PTB) also bind to the
MYC family of internal ribosome entry sites (IRES) and positively regulate translation of the Myc family of oncoproteins
in vitro and
in vivo[
53]. Protein array data in this study showed that expression of both PSF and p54nrb in colorectal tissue extracts correlated significantly with c-Myc expression levels, which is consistent with a role for PSF and p54nrb in the regulation of c-Myc protein expression.
Researchers identified both U2AF and PSF, as well as hnRNP C and PTB, as RNA-binding proteins that bind to two regions 3’ of the (CUG)
n repeat expansion in the 3’-UTR of the
DMPK gene, where expansion of this trinucleotide repeat causes the neuromuscular disorder myotonic dystrophy
[
54]. Their study explored RNA-binding proteins interacting with non-CUG regions or higher order structures in the
DMPK 3’-UTR that may be involved in RNA-mediated pathogenesis. Their finding that both U2AF and PSF can bind near this triplet repeat sequence with the potential to form higher order structures such as triplexes is consistent with our data on biotin triplex DNA affinity identification of both U2AF65 and PSF. Another group identified an RNA/protein complex in both
Drosophila and 293 cells that consisted of expanded CAG RNA, U2AF65, and the NXF1 nuclear export receptor, providing further evidence that in other models, U2AF65 interacts with these triplet repeat sequences
[
55]. We believe that the purine triplex DNA EMSA probe can be a surrogate multiplex nucleic acid structure that acts as a “bait and hook” to capture proteins that may be binding D-loops, R-loops, triplexes, G-quadruplexes, or other multi-stranded structures containing Hoogsteen or reverse Hoogsteen base pairs
in vivo.
PTB also binds to polypyrimidine tracts in pre-mRNAs, and numerous studies have shown that PTB competes with U2AF65 for binding to these sequences
[
56-
61]. Since PSF is a PTB-associated protein, binding competition between PSF and U2AF65 may be possible as well, which may explain why we identified both PSF with the biotinylated triplex DNA in RKO nuclear extracts and U2AF65 in RKO cytoplasmic extracts. Gama-Carvalho and colleagues performed immunoprecipitation of U2AF65- and PTB-associated RNAs from HeLa cells followed by microarray analysis to determine which mRNAs are associated with these two splicing factors that can compete for binding to polypyrimidine tracts
[
36]. Among U2AF65-associated mRNAs was a predominance of transcription factors and cell cycle regulators, whereas PTB-associated transcripts were enriched in mRNAs that encode proteins implicated in intracellular transport, vesicle trafficking, and apoptosis.
Related to cancer, researchers found that 2 of 14 patients with malignant mesothelioma, a pulmonary malignancy, had antibodies against U2AF65 using the SEREX technique (serologic identification by recombinant expression cloning)
[
62]. Additionally, a patient with liver cirrhosis that progressed to hepatocellular carcinoma had antinuclear antibodies that recognized a nuclear protein putatively identified as U2AF65
[
63]. Other splicing factors, most notably SFRS1 (ASF/SF2), are reported to be over-expressed in colon, thyroid, kidney, lung and breast cancer cells
[
64]. Other splicing factors shown to be over-expressed in colorectal cancer cells are hnRNP-F and –K, SPF45, and SRPK1
[
64]. However, the present report is the first to describe correlation of increased expression or binding activity of U2AF65 in primary colorectal tumors with tumor stage, lymph node disease, metastasis and reduced overall survival.
Why U2AF65 is over-expressed in colorectal tumor cells, and whether this over-expression is important to the development and/or progression of colorectal cancer or a passive effect of general gene deregulation are unknown. About 75% of sporadic colorectal cancers are characterized by a chromosomal instability (CIN) phenotype. The most common reported chromosomal losses involve 5q (APC), 18q (DCC), and 17p (p53), while the most common gains involve 8q and 20q. The gene encoding U2AF65 (
U2AF2) is located at c19q13.42. Chromosomal amplifications at c19q13.42 have been found in a rare embryonal tumor using array CGH and FISH
[
65,
66]. Other groups have reported amplifications or aberrations at c19q13 in colorectal tumors, particularly in liver metastases compared to primary tumors
[
67], and in other solid tumors including pancreatic
[
68] and ovarian
[
69].
Regarding genomic instability, Vasquez and colleagues recently showed that both non-B DNA sequences and WRN helicase deficiency induce mutations characterized by single base changes, mostly at C-G base pairs, in an additive but not synergistic manner
[
70]. Because no synergy was observed, the authors concluded that a role for WRN in reducing mutation frequencies via a mechanism dependent on its cellular helicase activity (for example, of non-B DNA sequences) is unlikely. Their data do not directly support our present hypothesis, which is similar to their hypothesis that if one function of the WRN helicase were to resolve non-B (triplex and Z-DNA) structures, as observed
in vitro, then mutation frequencies may be higher in WRN-deficient cells than in WRN-wild type cells because both the number and stability of such structures would be greater in WRN-deficient cells. However, they did verify that purified WRN protein was able to unwind the third purine-rich strand of a synthetic triplex
in vitro. Although our data suggest a correlation between expression of the WRN helicase with triplex DNA-binding activity in both normal and tumor tissue extracts, defining a functional role and mechanism of non-B DNA unwinding activity by WRN helicase and G*G multiplex binding (for example, by U2AF65) will require further study.
Beta-catenin, as a transcription factor complexed with TCF4, is known to upregulate expression of many relevant proteins in colorectal cancer, such as c-myc, cyclin D1, LEF-1, CD44, and c-jun. Whether beta-catenin influences the expression of U2AF65 is unknown, but a search of transcription factor binding sites in the U2AF65 (
U2AF2) gene promoter did not indicate any beta-catenin or TCF family transcription factor sites among the 55 high-scoring (>85%) sites we identified (Cold Spring Harbor Laboratory Mammalian Promoter Database
http://rulai.cshl.edu/CSHLmpd2/; Transcription Factor Search
http://www.cbrc.jp/research/db/TFSEARCH.html). Similarly, mining through microarray expression studies revealed no reports describing U2AF65 (
U2AF2) as a beta-catenin, TCF4, or Wnt target gene (NCBI GEO; R Nusse Wnt/Beta catenin targets list:
http://www.stanford.edu/~rnusse/pathways/targets.html). The biological significance of the correlation of U2AF65 and beta-catenin expression in colorectal tumor tissues, such as if beta-catenin as a transcription factor affects U2AF65 expression, or if U2AF65 as a splicing factor affects the splicing or expression of beta-catenin, remains to be determined.
Several studies have examined the interaction of beta-catenin with splicing factors and the role of beta-catenin in mRNA splicing. Researchers identified alternative splicing of SLC39A14, a divalent cation transporter, in colorectal tumors and found it to be regulated by the Wnt pathway, probably through regulation of splicing factor SRSF1
[
71]. The beta-catenin/TCF4 pathway also modifies alternative splicing through modulation of expression of splicing factors SRp20
[
72] and SF1
[
73] and direct interaction with FUS/TLS (translocated in liposarcoma) and various other RNA-binding proteins, including p54nrb
[
74]. Others have shown that beta-catenin regulates multiple steps of RNA metabolism in colon cancer cells and may coordinate RNA metabolism
[
75].
Authors have also reported identification of truncated beta-catenin isoforms, mostly in colorectal cancer cells. In primary colorectal tumors, a relatively small percent (7 of 58 examined) contained somatic interstitial deletions that included all or part of exon 3 of the beta-catenin gene, and RT-PCR analysis from 3 of the 7 tumors detected transcripts that lacked exon 3 and the presence of the normal transcript
[
39]. Researchers also detected two novel beta-catenin mRNA splice variants in the SW480 colon cancer cell line and in primary colorectal tumors
[
38]. A truncated beta-catenin protein of 80-kDa was also detected in three colorectal metastases to the liver
[
40]. Several of these isoforms have truncations in the NH
2-terminus of the protein that produce deletions of key serine and threonines that are phosphorylated by GSK-3 beta, which is important for proteosomal degradation, which was hypothesized to stabilize the protein and have a dominant oncogenic effect
[
76]. Data from this and other studies lead us to speculate that U2AF65 could be binding to a multi-stranded nucleic acid structure such as R-loops, D-loops, or G-quartet mRNA
in vivo that is mimicked by the purine triplex DNA probe in our study, and that overexpression or increased EMSA binding activity of U2AF65 in tumor tissues could cause deregulation of mRNA splicing and protein isoform expression, such as beta-catenin, that could contribute to colorectal cancer initiation and/or progression.