PMCCPMCCPMCC

Search tips
Search criteria 

Advanced

 
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
 
Genes Dev. Author manuscript; available in PMC 2007 October 28.
Published in final edited form as:
PMCID: PMC2043112
NIHMSID: NIHMS26349

U2AF homology motifs: protein recognition in the RRM world

Abstract

Recent structures of the heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) have revealed two unexpected examples of RNA recognition motif (RRM)-like domains with specialized features for protein recognition. These unusual RRMs, called U2AF homology motifs (UHMs), represent a novel class of protein recognition motifs. Defining a set of rules to distinguish traditional RRMs from UHMs is key to identifying novel UHM family members. Here we review the critical sequence features necessary to mediate protein–UHM interactions, and perform comprehensive database searches to identify new members of the UHM family. The resulting implications for the functional and evolutionary relationships among candidate UHM family members are discussed.

Keywords: U2AF, RNA recognition motif, protein-protein interaction, RNA-binding domain, PUMP, splicing factor

The processes of RNA splicing, transport, capping, editing, and polyadenylation are heavily dependent on protein factors that recognize the pre-mRNA and assemble the appropriate pre-mRNA processing complexes. Surprisingly, the many different protein factors that guide pre-mRNA modification pathways are composed of a limited number of conserved, modular RNA-binding domains (Burd and Dreyfuss 1994). Of these, the RNA recognition motif (RRM) domain is by far the most abundant type of eukaryotic RNA-binding motif. In addition to associations between protein and RNA, protein–protein interactions are essential to recruit catalytic components to sites of RNA modification and to coordinate pre-mRNA processing with other cellular pathways. Interestingly, traditional protein interaction domains, such as SH2, SH3, and WW motifs, are rarely observed in pre-mRNA processing factors (e.g., see Shatkin and Manley 2000; Zhou et al. 2002), implying that the ability to interact with other proteins may reside in the sequences previously thought to be involved in RNA binding. Consistent with this idea, recent structures of the heterodimeric splicing factor U2 snRNP auxiliary factor (U2AF) have revealed two unexpected examples of RRM-like domains with specialized features for protein recognition (Kielkopf et al. 2001; Selenko et al. 2003). In light of this structural information, we call these unusual RRMs U2AF homology motifs (UHMs) to reflect their distinct role in protein recognition. Here, the critical sequence features necessary to mediate protein–UHM interactions are reviewed and formulated in a manner that has permitted a comprehensive database search designed to identify members of the UHM family. The resulting implications for the functional and evolutionary relationships among candidate members of the UHM family are discussed. This review represents a first step toward distinguishing canonical RRMs from UHMs, and thereby contributes toward a major goal of the postgenomic era (Thornton et al. 2000): to convert genomic sequences into testable functional hypotheses.

Structural features of RNA recognition by canonical RRMs

The RNA-binding function of the canonical RRM domain has been extensively investigated over the last two decades. The most conserved RRM signature sequence is an eight-residue motif called ribonucleoprotein 1 (RNP1; Adam et al. 1986; Sachs et al. 1986), which has the consensus [RK]-G-[FY]-[GA]-[FY]-[ILV]-X-[FY] (where X is any amino acid). A second six-residue region of homology, called RNP2, is typically located ~30 residues N-terminal to RNP1 (Lahiri and Thomas 1986; Dreyfuss et al. 1988), and has the consensus [ILV]-[FY]-[ILV]-X-N-L. Additional conserved amino acids define an ~80-residue domain that encompasses the RNA-binding function (Query et al. 1989; Scherly et al. 1989; Birney et al. 1993).

The three-dimensional structure of the canonical RRM domain was first determined for the RRM of U1A (Nagai et al. 1990; Hoffman et al. 1991). The RRM fold is composed of two α-helices packed against four antipar-allel β-strands with topology βαββαβ, which form an α/β sandwich (Fig. 1). The RNP consensus motifs form two central β-strands, with RNP1 in β3 and RNP2 in β1. Because of the alternating side-chain conformations of the pleated β-sheet, some of the consensus residues maintain the core fold, whereas others are displayed on the surface for nucleic acid recognition. Structures of single RRMs complexed with RNA have been determined for the U1A-RRM bound to a hairpin loop of U1 snRNA (Oubridge et al. 1994; Price et al. 1998; Deo et al. 1999; Handa et al. 1999; Allain et al. 2000; Wang and Tanaka Hall 2001), and for a ternary complex of the U2 snRNP proteins U2B″-RRM/U2A′ with a U2 snRNA hairpin loop (Price et al. 1998). In contrast to the isolated RRM of U1A, in most cases multiple RRMs are observed within a single polypeptide, with an average of two RRMs per protein (Letunic et al. 2004). The structures of several proteins composed of two tandem RRMs complexed with single-stranded RNA oligonucleotides have been determined, including the alternative splicing factor Sxl (Handa et al. 1999), PAB (Deo et al. 1999), pre-rRNA packaging protein nucleolin (Allain et al. 2000), and translation regulatory protein HuD (Wang and Tanaka Hall 2001). A comparison of these six different structures has revealed some common themes, as well as differences, in the mode of canonical RRM/RNA recognition. When all the structures are superimposed, structural equivalent hydrogen-bonds or stacking interactions are observed between single-stranded RNA and residues in the RNP1 and RNP2 motifs. A variety of sequences and RNA conformations are recognized by a variety of complementary hydrogen bonds with specific bases and differing arrangements of single or multiple RRMs.

Figure 1
Representative canonical RRM fold, from the structure of the U1A/RNA complex (PDB code 1URN). The N- and C-terminal ends of the RRM are indicated. The position and orientation of the RNA are represented with a ribbon diagram.

U2AF, the UHM prototype

During pre-mRNA splicing, U2AF and other essential factors facilitate sequential association of small nuclear RNP particles (snRNPs), including U1, U2, U4, U5, and U6 snRNPs, with the borders of intervening pre-mRNA sequences (for review, see Brow 2002). Following assembly of the functional spliceosome, the intron is excised as a branched lariat by two catalytic steps, and adjacent exons are joined together to form the spliced mRNA. U2AF was identified as a factor that binds to pre-mRNA consensus sequences at the 3′ splice site (3′SS), and is required for stable association of the U2 snRNP core spliceosome particle with the pre-mRNA branch point sequence (BPS) during the first ATP-dependent step of the splicing process (Complex A; Ruskin et al. 1988; Zamore and Green 1989). The importance of U2AF in vitro was soon corroborated by the discovery that both subunits are essential in Drosophila melanogaster (Kanaar et al. 1993; Rudner et al. 1996, 1998b) and Caenorhabditis elegans (Zorio and Blumenthal 1999b). Moreover, U2AF65 is an essential protein in Schizosaccharomyces pombe (Potashkin et al. 1993), and U2AF35 is necessary for vertebrate development (Golling et al. 2002). Because U2AF commits the pre-mRNA to the first critical ATP-dependent step of splicing, its binding is often regulated during alternative splicing (Smith and Valcarcel 2000). In humans, the products of five U2AF35-like open reading frames and the single U2AF65 subunit may form distinct heterodimers with different functional activities (Tupler et al. 2001; Shepard et al. 2002). In addition to U2AF, other non-snRNP protein factors are required for formation of Complex A, including Splicing Factor 1 (SF1) and Splicing Factor 3b (SF3b), a multisubunit component of the U2 snRNP (Kramer and Utans 1991).

To perform its role in RNA splicing, two central canonical RRM domains of U2AF65 recognize the polypyrimidine tract (Py-tract) in the pre-mRNA (Fig. 2). Binding of U2AF65 to the Py-tract is strengthened by cooperative protein–protein interactions with SF1 at the upstream BPS (Berglund et al. 1998; Rain et al. 1998) and with U2AF35, which contacts the downstream 3′SS consensus (Merendino et al. 1999; Wu et al. 1999; Zorio and Blumenthal 1999a). The C-terminal UHM domain of U2AF65 interacts with the N-terminal domain of SF1 (U2AF65-UHM/SF1-ligand; Rain et al. 1998). At the opposite end of the large subunit, the N-terminal domain of U2AF65 provides a ligand that interacts with the central UHM domain of U2AF35 (U2AF35-UHM/U2AF65-ligand; Zhang et al. 1992; Rudner et al. 1998b). Subsequently, entry of the U2 snRNP displaces SF1 by interacting with the BPS via the U2 snRNA (Nelson and Green 1989; Wu and Manley 1989; Zhuang and Weiner 1989; Query et al. 1994), and with the U2AF65 C-terminal domain via the SF3b subunit, SAP155 (Gozani et al. 1998; Habara et al. 1998). Once the U2 snRNP has contacted the pre-mRNA, U2AF is dissociated by conformational rearrangements of the spliceosome components (Bennett et al. 1992; Chiara et al. 1997). In summary, key protein–protein interactions are mediated by the U2AF65-UHM, which interacts with SF1 and subsequently SAP155, and by the U2AF35-UHM, which interacts with the U2AF65 N terminus.

Figure 2
Diagram of protein–protein interactions mediated by the U2AF heterodimer during the initial stages of pre-mRNA splicing. The U2AF heterodimer (mediated by the U2AF35-UHM/U2AF65-ligand interaction) binding to the poly-pyrimidine tract (Py-tract) ...

Structural features of protein–protein interactions by UHMs

Based on primary sequence analysis, both the U2AF65 C-terminal domain and the central domain of U2AF35 were suspected to contain unusual variations of the RRM fold (Birney et al. 1993). However, the borders of the U2AF-UHM domains could not be assigned accurately because of sequence insertions in the first helix of the fold (Helix A) and the absence of aromatic amino acids in the RNP-like motifs that are normally critical for RNA recognition. The independent determination of the X-ray structure of the U2AF35-UHM/U2AF65-ligand complex (Kielkopf et al. 2001) and NMR structure of the U2AF65-UHM/SF1-ligand complex (Selenko et al. 2003) confirmed that both the C-terminal U2AF65 and central U2AF35 protein interaction domains adopt the βαββαβ RRM-fold topology. Within the RRM-like fold, the sequence insertions separating the RNP-like motifs increase the length of Helix A from three turns observed among canonical RRMs to five or eight turns for U2AF65 and U2AF35, respectively; the functional role of these sequence insertions, if any, is unclear. The parallel use of an RRM-like fold to recognize similar peptide ligands implies that the U2AF35-UHM and U2AF65-UHM domains represent a new type of protein–protein interaction motif, hitherto undetected amid the many canonical RRMs of pre-mRNA processing factors.

Three-dimensional structural information revealed unanticipated sequence features of U2AF35-UHM and U2AF65-UHM domains that enable interaction with short protein ligands. Despite low primary sequence identity (23%), ligand recognition by the different UHM domains is very similar (Fig. 3). In both the U2AF35-UHM/U2AF65-ligand and U2AF65-UHM/SF1-ligand structures, a critical Trp residue in the ligand sequence inserts into a tight hydrophobic pocket between the α-helices and the RNP1- and RNP2-like motifs (Kielkopf et al. 2001; Selenko et al. 2003). In addition to aliphatic residues, a conserved Arg–X–Phe motif (where X is any amino acid; see below) on the loop connecting the last α-helix (Helix B) and β-strand of the UHM fold contributes to the Trp-binding pocket. The Arg residue in the loop (U2AF35-Arg 133 or U2AF65-Arg 452) forms an intramolecular salt bridge with the last Glu residue of Helix A (U2AF35-Glu 88 or U2AF65-Glu 405) that shields one face of the ligand-Trp, whereas the Phe residue (U2AF35-Phe 135 or U2AF65-Phe 454) encloses the opposite Trp face. In addition to the extensive interface with the ligand-Trp, a series of acidic residues in Helix A of the UHM interacts with basic residues at the N terminus of the protein ligand. Specifically, electrostatic interactions between U2AF35-Glu 84 and U2AF65-Lys 90 as well as U2AF65-Asp 401 and SF1-Arg 21 are observed at similar positions for both structures. The essential nature of acidic residues within Helix A, Phe 454, and the Trp-binding pocket was confirmed for the U2AF65-UHM/SF1-ligand complex by site-directed mutagenesis of the U2AF65-UHM or SF1-ligand followed by pull-down assays (Selenko et al. 2003). Likewise, the U2AF65-ligand-Trp 92 was found to contribute two orders of magnitude to the affinity of the U2AF35-UHM/U2AF65-ligand complex by isothermal titration calorimetry.

Figure 3
Structures of U2AF-UHM/ligand complexes. The UHM is shown in blue, the protein ligand is shown in yellow. Key interacting side chains are drawn in ball-and-stick representation. (A) The U2AF35-UHM bound to the U2AF65-ligand. (B) The U2AF65-UHM bound to ...

In the U2AF35-UHM, a distinctive Trp residue (U2AF35-Trp 134) is observed at the X position of the Arg–X–Phe motif on the last loop of the UHM domain. Bulky aromatic residues such as Trp at this solvent-exposed position are especially rare among canonical RRM domains (1% of 676 annotated RRM domains in the SWISS-PROT database). The most frequently observed residues at the corresponding position of canonical RRMs are highly charged, including Glu (16%) and Lys (15%), as is also observed for the U2AF65-UHM (Arg–Lys–Phe). The unusual U2AF35-Trp 134 inserts between a series of unique Pro residues at the C terminus of the U2AF65-ligand, which are completely absent from the SF1-ligand of the U2AF65-UHM. The additional Trp/Pro interaction significantly contributes to the high affinity of the U2AF heterodimer (1.7 nM Kd; Kielkopf et al. 2001). Because the U2AF65-UHM/SF1-ligand complex lacks the corresponding Trp/Pro interaction, the affinity is relatively weak (~100 nM Kd; Selenko et al. 2003). The sequence differences in the ligands recognized by the U2AF35-UHM and U2AF65-UHM domains reflect the different functional roles of the complexes, which, respectively, maintain the constitutive U2AF35-UHM/ U2AF65 heterodimer (Zhang et al. 1992; Rudner et al. 1998b) or form a transient U2AF65/SF1 intermediate during spliceosome assembly (Rutz and Seraphin 1999).

The structures of the U2AF35-UHM/U2AF65-ligand and U2AF65-UHM/SF1-ligand complexes revealed several sequence features that distinguish UHMs from canonical RRM domains. One striking feature of UHM domains is their atypical RNP-like motifs. The first residue of the RNP1-like motif and the second residue of the RNP2-like motif are unusual in that they are exposed on the β-sheet surface rather than directly involved in RNA binding. Residues in these positions consist of aliphatic amino acids (U2AF35-Ala 47, Val 110, or U2AF65-Cys 379, Cys 429) as opposed to the basic and aromatic residues used for RNA recognition by canonical RRM domains. Other prominent distinguishing sequences include the Arg–X–Phe motif and acidic residues in Helix A (especially U2AF35-Glu 84/Glu 88 and U2AF65-Asp 401/Glu 405). As a consequence of the acidic nature of Helix A and lack of a basic RNP1 residue that usually contacts the RNA, the isoelectric points of UHM domains are remarkably low (pI 4.1 for the U2AF35-UHM and pI 4.3 for the U2AF65-UHM) compared with the typically basic character of canonical RRMs (pI >9) that function to bind anionic RNA ligands. The majority of the aliphatic residues lining the Trp-binding pocket (including U2AF35-UHM Leu 48, Val 85, Leu 130, and Ile 140 and their U2AF65-UHM counterparts Leu 380, Val 402, Leu 449, and Val 459), however, cannot be used to distinguish UHM from canonical RRM domains, because they also serve to preserve the RRM fold (Birney et al. 1993). One exception is the last aliphatic residue of the RNP2 motif (U2AF35-Ile 51 or U2AF65-Met 383), which contributes to the Trp-binding pocket and consequently differs from the conserved RNP2-Leu residue within the hydrophobic core of canonical RRM domains. Thus, at least three major sequence differences required for UHM–protein interactions distinguish UHMs from canonical RRM domains: (1) atypical RNP-like motifs, (2) an Arg–X–Phe motif in the last loop, and (3) an acidic character of Helix A.

Identifying novel UHM family members

The discovery of two examples of RRM-like domains with specialized sequence characteristics for protein–protein interactions raised the question of whether the U2AF35-UHM and U2AF65-UHM domains represented a larger family of modular protein interaction domains. Examples of proteins with domains similar to the U2AF65 or U2AF35-UHM had been previously noted (Kielkopf et al. 2001; Selenko et al. 2003), including the C-terminal homodimerization domain of PUF60, which has previously been referred to as the PUF60, U2AF65, MUD2 protein–protein interaction (PUMP) domain (Page-McCaw et al. 1999). Several search strategies were used to further extend the UHM family. An initial consensus pattern for the U2AF65, U2AF35, PUF60, and Tat-SF1 UHM domains, defined automatically using the program PRATT (Brazma et al. 1996), proved too stringent as it only matched homologs of these proteins in a Scan-ProSite search of the SWISS-PROT/TrEMBL databases (Gattiker et al. 2002). Therefore, a target pattern ([ILM-VFC]-X-[LIFV]-X-[NSHT]-[ILMVC]-X(6,40)-[VLIT]-X(2)-[ED]-X(4,5)-G-X-[IVA]-X(4)-[VIL]-X(4,25)-[GV]-X-[VIAL]-[FY]-[VIL]-X-[FYC]-X(6,12)-[AC]-[LVMIC]-X-X-[LMIF]-X-[NG]-R-[WYKM]-[FY]-X-G-X(4,8)-[IVL]) was defined manually based upon conserved residues that either maintain the RRM-like fold (Birney et al. 1993) or mediate protein–protein interactions in the structures of U2AF35-UHM/U2AF65-ligand and U2AF65-UHM/SF1-ligand (Kielkopf et al. 2001; Selenko et al. 2003). A search of the SWISS-PROT/TrEMBL databases with this target pattern identified several novel UHM candidates. The UHM family was further extended by manually inspecting RRM family alignments (Prosite PS50102) and the results of iterative PHI-PSI BLAST searches (Altschul et al. 1997) for similarities to the signature Arg–X–Phe motif observed in the last loop of the prototype U2AF-UHM domains.

Sequence comparisons revealed that the principal features that distinguish UHM candidates from canonical RRMs are conserved among 12 novel UHM candidates (Table 1; Fig. 4A), including (1) poor conservation of amino acids in the RNP1- and RNP2-like consensus motifs that would normally bind RNA (first/third and second positions, respectively); (2) an Arg–X–Phe motif in the last loop of the RRM-like fold; and (3) conserved acidic residues in the predicted Helix A and a low isoelectric point (average pI ~4.5). Seven additional UHM candidates displayed a subset of the UHM characteristics. To further investigate the evolutionary relationship among members of the UHM and RRM families, a phylogenetic tree of the candidates was constructed using neighbor joining with correction for multiple substitutions (Fig. 4B; Thompson et al. 1997). A comparison with canonical RRMs whose role in RNA recognition has been established by structure determination (including U1A, SXL, PAB, HuD, and nucleolin) revealed that the 12 convincing UHM candidates occupy a phylogenetic branch distinct from canonical RRMs that diverged from a common ancestral domain. The dendrogram also confirms that several of the putative UHM candidates (i.e., those that displayed only a subset of the UHM characteristics) are more closely related to canonical RRM domains than to U2AF or other UHM candidates, indicating that these proteins may have independently evolved UHM-like sequence motifs. Additional proteins from diverse eukaryotes displayed UHM signature sequences, but were considered homologs of other UHM candidates based on high sequence identity (Table 2). Given the difficulty of distinguishing UHMs from canonical RRM domains based on primary sequence comparisons alone, additional UHM protein interaction domains may be hidden within the RRM superfamily.

Figure 4
Identification of candidate UHM-containing proteins. (A) Structure-based alignment of candidate UHM sequences, U2AF35-UHM, U2AF65-UHM, and representative canonical RRMs whose structures in complex with RNA are known. Structures were aligned using the ...
Table 1
Signature sequences of UHM candidates compared with representative canonical RRMs
Table 2
Representative homologs of UHM candidates

In a few cases, UHM candidates that share a similar domain organization may represent homologs despite low sequence identities and/or a lack of consistent functional data, including SPF45 and DRT111; HCC1 and PAD1; TAT-SF1, UAP2, and CUS2; and MUD2 and U2AF65. In a well-studied example, MUD2 is the S. cerevisiae homolog of U2AF65 based on similar functional interactions with the Py-tract, U2 snRNP, and SF1 (Abovich et al. 1994; Rain et al. 1998). Despite low sequence identity (16%), heterologous complexes between MUD2 and human SF1, or between S. cerevisiae SF1 and human U2AF65 have not been observed (Rain et al. 1998) indicating that the ligand specificity of the human U2AF65-UHM has diverged from the MUD2-UHM. These differences in protein–protein interaction specificity are consistent with functional divergence of MUD2 from other U2AF large subunits. For example, MUD2 is dispensable for viability in S. cerevisiae (Abovich et al. 1994), whereas the UHM domain of the S. pombe U2AF65 homolog (which shares 31% sequence identity with human U2AF65) is required in vivo (Banerjee et al. 2004). The U2AF65-UHM interacts with an N-terminal domain of the SAP155 subunit of the U2 snRNP that is absent from the S. cerevisiae homolog of SAP155 (Gozani et al. 1998). Moreover, S. cerevisiae lacks an ortholog of the U2AF small subunit, indicating that MUD2 functions in the absence of the heterodimeric partner. These differences between S. cerevisiae and other U2AF homologs, coupled with the identification of eight human UHM candidates and 35 more homologs in a variety of higher eukaryotes compared with only three convincing yeast UHM candidates, suggests that the UHM diverged from the canonical RRM late in the evolutionary timeframe to serve the complicated pre-mRNA processing requirements of multicellular organisms.

The 12 candidate UHM domains are found in the context of a variety of domain arrangements within their protein sequences; a subset is detailed in Table 3. With the exception of the central URP-UHM, the UHM domains often occur near the C terminus of the candidate proteins, providing an exposed position to facilitate molecular recognition. Many of the UHM candidates also contain motifs frequently observed in splicing factors, such as canonical RRMs, arginine–serine (RS) domains, zinc fingers, and Glyrich regions. Additional unexpected domains are also observed, including the LAP2-Emerin-Man1 (LEM) protein–protein interaction domain of MAN1 and kinase domain of KIS.

Table 3
Function of human UHM candidates and their potential disease relevance

The diverse functional domains of the UHM candidates are accompanied by an array of different biological functions (Table 3). Like U2AF65 and U2AF35, many of the UHM candidates play important roles during RNA splicing. Interestingly, a few of the candidates (PUF60, TAT-SF1, and HCC1) play a dual role in regulating transcription. Because RNA splicing factors can influence the efficiency of transcription, and conversely, transcription often influences the efficiency and products of alternative RNA splicing (Auboeuf et al. 2002; Rosonina and Blencowe 2002), these UHM candidates may couple RNA splicing with transcription to coordinate gene expression. Other candidate UHM proteins, namely, the membrane protein MAN1 and the regulatory kinase KIS, are important for signal transduction, but have not yet been shown to affect RNA processing. It remains to be determined whether KIS and MAN1 couple pre-mRNA splicing with other cellular pathways via their established roles in signal transduction. Demonstrating the significance of their functions, misregulated UHM-containing proteins are associated with several human diseases, including human immunodeficiency virus type 1 (HIV-1; Zhou and Sharp 1996), and certain cancers (Liu et al. 2001; Bieche et al. 2003; Sampath et al. 2003).

Specificity of protein–protein interaction by UHM domains

As stated above, the structures of the U2AF35-UHM and U2AF65-UHM complexed with their ligands revealed remarkably similar modes of protein interaction. Notably, both the U2AF65 and SF1 ligands bind their respective UHM domains via a similar arrangement of basic and Trp residues. A search for the ligand consensus pattern [RK]-X-[RK]-W, shared by both the SF1 and U2AF65 ligands, found >1000 matches within the SWISS-PROT database, indicating that predicting protein ligands of candidate UHMs is impractical without further experimental information to identify their functional binding partners. Given that all 12 compelling UHM candidates possess the signature sequences predicted to recognize ligands containing the [RK]-X-[RK]-W motif, it remains an open question whether each UHM domain specifically recognizes a single target, or could promiscuously interact with the intended ligand of a different UHM in the absence of temporal or spatial regulation. Toward answering this question, the U2AF65-UHM has been found not to interact with the N terminus of the U2AF65-ligand in two-hybrid assays (Tronchere et al. 1997) and pull-down experiments (C. Kielkopf, unpubl.), suggesting that individual UHMs may, indeed, specifically recognize a unique target. By analogy with other modular peptide binding domains (Pawson and Nash 2003), UHM sequences flanking the Trp-binding pocket may ensure specific and directional interactions (N-to-C orientation) with the ligand.

In support of this analogy, structure-based modeling suggests that variation of the central X residue in the UHM Arg–X–Phe loop may provide one mechanism for UHM recognition of diverse C-terminal ligand sequences. Several of the UHM candidates (hURP, PUF60, Tat-SF1, and HCC1) share a Trp residue within the Arg–X–Phe loop that is essential in the U2AF35-UHM for specific recognition of C-terminal U2AF65-ligand-Proresidues (Kielkopf et al. 2001). Other UHM sequences vary from similar Arg–Tyr–Phe motifs (SPF45, DRT111, UAP2, CUS2, and PAD1) to divergent Lys (U2AF65) and Met (KIS) residues. Besides recognizing the ligand C terminus via the Arg–X–Phe loop, distinct U2AF65-UHM or U2AF35-UHM residues make specific contacts with N-terminal ligand residues. In particular, the bulky U2AF65-ligand Tyr 91 stacks against unique U2AF35 aromatic residues (Tyr 52 and Phe 81), and forms a specific hydrogen-bond with His 77 of the U2AF35-UHM (Kielkopf et al. 2001). Similar or identical residues in the URP-UHM (Phe 206, Phe 239, and Gln 235) suggest that a bulky, hydrophobic residue preceding the consensus ligand-Trp would be recognized in an analogous manner, consistent with an interaction between hURP and U2AF65 in pull-down and yeast two-hybrid assays (Tronchere et al. 1997). The smaller size of the corresponding U2AF65-UHM residues (Ile 398 and Val 384) would leave the hydrophobic side chain of a ligand-Tyr in an unfavorable, solvent-exposed environment (Selenko et al. 2003). Considering the variety of cellular roles played by UHM candidates and the consequent requirement to recognize diverse protein ligands, it will be important to determine whether variation in the positions corresponding to U2AF35 Tyr 52, Phe 81, and His 77, and the central position of the Arg–X–Phe loop enables recognition of distinct ligand sequences by UHM domains.

In addition to recognizing short peptide ligands, UHM domains can self-associate to form protein homodimers. For example, the PUF60-UHM domain interacts with itself in two-hybrid assays (Poleev et al. 2000) and forms SDS-resistant homodimers during electrophoresis (Page-McCaw et al. 1999). The U2AF35-UHM has been shown to form weak homodimers by gel filtration, analytical ultracentrifugation, dynamic light scattering (Kielkopf et al. 2001), and two-hybrid assays (Wentz-Hunter and Potashkin 1996), whereas homodimers of the U2AF65-UHM have not been observed (Tronchere et al. 1997). Homo- or heterotypic oligomerizations also have been observed for classical protein–protein interaction domains, with several different effects on ligand recognition. For example, the nNOS-PDZ/syntrophin heterodimer prohibits peptide recognition (Hillier et al. 1999), whereas GRIP or Shank PDZ homodimers leave the peptide-binding pockets free (Im et al. 2003a,b) and the Eps8-SH3 homodimer alters the ligand specificity (Kishan et al. 1997). Although a U2AF35-UHM homodimer can be modeled with the solvent exposed Arg–Trp–Phe loop binding to the Trp-binding site on a second UHM domain, alternative interfaces are possible that would allow the oligomer to simultaneously recognize peptide ligands, as observed for established protein–protein interaction domains.

Do UHM domains recognize RNA?

Modeling of the U2AF-UHM/ligand structures with RNA has revealed that peptide binding to the helical surface of the RRM-like fold is not predicted to physically interfere with putative RNA interactions on the opposite β-sheet face (Kielkopf et al. 2001; Selenko et al. 2003). Although the U2AF35-UHM/U2AF65-ligand complex binds RNA weakly (Kd >6 μM), accessory protein factors and adjacent domains in the full-length U2AF35 sequence (e.g., flanking zinc fingers and an RS domain) are required to assist the weak interaction (Rudner et al. 1998a; J. Valcarcel, pers. comm.). Likewise, the U2AF65-UHM domain is not required for Py-tract recognition (Banerjee et al. 2003), and does not appear to interact with RNA (Selenko et al. 2003). These results indicate that UHM domains are not likely to be involved in RNA interactions.

Instead, the UHM family has evolved sequence characteristics that have no benefit for RNA binding, while optimizing the interaction with peptide ligands. In most canonical RRM-RNA structures, conserved aromatic Phe/Tyr residues at the third RNP1 position or second RNP2 position stack with RNA bases or sugars, and a basic Arg/Lys residue at the first position of the RNP1 motif frequently forms a salt bridge with the phosphate backbone (Fig. 5A; Oubridge et al. 1994; Price et al. 1998; Deo et al. 1999; Handa et al. 1999; Allain et al. 2000; Wang and Tanaka Hall 2001). In contrast, the corresponding U2AF35-UHM (Fig. 5B) and U2AF65-UHM (Fig. 5C) residues are replaced with aliphatic substitutions that are not predicted to interact favorably with RNA. Moreover, UHMs display unexpectedly low isoelectric points for optimal binding of basic peptides.

Figure 5
RNA recognition by an RRM domain compared with the U2AF-UHM domains. The UHM or RRM domains are shown in blue, protein ligands are yellow, and RNA ligands are purple. (A) The U1A-RRM recognizing an RNA oligonucleotide. The RNP1-Arg recognizes the RNA ...

In addition to poor conservation of RNP-like motifs and overall negative charge, RNA binding by the U2AF65-UHM structure is further inhibited by a C-terminal α-helix that forms a tight hydrophobic interface with the putative RNA-binding surface of the RRM-like fold (Selenko et al. 2003). In contrast, the C-terminal extensions of canonical RRMs more often strengthen rather than inhibit RNA binding. For example, the C-terminal helical extension of the N-terminal U1A-RRM not only contributes to RNA binding (Oubridge et al. 1994; Zeng and Hall 1997) but also mediates dimer formation for recognition of tandem RNA elements (Klein Gunnewiek et al. 2000; Varani et al. 2000). Based on the U2AF65-UHM structure, Phe 433 in the RNP1-like motif and Tyr 463 within the preceding turn interact with Tyr 469, Phe 474, and Trp 475 in the C-terminal α-helix (Selenko et al. 2003). Although counterparts of Phe 474 and Trp 475 are absent among the UHM candidates, aromatic residues at positions corresponding to Phe 433, Tyr 463, and Tyr 469 are observed for the UHM domains of KIS, PUF60, SPF45, and HCC1. This raises the possibility that some of the UHM candidates may have a hydrophobic C-terminal extension that may either interfere with RNA binding as for the U2AF65-UHM, or contribute to homodimer formation in a manner similar to U1A.

Conclusions

The canonical RRM domain was a relatively late evolutionary addition to the array of RNA-binding folds that emerged in response to the needs of complex pre-mRNA processing pathways (Anantharaman et al. 2002). As processes that were originally based on the RNA world became progressively more regulated and reliant on protein interactions, the RRM fold further developed specialized sequence characteristics for protein recognition to form the UHM subfamily. These UHM signature sequences included divergent residues in the RNP-like motifs, an Arg–X–Phe loop sequence, and key acidic residues that collectively recognized the Trp residue and positive charge of the protein ligand. Convincing UHM candidates have been discovered in association with a variety of fundamental cellular processes, ranging from pre-mRNA splicing to transcription, DNA repair, and signal transduction. The large number of proteins that share the signature protein–protein interaction residues of UHM domains supports the proposal that the U2AF-UHMs represent a novel family of modular protein interaction domains.

Because protein interaction domains are attractive modules for communication among a network of pathways, the UHM domain may be an evolutionary extension of RRMs that couples pre-mRNA processing with other nuclear processes. Protein recognition by so-called RNA-binding domains is an emerging theme in molecular recognition. An early example of RRM–protein interactions was observed in the structure of U2B″/U2A′, in which the α-helical surface of the U2B″-RRM interacts with the U2A′ leucine-rich repeat motif (Price et al. 1998). Several recent structures of the β-sheet surfaces of heterodimeric RRM domains interacting with α-helical protein ligands have revealed a second mode of RRM–protein recognition distinct from that of UHM/protein complexes (Fribourg et al. 2003; Lau et al. 2003; Shi and Xu 2003; Kadlec et al. 2004). In addition to distinguishing protein recognition domains within the RRM family, a growing list of fold families such as the Sterile α-Motif (SAM; Kim and Bowie 2003), LEM (Cai et al. 2001; Laguri et al. 2001), Pumilio/HEAT-repeat domains (Wang et al. 2002), and zinc fingers (Morgan et al. 1997) have been found to bind either nucleic acids or protein ligands through slight variations of a common scaffold. Furthermore, RS domains have been shown to contact the pre-mRNA during splicing (Valcarcel et al. 1996; Shen et al. 2004), and have also been reported to mediate protein–protein interactions (Wu and Maniatis 1993). Because a major goal of the “postgenomic” era is the ability to predict protein functions even in the absence of corroborating experimental results (Thornton et al. 2000), it will be essential to compile a lexicon of signature sequences, such as those that distinguish UHMs from canonical RRMs, for other fold families whose members play diverse functional roles.

Acknowledgments

We thank S. Evans for editorial assistance, and J. Bender, M. Matunis, M. Swenson, and J. Wedekind for careful reading of the manuscript. Funding for C.L.K. is provided by the Johns Hop-kins University Center for AIDS Research grant #P30 AI42855.

References

  • Abovich N, Liao XC, Rosbash M. The yeast MUD2 protein: An interaction with PRP11 defines a bridge between commitment complexes and U2 snRNP addition. Genes & Dev. 1994;8:843–854. [PubMed]
  • Adam SA, Nakagawa T, Swanson MS, Woodruff TK, Dreyfuss G. mRNA polyadenylate-binding protein: Gene isolation and sequencing and identification of a ribo-nucleoprotein consensus sequence. Mol Cell Biol. 1986;6:2932–2943. [PMC free article] [PubMed]
  • Allain FH, Bouvet P, Dieckmann T, Feigon J. Molecular basis of sequence-specific recognition of pre-ribosomal RNA by nucleolin. EMBO J. 2000;19:6870–6881. [PubMed]
  • Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–3402. [PMC free article] [PubMed]
  • Anantharaman V, Koonin EV, Aravind L. Comparative genomics and evolution of proteins involved in RNA metabolism. Nucleic Acids Res. 2002;30:1427–1464. [PMC free article] [PubMed]
  • Auboeuf D, Honig A, Berget SM, O’Malley BW. Coordinate regulation of transcription and splicing by steroid receptor coregulators. Science. 2002;298:416–419. [PubMed]
  • Auboeuf D, Dowhan DH, Kang YK, Larkin K, Lee JW, Berget SM, O’Malley BW. Differential recruitment of nuclear receptor coactivators may determine alternative RNA splice site choice in target genes. Proc Natl Acad Sci. 2004;101:2270–2274. [PubMed]
  • Banerjee H, Rahn A, Davis W, Singh R. Sex lethal and U2 small nuclear ribonucleoprotein auxiliary factor (U2AF65) recognize polypyrimidine tracts using multiple modes of binding. RNA. 2003;9:88–99. [PubMed]
  • Banerjee H, Rahn A, Gawande B, Guth S, Valcarcel J, Singh R. The conserved RNA recognition motif 3 of U2 snRNA auxiliary factor (U2AF65) is essential in vivo but dispensable for activity in vitro. RNA. 2004;10:240–253. [PubMed]
  • Bennett M, Michaud S, Kingston J, Reed R. Protein components specifically associated with prespliceo-some and spliceosome complexes. Genes & Dev. 1992;6:1986–2000. [PubMed]
  • Berglund JA, Abovich N, Rosbash M. A cooperative interaction between U2AF65 and mBBP/SF1 facilitates branchpoint region recognition. Genes & Dev. 1998;12:858–867. [PubMed]
  • Bieche I, Manceau V, Curmi PA, Laurendeau I, Lachkar S, Leroy K, Vidaud D, Sobel A, Maucuer A. Quantitative RT–PCR reveals a ubiquitous but preferentially neural expression of the KIS gene in rat and human. Brain Res Mol Brain Res. 2003;114:55–64. [PubMed]
  • Birney E, Kumar S, Krainer AR. Analysis of the RNA-recognition motif and RS and RGG domains: Conservation in metazoan pre-mRNA splicing factors. Nucleic Acids Res. 1993;21:5803–5816. [PMC free article] [PubMed]
  • Boehm M, Yoshimoto T, Crook MF, Nallamshetty S, True A, Nabel GJ, Nabel EG. A growth factor-dependent nuclear kinase phosphorylates p27Kip1 and regulates cell cycle progression. EMBO J. 2002;21:3390–3401. [PubMed]
  • Brazma A, Jonassen I, Ukkonen E, Vilo J. Discovering patterns and subfamilies in biosequences. Proc Int Conf Intell Syst Mol Biol. 1996;4:34–43. [PubMed]
  • Brow DA. Allosteric cascade of spliceosome activation. Annu Rev Genet. 2002;36:333–360. [PubMed]
  • Burd CG, Dreyfuss G. Conserved structures and diversity of functions of RNA-binding proteins. Science. 1994;265:615–621. [PubMed]
  • Cai M, Huang Y, Ghirlando R, Wilson KL, Craigie R, Clore GM. Solution structure of the constant region of nuclear envelope protein LAP2 reveals two LEM-domain structures: One binds BAF and the other binds DNA. EMBO J. 2001;20:4399–4407. [PubMed]
  • Chiara MD, Palandjian L, Feld Kramer R, Reed R. Evidence that U5 snRNP recognizes the 3′ splice site for catalytic step II in mammals. EMBO J. 1997;16:4746–4759. [PubMed]
  • Collaborative Computational Project, number 4. The CCP4 suite: Programs for protein crystallography. Acta Crystallogr. 1994;D50:760–763. [PubMed]
  • Dean W, Bowden L, Aitchison A, Klose J, Moore T, Meneses JJ, Reik W, Feil R. Altered imprinted gene methylation and expression in completely ES cell-derived mouse fetuses: Association with aberrant phenotypes. Development. 1998;125:2273–2282. [PubMed]
  • Dendouga N, Callebaut I, Tomavo S. A novel DNA repair enzyme containing RNA recognition, G-patch and specific splicing factor 45-like motifs in the protozoan parasite Toxoplasma gondii. Eur J Biochem. 2002;269:3393–3401. [PubMed]
  • Deo RC, Bonanno JB, Sonenberg N, Burley SK. Recognition of polyadenylate RNA by the poly(A)-binding protein. Cell. 1999;98:835–845. [PubMed]
  • Dreyfuss G, Swanson MS, Pinol-Roma S. Heterogeneous nuclear ribonucleoprotein particles and the pathway of mRNA formation. Trends Biochem Sci. 1988;13:86–91. [PubMed]
  • Fong YW, Zhou Q. Stimulatory effect of splicing factors on transcriptional elongation. Nature. 2001;414:929–933. [PubMed]
  • Fribourg S, Gatfield D, Izaurralde E, Conti E. A novel mode of RBD–protein recognition in the Y14–Mago complex. Nat Struct Biol. 2003;10:433–439. [PubMed]
  • Gattiker A, Gasteiger E, Bairoch A. ScanProsite: A reference implementation of a PROSITE scanning tool. Appl Bioinformatics. 2002;1:107–108. [PubMed]
  • Golling G, Amsterdam A, Sun Z, Antonelli M, Maldonado E, Chen W, Burgess S, Haldi M, Artzt K, Farrington S. et al. Insertional mutagenesis in zebrafish rapidly identifies genes essential for early vertebrate development. Nat Genet. 2002;31:135–140. [PubMed]
  • Gozani O, Potashkin J, Reed R. A potential role for U2AF–SAP155 interactions in recruiting U2 snRNP to the branch site. Mol Cell Biol. 1998;18:4752–4760. [PMC free article] [PubMed]
  • Habara Y, Urushiyama S, Tani T, Ohshima Y. The fission yeast prp10+ gene involved in pre-mRNA splicing encodes a homologue of highly conserved splicing factor, SAP155. Nucleic Acids Res. 1998;26:5662–5669. [PMC free article] [PubMed]
  • Handa N, Nureki O, Kurimoto K, Kim I, Sakamoto H, Shimura Y, Muto Y, Yokoyama S. Structural basis for recognition of the tra mRNA precursor by the Sex-lethal protein. Nature. 1999;398:579–585. [PubMed]
  • Hillier BJ, Christopherson KS, Prehoda KE, Bredt DS, Lim WA. Unexpected modes of PDZ domain scaffolding revealed by structure of nNOS–syntrophin complex. Science. 1999;284:812–815. [PubMed]
  • Hoffman DW, Query CC, Golden BL, White SW, Keene JD. RNA-binding domain of the A protein component of the U1 small nuclear ribonucleoprotein analyzed by NMR spectroscopy is structurally similar to ribosomal proteins. Proc Natl Acad Sci. 1991;88:2495–2499. [PubMed]
  • Im YJ, Lee JH, Park SH, Park SJ, Rho SH, Kang GB, Kim E, Eom SH. Crystal structure of the Shank PDZ–ligand complex reveals a class I PDZ interaction and a novel PDZ–PDZ dimerization. J Biol Chem. 2003a;278:48099–48104. [PubMed]
  • Im YJ, Park SH, Rho SH, Lee JH, Kang GB, Sheng M, Kim E, Eom SH. Crystal structure of GRIP1 PDZ6-peptide complex reveals the structural basis for class II PDZ target recognition and PDZ domain-mediated multimerization. J Biol Chem. 2003b;278:8501–8507. [PubMed]
  • Imai H, Chan EK, Kiyosawa K, Fu XD, Tan EM. Novel nuclear autoantigen with splicing factor motifs identified with antibody from hepatocellular carcinoma. J Clin Invest. 1993;92:2419–2426. [PMC free article] [PubMed]
  • Jung DJ, Na SY, Na DS, Lee JW. Molecular cloning and characterization of CAPER, a novel coactivator of activating protein-1 and estrogen receptors. J Biol Chem. 2002;277:1229–1234. [PubMed]
  • Kadlec J, Izaurralde E, Cusack S. The structural basis for the interaction between nonsense-mediated mRNA decay factors UPF2 and UPF3. Nat Struct Mol Biol. 2004;11:330–337. [PubMed]
  • Kanaar R, Roche SE, Beall EL, Green MR, Rio DC. The conserved pre-mRNA splicing factor U2AF from Drosophila: Requirement for viability. Science. 1993;262:569–573. [PubMed]
  • Kielkopf CL, Rodionova NA, Green MR, Burley SK. A novel peptide recognition mode revealed by the X-ray structure of a core U2AF35/U2AF65 heterodimer. Cell. 2001;106:595–605. [PubMed]
  • Kim CA, Bowie JU. SAM domains: Uniform structure, diversity of function. Trends Biochem Sci. 2003;28:625–628. [PubMed]
  • Kishan KV, Scita G, Wong WT, Di Fiore PP, Newcomer ME. The SH3 domain of Eps8 exists as a novel intertwined dimer. Nat Struct Biol. 1997;4:739–743. [PubMed]
  • Kitagawa K, Wang X, Hatada I, Yamaoka T, Nojima H, Inazawa J, Abe T, Mitsuya K, Oshimura M, Murata A. et al. Isolation and mapping of human homologues of an imprinted mouse gene U2AF1-RS1. Genomics. 1995;30:257–263. [PubMed]
  • Klein Gunnewiek JM, Hussein RI, van Aarssen Y, Palacios D, de Jong R, van Venrooij WJ, Gunderson SI. Fourteen residues of the U1 snRNP-specific U1A protein are required for homodimerization, cooperative RNA binding, and inhibition of polyadenylation. Mol Cell Biol. 2000;20:2209–2217. [PMC free article] [PubMed]
  • Kramer A, Utans U. Three protein factors (SF1, SF3 and U2AF) function in pre-splicing complex formation in addition to snRNPs. EMBO J. 1991;10:1503–1509. [PubMed]
  • Kuldau GA, Raju NB, Glass NL. Repeat-induced point mutations in PAD1, a putative RNA splicing factor from Neurospora crassa, confer dominant lethal effects on ascus development. Fungal Genet Biol. 1998;23:169–180. [PubMed]
  • Laguri C, Gilquin B, Wolff N, Romi-Lebrun R, Courchay K, Callebaut I, Worman HJ, Zinn-Justin S. Structural characterization of the LEM motif common to three human inner nuclear membrane proteins. Structure (Camb) 2001;9:503–511. [PubMed]
  • Lahiri DK, Thomas JO. A cDNA clone of the hnRNP C proteins and its homology with the single-stranded DNA binding protein UP2. Nucleic Acids Res. 1986;14:4077–4094. [PMC free article] [PubMed]
  • Lallena MJ, Chalmers KJ, Llamazares S, Lamond AI, Valcarcel J. Splicing regulation at the second catalytic step by Sex-lethal involves 3′ splice site recognition by SPF45. Cell. 2002;109:285–296. [PubMed]
  • Lau CK, Diem MD, Dreyfuss G, Van Duyne GD. Structure of the Y14–Magoh core of the exon junction complex. Curr Biol. 2003;13:933–941. [PubMed]
  • Letunic I, Copley RR, Schmidt S, Ciccarelli FD, Doerks T, Schultz J, Ponting CP, Bork P. SMART 4.0: Towards genomic data integration. Nucleic Acids Res. 2004;32:D142–D144. [PMC free article] [PubMed]
  • Li XY, Green MR. The HIV-1 Tat cellular coacti-vator Tat-SF1 is a general transcription elongation factor. Genes & Dev. 1998;12:2992–2996. [PubMed]
  • Lin F, Blake DL, Callebaut I, Skerjanc IS, Holmer L, Mc-Burney MW, Paulin-Levasseur M, Worman HJ. MAN1, an inner nuclear membrane protein that shares the LEM domain with lamina-associated polypeptide 2 and emerin. J Biol Chem. 2000;275:4840–4847. [PubMed]
  • Liu J, He L, Collins I, Ge H, Libutti D, Li J, Egly JM, Levens D. The FBP interacting repressor targets TFIIH to inhibit activated transcription. Mol Cell. 2000;5:331–341. [PubMed]
  • Liu J, Akoulitchev S, Weber A, Ge H, Chuikov S, Libutti D, Wang XW, Conaway JW, Harris CC, Conaway RC. et al. Defective interplay of activators and repressors with TFIH in xeroderma pigmentosum. Cell. 2001;104:353–363. [PubMed]
  • Marchler-Bauer A, Panchenko AR, Shoemaker BA, Thies-sen PA, Geer LY, Bryant SH. CDD: A database of conserved domain alignments with links to domain three-dimensional structure. Nucleic Acids Res. 2002;30:281–283. [PMC free article] [PubMed]
  • Maucuer A, Camonis JH, Sobel A. Stathmin interaction with a putative kinase and coiled-coil-forming protein domains. Proc Natl Acad Sci. 1995;92:3100–3104. [PubMed]
  • Maucuer A, Le Caer JP, Manceau V, Sobel A. Specific Ser-Pro phosphorylation by the RNA-recognition motif containing kinase KIS. Eur J Biochem. 2000;267:4456–4464. [PubMed]
  • McKinney R, Wentz-Hunter K, Schmidt H, Potashkin J. Molecular characterization of a novel fission yeast gene spUAP2 that interacts with the splicing factor spU2AF59. Curr Genet. 1997;32:323–330. [PubMed]
  • Merendino L, Guth S, Bilbao D, Martinez C, Valcarcel J. Inhibition of msl-2 splicing by Sex-lethal reveals interaction between U2AF35 and the 33 splice site AG. Nature. 1999;402:838–841. [PubMed]
  • Morgan B, Sun L, Avitahl N, Andrikopoulos K, Ikeda T, Gonzales E, Wu P, Neben S, Georgopoulos K. Aiolos, a lymphoid restricted transcription factor that interacts with Ikaros to regulate lymphocyte differentiation. EMBO J. 1997;16:2004–2013. [PubMed]
  • Nagai K, Oubridge C, Jessen TH, Li J, Evans PR. Crystal structure of the RNA-binding domain of the U1 small nuclear ribonucleoprotein A. Nature. 1990;348:515–520. [PubMed]
  • Nelson K, Green M. Mammalian U2 snRNP has a sequence-specific RNA-binding activity. Genes & Dev. 1989;3:1562–1571. [PubMed]
  • Neubauer G, King A, Rappsilber J, Calvio C, Watson M, Ajuh P, Sleeman J, Lamond A, Mann M. Mass spectrometry and EST-database searching allows characterization of the multi-protein spliceosome complex. Nat Genet. 1998;20:46–50. [PubMed]
  • Osada S, Ohmori SY, Taira M. XMAN1, an inner nuclear membrane protein, antagonizes BMP signaling by interacting with Smad1 in Xenopus embryos. Development. 2003;130:1783–1794. [PubMed]
  • Oubridge C, Ito N, Evans PR, Teo CH, Nagai K. Crystal structure at 1.92 Å resolution of the RNA-binding domain of the U1A spliceosomal protein complexed with an RNA hairpin. Nature. 1994;372:432–438. [PubMed]
  • Page-McCaw PS, Amonlirdviman K, Sharp PA. PUF60: A novel U2AF65-related splicing activity. RNA. 1999;5:1548–1560. [PubMed]
  • Pang Q, Hays JB, Rajagopal I. Two cDNAs from the plant Arabidopsis thaliana that partially restore recombination proficiency and DNA-damage resistance to E. coli mutants lacking recombination-intermediate-resolution activities. Nucleic Acids Res. 1993;21:1647–1653. [PMC free article] [PubMed]
  • Pawson T, Nash P. Assembly of cell regulatory systems through protein interaction domains. Science. 2003;300:445–452. [PubMed]
  • Poleev A, Hartmann A, Stamm S. A trans-acting factor, isolated by the three-hybrid system, that influences alternative splicing of the amyloid precursor protein minigene. Eur J Biochem. 2000;267:4002–4010. [PubMed]
  • Potashkin J, Naik K, Wentz-Hunter K. U2AF homolog required for splicing in vivo. Science. 1993;262:573–575. [PubMed]
  • Price SR, Evans PR, Nagai K. Crystal structure of the spliceosomal U2B″–U2A′ protein complex bound to a fragment of U2 small nuclear RNA. Nature. 1998;394:645–650. [PubMed]
  • Query CC, Bentley RC, Keene JD. A common RNA recognition motif identified within a defined U1 RNA binding domain of the 70K U1 snRNP protein. Cell. 1989;57:89–101. [PubMed]
  • Query CC, Moore MJ, Sharp PA. Branch nucleophile selection in pre-mRNA splicing: Evidence for the bulged duplex model. Genes & Dev. 1994;8:587–597. [PubMed]
  • Rain JC, Rafi Z, Rhani Z, Legrain P, Kramer A. Conservation of functional domains involved in RNA binding and protein–protein interactions in human and Saccharomyces cerevisiae pre-mRNA splicing factor SF1. RNA. 1998;4:551–565. [PubMed]
  • Raju GP, Dimova N, Klein PS, Huang HC. SANE, a novel LEM domain protein, regulates bone morphogenetic protein signaling through iteraction with Smad1. J Biol Chem. 2003;278:428–437. [PubMed]
  • Rosonina E, Blencowe BJ. Gene expression: The close coupling of transcription and splicing. Curr Biol. 2002;12:R319–R321. [PubMed]
  • Rudner DZ, Kanaar R, Breger KS, Rio DC. Mutations in the small subunit of the Drosophila U2AF splicing factor cause lethality and developmental defects. Proc Natl Acad Sci. 1996;93:10333–10337. [PubMed]
  • Rudner DZ, Breger KS, Kanaar R, Adams MD, Rio DC. RNA binding activity of heterodimeric splicing factor U2AF: At least one RS domain is required for high-affinity binding. Mol Cell Biol. 1998a;18:4004–4011. [PMC free article] [PubMed]
  • Rudner DZ, Kanaar R, Breger KS, Rio DC. Interaction between subunits of heterodimeric splicing factor U2AF is essential in vivo. Mol Cell Biol. 1998b;18:1765–1773. [PMC free article] [PubMed]
  • Ruskin B, Zamore PD, Green MR. A factor, U2AF, is required for U2 snRNP binding and splicing complex assembly. Cell. 1988;52:207–219. [PubMed]
  • Rutz B, Seraphin B. Transient interaction of BBP/ ScSF1 and Mud2 with the splicing machinery affects the kinetics of spliceosome assembly. RNA. 1999;5:819–831. [PubMed]
  • Sachs AB, Bond MW, Kornberg RD. A single gene from yeast for both nuclear and cytoplasmic polyadenylate-binding proteins: Domain structure and expression. Cell. 1986;45:827–835. [PubMed]
  • Sampath J, Long PR, Shepard RL, Xia X, Devanarayan V, Sandusky GE, Perry WL, III, Dantzig AH, Williamson M, Rolfe M, et al. Human SPF45, a splicing factor, has limited expression in normal tissues, is overexpressed in many tumors, and can confer a multidrug-resistant phenotype to cells. Am J Pathol. 2003;163:1781–1790. [PubMed]
  • Scherly D, Boelens W, van Venrooij WJ, Dathan NA, Hamm J, Mattaj IW. Identification of the RNA binding segment of human U1 A protein and definition of its binding site on U1 snRNA. EMBO J. 1989;8:4163–4170. [PubMed]
  • Selenko P, Gregorovic G, Sprangers R, Stier G, Rhani Z, Kramer A, Sattler M. Structural basis for the molecular recognition between human splicing factors U2AF65 and SF1/mBBP. Mol Cell. 2003;11:965–976. [PubMed]
  • Shatkin AJ, Manley JL. The ends of the affair: Capping and polyadenylation. Nat Struct Biol. 2000;7:838–842. [PubMed]
  • Shen H, Kan JL, Green MR. Arginine–serine-rich domains bound at splicing enhancers contact the branch-point to promote prespliceosome assembly. Mol Cell. 2004;13:367–376. [PubMed]
  • Shepard J, Reick M, Olson S, Graveley BR. Characterization of U2AF26, a splicing factor related to U2AF35. Mol Cell Biol. 2002;22:221–230. [PMC free article] [PubMed]
  • Shi H, Xu RM. Crystal structure of the Drosophila Mago nashi–Y14 complex. Genes & Dev. 2003;17:971–976. [PubMed]
  • Smith CW, Valcarcel J. Alternative pre-mRNA splicing: The logic of combinatorial control. Trends Biochem Sci. 2000;25:381–388. [PubMed]
  • Thompson JD, Gibson TJ, Plewniak F, Jeanmougin F, Higgins DG. The CLUSTAL_X windows interface: Flexible strategies for multiple sequence alignment aided by quality analysis tools. Nucleic Acids Res. 1997;25:4876–4882. [PMC free article] [PubMed]
  • Thornton JM, Todd AE, Milburn D, Borkakoti N, Orengo CA. From structure to function: Approaches and limitations. Nat Struct Biol. 2000;7:991–994. [PubMed]
  • Tronchere H, Wang J, Fu XD. A protein related to splicing factor U2AF35 that interacts with U2AF65 and SR proteins in splicing of pre-mRNA. Nature. 1997;388:397–400. [PubMed]
  • Tupler R, Perini G, Green MR. Expressing the human genome. Nature. 2001;409:832–833. [PubMed]
  • Valcarcel J, Gaur RK, Singh R, Green MR. Interaction of U2AF65 RS region with pre-mRNA branch point and promotion of base pairing with U2 snRNA. Science. 1996;273:1706–1709. [PubMed]
  • Van Buskirk C, Schupbach T. Half pint regulates alternative splice site selection in Drosophila. Dev Cell. 2002;2:343–353. [PubMed]
  • Varani L, Gunderson SI, Mattaj IW, Kay LE, Neuhaus D, Varani G. The NMR structure of the 38 kDa U1A protein–PIE RNA complex reveals the basis of cooperativity in regulation of polyadenylation by human U1A protein. Nat Struct Biol. 2000;7:329–335. [PubMed]
  • Waite KA, Eng C. From developmental disorder to heritable cancer: It’s all in the BMP/TGF-β family. Nat Rev Genet. 2003;4:763–773. [PubMed]
  • Wang X, Tanaka Hall TM. Structural basis for recognition of AU-rich element RNA by the HuD protein. Nat Struct Biol. 2001;8:141–145. [PubMed]
  • Wang X, McLachlan J, Zamore PD, Hall TM. Modular recognition of RNA by a human pumilio-homology domain. Cell. 2002;110:501–512. [PubMed]
  • Wentz-Hunter K, Potashkin J. The small subunit of the splicing factor U2AF is conserved in fission yeast. Nucleic Acids Res. 1996;24:1849–1854. [PMC free article] [PubMed]
  • Wu JY, Maniatis T. Specific interactions between proteins implicated in splice site selection and regulated alternative splicing. Cell. 1993;75:1061–1070. [PubMed]
  • Wu J, Manley J. Mammalian pre-mRNA branch site selection by U2 snRNP involves base pairing. Genes & Dev. 1989;3:1553–1561. [PubMed]
  • Wu S, Romfo CM, Nilsen TW, Green MR. Functional recognition of the 3′ splice site AG by the splicing factor U2AF35. Nature. 1999;402:832–835. [PubMed]
  • Yan D, Perriman R, Igel H, Howe KJ, Neville M, Ares M. CUS2, a yeast homolog of human Tat-SF1, rescues function of misfolded U2 through an unusual RNA recognition motif. Mol Cell Biol. 1998;18:5000–5009. [PMC free article] [PubMed]
  • Zamore PD, Green MR. Identification, purification, and biochemical characterization of U2 small nuclear ribonucleoprotein auxiliary factor. Proc Natl Acad Sci. 1989;86:9243–9247. [PubMed]
  • Zeng Q, Hall KB. Contribution of the C-terminal tail of U1A RBD1 to RNA recognition and protein stability. RNA. 1997;3:303–314. [PubMed]
  • Zhang M, Zamore PD, Carmo-Fonseca M, Lamond AI, Green MR. Cloning and intracellular localization of the U2 small nuclear ribonucleoprotein auxiliary factor small subunit. Proc Natl Acad Sci. 1992;89:8769–8773. [PubMed]
  • Zhou Q, Sharp PA. Tat-SF1: Cofactor for stimulation of transcriptional elongation by HIV-1 Tat. Science. 1996;274:605–610. [PubMed]
  • Zhou Z, Licklider LJ, Gygi SP, Reed R. Comprehensive proteomic analysis of the human spliceosome. Nature. 2002;419:182–185. [PubMed]
  • Zhuang Y, Weiner A. A compensatory base change in human U2 snRNA can suppress a branch site mutation. Genes & Dev. 1989;3:1545–1552. [PubMed]
  • Zorio DA, Blumenthal T. Both subunits of U2AF recognize the 3′ splice site in Caenorhabditis elegans. Nature. 1999a;402:835–838. [PubMed]
  • Zorio DA, Blumenthal T. U2AF35 is encoded by an essential gene clustered in an operon with RRM/cyclophilin in Caenorhabditis elegans. RNA. 1999b;5:487–494. [PubMed]