Most transcripts of higher eukaryotes contain intervening sequences (introns) between the protein coding regions (exons) that must be excised by pre-mRNA splicing before nuclear export and translation of the mRNA product. The task of pre-mRNA splicing is accomplished through a series of ATP-dependent conformational rearrangements among constitutive splicing factors and small nuclear (sn)RNAs called the spliceosome (
Jurica and Moore, 2003). In addition, alternative splicing factors generate transcript diversity for cell growth and differentiation by incorporating different exons into the final mRNA (
Maniatis and Tasic, 2002). An essential splicing factor, U2 Auxiliary Factor (U2AF) recognizes consensus 3′-splice site sequences in the pre-mRNA and coordinates the initial states of spliceosome assembly. Since formation of the U2AF complex commits the pre-mRNA to be spliced (
Michaud and Reed, 1991), U2AF/pre-mRNA interactions present a key target for regulation during alternative splicing, for example by Sex-lethal (SXL) (
Valcarcel et al., 1993) or polypyrimidine tract binding protein (PTB) (
Sharma et al., 2005). Accurate recognition of the 3′-splice site by U2AF is critical for pre-mRNA splicing, as demonstrated by the association of an estimated half of human genetic diseases with errors in splice site recognition (
Garcia-Blanco et al., 2004).
U2AF is a heterodimer of two subunits. The large subunit (U2AF
65) recognizes an essential, polypyrimidine (Py)-tract pre-mRNA consensus that is composed predominantly of uridines (
Zamore et al., 1992). The small subunit (U2AF
35) associates tightly with a region near the U2AF
65 N-terminus, and contacts an adjacent ‘AG’ consensus dinucleotide at the nearby intron-exon boundary (
Merendino et al., 1999;
Wu et al., 1999;
Zorio and Blumenthal, 1999). Initially, the U2AF heterodimer binds the pre-mRNA as a ternary complex with a third protein, Splicing Factor 1 (SF1) (
Abovich and Rosbash, 1997). SF1 recognizes the branchpoint consensus sequence (BPS) of the pre-mRNA where the first step of the splicing reaction ultimately takes places. In parallel with assembly of the U2AF/SF1/3′-splice site complex, the U1 small nuclear ribonucleoprotein (snRNP) associates with the 5′-splice site. Formation of this early splicing complex brings the BPS, 5′- and 3′-splice sites together in a structured conformation (
Kent and MacMillan, 2002;
Kent et al., 2003). Next, the U2 snRNP component of the spliceosome forms an ATP-dependent complex with the BPS and U2AF as SF1 dissociates. Stable association of the U2 snRNP requires an N-terminal, arginine-serine-rich (RS) domain of U2AF
65 (
Shen and Green, 2004;
Valcarcel et al., 1996), and RNP-unwindases such as the U2AF-Associated-Protein-56KD (UAP56) (
Fleckner et al., 1997). Ultimately, U2AF is released from the pre-mRNA before the splicing reaction is catalyzed by the active spliceosome.
U2AF
65 preferentially binds uridine-rich RNA sequences, as shown by
in vitro genetic selection experiments with U2AF
65 that enrich polyuridine sequences (
Singh et al., 1995), and chemical modification of the uridine-N3 or O4 atoms inhibits U2AF
65 binding by ~100-fold (
Singh et al., 2000). Accordingly, Py-tracts composed of long uridine stretches promote use of adjacent 3′-splice sites (
Coolidge et al., 1997;
Reed, 1989). However, natural mammalian Py-tracts vary in length and sequence composition (
Senapathy et al., 1990). U2AF
65 universally recognizes these diverse natural Py-tracts, which are frequently interrupted with cytosines or purines, albeit with a broad (200-fold) range of affinities (
Zamore et al., 1992). In contrast, the alternative splicing factors SXL and PTB bind specific Py-tract sequences of regulated splice sites (guanosine-containing uridine-tracts or alternating (CU)-tracts, respectively) (
Perez et al., 1997;
Singh et al., 1995;
Sosnowski et al., 1989;
Valcarcel et al., 1993).
Despite their distinct Py-tract specificities, the RNA binding domains of U2AF
65 (
Ito et al., 1999), SXL (
Handa et al., 1999), and PTB (
Conte et al., 2000;
Oberstrass et al., 2005;
Simpson et al., 2004) are composed of a similar structural scaffold of consecutive RNA recognition motifs (RRM). The RRM, one of the most common types of eukaryotic RNA binding domains, is characterized by two ribonucleoprotein consensus motifs (RNP1 and RNP2) with aromatic and basic residues that interact with single-stranded RNAs (
Maris et al., 2005). Specifically, the two central RRMs (RRM1 and RRM2) of U2AF
65 comprise the minimal Py-tract binding domain (U2AF
651,2) (
Banerjee et al., 2003;
Banerjee et al., 2004;
Zamore et al., 1992). To investigate how these two U2AF
65 RRMs accomplish versatile Py-tract recognition, we present the X-ray structure and a complementary mutational analysis of the U2AF
65 RNA binding domain in complex with polyuridine RNA. The structure reveals that U2AF
65 recognizes uridines through a network of hydrogen bond interactions with the base edges, rather than shape selection of the smaller pyrimidine compared with purine bases. A significant number of side-chain and water-mediated hydrogen bonds may explain the ability of U2AF
65 to bind a variety of natural Py-tract sequences.