RNA-binding proteins play critical roles in gene expression through regulation of RNA splicing, localization, translation, and decay. Members of the PUF family, named after the two founding members
Drosophila melanogaster Pumilio (PUM) and
Caenorhabditis elegans fem-3 binding factor (FBF), are RNA-binding proteins that regulate gene expression posttranscriptionally. They induce mRNA decay or repress translation (
Wickens et al., 2002) and have recently been shown to activate translation of some mRNA targets (
Kaye et al., 2009;
Suh et al., 2009). PUF proteins exist exclusively in eukaryotes and bind sequence specifically to regulatory sequences in the 3′ UTRs of their target mRNAs. All PUF proteins share a highly conserved RNA-binding domain known as the Pumilio-homology domain (PUM-HD) or PUF domain (
Wharton et al., 1998;
Zamore et al., 1997;
Zhang et al., 1997). Structural studies of different PUF proteins with RNA have revealed the details of their RNA recognition schemes (
Gupta et al., 2008;
Miller et al., 2008;
Wang et al., 2002;
Wang et al., 2009b;
Zhu et al., 2009).
Crystal structures of the PUM-HD of human PUMILIO1 (PUM1) bound to the Nanos Response Element (NRE) sequences in
D. melanogaster hunchback (
hb) mRNA provide a prototypical model of modular RNA recognition (
Wang et al., 2002). The PUM-HD comprises eight α-helical PUM repeats and two pseudo-repeats at the N and C termini, which together adopt a crescent shape. The inner concave surface binds target RNAs in an ‘anti-parallel’ orientation with the N -terminal end of the protein binding to the 3′ end of the RNA. Each PUM repeat recognizes one RNA base using three side chains at specific positions in the repeat. Thus 8 RNA bases are recognized by 8 PUM repeats. We refer to this as a 1:1 binding mode. Two side chains contact the Watson-Crick edge of the base and a third side chain stacks with the same base and/or preceding base. Certain combinations of side chains recognize particular RNA bases. Mutation of these conserved combinations of residues allows design of the RNA recognition specificity of PUM-HDs (
Cheong and Hall, 2006;
Furman et al., 2010;
Koh et al., 2009;
Opperman et al., 2005;
Ozawa et al., 2007;
Stumpf et al., 2008;
Tilsner et al., 2009;
Wang et al., 2002;
Wang et al., 2009a).
In contrast to other RNA-binding protein families with hundreds of different family members per organism, the PUF protein family is small. Humans and other mammals express two PUF proteins, D. melanogaster express one, Saccharomyces cerevisiae express six, C. elegans express nine, and Arabidopsis thaliana encode up to 26. Each organism expresses at least one PUF family member closely related to human PUM1 and Drosophila PUM, which contains a PUM-HD that binds to the recognition sequence found in hb mRNA, 5′-UGUANAUA-3′. All PUM-HDs bind to sequences containing a 5′ UGU sequence.
The identification of mRNA targets of PUF proteins has revealed more variability in mRNA sequence recognition than expected based on the 1:1 binding mode observed in crystal structures of human PUM1 with
hb RNA and the high conservation of RNA recognition side chains among PUF proteins. Yeast Puf4p and Puf5p use 8 PUM repeats to bind to sequences containing, respectively, 9 or 10 bases starting from the 5′ UGU (
Gerber et al., 2004). Similarly, worm PUF proteins with 8-repeat PUM-HDs recognize longer RNA sequences (
Koh et al., 2009;
Opperman et al., 2005;
Stumpf et al., 2008). Crystal structures of yeast Puf4p and worm FBF-2 demonstrate that additional bases can be accommodated by direct stacking of bases or flipping bases away from the RNA-binding surface, influenced by changes in curvature of the RNA-binding surfaces of these proteins (
Miller et al., 2008;
Wang et al., 2009b). Additional specificity of a PUM-HD is achieved by a specialized binding pocket at the C-terminal end of the domain of Puf3p, which specifically recognizes a cytosine two bases upstream of the 5′ UGU motif (
Zhu et al., 2009). A cytosine at this (−2) position is required for
in vivo target recognition. Hence, PUF proteins utilize binding modes in addition to 1 repeat:1 RNA base to recognize RNA targets. As our understanding of the structures and RNA target recognition by specific PUF proteins grows along with corresponding knowledge of downstream effects, computational prediction of binding modes and biological effects may be possible.
Mammalian cells express two PUF proteins, PUM1 and PUM2, whose RNA-binding domains are highly similar to
D. melanogaster PUM (80% and 78% amino acid positions identical to PUM, respectively). Until recently, little was known about target mRNAs of human PUM1 and PUM2. Several studies in the past few years have identified mRNA targets associated with human PUM1 and PUM2 and revealed the same consensus recognition sequence as that of fly PUM, 5′-UGUANAUA-3′, where N is A, U, or C (
Galgano et al., 2008;
Gerber et al., 2006;
Hafner et al., 2010;
Morris et al., 2008).
The primordial function of PUF proteins appears to be regulating germline stem cell differentiation (
Wickens et al., 2002). mRNAs of mitogen-activated protein (MAP) kinases have been shown to be targets of PUF proteins in stem cells (
Lee et al., 2007). FBF regulates
mapk/erk2 mRNAs in
C. elegans germline cells. Human PUM2 has been shown to down-regulate the expression of MAP kinase homolog mRNAs,
p38α and
erk2, in human embryonic stem cells (
Lee et al., 2007). These two mRNAs contain sequences similar to the PUM consensus sequence, and mutation of the UGU motifs in these sequences results in reporter mRNAs refractory to PUM2 regulation.
These advances in the identification of native mRNA targets of human PUF proteins prompted us to revisit how cognate RNA target sequences are recognized and further examine human PUF protein substrate specificities. We determined crystal structures of PUM1-HD and PUM2-HD in complex with four different RNA sequences including three cognate target sequences from MAPK homolog mRNAs. We also analyzed biochemically the affinity and specificity of binding to these RNAs by PUM1 and PUM2. We observe three different modes of binding to RNAs around the 5th RNA base, which varies in the consensus sequences. The different modes of binding do not appear to affect binding affinity in vitro, but in vivo the protein:RNA complexes may present alternative recognition surfaces that could direct downstream effector complex formation.