|Home | About | Journals | Submit | Contact Us | Français|
Conceived and designed the experiments: LHS HWP MH MM HS. Performed the experiments: PS TK SvdB RC LL MH LHS WT MH MM AGT. Analyzed the data: PS TK LL MH HWP. Wrote the paper: PS HS.
DEAD-box RNA helicases play various, often critical, roles in all processes where RNAs are involved. Members of this family of proteins are linked to human disease, including cancer and viral infections. DEAD-box proteins contain two conserved domains that both contribute to RNA and ATP binding. Despite recent advances the molecular details of how these enzymes convert chemical energy into RNA remodeling is unknown. We present crystal structures of the isolated DEAD-domains of human DDX2A/eIF4A1, DDX2B/eIF4A2, DDX5, DDX10/DBP4, DDX18/myc-regulated DEAD-box protein, DDX20, DDX47, DDX52/ROK1, and DDX53/CAGE, and of the helicase domains of DDX25 and DDX41. Together with prior knowledge this enables a family-wide comparative structural analysis. We propose a general mechanism for opening of the RNA binding site. This analysis also provides insights into the diversity of DExD/H- proteins, with implications for understanding the functions of individual family members.
DExD/H-box RNA helicases, from virus and bacteria to eukaryotes, play important roles in processes including ribosome biogenesis, RNA processing and folding, ribonucleoprotein (RNP) remodeling, RNA nuclear export, the regulation of RNA translation and transcription, and nonsense-mediated RNA decay. DExD/H-box RNA helicases have multiple functions in these processes: They can act as RNA chaperones, ATP-dependent RNA helicases and unwindases, as RNPases by mediating RNA-protein association and dissociation – or as co-activators and co-repressors of transcription (– and refs. therein). Cancer cell lines often feature deregulated expression or impaired functioning of RNA helicases , . In addition, several family members are captured and regulated by viral proteins , are involved in viral RNA maturation , or mediate antiviral host defense , . Inhibition of individual RNA helicases as a therapeutic route is currently being explored (e.g., –).
DExD/H-box proteins often contain accessory regulatory domains and localization modules, but their cores consist of two RecA-like domains joined by a short flexible linker. The N-terminal domain is commonly referred to as conserved domain-1, or DEAD-domain, and the C-terminal domain as conserved domain-2, or helicase domain , , . Both domains contribute to the binding site for RNA substrates and both contribute to ATP hydrolysis. These activities are coupled to one another by allostery throughout the protein molecules. Consequently, a detailed understanding of how these proteins convert chemical energy into RNA remodeling requires knowledge of the structures of the two conserved domains independent of each other and interacting in the closed active state. To date, crystal structures of tandem domains are available for several DExD/H-box helicases, also in complex with RNA substrates –. To understand the RNA remodeling event and the underlying structural rearrangements, it is important to compare these structures with those of each domain in isolation.
We have solved crystal structures of single domains from eleven human DExD/H-box helicases of the DEAD-motif subfamily. A comparative analysis of these structures uncovered not only isoform specific features, but also nucleotide specific positioning of flexible elements that are common to several proteins. We suggest a structural mechanism for the linkage between binding of ATP and activation of the RNA binding site.
We used X-ray crystallography to determine the structures of the DEAD-domains of DDX2A, DDX2B, DDX5, DDX10, DDX18, DDX20, DDX47, DDX52, and DDX53, as well as the helicase domains of DDX25 and DDX41. While the physiological roles of these proteins are diverse (Table 1) all structures show the RecA-like fold. Superposition of the DEAD-domain structures gives root mean square deviations of Cα-atom positions between 0.6 and 1.9 Å for proteins with sequence identity between 86 and 27%. The two helicase domains have a sequence identity of 23% and their structures superimpose with an r.m.s.d. of 3 Å. Details of the synchrotron data collection, structure determination, and refinement statistics are presented in Table 2.
Superpostition of the different crystal structures illustrates the location of flexible regions (Figure 1A, 1B). In general, regions of high sequence conservation (the conserved motifs in particular) contribute to the binding sites for nucleotide and for RNA, and these sites coincide with the highest structural similarity (Figure 2). Conversely, unconserved regions in the DEAD-domains determined here show a higher r.m.s.d. in their Cα-atom positions. Some of the unconserved regions in the structures are flexible, as documented by high B-factors and partially missing electron density.
We compared the surface charge distributions of the DEAD-domain structures (Figure 1E, 1H). All DEAD-domains feature a conserved patch that constitutes the nucleotide binding site and part of the RNA binding site. This patch forms a negatively charged channel between α-helices 8 and −10 that extends to the Mg2+-binding site. The negative charges originate from the side chains of the two helices, including the DEAD-motif on α-helix 8. As expected, the RNA binding cleft is positively charged in all DEAD domains, but the charged patches differ in size. The remainder of the DEAD-domain surfaces differs in electrostatic surface properties among the family members.
Conserved motifs I (the P-loop), Ia, II, and the Q-motif participate in nucleotide binding , . The P-loop and motif II coordinate the nucleotide phosphates and the magnesium ion, whereas residues of the Q-motif bind and recognize the adenine moiety. The side chains that participate in nucleotide and magnesium binding are highly conserved (Figure 2). The nucleotide phosphates interact with backbone atoms, a conserved lysine, and the divalent cation. Superposition of the DEAD-domains shows that the structures of the P-loop and motif III are determined by the state of nucleotide hydrolysis. The P-loop is in a wide-open conformation when ATP is bound, as seen in DDX20 as well as in the previously published structures of DDX19  and eIF4AIII . In the crystal complexes with either ADP or AMP the loop closes up, resulting in a shift in Cα-atom positions by up to 3.5 Å between the ATP- and the AMP-states, or by up to 2.5 Å between the ADP- and the AMP-state (Figure 3A). Thus the conformation of the P-loop is determined by the nucleotide phosphates, and longer phosphate tails result in a more open loop. This observation agrees with previous results . Motif III follows the P-loop transition and its position changes by up to 3 Å toward the P-loop. Motifs Ia, Ib and II seem unaffected by the state of ATP hydrolysis, and their conformations remain unchanged even in the crystal structures in which the nucleotide binding site is not occupied.
Two of the structures show unique P-loop conformations. The DDX2B structure features an α-helix 4 that is longer than in other helicases, and leads into an unusually closed P-loop conformation (Figure 3C). As a consequence, the ATP binding site is not visible on the surface of the DDX2B structure. This conformation is most likely induced by a crystal contact in this region.
The AMPPNP-bound DDX20 structure contains no metal ion (Figure 3B). Lack of γ-phosphate coordination by a metal ion leads to a shift in the position of the β- and γ-phosphates, which bind where the α- and β-phosphates are bound in other ATP complexes. Since the adenine base is coordinated in the usual fashion the α-phosphate and the sugar moiety are tilted out of the expected positions. This illustrates that DDX20 (and presumably other helicases) can bind ATP also in the absence of divalent cation. However, a divalent cation is needed to allow coordination of three phosphates in the correct geometry for catalysis.
Some of the side chains that interact with the nucleotides are not conserved, and most of these are found in the Q-motif. Three hydrogen bonds between the adenine ring and the protein ensure specific binding of adenosine nucleotides. These are formed by the conserved glutamine and the backbone carbonyl five residues upstream of the glutamine (Figure 3). The 6th residue upstream of the conserved glutamine is an aromatic residue in most DEAD-box helicases. Its side chain stacks with the nucleotide base, stabilizing it in its position. Interestingly this residue is not conserved: While phenylalanine is most common, DDX10 has a tyrosine and DDX47 has a tryptophan in the corresponding position. Moreover, an aromatic residue in this position is not obligatory: DDX53 features an isoleucine, with weaker van-der-Waals interactions with the adenosine ring than the base stacking interactions with the aromatic side chains (Figure 3D). We analyzed the protein-nucleotide binding interfaces in these crystal structures using the PISA server . This analysis showed that, while the overall ligand interface areas are similar in the different nucleotide complexes, the contribution by the base stacking residues vary considerably. The variability in the stacking residue position may reflect different needs for conformational flexibility in this region of the DEAD-domains.
The helicase domain contributes to nucleotide coordination via motifs V and VI. From the closed state DDX19 structure  it is apparent that four side chains are of particular importance: The aspartate of motif V coordinates the O3′ of the ribose. The second arginine side chain of motif VI (HRxGRxGR) interacts with the γ-phosphate. The third arginine, which is also the putative arginine finger during ATP hydrolysis, coordinates all three ATP phosphates. The variable residue that follows this arginine coordinates the adenosine ring by different means. In the DDX19 helicase domain a phenylalanine stacks with the adenosine rings. A superposition of DDX19 with the DDX25 and DDX41 helicase domains shows that in the latter two structures part of motif VI is not visible in the electron density, indicating its flexibility. The conserved motifs IV and VI superpose well, whereas motif V shows different conformations in all three structures (Figure 1B).
The only part of motif IV that is not flexible is the histidine-arginine pair, and it superposes in all three crystal structures. The arginine points to a negatively charged pocked formed in part by side chains from motifs IV and V in the inside of the helicase domain. The aliphatic part of the arginine side chain makes a hydrophobic contact with the phenylalanine of motif IV. In the two-domain closed state structures the histidine interacts with the SAT motif from the helicase domain. Therefore, the SAT motif is indirectly linked to the ATP binding site as well as to the RNA binding sites of both domains. This explains the central importance of this motif in the coupling of ATP hydrolysis and RNA unwinding . In SAT-motif mutants of eIF4A the ATPase and helicase activities were uncoupled : SAT-to-AAA mutant protein is capable of binding RNA in an ATP dependent manner, but lacks RNA unwinding activity.
The available atomic resolution structures of DEAD-box helicases with bound RNA – show that the DEAD-domain contributes to RNA binding through two conserved and one variable structural element: (i) Motif Ia; (ii) α-helix 7, with its conserved motif Ib; and (iii) the variable loop connecting β-sheets 3 and 4. These interactions are illustrated for DDX19 in Figure 4A: While the variable loop clamps the RNA substrate in a specific conformation, motifs Ia and Ib each coordinate an RNA-backbone phosphate and induce a tilt of one or more RNA bases.
Conserved motifs Ia and Ib of DDX19 and all DEAD-domain structures described here superimpose perfectly (Figure 4). This leads us to conclude that RNA substrates are bound in a similar conformation by the conserved motifs of all these DEAD-domains. The variability in part of the RNA binding sites (Figure 4D), on the other hand, implies that different helicases could stabilize specific RNA conformations. In addition, variable side chain contribution may also reflect optimal recognition of specific nucleotide sequences.
Inspection of the RNA complexes of DDX19, vasa, and eIF4AIII – shows that the conserved motif that makes the most extensive contacts with the RNA-backbone phosphates is motif Ib. In two of our DEAD-domain crystal structures, anions from the crystallization buffers are bound to motif Ib (a sulfate in DDX5, and a phosphate in DDX47) highlighting the ability of this motif to bind polyanions.
Our crystal structures of both DEAD-domains and helicase domains in isolation reveal that the RNA binding site on each domain is in a conformation that is incompetent to bind RNA substrate. In the free helicase domain structures motif V, an important RNA backbone interaction site – is in a binding incompetent conformation. In the closed state, an RNA binding competent conformation is stabilized by the interaction of the conserved arginine of motif V with the C-terminal aspartic acid of the DEAD-motif (Figure 5C). In all single DEAD-domain structures, α-helix 8 has adopted a position that would block the RNA binding site. By contrast, upon cleft closure in the two-domain ATP analog and RNA complexes, α-helix 8 has moved out of the RNA binding site (Figure 5).
Thus, superposition of single DEAD-domain structures onto the closed state structures of DDX19 and eIF4AIII suggests involvement of α-helix 8 in the formation of a competent RNA binding site. How is α-helix 8 displaced to allow access to the RNA substrate binding site? No direct interaction between α-helix 8 and the RNA have been observed; thus displacement of α-helix 8 by the RNA substrate itself seems unlikely. Also, binding of ATP itself cannot cause α-helix 8 rotation out of the RNA site: The DEAD-motif is the only link between the nucleotide and α-helix 8, but the state of nucleotide hydrolysis does not influence the conformation of the DEAD motif (motif II; Figure 1A, ,3A3A).
Instead, we propose direct involvement of the helicase domain in the activation of the RNA binding site on the DEAD-domain: In the complex structures, the conserved arginine of motif V in the helicase domain forms a salt bridge with the C-terminal aspartic acid of the DEAD-motif, which is also the terminal residue of α-helix 8 (Figure 5C, 5D). This interaction stabilizes a conformation where α-helix 8 is rotated out of the RNA binding site (Figure 5D). We propose that ATP binding primes the helicases for RNA substrate binding by bringing the domains together to allow motif V to push α-helix 8 out of the RNA site on the DEAD-domain. RNA binding to the DEAD-domain then completes cleft closure to allow formation of an active ATPase site (Figure 6).
This model of cleft closure and helicase activation through regulation of α-helix 8 can reconcile published data. Moreover, it can explain how substrate release in the post-hydrolysis state is achieved. DEAD-box helicases typically bind ADP with higher affinity that ATP –, and binding of ATP and RNA are cooperative –. Thus, the binding energy of the RNA-protein interaction likely stabilizes a strained conformation that is competent for ATP hydrolysis. Conversely, relief of this strain upon ATP hydrolysis and phosphate release likely drives RNA substrate remodeling . According to our comparative structural analysis, ATP hydrolysis and phosphate release would allow α-helix 8 to move back into its original position, releasing the RNA substrate and switching back to a binding incompetent RNA site on the DEAD domain.
DExH-box RNA helicases differ in some aspects from the DEAD-motif containing helicases. The hepatitis C virus DExH-box helicase NS3 binds RNA in the absence of ATP . DExH helicase NPH-II unwinds RNA in a processive fashion  and thus stays bound to the RNA after each unwinding step. Our model for the role of α-helix 8 in cleft closure of DEAD-proteins is consistent also with these properties of DExH-box RNA helicases. Whereas α-helix 8 is conserved in all DEAD-box proteins, it is missing in the DExH-box proteins (refs. , – and references therein). Moreover, the DEAD-motif aspartic acid side chain that mediates opening of the RNA binding site (Figure 5) is replaced by the histidine of the DExH-motif. Thus apparently, in the absence of α-helix 8 that may block the RNA site, this terminal aspartic acid is redundant, and the histidine that substitutes it fulfills a different function . We conclude that DEAD- and DExH-box helicases differ significantly in the coupling of the RNA binding event to the conformational cycle of the two RecA domains.
All proteins were expressed in Escherichia coli as N-terminally hexahistidine tagged fusion proteins, and purified by nickel affinity chromatography and gel filtration. Proteins were crystallized in sitting drops at 4°C or 20°C. X-ray diffraction data were collected at the APS (Chicago, USA), the BESSY (Berlin, Germany), the Diamond (Oxfordshire, UK), the ESRF (Grenoble, France), and the MaxLab (Lund, Sweden) synchrotron radiation facilities. Data were indexed and integrated using XDS , MOSFLM , or DENZO , and scaled using XSCALE , SCALA  or SCALEPACK . Structures were solved by molecular replacement using PHASER  or MOLREP , and refined using REFMAC . Refinement rounds were complemented with manual rebuilding using COOT .
The coordinates have been deposited in the Proteins Data Bank with accession codes 2G9N, 3BOR, 3FE2, 2PL3, 3LY5, 3B7G, 2RB4, 2P6N, 3BER, 3DKP, and 3IUY.
We thank the beamline staff at the APS, BESSY, Diamond, ESRF, and MaxLab synchrotron radiation facilities for excellent support. We would also like to acknowledge our colleagues at the Structural Genomics Consortium.
Competing Interests: Clarifying statements: 1. The Structural Genomics Consortium (SGC) is a not-for-profit organization that receives funding from a funder consortium that includes commercial sources (GlaxoSmithKline and Merck & Co., Inc.). This circumstance does not alter the authors' adherence to all the PLoS ONE policies on sharing data and materials. The SGC and its scientists are committed to making their research outputs (materials and knowledge) available without restriction on use. This means that the SGC will promptly place its results in the public domain and will not agree to file for patent protection on any of its research outputs. It will seek the same commitment from any research collaborator. 2. One of the authors (LHS) is currently employed by a commercial company. As the role of this author in the current study was terminated before her affiliation with that company, this circumstance does not alter the authors' adherence to all the PLoS ONE policies on sharing data and materials.
Funding: The Structural Genomics Consortium is a registered charity (number 1097737) that receives funds from the Canadian Institutes for Health Research, the Canada Foundation for Innovation, Genome Canada through the Ontario Genomics Institute, GlaxoSmithKline, Karolinska Institutet, the Knut and Alice Wallenberg Foundation, the Ontario Innovation Trust, the Ontario Ministry for Research and Innovation, Merck & Co., Inc., the Novartis Research Foundation, the Swedish Agency for Innovation Systems, the Swedish Foundation for Strategic Research and the Wellcome Trust. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.