One of the earliest descriptions of an RNA helicase was the report that incubation of globin mRNA with the translation initiation factor eIF4A and ATP changed the susceptibility of the mRNA to nucleases (
13). Thus, eIF4A altered the structure of the mRNA in such a way, that the RNase digestion pattern changed. This change was dependent on a source of energy in the form of ATP. The translation initiation factor eIF4A could therefore be considered as a helicase that melts (local) secondary structures and makes the RNA accessible to nucleases. Since then, many RNA helicases involved in a variety of cellular processes have been described.
In 1988, Gorbalenya
et al. (
14) defined a group of NTPases and showed that they had several common sequence elements. This analysis, together with the description of a number of proteins involved in RNA metabolism (p68, SrmB, MSS116, vasa, PL10, mammalian eIF4A, yeast eIF4A) resulted, based on the sequence of eIF4A, in the birth of the DEAD-box protein family (
15). Today, the alignment of all annotated sequences in SwissProt from all species reveal nine conserved sequence motifs with very little variation (
15,
16) (). The simultaneous presence of these motifs is a criterion for inclusion of a protein within the family, although an enzymatic activity has been demonstrated only for a limited number. Motif II (or Walker B motif) has the amino acids D-E-A-D, which gave the name to the family. This motif, together with motif I (or Walker A motif), the Q-motif and motif VI, is required for ATP binding and hydrolysis (
16–
19). Motifs Ia and Ib, III, IV and V have been characterized less well but may be involved in interaction with RNA (
20) and in intramolecular rearrangements necessary for remodeling activity ().
Proteins related to eIF4A in sequence can be found in all eukaryotic cells and in most eubacteria and archaebacteria. The genome of the yeast
Saccharomyces cerevisiae encodes 25 DEAD-box proteins (
21,
22). Interestingly, it has two genes (
TIF1 and
TIF2) encoding exactly the same eIF4A protein, and it encodes two related proteins Ded1 and Dbp1. The deletion of
DED1 is lethal, whereas the deletion of
DBP1 is not lethal under normal laboratory conditions. However, overexpression of Dbp1 can suppress the lethal deletion of
DED1 (
23), indicating (but not proving) a functional redundancy. A comparison with another fungal species,
Ashbya gosypii, which is considered to be the free living eukaryote with the smallest genome (
24), reveals all the DEAD-box proteins found in
S.cerevisiae, with the exception of Dbp1 and Prp28 (involved in pre-mRNA splicing, see below), and with only one eIF4A copy. Thus, the DEAD-box proteins of
A.gosypii could represent the minimal number of such proteins required for a free-living eukaryote.
In multicellular eukaryotes, several additional DEAD-box proteins can be found. A search in the human genome revealed 38 DEAD-box proteins (), which can tentatively be classified into 32 subfamilies. These subfamilies have been defined by iterative blast searches against the SwissProt/trEMBL databases, using all human DEAD-box proteins. Approximately 250 best scoring sequences from each blast search were then used for a ClustalW analysis to identify related sequences. In some cases, where twohuman or two yeast proteins clustered together, the members of the putative subfamily from other model organisms were analyzed further to determine whether there were one or two proteins within this subfamily. If other model organisms had only one protein, the subfamily was defined as such. However, if most model organisms had also two representatives within the putative subfamily, the subfamily was divided into two. An example is the separation of the Ddx3/Ded1 and Vasa subfamilies. Drosophila and other multi-cellular eukaryotes have two or more DEAD-box protein related to Ded1 or Vasa. However, with the exception of the yeast
S.cerevisiae, unicellular eukaryotes have only one of these proteins (
25) and therefore these proteins have been divided in two subfamilies. Another example would be the subfamily of proteins homologous to the yeast Dbp5 protein. In the human genome three proteins, Ddx19A, Ddx19B and Ddx25, are very similar to Dbp5 and are therefore being included in the same subfamily. It is clear, that this definition of subfamilies is somehow arbitrary and should be regarded as a working tool to compare proteins and predict functions. In some cases cross species complementation could be demonstrated (
26,
27) but in any case, experiments are needed to characterize these subfamilies further. The Ddx7 (
28) protein has no homologs in other mammals and a tblast against the human genome does not report any significant similarity. It is therefore excluded from the list presented here. According to the criteria defined above, 11 human DEAD-box proteins have no direct or obvious counterpart in yeast (Ddx1, Ddx4/vasa, Ddx20/DP103, Ddx21/RNA helicase Gu-alpha, Ddx28, Ddx50/RNA helicase Gu-beta, Ddx41/abstrakt, Ddx42, Ddx43, Ddx53, Ddx59). Although it may be expected that the human genome contains more DEAD-box proteins than the simple budding yeast, it may seem surprising that three DEAD-box proteins present in yeast (Dbp3, Mss116, Mrh4) have no obvious counterpart in humans. The DEAD-box proteins Mss116 and Mrh4 have been shown to be required for gene expression in yeast mitochondria (
29,
30). It is tempting to speculate that these proteins are simply not required in human mitochondria, because the structural organization of human mitochondrial genes is different from that of yeast mitochondrial genes, which harbor many introns. In contrast, Ddx28, may be involved in mitochondrial gene expression in human mitochondria, insofar as it shows nuclear and mitochondrial localization (
31,
32). The yeast Dbp3 protein is involved in ribosome biogenesis and it is one of the rare DEAD-box proteins that are not essential for growth under normal laboratory conditions (
33). In contrast to eukaryotes, bacterial genomes encode far fewer DEAD-box proteins and some bacterial species seem not to encode DEAD-box proteins at all (
5,
8). Today, searches in SwissProt reveal ~205 annotated sequences and >700 different entries in SwissProt and trEMBL. Based on the activity of eIF4A and on the sequence alignments, it is thought that the members of the DEAD-box family have similar biochemical activities.
| Table 1A tentative assignment of yeast and human DEAD-box protein subfamilies |