Using probabilistic models and advanced tree building methods, we have been able to build the most comprehensive evolutionary tree of the Rab family that has been reported to date. In our analysis, we defined the Rab family as being distinct from the Ran family, and thus excluded some proteins which have been referred to as Rab-like but which are more likely to be 'Ran-like' or have roles distinct from Rabs. These include RabL2, RabL3, and RabL5.
Our findings confirm and extend the evidence that the LECA had a large number of Rabs [16
]. If our size estimate of LECA Rab repertoire is incorrect, it will, if anything, prove to be an underestimate once more genome sequences become available from non-metazoan organisms, in particular protozoa. Indeed after our analysis was completed, Elias et al.
] reported an analysis of LECA Rabs using a different method to identify evolutionary relationships applied to only 55 species, and concluded that the LECA could have contained 23 Rabs. Although this 'ScrollSaw' method was not validated with well established protein families, and found only two of the six supergroups we have identified, there is nonetheless a reassuring overlap of 19 Rabs with our LECA repertoire (although Elias et al
used non-standard names for Rab7L1 (Rab32B) and RabX1 (Rab50)). The one extra Rab in our proposed set is Rab29, which Elias et al
did not distinguish from Rab7L1, possibly due to their use of a smaller genome set, but as discussed above, we felt we could tentatively include Rab7L1 in the LECA repertoire, based not only on it being in amoeboza and metazoans, but also on it being an excavate sequence that scores above the strict cut-off with our Rab29 HMM. Elias et al.
also suggest that the LECA contained four further Rabs that we did not include; Rab20, Rab34 and RabL2 (which they renamed RTW), and a new family that they named RabTitan. They based Rab20 on four sequences outside of unikonts, but in each case the relevant sequence gave a higher score with our Rab24 model, and the organism concerned had no other Rab24 in its genome, suggesting that these really are Rab24s. The inclusion by Elias et al.
of Rab34 is based on a Rab34-like sequence that is present in one non-unikont genome (the excavate N. gruberi
). With our Rab34 HMM, this sequence gave a score below the strict cut-off point, and so we felt it premature to include this protein. We also did not include RTW/RabL2, as it is well established to be an outlier from the Rab family that is more closely related to Ran, and so is unlikely to be a Rab [14
]. RabTitan is based on a set of rather distantly related sequences from several kingdoms, but all lack C-terminal cysteines and seem only very distantly related to the Rabs. We tested a 'RabTitan' from a metazoan (Branchiostoma floridae
), an amoebozoan (Dictyostelium discoideum
) and a excavate (N. gruberi
), and in each case we obtained a much lower score for our general Rab HMM than for either Ran or Rho/Rac from the corresponding species. To determine from which G proteins these RabTitans evolved will probably require a phylogenetic analysis of the entire Ras superfamily.
Irrespective of whether any further Rabs will be added to the LECA repertoire, it now seems unambiguously established that the repertoire was large, probably consisting of at least 20 members, which strongly supports the notion that the LECA had a complex set of internal organelles and trafficking steps [16
]. This is consistent with the 'complexity early' model of eukaryotic evolution that has been suggested by examining other families of proteins such as cytoskeletal components and motors [24
]. The notion that a single-celled LECA needed such complexity in its internal membranes is consistent with the Rab repertoires of more than 50 members found in some extant single-cellular protozoa [63
The widespread loss of Rabs during the diversification of eukaryotes probably reflects in part the loss of particular structures or processes during specialization. These include cilia/flagella and a capacity for phagocytosis. Interestingly, a LECA Rab of unknown function, Rab28, has a phylogenetic distribution similar to that of cilia, and it groups with RabL4, a Rab known to be involved in cilia function. Thus we suggest that Rab28 is a candidate for a role in cilia formation or function. In addition, the requirements for membrane-trafficking events during cytokinesis are likely to have changed as this process has diversified greatly in different lineages [66
It is possible that some cases of Rab loss reflect overlap in function between particular Rabs that belong in the same family, even if they have diverged from a common ancestor in the LECA. For instance, Rab4 and Rab11 were both present in the LECA, but are part of the same fundamental group of Rabs and have been found to share some effectors [67
]. This may have allowed Rab4 to have been lost more readily than if Rab11 had not been present to take over some of its roles. Likewise, mammalian Rab5 shares the same binding site on some effectors with other LECA Rabs of its group, Rab21 and Rab22, and again the latter have often been lost during evolution [68
]. Such 'cryptic' redundancy suggests that caution may be needed in interpreting the effects of deleting particular Rabs; that is, a failure to see a perturbation of a particular process need not imply that the Rab that has been deleted is not normally involved in that process. This is more likely to be the case for Rabs that have duplicated more recently in evolution, and may explain the surprisingly mild phenotypes observed after loss of some mammalian Rabs such as Rab18 or the Rab3 family, as both have relatives that arose from duplications at the dawn of metazoan evolution [67
Not only have some Rabs been lost, but also new ones have been gained by gene duplication and divergence. A large increase in the Rab repertoire occurred at the root of the metazoan lineage, which may be linked to multicellularity without a cell wall, and hence the opportunity for intracellular contacts and communications to be mediated by proteins directed to the cell surface. In most cases it seems likely that the duplications generated Rabs that carry out spatially or functionally related roles. However, there are one or two enigmatic exceptions; for instance, Rab35 is related to Rab1 and yet has a role in endocytosis at the plasma membrane rather than acting at the Golgi [70
], while Rab2 emerged in a LECA group with Rab4, Rab11, and Rab14, and yet acts on the Golgi rather than on endosomes [72
For the non-metazoan kingdoms, a lack of sufficient genome sequences restricts a deep understanding of Rab expansions, although plants seem to have expanded Rab11 [73
]. The protozoans Paramecium, Trichomonas, Entamoeba
have greatly expanded their Rab repertoires, perhaps reflecting the complex endocytic/phagocytic routes in these organisms and their elaborate cilia or flagella, but understanding the history of these expansions will require more genome sequences from other excavates and chromalveolates [63
A second round of Rab expansion occurred in vertebrates as a result of the two rounds of genome duplication that seem to have happened early in vertebrate evolution and that also generated paralogs for many other gene families [76
]. It remains to be determined which of these new Rabs have been conserved because they had different functions or different expression patterns [16
]. However, it is clear that the naming of these paralogs has not been consistent, with some being distinguished with letters (for example, Rab4a and Rab4b) and others by new numbers. For instance, Rab25 could be Rab11C, and Rab34 and Rab36 could be Rab34a and Rab34b. Adopting a consistent nomenclature for the vertebrate Rabs might avoid some confusion, and in particular the risk that potentially highly redundant paralogs are overlooked because they have different numbers. These anomalies and our suggested 'rational' names for human Rabs are shown in Figure .
Finally, it is interesting to compare the evolutionary patterns of Rabs with those of other components of membrane traffic. Known vesicle coat proteins (adaptor proteins (AP)-1 to AP-4, retromer) and multi-subunit tethering complexes (conserved oligomeric Golgi (COG), Golgi-associated retrograde protein (GARP), exocyst, homotypic fusion and vacuole protein sorting (HOPS) and transport protein particle (TRAPP)) have undergone little expansion since the LECA, indicating that their different subtypes correspond to the fundamental routes of traffic [78
]. These protein families must be the core components of ancient machineries involved in the fundamental steps of vesicle budding and fusion, which diversified during evolution of the LECA to adapt to the needs of specialized compartments. The SNARE protein families show a post-LECA diversification pattern comparable with that of Rabs; for example, during the rise of metazoans, the development of vertebrates [82
], and in angiosperms [83
]. In addition, metazoan SNARE diversification is, like Rab8 expansion, associated with those SNAREs involved in transport steps to the plasma membrane. However, the Rab proteins show a much more extensive and dynamic gain-and-loss history, with SNARE proteins having a tendency to become essential 'housekeeping' genes. For example, S. cerevisiae
has retained only 6 of the 20 LECA Rabs but 21 of the 22 LECA SNAREs.
This greater diversity in Rab evolution may reflect the fact that it is difficult to duplicate a coat, a tethering complex, or a SNARE tetramer, as they are encoded by multiple genes. More importantly, it suggests that a particular Rab can be gained or lost without a concomitant gain or loss of any of the known members of the known coat, SNARE, or tethering complex families. This is rather unexpected, as it implies that either these components are associated only with those Rabs that are 'indispensible', or that the Rabs only add functionality to existing processes, or, as seems likely in at least some cases, new Rabs can add transport steps without the cell having to evolve new coats, tethering complexes, or SNAREs. This suggests that Rabs have directed the evolutionary plasticity of membrane traffic, and hence, by implication, they encode much of its specificity. Given that the function of many Rabs, including some of those that were present in the LECA, is poorly understood, it seems certain that there is still much to learn about the involvement of Rabs in the organization of internal organelles and trafficking pathways.