All myosin protein sequences have been derived by manually inspecting the corresponding DNA, either the published cDNA or genomic DNA, or the genomic DNA provided by sequencing centers. Published sequences contained errors in many cases, either from sequencing or from manual annotation, while automatic annotations provided by the sequencing centers resulted in mispredicted exons in almost all transcripts. For many sequences, the prediction of the correct exons was only possible with the help of the analysis of the homologs of related species. Thus, not only has the quantity of myosin data increased as more and more genomes have been analyzed but also the quality as all ambiguous regions could be resolved for those sequences for which data from a closely related organism are available. Therefore, mispredicted exons may be limited to a few orphan myosins.
For the phylogenetic analysis of the myosin motor domains we created a structure-guided manual sequence alignment whose quality is far beyond any computer-generated alignment. It is obvious that all secondary structure elements of the class-II myosin motor domain structure remain conserved in all myosins, even in the most divergent homologs. Sequence motifs that would not have been aligned at first glance were placed based on the analysis of their supposed three-dimensional counterparts, which always maintained the structural integrity of the respective region. Thus, strong sequence variation and sequence insertions were limited to loop regions. Based on the phylogenetic tree constructed from 1,984 myosin motor domains, 35 classes have been assigned (Figures and ; Additional data files 2 and 3). There are 149 myosins that still remain unclassified due to our conservative view on designating classes but it is anticipated that sequencing of further genomes will result in their classification and will substantially increase the existing number of classes. For generating the tree it does not matter whether long loop regions (for example, the 300 amino acid loop-1 of the Arthropoda Myo1C proteins) are included in the alignment or not (data not shown). So far, almost all orphan myosins belong to taxa that have not undergone large-scale comparative sequencing efforts. Only short sequence fragments have been found for 277 myosins. These sequences were excluded from the phylogenetic analysis but have been classified based on their similarity in the multiple sequence alignment. Nevertheless, these data are important for defining myosin diversity in as many organisms as possible.
The highest number of myosins in a single organism has been found in Brachydanio rerio
(61 myosins grouped into 13 classes) while the broadest class distribution is expected for the Phytophthora species (25 myosins grouped into at least 15 classes). The high numbers of vertebrate myosin genes in general are due to several whole genome duplications that happened after the separation from the Craniata and Urochordata [27
Our survey of the myosin gene family now allows the reconstruction of the tree of 328 eukaryotes (Figure ). The organisms of the major clades Fungi/Metazoa, Euglenozoa, Stramenopiles and Alveolata have distinct sets of myosin classes (except class-I), showing that horizontal gene transfer of myosins has not happened in later stages of eukaryotic evolution. However, we cannot exclude yet that horizontal gene transfer of myosins has not happened at the origin of eukaryotic evolution. Hence, only paralogs and orthologs have to be resolved. Figure represents a schematic reconstruction of both the phylogenetic relationships of major taxa reconstructed from class-specific trees as well as the information on myosin class evolution and distribution. For example, Tetrahymena thermophila, Perkinsus marinus, Toxoplasma gondii, Plasmodium falciparum, and Babesia bovis have all been classified as Alveolata. However, the relation between Ciliophora (Tetrahymena thermophila), Perkinsea (Perkinsus marinus), and Apicomplexa (Toxoplasma gondii, Plasmodium falciparum, and Babesia bovis) has not been resolved yet. Tetrahymena thermophila does not share any myosin with the other Alveolata and should, therefore, have diverged before the other species. Perkinsus marinus shares two myosin classes with the Apicomplexa. Thus, they must have had a common ancestor. The Apicomplexa developed three further common classes, of which single classes have been lost by different species. The myosin class-specific trees show that the Coccidia, the Haemosporida, and the Piroplasmida form distinct lineages. However, their relation cannot be resolved further. This principle for reconstructing the tree has been applied to all species.
Figure 8 Schematic drawing of the evolution of myosin diversity. The tree has been constructed based on the combination of the phylogenetic information obtained from the analysis of single myosin classes as well as the analysis of the class distribution of major (more ...)
The class-I myosins show the widest taxonomic distribution and are devoid of the amino-terminal SH3-like domain and are thus suggested to be the first myosins to have evolved (see below). Only two major lineages, the Viridiplantae and the Alveolata, do not contain class-I myosins (Figure ). The Alveolata have either lost the class-I myosin, or their class-I myosin diverged so far that a common ancestor could not be reconstructed. The Apicomplexa developed several specific classes, while the Ciliophora myosins cannot be classified yet. The evolutionary history of the Euglenozoa and Stramenopiles cannot be further resolved because both do not share any further myosin classes with other species, and their taxonomic sampling is not high enough for a more precise grouping.
The second myosin class to develop during the evolution of the Fungi and Metazoa kingdoms was class-V. The plants have developed two kingdom-specific classes. However, the domain organization of the plant-specific class-XI is similar to that of class-V, suggesting that both had a common ancestor. In contrast to the class-I myosins, the class-V and class-XI myosins have diverged so far that a common ancestry is not visible beyond their general domain organization. After separation of the plant lineage, the class-II myosins arose. The protists Entamoeba
sp., Acanthamoeba castellanii
, Naegleria gruberi
, and Dictyostelium discoideum
have closely related myosins, suggesting that they share a common ancestor that diverged shortly before the Fungi and Metazoa split. While the Entamoebidae have lost their class-V myosin, retaining only a class-I and a class-II myosin, the Acanthamoebidae, Dictyosteliida, and Heterolobosea have developed several additional specific myosins with unique domain organizations, in addition to the increase in the number of myosin genes through single gene or whole genome duplications. The Acanthamoebidae and Dictyosteliida already contain the combination of the myosin motor domain and the MyTH4 domain that is also widely found in the metazoan lineage. However, a lack of genomic data prevents the designation of a common myosin motor domain-MyTH4 containing ancestor. The fungi developed the class-XVII myosin that consists of a functionally restricted myosin motor domain fused with a highly conserved chitin synthetase [28
]. While the Ascomycetes, Basidiomycetes, and Chytridiomycota have retained one member of each of the four myosin classes, the Zygomycotes Rhizopus arrhizus
and Phycomyces blakesleeanus
have undergone several single gene or whole genome duplications. The Saccharomycetes, Schizosaccharomycetes, and Microsporidia have lost their class-XVII myosin.
Two different models can be proposed for the further evolution of the Metazoa (Figures and ). In both models a considerable boost of myosin diversity happened at the early evolution of Metazoa. The most reasonable model based on the myosin class distribution suggests an increase of the myosin diversity in three steps. After separation of the Fungi, the Metazoa developed four new classes, class-VI, class-VII, class-IX, and class-XVIII. These classes are shared by species of all Metazoa taxa sequenced so far, except the choanoflagellate Monosiga brevicollis
, which does not contain class-IX and class-XVIII myosins. However, single species of the other taxa have also lost their members of these four classes; for example, the nematode Trichinella spiralis
contains only a class-VII myosin, the Caenorhabditis
species have lost their class-XVIII myosins, and the Drosophila
species have lost their class-IX myosin. Our model places the choanoflagellates to the Coelomata that invented the related class-X, class-XV, and class-XXII myosins. After separation of the choanoflagellates, the Bilateria gained another three classes, class-III, class-XIX, and class-XX. The Deuterostomia, to which we placed the Cnidaria, invented the class-XXVIII myosins and lost class-XXII myosins. Later in evolution, the Chordata lost class-XX myosins. This model proposes the continuous invention of new myosin classes over a relatively long time and the subsequent loss of single myosin classes by certain species and lineages. The placement of the Cnidaria to the Deuterostomia is surprising as the Cnidaria are commonly considered to be a sister group of the Bilateria. However, the analysis of the Nematostella vectensis
genome showed that, from a genomic perspective, Nematostella
more closely resembles modern vertebrates than the fruit fly or nematodes [29
], which is consistent with our analysis. But as long as genome sequences of further Cnidaria species are not available, this placement could also be the result of long branch attraction effects in the phylogenetic tree. Sequencing of further species of the lineages Choanoflagellida, Cnidaria, and Echinodermata, which are as yet represented only by single species, will provide a better picture of these taxa, as has been obtained for the nematodes, Arthropoda, and vertebrates, which show a wide distribution of the myosin content between their member species. For example, during the evolution of the Arthropoda, the Insecta lost the class-XIX myosin. Later in evolution the ancestor of all Drosophila
species lost the class-III and class-IX myosins, and finally most Drosophila
species lost the class-XXII myosin. Most of the lineages like the Nematoda, Arthropoda and Vertebrata have developed further branch-specific myosins. We propose that sequencing of related organisms to Strongylocentrotus purpuratus
and Monosiga brevicollis
will result in the classification of their orphan myosins and, thus, also of branch-specific myosins for these lineages.
Figure 9 Schematic drawing of the evolution of myosin diversity in the Fungi/Metazoa lineage based on the 'accepted' taxonomy. The inventions and losses of the myosin classes have been plotted onto the 'accepted' phylogeny of the Eukaryotes available at NCBI. (more ...)
In contrast, the metazoan tree based on classical taxa and nodes shows the invention of ten myosin classes in a very short time scale (Figure ). The evolution of the Metazoa would thus mainly be characterized by gene losses. While the Anthozoa Nematostella vectensis shares all its 12 myosin classes with vertebrates, the nematodes must have lost 6 of the 13 common Metazoa myosin classes. The nematode Trichinella spiralis has lost another three of the remaining classes, sharing only four classes with the other Metazoa. The Arthropoda must also have lost at least two of the common Metazoa myosin classes. This scenario, the invention of ten myosin classes during the evolution of only two taxa nodes and the subsequent major losses of myosin classes until the final speciation, seems very unlikely compared to the other model that proposes the invention of new myosin classes over a long period with the subsequent loss of single classes.
In both models, the tree of myosin diversity gives clear support for the classical Coelomata hypothesis that groups Arthropoda with Deuterostomia in a monophyletic class. The Nematoda sequenced so far lack four classes that the Arthropoda share with the vertebrates. It is very unlikely that the Nematoda have lost just these four classes and not one or more of the others. The class specific phylogenetic trees show that the Nematoda myosins always separate before the Arthropoda-Deuterostomia split, except for the class-IX myosins, where the Nematoda and Arthropoda homologs group separately from the Deuterostomia homologs. These findings illustrate the advantage of analyzing the diversity of a large protein family in contrast to looking at single-gene phylogenies that have supported the monophyletic grouping of Nematoda and Arthropoda in some cases [30
The comparative analysis of the phylogenetic relationship of the species in single myosin classes showed several incongruities. We hypothesized that the myosin genes of the corresponding organisms might have evolved asynchronously, as has been observed for a number of yeast genes [31
]. From the phylogenetic tree we therefore determined the distances between pairs of sequences. To compensate for differences in general diversity within each class, all distances were normalized. Asynchronous evolution is visualized by the comparison of the deviation from the mean distances. As examples we analyzed the myosins of completely sequenced mammalian (Figure ) and fungal (Figure ) genomes. As expected, all primates are very closely related, with the chimpanzee generally closer to Homo sapiens
than to macaca. The myosin proteins from dog and cow are more closely related to those of the primates than to those from rodents. The opossum Monodelphis domestica
is, in general, the most divergent mammal with respect to myosins, although in the case of Myo1E and Myo16, it is most closely related to the dog and the primates than to the rodents. The myosin proteins from cow show the most asynchronous phylogenetic relationship of the analyzed mammalian genomes. They either diverge before the split of the rodents and primates/dog, after this split, or form a monophyletic class with the corresponding dog orthologs. Hence, it is not possible either to resolve the phylogenetic grouping of the cow in general, or to do so by using the myosin proteins, or sequences from additional mammals have to be added to better resolve the tree.
Figure 10 Asynchronous evolution of mammalian myosin proteins. The matrix illustrates the normalized distances between corresponding sequences. Asynchronous evolution is observed if the pattern of the deviation from the mean is different. For example, the pattern (more ...)
Figure 11 Asynchronous evolution of fungi myosin proteins. The matrix is shown in a similar way as in Figure 10. The consensus tree from the analysis of the single myosin class trees is shown. The obtained polytomic tree is the result of the asynchronous evolution (more ...)
The fungal myosins show several distinct groups that are related to the established taxa. However, the analysis resolves some so far unrecognized relationships. The Saccharomycotina do not group to the Ascomycota in all myosin classes, but have evolved asynchronously. Based on our analysis of the myosins, the Saccharomycotina should be considered as an independent clade that evolved from Fungi, in parallel with the Ascomycota, the Basidiomycota, the Zygomyocota, and the Schizosaccharomycetes. These clades developed very asynchronously so that their phylogeny cannot be resolved. In addition, the species in these clades have undergone considerable asynchronous development. Yarrowia lipolytica, which has been considered a yeast species, is more closely related to the Ascomycota than to the Saccharomycotina, based on both the phylogenetic relation of the respective myosin homologs and its myosin content as it contains a class-XVII myosin that all Saccharomycotina have lost.
What did the very first myosin look like? In the beginning of eukaryotic evolution, the myosin motor domain had been developed (Figure ). During subsequent early evolution, an extensive process of domain fusions started, during which the carboxy-terminal IQ motif was added first. After duplication of this gene, the amino-terminal SH3-like domain was fused to the motor domain. These two domain organizations are shared by myosins of all species. The class-I myosins show the widest taxonomic distribution, are devoid of the amino-terminal SH3-like domain and, thus, are suggested to be the first distinct myosin class to have evolved. We propose that the most ancient myosin motor domain had a sequence very close to that of the class-I myosins.
Figure 12 Evolution of the first myosins. The first myosin, called ur-myosin, is expected to consist only of the myosin motor domain. By domain fusion it generated the IQ motif either directly carboxy-terminal to the motor domain (2), or after a gene duplication (more ...)