Reviewer 1: Gáspár Jékely, Max Planck Institute for Developmental Biology, Tuebingen, Germany
The identification of the Nitrosoarchaeum tubulins by Yutin and Koonin is potentially interesting, and upon first read I tended to agree with their conclusion that the results are compatible with the origin of eukaryotic tubulins from Nitrosoarchaeum tubulins. However, upon closer inspection, I found a few potential caveats with this interpretation, and I would like to ask the authors to address these. Additionally, I would also like to suggest a few points that would need further clarification.
The authors write that "Eukaryotic tubulin sequences ... aligned with these proteins [Nitrosoarchaeum tubulins] over a region of approximately 300 amino acid residues" and that "the similarity between eukaryotic tubulins and FtsZ-like proteins ... covered regions of approximately 100 amino acid centered at the GTP-binding loops". Having performed the blast searches, I can confirm these results. However, the statement like this is slightly misleading, since it implies that Nitrosoarchaeum tubulins are related to eukaryotic tubulins across 300 residues, and FtsZ only across 100 residues, and that is not true. If one performs psi-blast searches, after three iterations it becomes apparent that the alignments with Nitrosoarchaeum tubulins and FtsZ proteins all cover about 61-66% of the query sequence (I used mouse alpha tubulin as query). This section should be clarified to indicate that the region of homology is not longer between the Nitrosoarchaeum sequences and tubulins, than between FtsZ and tubulin.
Authors' response: The text in question pertains to a single iteration of BLAST search and the observed differences are highlighted to emphasize the greater similarity between artubulins and tubulins compared to the tubulins versus FtsZ. There is no implication that the actual homologous domains are of different size. Indeed, it should be obvious the GTPase domain is conserved as a whole.
I would like to point out a caveat about the rooting of the tree in Figure . The authors chose to root it on FtsZ proteins, however, with the same topology, the tree could also be rooted on the Nitrosoarchaeum sequences, and this would show the FtsZ clade as a sister to eukaryotic tubulins. Alternatively, the root could also be placed between eukaryotic sequences and FtsZ + Nitrosoarchaeum. These different rootings would dramatically affect the conclusions of the paper. The only justification for using FtsZ as a root, I assume, is that in blast searches the Nitrosoarchaeum sequences show higher similarity to the eukaryotic tubulins. This means, that the phylogenetic tree does not constitute independent evidence from blast, and therefore does not confirm the close relationship between the Nitrosoarchaeum sequences and eukaryotic tubulins.
Authors' response: The rooting of the tree in Figure is justified not so much by sequence similarity but by phyletic distribution of tubulins and FtsZ. Indeed, both Nitrosoarchaea, in addition to the artubulins, encode FtsZ proteins typical of other Thaumarchaeota. Rooting the tree by artubulins would imply an ancient duplication with subsequent massive loss of artubulin genes in all bacteria and archaea except for two Nitrosoarchaea which is an extremely non-parsimonious scenario.
The good blast score is due to alignments that are longer between the eukaryotic tubulins and the Nitrosoarchaeum sequences. However, if one looks at the extended alignment at the C-terminal side, the similarity is really poor, and this similarity is not picked up by blast, if only this portion is used. Could this extended alignment be due to residue composition or other bias (e.g. Nitrosoarchaeum sequences are less derived than FtsZ)? There is another disturbing observation. Namely, if one blasts with the portion of eukaryotic tubulins that are represented in the alignment (e.g. El_Musmu58037275), the best hit is to Thermococcus FtsZ (2e-08), and not Nitrosoarchaeum, that doesn't even show up until a psi-blast iteration is performed. Since the argument hinges on the phylogenetic tree, the above considerations should inspire extra caution.
Authors' response: These concerns seem to stem from a certain misunderstanding of the way BLAST algorithm works. The algorithm extends the initial hit to the extent that is statistically justified and halts when further extension leads to increased E-values. Therefore a longer alignment recovered by BLAST is indeed evidence of greater sequence similarity. Spurious extension of an alignment due to compositional bias is possible but highly unlikely given the composition-based statistical corrections implemented in the current version of BLAST
As a minor comment, I suggest that the authors discuss in more detail the phylogenetic position of the Prosthecobacter tubulins in their tree. In particular, since it has been suggested by others that Prosthecobacter tubulins may by ancestral to all eukaryotic tubulins. For example, Pilhofer et al. [5
] speculate about a "vertical evolution" scenario where eukaryotic tubulins evolved from the bacterial ones. This may have been justified given the poor resolution of their trees, showing no clear relationship between Prosthecobacter tubulins and any of the eukaryotic tubulin families. The present paper shows a tree (the first one to my knowledge) that finds strong support for a clade uniting Prosthecobacter tubulins with alpha and beta tubulins. This strongly argues against the vertical evolution scenario. This would be important to discuss, given that these bacterial tubulins sometimes feature in arguments about a purported evolutionary connection between eukaryotes and Planctomycetes-Verrucomicrobia-Chlamydiae bacteria, e.g. [44
The relationship of Prosthecobacter
tubulins and alpha and beta tubulins is not resolved. Others concluded [6
] that Prosthecobacter
tubulins have mosaic sequences with intertwining features from both alpha and beta tubulin. This analysis and the tree shown in this paper are consistent with a scenario where Prosthecobacter
tubulins arose from an early horizontal gene transfer from an ancient tubulin, prior to the duplication of alpha and beta. This may be interesting to point out. It would also be interesting to see a technical comment on why the position of Prosthecobacter
is resolved in the present tree, but not in previous attempts. Was there a difference in methodology? Were the sequence evolution models used more realistic in this study?
Authors' response: Pinpointing the exact reasons behind differences in the results of phylogenetic analyses is very difficult. We are inclined to believe that the key factor is the more representative and balanced species sampling behind the trees presented here.
That said, we have investigated the phylogenetic position of bacterial tubulins in greater details, with the following conclusions.
1. Placing the bacterial branches outside the eukaryotic tubulin subtree was firmly rejected by the same statistical test of tree topology that we did in the paper (AU < 0.01). Thus, we have reasonable confidence that Prosthecobacterial tubulins are not ancestors to eukaryotic tubulins.
2. Monophyly of bacterial tubulins remains a matter of considerable uncertainty. This clade is not strongly supported in the tree in
Figure (bootstrap value of 71 at best). Furthermore, per suggestion of reviewer 2, we ran ProtTest
]to select the best substitution matrix which in this case turned out to be the LG matrix. Two alternative trees, using RAxML
], were constructed from the same alignment as used for the tree shown in
Figure . In both trees, Prosthebacterial tubulins A and B grouped, respectively, with eukaryotic tubulins α and β but the respective branches were not supported (
Additional File 4). In addition, eukaryotic and Prosthecobacterial tubulins were realigned without artubulins and FtsZ, in order to obtain an extended, higher quality alignment, and a tree was constructed using TreeFinder (
Additional File 4). In this tree, bacterial tubulin A grouped with α/κ tubulins whereas bacterial tubulin B grouped with γ tubulins, exactly reproducing the topology in
Figure 6 of Pilhofer
et al. [5
]but again with weak support
Thus, we can only assert that Prosthecobacterial tubulins evolved from within the eukaryotic subtree but the actual scenario for their evolution remains uncertain. However, even this conclusion is sufficient to dismiss Prosthecobacterial tubulins as an argument for an evolutionary connection between the PVC superphylum of bacteria and eukaryotes. This connection seems to be non-existent as argued in detail elsewhere
Reviewer 2: J. Peter Gogarten, University of Connecticut
The manuscript by Natalya Yutin and Eugene Koonin reports an exciting discovery: the presence of tubulin encoding genes in Thaumarchaeota. Tubulins are an important component of the eukaryotic cytoskeleton. The absence of closely related sequences ancestral to tubulins in prokaryotes was used to argue for a eukaryotic stem group that existed for a long period of time as a lineage distinct from archaea and bacteria. The argument is that a lot of substitutions would be needed to evolve tubulin from ftsZ, and that these many substitutions would be more compatible with a deep origin of the eukaryotes (see [47
] for discussion). The finding of archaeal tubulins weakens this argument: If some archaea possess tubulins that branch outside the eukaryotic tubulins, but much closer to the tubulins than to FtsZ, then it is conceivable that the eukaryotes branch from inside the archaeal domain and inherited the tubulin from their archaeal ancestor.
Authors' response: Indeed, the discovery of artubulins seems to invalidate the use of the distant relationship between tubulins and FtsZ as an argument for a eukaryotic stem outside Archaea. Along with other recent observations
, e.g. [20
], these findings seem to be best compatible with the origin of eukaryotes from a highly complex, possibly transiently existing archaeon
The archaeal tubulin sequences presented and discussed by Yutin and Koonin appear more similar to the eukaryotic tubulins than to FtsZ. This observation is confirmed by their phylogenetic analysis. The authors discuss their findings with appropriate caution, and I don't think that more sophisticated analyses will change the findings; nevertheless, the following two concerns seem worthy of consideration: First, the phylogenetic reconstruction does not appear to consider among site rate variation (ASRV), i.e., the choice of model used in phylogenetic reconstruction using maximum likelihood is not well described and justified. If one were to incorporate ASRV, I expect the deep branches of the phylogeny to become longer, because multiple substitutions are more efficiently corrected for, and the distinction between tubulins (including the archaeal tubulins) and FtsZs becomes stronger, strengthening the authors conclusion that these are indeed tubulins. Nevertheless, the choice of model should be discussed, and could be improved.
Authors' response: We applied ProtTest and found LG to be the optimal matrix; accordingly, two alternative trees were built using LG (
Additional File 4). The topologies of these trees differ in many places from the topology of the tree in
Figure but these differences do not affect the conclusions of this work (see also the response to Reviewer 1 regarding the bacterial tubulins)
Second, aligning divergent sequences is difficult, and the alignment itself can create a strong phylogenetic bias. This is a concern, because the archaeal tubulins and the FtsZ sequences are very divergent. How certain can we be that these archaeal sequences group outside the eukaryotic domain, as one would expect if archaeal tubulins were ancestral to the eukaryotic ones, and not inside the eukaryotic domain, as one would expect if the Thaumarchaeota acquired the tubulins from a eukaryote through horizontal gene transfer. An analysis that simultaneously considers phylogeny and alignment, such as SATé [48
], might help to exclude the possibility of a eukaryote to archaeon transfer with more confidence. However, the best approach to address this uncertainty will be additional archaeal tubulin sequences, which hopefully will become available in the future.
Authors' response: SATé is beyond doubt an attractive phylogenetic approach but one that has not been sufficiently tested on phylogenies including distantly related, real sequences. We fully agree with the reviewer that the primary advance is likely to be brought about by further sampling of diverse archaea that is expected to reveal a greater diversity of artubulins.
Minor suggestions: In spelling the species names for candidatus species, the convention is to italicize the word Candidatus, and to leave the suggested species name in normal font, e.g., Candidatus Nitrosoarchaeum koreensis. As no members of the genus have been cultivated, the Candidatus should also be used for the genus (e.g., the corresponding line in the abstract should read: "... genus Candidatus Nitrosoarchaeum that we denote artubulins. Phylogenetic ..." - also, the period was missing after artubulins). 1. Fournier GP, Dick AA, Williams D, Gogarten JP (2011) Evolution of the Archaea: emerging views on origins and phylogeny. Research in microbiology 162: 92-98. 2. Liu K, Raghavan S, Nelesen S, Linder CR, Warnow T (2009) Rapid and accurate large-scale coestimation of sequence alignments and phylogenetic trees. Science 324: 1561-1564.
Authors' response: Corrected.