The origin of vertebrate retroviruses (Retroviridae) is yet to be thoroughly investigated, but due to their similarity and identical gag-pol (and env) genome structure, it is accepted that they evolve from Ty3/Gypsy LTR retroelements the retrotransposons and retroviruses of plants, fungi and animals. These 2 groups of LTR retroelements code for 3 proteins rarely studied due to the high variability – gag polyprotein, protease and GPY/F module. In relation to 3 previously proposed Retroviridae classes I, II and II, investigation of the above proteins conclusively uncovers important insights regarding the ancient history of Ty3/Gypsy and Retroviridae LTR retroelements.
We performed a comprehensive study of 120 non-redundant Ty3/Gypsy and Retroviridae LTR retroelements. Phylogenetic reconstruction inferred based on the concatenated analysis of the gag and pol polyproteins shows a robust phylogenetic signal regarding the clustering of OTUs. Evaluation of gag and pol polyproteins separately yields discordant information. While pol signal supports the traditional perspective (2 monophyletic groups), gag polyprotein describes an alternative scenario where each Retroviridae class can be distantly related with one or more Ty3/Gypsy lineages. We investigated more in depth this evidence through comparative analyses performed based on the gag polyprotein, the protease and the GPY/F module. Our results indicate that contrary to the traditional monophyletic view of the origin of vertebrate retroviruses, the Retroviridae class I is a molecular fossil, preserving features that were probably predominant among Ty3/Gypsy ancestors predating the split of plants, fungi and animals. In contrast, classes II and III maintain other phenotypes that emerged more recently during Ty3/Gypsy evolution.
The 3 Retroviridae classes I, II and III exhibit phenotypic differences that delineate a network never before reported between Ty3/Gypsy and Retroviridae LTR retroelements. This new scenario reveals how the diversity of vertebrate retroviruses is polyphyletically recurrent into the Ty3/Gypsy evolution, i.e. older than previously thought. The simplest hypothesis to explain this finding is that classes I, II and III trace back to at least 3 Ty3/Gypsy ancestors that emerged at different evolutionary times prior to protostomes-deuterostomes divergence. We have called this "the three kings hypothesis" concerning the origin of vertebrate retroviruses.