The caleosin genes encode proteins with a single conserved EF hand calcium-binding domain and comprise small gene families found in a wide range of plant species. These proteins may be involved in many cellular and biological processes coupled closely to the synthesis, degradation, or stability of oil bodies. Although previous studies of this protein family have been reported for Arabidopsis and other species, understanding of the evolution of the caleosin gene family in plants remains inadequate.
In this study, comparative genomic analysis was performed to investigate the phylogenetic relationships, evolutionary history, functional divergence, positive selection, and coevolution of caleosins. First, 84 caleosin genes were identified from five main lineages that included 15 species. Phylogenetic analysis placed these caleosins into five distinct subfamilies (sub I–V), including two subfamilies that have not been previously identified. Among these subfamilies, sub II coincided with the distinct P-caleosin isoform recently identified in the pollen oil bodies of lily; caleosin genes from the same lineage tended to be clustered together in the phylogenetic tree. A special motif was determined to be related with the classification of caleosins, which may have resulted from a deletion in sub I and sub III occurring after the evolutionary divergence of monocot and dicot species. Additionally, several segmentally and tandem-duplicated gene pairs were identified from seven species, and further analysis revealed that caleosins of different species did not share a common expansion model. The ages of each pair of duplications were calculated, and most were consistent with the time of genome-wide duplication events in each species. Functional divergence analysis showed that changes in functional constraints have occurred between subfamilies I/IV, II/IV, and II/V, and some critical amino acid sites were identified during the functional divergence. Additional analyses revealed that caleosins were under positive selection during evolution, and seven candidate amino acid sites (70R, 74G, 88 L, 89G, 100 K, 106A, 107S) for positive selection were identified. Interestingly, the critical amino acid residues of functional divergence and positive selection were mainly located in C-terminal domain. Finally, three groups of coevolved amino acid sites were identified. Among these coevolved sites, seven from group 2 were located in the Ca2+-binding region of crucial importance.
In this study, the evolutionary and expansion patterns of the caleosin gene family were predicted, and a series of amino acid sites relevant to their functional divergence, adaptive evolution, and coevolution were identified. These findings provide data to facilitate further functional analysis of caleosin gene families in the plant lineage.