A fundamental question in Se biology is the extent of functional exchangeability between Sec and Cys amino acids, a measure of the distinct contribution of Sec to protein function. Sec is a nonstandard amino acid, and previous evolutionary studies on amino acid exchangeability have not considered this rare residue. To gain insight into this question, we have characterized the evolutionary forces shaping the exchange of Sec/Cys residues in vertebrates, a challenging inference given the small number of Sec sites in vertebrate proteomes. We believe this approach to be superior to physicochemical or experimental measures of exchangeability (Grantham 1974
; Miyata et al. 1979
; Yampolsky and Stoltzfus 2005
) for the question at hand, as it discerns selection from mutational biases and it accounts for different fitness effects due to the use of a Se-dependent amino acid in proteins. The recent characterization of vertebrate selenoproteomes is believed to be quite complete (Kryukov et al. 2003
; Castellano et al. 2004
; Shchedrina et al. 2007
), and the knowledge of the vertebrate phylogeny and mutation rates enables us, for the first time, to test current hypothesis on the role of Se and Sec in protein activity.
Our results are consistent not only with strong purifying selection acting on both Sec and Cys sites (as expected from functional sites), but also with a low level of functional exchangeability between the two residues over half a billion years of vertebrate evolution. These results underscore the unique role of Sec in protein activity. In interpreting these findings, it is worth noting that, as any evolutionary inference, they depend on the null model adopted and the test statistic used. In our simulations of the neutral divergence of vertebrate selenoproteomes, the expected number of synonymous substitutions per synonymous site is used as a proxy of the neutral mutation rate (see Materials and Methods). Synonymous mutations in mammals and other vertebrates with small population sizes are commonly assumed to be neutral. Although many synonymous mutations are no doubt free from selection, selective pressures related to translational efficiency, mRNA stability, splicing control, and others suggest that weakly purifying selection may act on an unknown fraction of synonymous sites (Chamary et al. 2006
). Weakly purifying selection would make us underestimate mutation rates in vertebrate genomes, but would not compromise the tests. On the contrary, a slower neutral rate of evolution would make our tests conservative in the inference of purifying selection, a statistical property shared by our divergence summary statistic (see Materials and Methods).
A more problematic bias would be the underestimation of the extent of mutation rate heterogeneity in a genome, which would result in an overestimation of sequence divergence. Such biased neutral expectation could result in the false inference of constraint. Several lines of evidence, though, suggest that this is an unlikely explanation to our results. First, synonymous sites in selenoproteins and Cys-homolog genes between humans and chimpanzees are not unusually constrained, suggesting that mutations accumulate at a typical rate in these genes (Castellano S, data not shown). Second, selenoproteins and Cys-homolog genes are located in different chromosomal regions within and between species genomes. That the distribution of a large fraction of these genes consistently overlaps regions of low mutation in most species, as the pervasive purifying selection inferred above would imply, is highly improbable. Third, neutral simulations with increasing levels of mutation rate heterogeneity suggest that our tests are, to a large degree, robust to nonuniform mutation rates. Therefore, all evidence supports that vertebrate selenoproteomes are selectively constrained and that such evolutionary conservation can be of functional relevance.
Accordingly, we discuss previously proposed selective pressures on Sec usage in the context of the inferred constraint:
- Nutrition is a prominent selective force in humans and other species (Haygood et al. 2007), and dietary adaptations are likely to have arisen primarily due to changes in nutrient availability. For example, iron deficiency in populations of European descent may have caused recent local positive selection on the HFE gene (iron absorption regulation), where an enhancing Cys to Tyr mutation has reached a relatively high frequency in only ~60 generations (Bamshad and Wooding 2003). Environmental changes and range expansions in populations may also have resulted in different nutritional pressures regarding Se dietary intake, an unevenly distributed trace element worldwide (Shamberger 1981; Levander 1987; Valentine 1997). Indeed, selective claims regarding Se availability in vertebrates and other eukaryotes have been recently published (Lobanov et al. 2007, 2008a, 2008b). If so, patterns of selenoproteome divergence and diversity should bear the footprint of past and present Se abundance or deficiency events. Despite the fact that vertebrate species may have repeatedly encountered extreme Se environments in the last half billion years, our exchangeability test fails to support extensive positive selection targeting Sec/Cys sites (). This result is consistent with low functional exchangeability between Sec and Cys amino acids and a minor role for environmental Se in driving the use of Sec in vertebrate enzymes. Furthermore, despite a considerable range of variation in dietary Se intake among human populations (Levander 1987), we find no evidence of variation in the use of Sec and Cys residues among populations worldwide (see supplementary table S5, Supplementary Material online), suggesting that Se availability has not sized the human selenoproteome among regions throughout the world.
- Atmospheric O2 levels have played a key role in the evolution of vertebrates (Canfield et al. 2007). Leinfelder et al. (1988) have suggested that the highly oxidizable Sec (Jacob et al. 2003) is counterselected (substituted by Cys) in response to rising O2 levels, a hypothesis later embraced by Jukes (1990). Although this adaptive factor was suggested to be important 2.4 billion years ago, examples of molecular adaptations to variable O2 concentrations have been described in animals (Bargelloni et al. 1998). A great increase in O2 levels in late Proterozoic (~600 Ma) preceded the appearance of the first animals, and wide variations in atmospheric O2 concentrations followed in the Phanerozoic (~550 Ma to the present). Vertebrates have evolved for half a billion years with a maximum O2 concentration around 300 Ma (~31% O2), a minimum about 200 Ma (~13% O2), followed by a steady rise to present times (21% O2) (Berner 2006; Berner et al. 2007). O2 levels have been recently proposed to drive nonneutral evolution of eukaryotic selenoproteomes (Lobanov et al. 2007, 2008a, 2008b). The extensive constraint identified in Sec and Cys residues during vertebrate evolution () is, however, in agreement with a limited role of O2 in shaping Sec usage, as broad fluctuations in selection intensity would have resulted in episodic positive selection, most likely in different genes in different lineages, leading to higher selenoproteome divergence. In agreement, no significant negative correlation between O2 levels and selenoproteomes sizes during the phanerozoic was found (Results and supplementary table S4, Supplementary Material online). However, the uncertainty of these estimates, particularly of divergence times between lineages, and the small number of selenoproteomes tested, makes this lack of correlation, at most, suggestive. Nevertheless, the observation that vertebrate selenoproteomes have remained similar in size, virtually unchanged in mammals, for hundreds of millions of years despite levels of atmospheric O2 exhibiting the greatest variability of any geological period, is a stronger evidence of a minor role of O2 concentrations in driving Sec use in vertebrates.
- Metabolic costs of amino acid biosynthesis and incorporation into proteins are usually overlooked selective pressures (Akashi and Gojobori 2002). Sec is an expensive residue due to its complex biosynthetic pathway (Xu et al. 2007) and its elaborate and inefficient cotranslational insertion into proteins (Berry et al. 1992; Driscoll and Copeland 2003; Mehta et al. 2004). However, the strong purifying selection in both Cys and, more importantly, Sec sites () suggests no major detrimental effect on fitness of Sec larger metabolic cost. Other than Sec anabolic fitness effects, the slightly higher number of Sec to Cys than Cys to Sec changes can be attributed to the requirement of a functional SECIS element in selenoproteins. This result provides some support, at least in vertebrates, to the pattern of Sec usage following Dollo's Principle (Farris 1977), in which the derived state (Sec) arose only once and reversals to Cys have occurred multiple times.
- Functional constraints on particular amino acid sites, although difficult to document, can explain in part heterogeneity in protein rates of evolution. The extent of constraint in Sec and Cys sites across vertebrate selenoproteomes strongly suggests that some functional characteristics account for the low exchangeability between Sec and Cys residues (). The fine molecular features behind the observed degree of constraint in each selenoprotein or Cys homolog may vary and are not fully clear, as the majority of these enzymes remain poorly characterized. Nevertheless, it is now apparent that the higher catalytic activity usually attributed to Sec-containing enzymes (Berry et al. 1992; Rocher et al. 1992; Maiorino et al. 1995; Zhong and Holmgren 2000) can only justify a fraction of the extensive conservation in Sec and Cys sites during vertebrate evolution. Similar catalytic activity between homologous Sec- and Cys-containing enzymes, most likely due to additional compensatory substitutions in the active site of Cys enzymes, has been recently reported (Gromer et al. 2003; Kim and Gladyshev 2005; Shchedrina et al. 2007). A broader range of substrates and pH in which selenoenzyme activity is possible (Gromer et al. 2003) or different catalytic mechanisms between Sec- and Cys enzymes (Kim and Gladyshev 2005) may account for the constraint and the deleterious effect of Sec/Cys replacements inferred here. A more complex view of Sec in protein activity is emerging, and other biochemical and functional differences with fitness consequences may apply to the majority of uncharacterized selenoenzymes. Hence, to the question posed by Johansson et al. (2005) of whether every reaction catalyzed by Sec can be supported by Cys, the evolutionary analysis of all Sec and Cys residues in vertebrate proteomes provides a negative answer. Overall, our results support and extend to the protein, organismal, and population level the characterized physicochemical differences between Se and S (Stadtman 1996).
We have derived a global measure of functional exchangeability across vertebrate selenoproteins and selenoproteomes and provided the first evolutionary assessment of several selective pressures proposed to drive Sec use in proteins. The low exchangeability between Sec and Cys residues is better explained by strong natural selection due to Sec/Cys functional differences and, at best, a moderate role of environmental and metabolic forces, suggesting caution in the interpretation of evolutionary trends in Sec usage as ecological adaptations. Although our results only apply to the vertebrate clade, we feel that common claims of ecological adaptations in the Se field may be premature. Despite the difficulties and uncertainties associated with any molecular inference of the past, different selective factors leave different signatures of selection and these adaptive hypotheses can be examined through established evolutionary principles. Strong evidence for selection is most needed for genes of plausible ecological importance, like selenoproteins, as apparent selective factors may discourage considering alternatives to environmental adaptations (Gould and Lewontin 1979
; Mitchell-Olds et al. 2007
). Furthermore, natural selection is just one of several evolutionary mechanisms responsible for differences at the molecular level (Lynch 2007
) and, despite typical assumptions in Se biology regarding the role of natural selection, no Sec to Cys or Cys to Sec substitution has yet been shown to be adaptive. Whether nonneutral evolutionary processes are responsible for some of these amino acid replacements is unknown. Similarly, whether adaptation to local Se levels or other selective factors have driven the evolution of selenoprotein expression, Se intake, metabolism or transport has not been addressed. These are open questions in Se biology.
A better understanding of the selenoproteomes and neutral evolutionary patterns in other taxa will be necessary to fully assess the generality of our conclusions. For example, the recent identification in the Drosophila
clade of the first animal without selenoproteins is remarkable (Drosophila 12 Genomes Consortium 2007
). Although all known Drosophila
species have three selenoproteins, Drosophila willistoni
has none. Indeed, insects seem to have a higher number of Sec/Cys exchanges in proteins than vertebrates (Chapple and Guigó 2008
; Lobanov et al. 2008a
). The evolutionary forces and selective pressures, if any, driving these replacements are still unclear. Beyond the Sec residue, the evolutionary forces targeting selenoprotein genes as a whole are also poorly known. A notable exception is the Glutathione peroxidase 1 gene, which may have been under adaptive evolution in recent human history (Foster et al. 2006
). In any case, if the results obtained here are representative of more divergent species, the certain conclusion is the unique role of Sec in protein activity and evolution. Overall, Sec and Cys residues may be less functionally exchangeable than usually thought and, if some instances of Sec/Cys substitutions have been adaptive in vertebrates or other taxa, Sec distinct biochemical properties, and not Se geographical distribution, global O2
levels nor metabolic cost, may have played a major role in the evolution of selenoproteomes.