|Home | About | Journals | Submit | Contact Us | Français|
Among the five groups of visual pigments in vertebrates, the rhodopsin type 2 (RH2) group shows the largest number of gene duplication events. We have isolated three intact and one nonfunctional RH2 opsin genes each from Northern lampfish (Stenobrachius leucopsarus) and scabbardfish (Lepidopus fitchi). Using the deduced amino acid sequences of these and other representative RH2 opsin genes in vertebrates, we have estimated the divergence times and evolutionary rates of amino acid substitution at various stages of RH2 opsin evolution. The results show that the duplications of the lampfish and scabbardfish RH2 opsins have occurred ~60 and ~30 million years ago (Ma), respectively. The evolutionary rates of RH2 opsins in the early vertebrate ancestors were ~0.25 × 10−9/site/year, which increased to ~1 × 10−9 to 3 × 10−9/site/year in euteleost lineages and to ~0.3 × 10−9 to 0.5 × 10−9/site/year in coelacanth and tetrapods.
The fundamental mechanism of evolution of multigene families is gene duplication. Tandemly duplicated genes facilitate a further increase in the number of genes by unequal crossover. During the process of gene duplication, some genes may acquire new functions (Ohno 1970), but many others are also expected to become nonfunctional (Nei and Rooney 2005). Indeed, duplicated globin, sensory receptor, and many other genes reveal the dynamic processes of birth and death of duplicated genes in genomes (Nei 1987; Li 1997; Nei et al. 2008). The rates of sequence evolution can also vary considerably among duplicated genes (Li and Gojobori 1983; Li et al. 1985; Zhang 2006).
Visual pigments mediate the first step of vision, and each pigment consists of a chromophore (either 11-cis-retinal or 11-cis-3,4-dehydroretinal) and an opsin encoded by a specific opsin gene. Many vertebrate species have five major groups of visual pigments that have arisen through four gene duplication events: rhodopsin type 1 (RH1, rhodopsins), rhodopsin type 2 (RH2, RH1-like), short wavelength–sensitive type 1 (SWS1), short wavelength–sensitive type 2 (SWS2), and middle and long wavelength–sensitive (M/LWS) groups, having the evolutionary relationship of [(((RH1, RH2), SWS2), SWS1), M/LWS] (Yokoyama S and Yokoyama R 1996; Yokoyama 2000b; Ebrey and Koutalos 2001). The vertebrate ancestor already possessed all five types of opsins (Yokoyama S and Yokoyama R 1996). During vertebrate evolution, additional gene duplications have occurred in all five groups (Spady et al. 2006; Yokoyama et al. 2008), among which the RH2 group shows the largest number of gene duplication events (Spady et al. 2006).
The major function of a visual pigment can be characterized by its wavelength of maximal absorption (λmax). The λmax of the currently known RH2 pigment ranges from 467 nm in gecko (Kojima et al. 1992) and zebrafish (Chinen et al. 2005) to 535 nm in a cichlid (Carleton et al. 2005). The adaptive λmax shift of the coelacanth RH2 pigment to the depth of 200 m was caused by E122Q and M207L (Yokoyama et al. 1999; Yokoyama et al. 2008). The blue-shifted λmax of the RH2 pigment in the nocturnal gecko was caused mostly by specific amino acid substitutions at four sites (Takenaka and Yokoyama 2007). Because the amount of green–red light is reduced significantly at twilight (McFarland and Munz 1975), the blue-shifted λmax of the gecko pigment seems to be adaptive. The variable λmaxs (467–505 nm) of the zebrafish RH2 pigments are based largely on the amino acids (E or Q) at site 122 (Chinen et al. 2005); however, the biological significance of these variable λmaxs is not immediately clear. If these λmax shifts were needed for organisms to adapt to their ecological environments, then they had to wait for the necessary mutations to accumulate. Consequently, the speed of mutation accumulation is expected to significantly affect the efficiency of such phenotypic adaptations.
Here we have isolated four RH2 opsin genes, each from Northern lampfish (Stenobrachius leucopsarus) and scabbardfish (Lepidopus fitchi); in each species, one RH2 gene has become nonfunctional. Considering these and other representative RH2 opsins in a wide range of vertebrate species, we shall evaluate two fundamental evolutionary statistics: the divergence times at all nodes and the rates of sequence evolution at all branches in the composite phylogenetic tree of RH2 opsins in vertebrates.
High molecular weight DNAs of Northern lampfish (S. leucopsarus) and scabbardfish (L. fitchi) have been isolated from their body tissues using a standard phenol–chloroform extraction procedure (Yokoyama et al. 2008).
The nucleotide sequences between exons 3 and 4 of lampfish RH2 genes were first cloned by the polymerase chain reaction (PCR) method using degenerate forward primer (5′-TBTGYAARCCMATGGGNAGYTTYAAATT-3′) and reverse primer (5′-GCYTTYTGRGTRGAHKCWGADTCYTGCTG-3′). The PCR products in the pBluescript SK(−) vector were sequenced by using an LI-COR 4300 automated DNA sequencer (LI-COR, Lincoln, NE). To clone the remaining coding regions of the lampfish opsin genes, inverse PCRs were performed using an identical set of primers: 5′-CTGTCATGATCCAYTTYAGGCAGC-3′ (forward) and 5′-CCAGCTGAAASCAACTCCRGCTCC-3′ (reverse). The PCR products were cloned into the pBluescript SK(−) vector and sequenced.
To clone the RH2 genes of the scabbardfish, a genomic DNA library was constructed with XhoI-digested λFIXII vector and Sau3AI partially digested scabbardfish genomic DNA (10–20 kb). Screening ~1 × 106 recombinant plaques using the RH1 opsin gene of bovine (Bos taurus, GenBank accession number M21606), we obtained two clones that contained RH2 genes. One of them contained the entire coding region of an RH2 gene (referred to as scabbard 2A). Using the segment between the exons 1 and 5, including four introns, of this gene as the probe, we rescreened the same genomic DNA library and obtained additional six clones. From their restriction maps and DNA sequences, these eight clones were classified into four RH2 genes. All coding regions and four intron regions of the RH2 genes were then sequenced.
We considered 25 representative RH2 opsins from a wide range of species: lamprey 2 (Geotria australis: GenBank accession number AY366494), lampfish 2A (S. leucopsarus: GQ414753), lampfish 2B (GQ414754), lampfish 2D (GQ414756), scabbard 2A (L. fitchi: GQ414752), scabbard 2B (GQ421593), scabbard 2C (GQ421594), scabbard 2D (GQ421595), goldfish 2-1 (Carassius auratus: L11865), goldfish 2-2 (C. auratus: L11866), zebrafish 2-1 (Danio rerio: AB087805), zebrafish 2-2 (AB087806), zebrafish 2-3 (AB087807), zebrafish 2-4 (AB087808), medaka 2A (Oryzias latipes: AB223053), medaka 2B (AB223054), medaka 2C (AB223055), tilapia 2Aα (Oreochromis niloticus: DQ235683), tilapia 2Aβ (DQ235682), tilapia 2B (DQ235681), coelacanth 2 (Latimeria chalumnae: AF131258), chameleon 2 (Anolis carolinensis: AF134189), gecko 2 (Gekko gekko: M92035), pigeon 2 (Columba livia: AF149232), and chicken 2 (Gallus gallus: M92038). In the data analyses, we also considered six RH1 opsins, which is most closely related to the RH2 group (Yokoyama 1997): zebrafish 1 (AF109368), coelacanth 1 (AF131253), frog 1 (Xenopus laevis: U23463), chameleon 1 (L31503), chicken 1 (D00702), and bovine 1 (M21606).
To construct a rooted phylogenetic tree of the RH2 opsins, we applied the Neighbor-Joining (NJ) method (Saitou and Nei 1987) to their nucleotide sequences at a total of 330 codons (positions 1–324 and 327–332) and deduced amino acid sequences of the total of 31 RH1 and RH2 genes. The NJ trees were further modified by incorporating more widely accepted species relationships based on other molecular data (Hedges and Kumar 2009) as well as morphological data (Carroll 1988; Nelson 1994), and the composite tree topology was obtained. The divergence times and the evolutionary rates of amino acid substitution were inferred by using a new model (see the section “A Model for the Duplication of RH2 Opsins”). The numbers of nucleotide and site-specific amino acid substitutions at all branches in the composite tree were also inferred by using PAML (Yang 2007).
Considering four sequences (A, B, X, and Y) at a time, we shall consider the numbers of amino acid substitutions per site per year (α, β, γ, and δ) and divergence times (T1, T2, and T) (fig. 1). In this model, we assume that 1) T1 and T2 are known and 2) the evolutionary rates γ1 at branch (M-N) and γ2 at branch (M-X) are both equal to γ.
Let drs be the number of amino acid substitutions per site between sequence r and sequence s (r, s = A, B, X, and Y). Then,
From these, we can obtain the following formulae:
To evaluate standard errors of these parameters, let a, b, c, d, and e be the numbers of amino acid substitutions per site between N and A, between N and B, between M and N, between M and X, and between M and Y, respectively. Then, a = (dAB − dAX + dBX)/2, b = (dAB + dAX − dBX)/2, c = (−2dAB +dAX + dAY + dBX + dBY − 2dXY)/4, d = (dAX − dAY + dBX − dBY + 2dXY)/4, and e = (−dAX + dAY − dBX + dBY + 2dXY)/4. Expressing the parameters in equation (2) in terms of a, b, c, d, and e, we can obtain the standard errors of T, α, β, γ, and δ (Takezaki et al. 1995; Nei and Kumar 2000).
Previously, using various molecular data, the critical divergence times of different vertebrate species have been obtained: 1) lamprey and other vertebrates—608 million years ago (Ma) (Hedges 2009); 2) fish and tetrapods—455 Ma (Hedges 2009); 3) zebrafish/goldfish and other euteleosts—130 Ma (Wittbrodt et al. 2002); 4) coelacanth and tetrapods—430 Ma (Hedges 2009); 5) reptiles and birds—275 Ma (Shedlock and Edwards 2009); 6) chameleon and gecko—200 Ma (Hedges and Vidal 2009); and 7) pigeon and chicken—105 Ma (van Tuinen 2009). Using some of these divergence times as T1 and T2, we can estimate a new divergence time (T) and the evolutionary rates (α, β, γ, and δ). Testing whether or not these T values agree with the previously known divergence times, we can check the validity of the estimation procedures of the evolutionary rates.
In the estimation, we can first evaluate T (T*), α (α*), and β (β*) for most closely related sequences. We can repeat the procedure by replacing either sequence A or sequence B by another sequence that clusters next, and new T (T**), α (α**), and β (β**) can be evaluated. The evolutionary rate for the distant branch with a time span of T** – T* can then be evaluated by subtracting the d value for T* from that for T** and dividing by (T** – T*). Continuing this procedure, we can evaluate the divergence times and evolutionary rates at all branches in the composite evolutionary tree. The last step involving lamprey 2 can be determined simply by α = (dAB + dAX – dBX)/(2τ) and β = (dAB – dAX + dBX)/(2τ), where τ = 608 × 106 (Hedges 2009).
After screening the scabbardfish genomic DNA library, we obtained a total of eight positive clones. Using their partial DNA sequences and restriction enzyme maps (data not shown), they were classified into four RH2 genes (scabbard 2A, 2B, 2C, and 2D). All of these genes consist of 351 codons, but scabbard 2D has a premature stop codon at site 177 and is a pseudogene (fig. 2), which was caused by the replacement of the triplet (AGA) by TGA. Unfortunately, the relative locations of these genes on chromosomes have not been determined, and the actual number of RH2 gene loci is unknown. However, being separated ~30 Ma (see the section “Divergence Times and Evolutionary Rates”), the pseudogene (scabbard 2D) and the most closely related functional scabbard 2C (see the section “Molecular Phylogeny of Vertebrate RH2 Opsins”) should not be allelic and must be located at different loci. As for the most closely related scabbard 2A and scabbard 2B (see the section “Molecular Phylogeny of Vertebrate RH2 Opsins”), the nucleotides beyond –166 in their 5′ flanking regions become suddenly different (data not shown). Hence, it is most likely that scabbard 2A, 2B, 2C, and 2D represent four distinct genes.
Using PCR and inverse PCR methods, we also isolated four lampfish RH2 genes (lampfish 2A, 2B, 2C, and 2D) (fig. 2). Among these, lampfish 2C is a pseudogene, in which codon 175 has a stop codon (TGA) and the GT/AG splicing signal of its intron 2 has been changed to GA/AG. For this gene, we could not locate the exon 1 even after sequencing 853 nucleotides of its intron 1. Compared with this, the lengths of intron 1 of the other three lampfish RH2 genes are only 109~364 nucleotides. The other three lampfish RH2 genes consist of 341~348 codons (fig. 2). Nonfunctional lampfish 2C and functional lampfish 2D diverged ~7.5 Ma (see the Discussion), and they should not be allelic and must be located at different loci. Moreover, the closely related lampfish 2A and lampfish 2B (see the section “Molecular Phylogeny of Vertebrate RH2 Opsins”) consist of different number of amino acids (fig. 2). Hence, it is highly likely that lampfish has also four distinct genes.
Phylogenetic trees of the 25 RH2 opsins from 12 species (lampfish, scabbardfish, lamprey, medaka, tilapia, zebrafish, goldfish, coelacanth, chameleon, gecko, pigeon, and chicken) have been obtained by applying the NJ method to their nucleotide and deduced amino acid sequences. These NJ trees consistently show that the RH2 opsins in tetrapods are most closely related to those in coelacanth, euteleosts, and lamprey, in that order (fig. 3A), which is expected from morphological data (Carroll 1988) and molecular data (Hedges 2009). Furthermore, the RH2 opsins in lampfish, medaka, tilapia, and scabbardfish (lampfish/medaka/tilapia/scabbardfish) and those in zebrafish and goldfish (zebrafish/goldfish) are clustered into two separate groups, which is also consistent with the species tree based on morphological characters (Nelson 1994).
Figure 3A includes the following topologies as well: 1) scabbard 2A is most distantly related among the four scabbardfish RH2 opsins; 2) zebrafish 2-4 is more closely related to the two goldfish RH2 opsins than to zebrafish 2-3; and 3) chameleon 2 is more closely related to pigeon 2/chicken 2 than to gecko 2. First of all, the introns 1 (6 indels), 2 (5 indels), 3 (4 indels), and 4 (one indel) of the four scabbardfish RH2 genes clearly show that scabbard 2A and 2B (scabbard 2A/2B) and scabbard 2C/2D form two distinct clusters (supplementary fig. S1, Supplementary Material online), despite the highly supported clustering of scabbard 2B, 2C, and 2D in figure 3A. Second, when the corresponding nucleotide sequences are compared, the zebrafish 2-3/2-4 grouping is supported by a bootstrap value of 0.99, and it clusters with goldfish 2-1/2-2 group with a bootstrap value of 0.99. Hence, it is more likely that the zebrafish 2-3/2-4 is more closely related to the goldfish 2-1/2-2 than to the zebrafish 2-1/2-2, as previously suggested by several authors (Chinen et al. 2003, 2005; Matsumoto et al. 2006; Spady et al. 2006; Takenaka and Yokoyama 2007). Third, the reptilian and avian pigments should form two distinct groups.
Incorporating these three features into the tree topology in figure 3A, we obtained the composite tree topology of RH2 opsins (fig. 3B). Applying PAML analysis to this predetermined tree topology, we evaluated the numbers of nucleotide substitutions per site. Compared with those in the amino acid tree (fig. 3A), the branches leading to the RH2 opsin genes in euteleosts and gecko are more pronounced in the nucleotide tree (fig. 3B).
In constructing the rooted phylogenetic trees of the RH2 opsins, we have compared the numbers (d) of amino acid substitutions per site within and between pairs of RH1 and RH2 opsins. The d values for the pairs of RH1 opsins (group 1), for those of the euteleost RH2 opsins (group 2), and for those of the coelacanth and tetrapod RH2 opsins (group 3) ranged 0.168–0.321, 0.004–0.427, and 0.085–0.299, respectively, whereas the d values for the pairs between groups 1 and 2, between groups 1 and 3, and between groups 2 and 3 were 0.288–0.462, 0.277–0.401, and 0.311–0.419, respectively. Hence, some orthologous RH2 opsin pairs have larger distances than the two paralogous opsin pairs, showing that molecular clock does not hold for RH2 opsins, which can also be seen in figures 3A and and3B3B.
To satisfy the assumption of γ1 = γ2 in our estimation procedure (see Materials and Methods), an appropriate sequence X must be selected. The variable branch lengths in figure 3A indicate that our choice of sequence X is rather limited, and the method is highly data dependent. Hence, we inferred the parameters (α, β, γ, δ, and T) using various RH2 opsins as sequence X: 1) zebrafish 2-1 for the lampfish/medaka/tilapia/scabbardfish opsins; 2) scabbard 2A for the zebrafish/goldfish opsins; 3) zebrafish 2-4 for coelacanth/tetrapod opsins; 4) chicken 2, chameleon 2, coelacanth 2, and zebrafish 2-4 for tetrapod opsins; and 5) lamprey 2 for the euteleost/tetrapod opsins (supplementary table S1, Supplementary Material online).
To check the validity of using these specific sequences as sequence X, we can first compare the γ values obtained by considering closely related sequence pairs. Because of their rapid evolution, the comparison of γ values for the pairs of euteleost opsins is informative. The γ values evaluated for the pairs between lampfish 2A/2B/2D, between lampfish 2A/2B/2D and medaka 2A/tilapia 2B, between medaka 2B/2C and tilapia 2Aα/2Aβ, and between scabbard 2A/2B and scabbard 2C/2D are 1.36–1.38, 1.07–1.11, 1.26–1.30, and 1.26 × 10−9 to 1.29 × 10−9/site/year, respectively (supplementary table S1, Supplementary Material online). Hence, the γ values within each group are very similar.
Second, by evaluating γ1 and γ2 separately, we can also check their equality more directly. Assuming that the lampfish/medaka/tilapia/scabbardfish and zebrafish/goldfish RH2 opsin groups diverged 136 Ma (Wittbrodt et al. 2002) and using zebrafish 2-1 as sequence X, we considered six pairs of sequences A and B: 1) lampfish 2A and 2B; 2) medaka 2A and tilapia 2B; 3) medaka 2B and 2C; 4) tilapia 2Aα and 2Aβ; 5) scabbard 2A and 2B; and 6) scabbard 2C and 2D. Then, using the evolutionary rates (or branch lengths) and appropriate T values given in figure 4, the six γ1 and one γ2 values were obtained (table 1). The results show that the γ1 values vary between 1.28 × 10−9 to 1.84 × 10−9/site/year, which differ significantly neither from each other nor from the γ2 value of 1.59 × 10−9/site/year, again justifying the assumption of γ1 = γ2.
Using equation (2), we estimated T, α, and β for all 48 branches that are specified in figure 3B (supplementary table S1, Supplementary Material online). Before studying these results, two comments may be in order. First, we evaluated the divergence times between euteleost and coelacanth/tetrapod (438 ± 36 Ma), between coelacanth and tetrapod (438 ± 34 Ma), between lampfish/medaka/tilapia/scabbardfish and zebrafish/goldfish (136 ± 15 Ma), between reptile and bird (253 ± 20 Ma), between chameleon and gecko (183 ± 15 Ma), and between pigeon and chicken (106 ± 11 Ma) RH2 opsins (fig. 4). These estimates are reasonably close to the times of the corresponding speciation events of 455, 430, 130, 275, 200, and 105 Ma (Wittbrodt et al. 2002; Hedges 2009; Hedges and Vidal 2009; Shedlock and Edwards 2009; van Tuinen 2009).
Second, using these and other divergence times obtained (fig. 4), we can also obtain the second set of evolutionary rates by inferring the actual numbers of site-specific amino acid substitutions at all branches using PAML and by dividing each by an appropriate evolutionary time span. The two sets of evolutionary rates are very similar at 38 out of a total of 48 branches (79%), but those at the remaining 10 branches differ significantly (supplementary table S2, Supplementary Material online). Among the latter group, the unreasonably high evolutionary rate (50.5 ± 16.2 × 10−9) for the common ancestor of coelacanth/tetrapod RH2 opsins (branch 40) seems to have occurred because of the short time span estimated (0.6 × 106). If we adjust the split between euteleosts and tetrapods and that between coelacanth and tetrapods to 455 and 430 Ma, respectively (Hedges 2009), then the evolutionary rate becomes 1.21± 0.39 × 10−9/site/year and does not differ significantly from the estimate inferred from equation (2) (0.67 × 10−9/site/year, fig. 4).
The results in figure 4 show that the evolutionary rate of the RH2 opsin in the common ancestor of euteleosts and tetrapods was 0.26 × 10−9/site/year. This value is virtually identical to the evolutionary rates of lamprey 2, RH2 opsins in the euteleost ancestor, and coelacanth 2. Moreover, these rates do not differ significantly from that of the RH2 opsin in the common ancestor of coealanth and tetrapods (0.67 × 10−9/site/year).
The RH2 opsin in the lampfish/medaka/tilapia/scabbardfish ancestor (2.8 × 10−9/site/year) evolved with a significantly faster rate than the zebrafish/goldfish ancestor (0.7 × 10−9/site/year) (P < 0.01). During the last 90–100 million years (My), the lineages leading to lampfish 2D and medaka 2B/2C have maintained high average rates of evolutionary changes at ~1 × 10−9 to 3 × 10−9/site/year, explaining their long branch lengths in figure 3A. On the other hand, during the last 30 My, the scabbardfish RH2 opsins, including nonfunctional scabbard 2D, reduced their evolutionary rates significantly (<1 × 10−9/site/year).
For 80–90 My after the separation from the lampfish/medaka/tilapia/scabbardfish opsin group, the zebrafish/goldfish RH2 opsins evolved with the rates of 0.7 × 10−9 to 1.1 × 10−9/site/year, causing the relatively shorter branch lengths of the zebrafish and goldfish RH2 opsins (fig. 3A). Among the zebrafish and goldfish opsins, zebrafish 2-1 has the longest branch length (fig. 3A), which is explained by the accelerated mutant substitution during the last ~45 My.
The evolutionary rate of the RH2 opsin in the common ancestor of coelacanth and tetrapods is estimated to be 0.67 × 10−9/site/year, which do not differ significantly from that of coelacanth 2 (0.31 × 10−9/site/year). During the last ~180 My, gecko 2 has evolved with a much faster rate than chameleon 2, making an unusually long branch length among the tetrapod RH2 opsins as suggested by figure 3B. Even then, its evolutionary rate is much lower than those of many RH2 opsins in euteleosts. Hence, with the exception of gecko 2, the evolutionary rates of tetrapod RH2 opsins are ~0.2 × 10−9 to 0.5 × 10−9/site/year (fig. 4).
We have seen that the evolutionary rates of amino acid substitution in the RH2 opsins in vertebrate ancestors were ~0.25 × 10−9/site/year, which is much lower than those of such well-known proteins as hemoglobin α and β chains, ~1.2 × 10−9/site/year (Nei 1987). The RH2 opsins in the euteleost ancestor evolved equally slowly, but during the 136 My of euteleost evolution, the evolutionary rates of RH2 opsins have been accelerated roughly by 5–6 folds, whereas those of the tetrapod RH2 opsins have increased only by at most 2-folds.
One curious aspect of RH2 opsin evolution is that some orthologous RH2 opsin pairs can be more distantly related than between paralogous RH1 and RH2 opsin pairs, showing that RH1 opsins evolved with much slower rates than some RH2 opsins. To evaluate the evolutionary rates for the RH1 opsins, we considered five representative RH1 opsins (zebrafish1, coelacanth 1, chameleon 1, chicken 1, and bovine 1) and used lamprey 2 as the outgroup. The evolutionary rates of the RH1 opsins obtained vary between 0.1 × 10−9 to 1.0 × 10−9/site/year (fig. 5). As suspected (Yokoyama 2000a), chameleon 1 has evolved more quickly than others, but its evolutionary rate (0.65 × 10−9/site/year) is similar to that of gecko 2 and is still significantly lower than those of many RH2 opsins in euteleosts. Figure 5 also shows that the RH1 opsin of the tetrapod ancestor had a relatively high evolutionary rate of 0.97 × 10−9/site/year. However, because this rate is associated with a large standard error, it seems reasonable to consider that the RH1 opsins have evolved with the rates of ~0.1 × 10−9 to 0.4 × 10−9/site/year, which are similar to those of RH2 opsins in the early vertebrate ancestors and tetrapods.
Because of its incomplete sequence, lampfish 2C has been excluded from the analyses. To study the evolutionary processes of this pseudogene, we also compared the 209 deduced amino acids at positions 122–324 and 327–332 of lampfish 2C, 25 RH2, and 6 RH1 opsins (see Materials and Methods). In this region, no amino acid difference can be found between lampfish 2A and lampfish 2B (fig. 2). The NJ tree of the 32 sequences shows that lampfish 2A/2B and lampfish 2C/2D both cluster with bootstrap values of 1.0. When we compare the deduced amino acid sequences of lampfish 2C (sequence A), lampfish 2D (sequence B), zebrafish 2-1 (sequence X), and chameleon 1 (sequence Y), T = 7.6 ± 2.2 × 106, α = 7.33 ± 2.23 × 10−9/site/year, and β = 2.97 ± 1.41 × 10−9/site/year. Hence, the evolutionary rate of lampfish 2C is significantly higher than that of scabbard D (0.39 × 10−9/site/year; P < 0.01).
Using zebrafish 2-1 (sequence X) and chameleon 1 (sequence Y), we can also show that lampfish 2A and lampfish 2C have evolved with the rates of 0.81 × 10−9 and 1.48 × 10−9/site/year, respectively, and lampfish 2A and lampfish 2D with the rates of 0.92 × 10−9 and 1.03 × 10−9/site/year, respectively. From these estimates, the common ancestor of the four lampfish RH2 opsins is traced back to 56 ± 6.9 Ma, and lampfish 2A (or lampfish 2B) and the lampfish 2C/2D ancestor have evolved at the rates of 0.87 ± 0.27 × 10−9 and 0.67 ± 0.26 × 10−9/site/year, respectively. The weighted average between the evolutionary rates of the lampfish 2C/2D ancestor and lampfish 2D is given by 0.98 ± 0.30 × 10−9/site/year and is very similar to the corresponding rate, 1.25 ± 0.26 × 10−9/site/year, in figure 4.
The variable evolutionary rates of mutation accumulation in the two pseudogenes (lampfish 2C and scabbard 2D) are comparable to the dramatically different evolutionary rates of the RH2 pseudogenes in fugu and Tetradon pufferfish (Neafsey and Hartl 2005). Moreover, the SWS1 pseudogenes in two coelacanth species have evolved with an order of magnitude lower than those of hemoglobin α and β chains (Yokoyama and Tada 2000). Hence, scabbard 2D also shows that pseudogenes do not always evolve with a faster rate than the functional genes (see also fig. 3B), and their evolutionary rates can be much lower than a typical evolutionary rate of pseudogenes, ~5 × 10−9/site/year (Li et al. 1985).
If we assume that lampfish and scabbardfish have four duplicate RH2 genes, we can identify 10 independent gene duplication events in figure 4. Curiously, these gene duplications can be found only in euteleosts. The significance of this observation is not immediately clear, but it may be related to the functional expansion of duplicated RH2 opsins in diverse water environments. For example, zebrafish 2-1, 2-2, 2-3, and 2-4 pigments have λmaxs of 467, 478, 488, and 505 nm, respectively (Chinen et al. 2003), whereas the engineered ancestral pigments of zebrafish 2-1/2-2/2-3/2-4, of zebrafish 2-1/2-2, and of zebrafish 2-3/2-4 have had λmaxs of 506, 474, and 506 nm, respectively (Chinen et al. 2005). These results suggest that zebrafish 2-4 has retained the ancestral phenotype (λmax = ~505 nm), and the others have decreased their λmaxs. Although they are not statistically significant, the evolutionary rates of zebrafish 2-1, 2-2, and 2-3 tend to be higher than that of zebrafish 2-4 (fig. 4). As noted in the beginning, the blue-shifted λmax of the rapidly evolving gecko 2 is likely to be adaptive. These limited data suggest that the accelerated rate of sequence evolution and the gain of new function are positively correlated. The duplication of RH2 opsins seems to be playing a similar role.
To understand the biological significance of gene duplications, we would like to know 1) how gene duplications occur and 2) how these duplicate genes affect the fitness of organisms. The four scabbard RH2 genes may provide a rare opportunity in exploring the first question. As supplementary figure S1 (Supplementary Material online) suggests, the sequences of the four scabbardfish RH2 genes can be aligned easily, and the positions of indels and nucleotide substitutions can be relatively easily identified throughout the coding and noncoding segments. By obtaining the entire DNA sequence information in the intergenic regions of the RH2 genes, we may be able to reconstruct the detailed evolutionary history of the gene duplication events. The analyses may also shed light on the structural characteristics of why the RH2 genes have gained multiple copies so many times during euteleost evolution.
Although it is almost impossible to evaluate the fitness differences caused by various duplicated visual pigments (Lewontin 1979), we can still explore the mechanisms of molecular adaptation by studying the λmaxs of RH2 pigments living in different water environments (Yokoyama et al. 2008). This is because not only are visual pigments associated strongly with organisms’ ecological environments (Lythgoe 1979; Yokoyama S and Yokoyama R 1996; Yokoyama 1997) but also the light (or color) environments in the water can be reasonably well quantified (Jerlov 1976). Indeed, comparisons of various RH2 pigments in different fish species are expected to provide an excellent opportunity to study the adaptive significance of gene duplications
This work was supported by the National Institutes of Health (ROIEY016400) and Emory University.