Of the 11 genes, we found that medaka THEA2
) contained a nonsynonymous SNP at the exactly same site where a high Fst
is observed in humans (rs1702003 in exon 6: see the HapMap database; Fig. ). THEA2
is known to be a temperature responsive gene, and it is expressed in brown adipose tissue (BAT) in response to cold stress in mice [18
]. The genotype frequencies at rs1702003 are 98.3% G/G and 1.7% G/A in Europeans and 100% A/A in East Asians and Africans. This could suggest that the European-specific allele of the cold-inducible gene is an adaptation of Europeans to the cold environment around 40,000 years ago when early modern humans expanded to Europe. Interestingly, only Philippine medaka (Oryzias luzonensis
), inhabiting a warmer environment, has a different allele from the other Oryzias
species. While in situ
hybridization showed THEA2
is expressed ubiquitously in medaka embryos, RT-PCR indicated greater THEA2
expression in the brown tissue homologous to mammalian BAT than in the other tissues in adult medaka (data not shown). In the structural predictions for the THEA2, we found that the two SNPs indicated for the human and medaka proteins are located at the junction between the Acyl-CoA hydrolase structural domains in a loop predicted to be highly flexible. There, a G-D (in humans) or L-P change (in medaka) is likely to affect the dynamics of the protein chain and influence (1) the interaction between domains and/or (2) the transmission of conformational changes. We speculate that the amino acid change that affects protein flexibility may be related to temperature adaptation.
Figure 2 Nucleotide (upper) and amino acid (lower) sequence alignments of THEA2. Hd-rR is the inbred strain derived from the southern Japanese population for which the complete genome sequence has been determined . All three (Hd-rR, Northern Japanese and East (more ...)
For another gene, RTTN, we found even more remarkable regional differentiation. The phylogenetic network adding nine individuals from the northern Japanese population and one southern Japanese population indicates the nucleotide changes in the RTTN gene among geographical populations; each population forms a separate cluster and is separated by unique amino acid changes (Fig. ). According to bioinformatic predictions, the RTTN protein is comprised of armadillo-like repeats separated in a few places by disordered loops (Fig. ). A78 is partially buried and its substitution may destabilize the protein structure. S92 is located on the surface and is predicted to be phosphorylated; hence, its substitution may affect structure and/or function by removing a site of posttranslational modification. N140, T143, and P158 are in the disordered loop. Substituting P158 with A may increase the flexibility of the main chain, the introduction of K140 and K143 may increase the entropy of the side chain, and substitution of T143 (predicted to be phosphorylated) may remove a site of posttranslational modification. Thus, substitutions of all these residues are predicted to influence the dynamics of the loop and thus its ability to bind to other molecules or to respond to changes in the environment.
Figure 3 Phylogenetic network of RTTN based on nucleotide sequences from exons 3 + 4 (271 bp). The circle represents geographical regional strains (N.JPN: northern Japanese population; S.JPN: southern Japanese population; W.KOR: western Korean; E.KOR: eastern (more ...)
Figure 4 Structure prediction for RTTN: a well-folded globular part (armadillo-like repeats, aa 1 – 120) and an unstructured linker (aa 121 – 166). The protein chain is colored from blue (N-terminus) to red (C-terminus). α-helices are shown (more ...)
To gain further insight into whether natural selection is involved in the observed nucleotide variations, we plotted the average number of nonsynonymous nucleotide differences per number of nonsynonymous sites (dN) against the average number of synonymous nucleotide differences per number of synonymous sites (dS) estimated for the 11 genes among the 27 medaka strains (Fig. ). Seven of the 11 genes including THEA2 showed an average dN/dS of less than 1, suggesting that the seven genes are under purifying selection. In RTTN, in contrast, there are only nonsynonymous differences in the genomic regions examined (exons 3 and 4: 271 bp in total); in more than half of the population pairs, the dN/dS ratios are significantly greater than 1 (Z-test; p < 0.05). The dN/dS ratios of the LTC and the GRK4 genes are also greater than 1, but these are not statistically significant at 5% level for any pair. We have sequenced the entire RTTN cDNA for seven individual medaka from five geographical populations. Although there are synonymous variations in the other exons, the dN/dS ratios are overall greater than 1, and in nine of the 21 pairs they are statistically significant (p < 0.05; Table ). These results suggest that RTTN is under positive selection in medaka.
Synonymous (X axis) and nonsynonymous (Y axis) substitution ratios estimated by the Nei – Gojobori method. A dN/dS ratio significantly greater than 1 is a convincing indicator of positive selection.
The dN - dS values (upper diagonal) and the significance (lower diagonal) based on RTTN cDNA (5.8 kb) sequences
Although its exact function is not known, RTTN
is reported to be involved in determining the rotation of the body axis and the left-right asymmetry of internal organs during the embryonic development of mice [19
]. The conspicuous differentiation of RTTN
alleles among human populations also suggests differential natural selection acting on different populations: at a nonsynonymous SNP site (rs3911730) in the RTTN
exon 3, the A/A genotype occurs in 90% of Africans, 2% of Europeans and is absent in Asians, while the C/C genotype occurs in 3% of Africans, 80% of Europeans and 100% of Asians.
Previous studies have reported that genes identified in fish through "forward genetic" analysis of phenotypic mutants are involved in forming variations of related phenotypes in humans, e.g. of skin pigmentation [20
] and epithelial development [25
]. Our approach in this study is an extension of these previous studies, as a form of "reverse genetics" of genes that show, as a signature of natural selection acting on them, a prominent level of diversification in the allele frequency among populations with different ecological histories in both fish and humans. We found that out of 11 genes in our analysis, the medaka THEA2
gene has a nonsynonymous polymorphic site at exactly the same position as its ortholog in humans, and the RTTN
gene shows signs of population differentiation that can be explained plausibly by natural selection. The aim of our analysis is not to demonstrate evidence of natural selection in medaka, but to indicate that medaka is a marvelous resource as a "natural library" of genetic diversity, and this approach is efficient enough to find candidate genes targeted by natural selection in both humans and medaka. The exact function of the genes and the exact nature of the functional differences between alleles can be studied more feasibly in medaka, where crossing experiments between different genotypes of interest and transgenic techniques have already been established [7
]. This method can be applied to any polymorphic gene in humans, and larger-scale and more systematic screening of orthologous gene polymorphisms in medaka will find various target genes for further functional analyses. As the medaka has been widely used for carcinogenesis and ecotoxicological studies [7
], for example, in screening for genetic variants concerning medaka carcinogenesis and ecotoxins, it could also be used for testing variations in drug response in humans. Thus, we conclude that the medaka is a good vertebrate model of the functional diversity caused by human DNA polymorphisms that have been identified by recent resequencing and typing efforts.