|Home | About | Journals | Submit | Contact Us | Français|
Charles Darwin proposed that evolution occurs primarily by natural selection, but this view has been controversial from the beginning. Two of the major opposing views have been mutationism and neutralism. Early molecular studies suggested that most amino acid substitutions in proteins are neutral or nearly neutral and the functional change of proteins occurs by a few key amino acid substitutions. This suggestion generated an intense controversy over selectionism and neutralism. This controversy is partially caused by Kimura's definition of neutrality, which was too strict (|2Ns| ≤ 1). If we define neutral mutations as the mutations that do not change the function of gene products appreciably, many controversies disappear because slightly deleterious and slightly advantageous mutations are engulfed by neutral mutations. The ratio of the rate of nonsynonymous nucleotide substitution to that of synonymous substitution is a useful quantity to study positive Darwinian selection operating at highly variable genetic loci, but it does not necessarily detect adaptively important codons. Previously, multigene families were thought to evolve following the model of concerted evolution, but new evidence indicates that most of them evolve by a birth-and-death process of duplicate genes. It is now clear that most phenotypic characters or genetic systems such as the adaptive immune system in vertebrates are controlled by the interaction of a number of multigene families, which are often evolutionarily related and are subject to birth-and-death evolution. Therefore, it is important to study the mechanisms of gene family interaction for understanding phenotypic evolution. Because gene duplication occurs more or less at random, phenotypic evolution contains some fortuitous elements, though the environmental factors also play an important role. The randomness of phenotypic evolution is qualitatively different from allele frequency changes by random genetic drift. However, there is some similarity between phenotypic and molecular evolution with respect to functional or environmental constraints and evolutionary rate. It appears that mutation (including gene duplication and other DNA changes) is the driving force of evolution at both the genic and the phenotypic levels.
In his book On the Origin of Species, Charles Darwin (1859) proposed that all organisms on earth evolved from a single proto-organism by descent with modification. He also proposed that the primary force of evolution is natural selection. Most biologists accepted the first proposition almost immediately, but the second proposal was controversial and was criticized by such prominent biologists as Thomas Huxley, Moritz Wagner, and William Bateson. These authors proposed various alternative mechanisms of evolution such as transmutation theory, Lamarckism, geographic isolation, and nonadaptive evolution (see Provine 1986, chap. 7). Because of these criticisms, Darwin later changed his view of the mechanism of evolution to some extent (Origin of Species 1872, chap. 7). He was a pluralistic man and accepted a weak form of Lamarkism and nonadaptive evolution (see Provine 1986). Nevertheless, he maintained the view that the natural selection operating on spontaneous variation is the primary factor of evolution. His main interest was in the evolutionary change of morphological or physiological characters and speciation.
Another critic of evolution by natural selection was the post-Mendelian geneticist Thomas Morgan. He rejected Lamarkism and any creative power of natural selection and argued that the most important factor of evolution is the occurrence of advantageous mutations and that natural selection is merely a sieve to save advantageous mutations and eliminate deleterious mutations (Morgan 1925, 1932). For this reason, his view is often called mutationism. However, this view should not be confused with the saltation theory of Bateson (1894) or the macromutation theory of De Vries (1901–1903), in which natural selection plays little role. In Morgan's time the genetic basis of mutation was well established, and his theory of evolution was appealing to many geneticists. The only problem was that most mutations experimentally obtained were deleterious, and this observation hampered the general acceptance of his theory. He also proposed that some part of morphological evolution is caused by neutral mutation. In his 1932 book The Scientific Basis of Evolution, he stated “If the new mutant is neither more advantageous than the old character, nor less so, it may or may not replace the old character, depending partly on chance; but if the same mutation recurs again and again, it will most probably replace the original character” (p. 132).
However, Morgan's mutation-selection theory or mutationism gradually became unpopular as the neo-Darwinism advocated by Fisher (1930), Wright (1931), Haldane (1932), Dobzhansky (1937, 1951), and others gained support from many investigators in the 1940s. In neo-Darwinism, natural selection is assumed to play a much more important role than mutation, sometimes creating new characters in the presence of genetic recombination. Although there are several reasons for this change (see Nei 1987, chap. 14), two are particularly important. First, most geneticists at that time believed that the amount of genetic variability contained in natural populations is so large that any genetic change can occur by natural selection without waiting for new mutations. Second, mathematical geneticists showed that the gene frequency change by mutation is much smaller than the change by natural selection. Neo-Darwinism reached its pinnacle in the 1950s and 1960s, and at this time almost every morphological or physiological character was thought to have evolved by natural selection (Dobzhansky 1951; Mayr 1963).
This situation again started to change as molecular data on evolution accumulated in the 1960s. Studying the GC content of the genomes of various organisms, early molecular evolutionists such as Sueoka (1962) and Freese (1962) indicated the possibility that the basic process of evolution at the nucleotide level is determined by mutation. Comparative study of amino acid sequences of hemoglobins, cytochrome c, and fibrinopeptides from various organisms also suggested that most amino acid substitutions in a protein do not change the protein function appreciably and are therefore selectively neutral or nearly neutral, as mentioned below. However, this interpretation was immediately challenged by eminent neo-Darwinians such as Simpson (1964) and Mayr (1965), and this initiated a heated controversy over selectionism versus neutralism.
An even more intense controversy on this subject was generated when protein electrophoresis revealed that the extent of genetic variation within populations is much higher than previously thought. At that time, most evolutionists believed that the high degree of genetic variation can be maintained only by some form of balancing selection (Mayr 1963; Ford 1964). However, a number of authors argued that this variation can also be explained by neutral mutations. From the beginning of the 1980s, the study of molecular evolution was conducted mainly at the DNA level, but the controversy is still continuing. This longstanding controversy over selectionism versus neutralism indicates that understanding of the mechanism of evolution is fundamental in biology and that the resolution of the problem is extremely complicated. However, some of the controversies were caused by misconceptions of the problems, misinterpretations of empirical observations, faulty statistical analysis, and others.
Because I have been involved in this issue for the last 40 years and have gained some insights, I would like to discuss this controversy with historical perspectives. Obviously, the discussion presented will be based on my experience and knowledge, and therefore it may be biased. In my view, however, we can now reach some consensus and examine what has been solved and what should be done in the future. Needless to say, I shall not be able to cover every subject matter in this short review, and I would like to discuss only fundamental issues.
Before the molecular study of evolution was introduced around 1960, most studies of the mechanism of evolution were conducted by using the Mendelian approach. Because this approach depended on crossing experiments to identify homologous genes, the studies were confined to within-species genetic changes. Partly for this reason, the evolutionary study was concerned with allelic frequencies and their changes within species.
In the molecular approach, however, the evolutionary change of genes can be studied between any pair of species as long as the homologous genes can be identified. This removal of species barrier introduced new knowledge about long-term evolution of genes. Furthermore, because proteins are the direct products of transcription and translation of genes, we can study the evolutionary change of genes by examining the amino acid sequences of proteins. For this reason, a number of authors compared the amino acid sequences of hemoglobins, cytochrome c, fibrinopeptides, etc. from a wide variety of species. Many of these studies are presented in the symposium volume of Bryson and Vogel (1965), Evolving Genes and Proteins.
These studies revealed some interesting properties of molecular evolution. First, the number of amino acid substitutions between two species was approximately proportional to the time since divergence of the species (Zuckerkandl and Pauling 1962, 1965; Margoliash 1963; Doolittle and Blombach 1964). Second, amino acid substitutions occurred less frequently in the functionally important proteins or protein regions than in less important proteins or protein regions (Margoliash and Smith 1965; Zuckerkandl and Pauling 1965). Thus, the rate of amino acid substitution was much higher in less important fibrinopeptides than in essential proteins such as hemoglobins and cytochrome c, and the active sites of hemoglobins and cytochrome c showed a much lower rate of evolution than other regions of the proteins. A simple interpretation of these observations was to assume that amino acid substitutions in the nonconserved regions of proteins are nearly neutral or slightly positively selected and that the amino acids in functionally important sites do not change easily to maintain the same function (Freese and Yoshida 1965; Margoliash and Smith 1965; Zuckerhandl and Pauling 1965, pp. 148–149). Jukes (1966, p. 10) also stated “The changes produced in proteins by mutations will in some cases destroy their essential functions but in other cases the change allows the protein molecule to continue to serve its purpose.”
Nearly at the same time, a few molecular biologists who studied the interspecific variation of genomic GC content suggested that a large part of the variation is due to the difference between the forward (AT → GC) and backward (GC → AT) mutation rates and it has little to do with natural selection (Freese 1962; Sueoka 1962). This suggestion was probably largely correct, but it was not conclusive because the relationship between the GC content and the extent of selection was unclear at the gene level. Later Bernardi et al. (1985) discovered that the GC content in warm-blooded vertebrates varies considerably from chromosomal region to chromosomal region (isochores) and suggested that this variation is caused by natural selection.
In the middle 1960s, protein electrophoresis revealed that most natural populations contain a large amount of genetic polymorphism, and this finding led to a new level of controversy over selectionism and neutralism, as mentioned above. This subject will be discussed later in some detail.
At this juncture, Kimura (1968a) and King and Jukes (1969) formally proposed the neutral theory of molecular evolution. Kimura first computed the average number of nucleotide substitutions per mammalian genome (4 × 109 nt) per year from data on amino acid substitutions in hemoglobins and a few other proteins and obtained about one substitution every 2 years. (Actually he used 3.3 × 109 nt as the mammalian genome size after elimination of silent nucleotide sites.) He then noted that this rate is enormously high compared with the estimate of Haldane (1957) of the upper limit of the rate of gene substitution by natural selection that is possible in mammalian organisms (one substitution every 300 generations or every 1,200 years if the average generation time is 4 years in mammals). Haldane's estimate was based on the cost of natural selection that is tolerable by the average fertility of mammalian organisms. If we accept Haldane's estimate, such a high rate of nucleotide substitution (one substitution every 2 years) cannot occur by natural selection alone, but if we assume that most substitutions are neutral or nearly neutral and are fixed by random genetic drift, any number of substitutions is possible as long as the substitution rate is lower than the mutation rate. For this reason, Kimura concluded that most nucleotide substitutions must be neutral or nearly neutral.
This paper was immediately attacked by Maynard Smith (1968) and Sved (1968), who argued that Haldane's cost of natural selection can be reduced substantially if natural selection occurs by choosing only individuals in which the number of advantageous genes is greater than a certain number (truncation selection). Brues (1969) argued that gene substitution is the process of increase of population fitness, and therefore it must be beneficial and should not impose any cost to the population. However, Haldane's cost of natural selection is actually equivalent to the fertility excess required for gene substitution (Crow 1968, 1970; Felsenstein 1971; Nei 1971). In other words, for a gene substitution to occur in a population of constant size every individual should produce on average more than one offspring because natural selection occurs only when individuals containing disadvantageous mutations leave fewer offspring than other individuals. If there is no fertility excess, the population size would decline every generation. The higher the fertility excess, the higher the number of gene substitutions possible.
The criticism of Maynard Smith (1968) and Sved (1968) is also unimportant if we note that truncation selection apparently occurs very rarely in nature (Nei 1971, 1975). For truncation selection to occur, the number of advantageous genes in each individual must be identifiable before selection so that natural selection allows the best group of individuals to reproduce, as in the case of artificial selection. In practice, however, natural selection operates in various stages of development for different characters. Therefore, selection must be more or less independent for different loci. This justifies Haldane's theory of cost of natural selection and supports Kimura's argument for the neutral theory of molecular evolution.
However, Kimura's paper had at least two deficiencies. First, in the computation of the cost of natural selection, he assumed that all nucleotides in the genome were subject to natural selection. In practice, the unit of selection should be a gene or an amino acid because noncoding regions of DNA are largely irrelevant to the evolution of proteins and organisms. Considering the extent of mutation load tolerable for mammalian species, Muller (1967) had estimated that the number of genes in the human genome is probably no more than 30,000. Interestingly, the human genome sequence data suggest that the total number of functional genes is about 23,000 (International Human Genome Sequencing Consortium 2004). This number is much smaller than the number of nucleotides (3.3 × 109), which Kimura used in his computation. If we consider a gene as the unit of selection, as Haldane did, and assume that the mammalian genome contains 23,000 genes, the average rate of gene substitution now becomes one substitution every 286,000 [=2/(2.3 × 104/3.3 × 109)] years. (Here one nucleotide substitution every 2 years was assumed.) This is far less than Haldane's upper limit (one substitution every 1,200 years). A similar computation was made by Crow (1970). In contrast, if we consider an amino acid as the unit of selection and each gene encodes on average 450 amino acid sites (Zhang 2000), the average rate of amino acid substitution will be one substitution every 636 (=2.86 × 105/450) years. This rate is about two times higher than Haldane's upper limit.
The above computation was done under the assumption of an infinite population size, and it is known that in finite populations the cost of natural selection is reduced considerably (e.g., Kimura and Maruyama 1969; Ewens 1972; Nei 1975, p. 65). This will further weaken Kimura's original argument. However, at each locus deleterious mutations occur every generation and are expected to impose another kind of genetic load, i.e., mutation load. According to Muller (1950, 1967) this genetic load (genetic death) is substantial, and because of this load, the number of functional genes that can be maintained in a mammalian genome was estimated to be about 30,000, as mentioned above. If we consider both the cost of natural selection and the mutation load, Kimura's computation may not be so outrageous. Furthermore, what is important is the fact that Kimura initiated the study of population dynamics of neutral mutations and that he later became the strongest defender of the neutral theory and provided much evidence for it.
The second deficiency is concerned with the overly strict definition of selective neutrality. According to him, mutations with |2Ns|<1 or |s| ≤ 1/(2N) are defined as neutral, where N is the effective population size and s is the selection coefficient for the mutant heterozygotes, the selection coefficient for the mutant homozygote being 2s (Kimura 1968a 1983). Here the fitnesses of the wild-type homozygote (A1A1), the mutant heterozygote (A1A2), and the mutant homozygotes (A2A2) are given by 1, 1 + s, and 1 + 2s, respectively. I recall that I did not like this definition when it was proposed because it generates quite unrealistic consequences. For example, if a deleterious mutation with s = −0.001 occurs in a population of N = 106, |s| is much greater than 1/(2N) = 5 × 3 10−7. Therefore, this mutation will not be called “neutral.” In this case, however, the fitness of mutant homozygotes will be lower than that of wild-type homozygotes only by 0.002. Is this small magnitude of fitness difference biologically significant? In reality, it will have little effect on the survival of mutant homozygotes or heterozygotes because this magnitude of fitness difference is easily swamped by the large random variation in the number of offspring among different individuals, by which s is defined (see Appendix). By contrast, in the case of brother-sister mating N = 2, so that even a semilethal mutation with s = −0.25 will be called neutral. If this mutation is fixed in the population, the mutant homozygote has a fitness of 0.5 compared with the nonmutant homozygote. In this case the mutant line will quickly disappear in competition with the original line. This example clearly indicates that Kimura's definition is inadequate.
In my view, the neutrality of a mutation should be defined by considering its effect when the mutation is fixed in the population. The probability of fixation or the time until fixation is of secondary importance (fig. 1A).
Unlike mathematical geneticists, molecular biologists have had a relaxed concept of neutral mutations, and when a mutation does not change the gene function appreciably, they called it more or less neutral (e.g., Freese 1962; Freese and Yoshida 1965; King and Jukes 1969; Wilson, Carlson, and White 1977; Perutz 1983) (fig. 1A). According to this definition, |s| for neutral mutations is likely to be at least as great as 0.001. If we accept this definition, we do not have to worry about minor allelic differences in fitness and can avoid unnecessary controversies concerning the minor effects of selection.
There are some exceptions to be considered about the above definition. Bulmer (1991), Akashi (1995), and Akashi and Schaeffer (1997) estimated that the difference in fitness between the preferred and nonpreferred synonymous codons, which is caused by the differences in energy required for biosynthesis of amino acids (Akashi and Gojobori 2002), is less than |s| ≤ 4/N at the nucleotide level. However, because the frequencies of preferred and nonpreferred synonymous codons are nearly the same for all genes in a given species, the cumulative effect of selection for all codon sites may become significant. If this cumulative effect is sufficiently large, one may explain the evolutionary origin of codon usage bias. In general, if there is a character controlled by a large number of loci, even a small |s| value for each locus might be sufficient for changing the character even though it may take a long time.
Furthermore, various statistical methods developed by Kimura (1983) and others will be useful for testing the neutral theory. In this case, if the neutral theory is not rejected by these methods, the newly defined neutral theory certainly cannot be rejected. Furthermore, even if the strict neutral theory is rejected, the new neutral theory may not be rejected. However, the same thing happens with Kimura's theory because his theory allows the existence of deleterious mutations (see below). At any rate, the biological definition of neutral alleles is more appropriate in the study of evolution than the statistical definition, and the neutrality of mutations should eventually be studied experimentally.
As mentioned above, the neutral theory allows the existence of deleterious alleles that may be eliminated from the population by purifying selection. This type of alleles contribute to polymorphism but not to amino acid or nucleotide substitutions, and their existence has been predicted by the “classical” theory (see below). It is known that deleterious mutations cannot be fixed in the population unless the selection coefficient is very small (Li and Nei 1977).
In the above discussion we considered the selective neutrality of mutations at a locus ignoring the effects of other genes. As emphasized by Wright (1932), Lewontin (1974), and others, this type of situation would never occur in natural populations. Recent studies on the development of morphological characters or immune systems have shown that they are controlled by complicated networks of interactions between DNA (or RNA) and proteins or between different proteins (e.g., Davidson 2001; Wray et al. 2003; Klein and Nikolaidis 2005). In these genetic systems there are many different developmental or functional pathways for generating the same phenotypic character. Multiple functions of genes (gene sharing; Piatigorsky, in press) are also expected to enhance the extent of neutral mutations. For these reasons, there are many dispensable genes in the mammalian genome. Gene knockout experiments have shown that mice with many inactivated genes in the Hox, major histocompatibility complex (MHC) class I, or olfactory system can survive without any obvious harmful effects. Of course, these knockout genes would not be necessarily neutral in nature, but some of them could be nearly neutral (Wagner 2005). In this paper, however, we will not consider this subject though it is an important one.
King and Jukes (1969) took a different route to reach the idea of neutral mutations. They examined extensive amounts of molecular data on protein evolution and polymorphism and proposed that a large portion of amino acid substitutions in proteins occurs by random fixation of neutral or nearly neutral mutations and that mutation is the primary force of evolution and the main role of natural selection is to eliminate mutations that are harmful to the gene function. This idea was similar to that of Morgan (1925, 1932) but was against the then popular neo-Darwinian view in which the high rate of evolution is achieved only by natural selection (Simpson 1964; Mayr 1965). According to King and Jukes, proteins requiring rigid functional and structural constraint (e.g., histone and cytochrome c) are expected to be subject to stronger purifying selection than proteins requiring weak functional constraints (e.g., fibrinopeptides), and therefore the rate of amino acid substitution would be lower in the former than in the latter. Extending the results obtained by Zuckerhandl and Pauling (1965) and Margoliash (1963), they also emphasized that the functionally important parts of proteins (e.g., the active center of cytochrome c) have lower substitution rates than the less important parts. Later, Dickerson (1971) confirmed this finding by using an even larger data set. They also noted that cytochrome c from different mammalian species is fully interchangeable when tested in vitro with intact mitochondrial cytochrome oxidase (Jacobs and Sanadi 1960). For many molecular biologists, these data were more convincing in supporting neutral theory than Kimura's computation of the cost of natural selection.
In the late 1960s and the 1970s, however, there was another controversy about the maintenance of protein polymorphism, as mentioned earlier. This controversy was initiated by the discovery that natural populations contain an unexpectedly large amount of protein polymorphism (Shaw 1965; Harris 1966; Lewontin and Hubby 1966). It was a new version of the previous controversy concerning the maintenance of genetic variation. In the 1950s population geneticists were divided into two camps, one camp supporting the classical theory and the other the “balance” theory (Dobzhansky 1955). The classical theory asserted that most genetic variation within species is maintained by the mutation-selection balance, whereas the balance theory proposed that genetic variation is maintained primarily by overdominant selection or some other types of balancing selection. The major supporters of the former theory were H. J. Muller, James Crow, and Motoo Kimura, and those supporting the latter theory were Theodosious Dobzhansky, Bruce Wallace, and E. B. Ford. During this controversy, it became clear that the amount of genetic variation maintained by overdominant selection can be much greater than that maintained by mutation-selection balance but that overdominant selection incurs a large amount of genetic load (genetic death or fertility excess required) that may not be bearable by mammalian organisms when the number of loci is large (Kimura and Crow 1964). For this reason, Lewontin and Hubby (1966) could not decide between the two hypotheses when they found electrophoretic variation.
However, Sved, Reed, and Bodmer (1967), King (1967), and Milkman (1967) proposed that this genetic load can be reduced substantially if natural selection occurs by choosing only individuals in which the number of heterozygous loci is greater than a certain number (truncation selection). Soon after these papers were published, Nei (1971, 1975) argued that, unlike artificial selection, natural selection does not occur in the form of truncation selection. In the meantime, Robertson (1967), Crow (1968), and Kimura (1968a, 1968b) suggested that most protein polymorphisms are probably neutral and that the wild-type alleles in the classical hypothesis are actually composed of many isoalleles or neutral alleles. However, the balance camp did not accept this suggestion because they believed that almost all genetic polymorphisms were maintained by balancing selection (Dobzhansky 1970; Clarke 1971).
One important progress in the study of evolution in this era was that the neutral theory generated many predictions about the allele frequency distribution within populations and the relationships between genetic variation within and between species, so that one could test the applicability of the neutral theory to actual data by using various statistical methods. In other words, we could use the neutral theory as a null hypothesis for studying molecular evolution (Kimura and Ohta 1971). This type of statistical study of evolution was almost never done before the neutral theory was proposed. The results of these studies are summarized by Lewontin (1974), Nei (1975, 1987), Wills (1981), and Kimura (1983). Although the interpretations of the results by these and other authors were not necessarily the same, it was clear by the early 1980s that the extent and pattern of protein polymorphism within species roughly agree with what would be expected from neutral theory (e.g., Yamazaki and Maruyama 1972; Nei, Fuerst, and Chakraborty 1976; Chakraborty, Fuerst, and Nei 1978; Skibinski and Ward 1981; Nei and Graur 1984).
Of course, this does not mean that all amino acid substitutions are neutral. There must be some amino acid substitutions that are adaptive and change protein function. In fact, this was one of the important subjects of molecular evolution, and several such substitutions have been identified, as will be discussed later. There are also many deleterious mutations that are polymorphic but are eventually eliminated from the population, as expected from the classical theory of maintenance of genetic variation. Many of these deleterious mutations reduce the fitness of mutant heterozygotes only slightly, but some have lethal effects in the homozygous condition (Crow and Temin 1964; Mukai 1964; Simmons and Crow 1977).
In the study of evolution, DNA sequences are more informative than protein sequences because a large part of DNA sequences are not translated into protein sequences and there is degeneracy of the genetic code. The genetic variation in the noncoding regions of DNA such as the intergenic regions, introns, flanking regions and synonymous sites can only be studied by examining DNA sequences.
Because of degeneracy of the genetic code, a certain proportion of nucleotide substitutions in protein-coding genes are expected to be silent and result in no amino acid substitution. King and Jukes (1969) predicted that these silent or synonymous nucleotide substitutions should be more or less neutral and therefore the rate of synonymous nucleotide substitution should be higher than the rate of amino acid substitution if the neutral theory is correct. One of the first persons to study this problem empirically was Kimura (1977), who compared the rate of amino acid substitution (rA) with that of nucleotide substitution at the third codon position (r3) of histone 4 mRNA sequences from two species of sea urchins. The reason why he used the third codon position was that a majority of synonymous substitutions occur at this position. Kimura's results clearly supported the neutral theory. The highly conserved protein histone 4 showed an extremely low value of rA, which was estimated to be 0.006 × 10−9 per site per year. Yet, the rate of nucleotide substitution at the third codon position was r3 = 4 × 10−9. This latter rate is nearly the same as that for other nuclear genes studied later. These results supported King and Jukes' idea that the synonymous rate is nearly equal to the total mutation rate.
In 1981 even stronger support of the neutral theory came from studies of the evolutionary rate of pseudogenes (Li, Gojobori, and Nei 1981; Miyata and Yasunaga 1981). Pseudogenes are nonfunctional genes because they contain nonsense or frameshift mutations, and therefore according to the neutral theory the rate of nucleotide substitution is expected to be high and is more or less equal to the total mutation rate. By contrast, if neo-Darwinism is right, one would expect that virtually no nucleotide substitution occurs because they are functionless and there is no way for positive selection to operate. When Li, Gojobori, and Nei (1981) studied the rate of nucleotide substitution for three globin pseudogenes from the human, mouse, and rabbit, the average rate was about 5 × 10−9 per site per year and was much higher than the rates for the first, second, and third codon position rates of the functional genes (table 1). An independent study by Miyata and Yasunaga (1981) about a mouse globin pseudogene also showed a high rate of substitution. These observations clearly supported the neutral theory rather than the neo-Darwinian evolution.
In recent years, a large number of pseudogenes have been discovered in various organisms. For example, the human genome contains about 100 immunoglobulin (IG) heavy chain variable region (VH) genes, but about 50% of them are pseudogenes (Matsuda et al. 1993). Most of these genes have evolved faster than functional VH genes, so that they have long branches on phylogenetic trees compared with functional genes (fig. 2). However, as will be mentioned later, some pseudogenes have gained new functions and evolved slowly.
From the standpoint of the neutral theory, it is interesting to examine the extent of DNA polymorphism that is not expressed at the amino acid level. For genes which are subject to purifying selection, this silent or synonymous polymorphism is expected to be high compared with nonsynonymous polymorphism. In contrast, if polymorphism is maintained by advantageous mutations and selection (e.g., directional selection), one would expect that the level of synonymous polymorphism is lower than that of nonsynonymous polymorphism because selection occurs primarily for nonsynonymous substitutions. One way of distinguishing between the two hypotheses is to examine the extent of polymorphism at the first, second, and third codon positions of protein-coding genes. In nuclear genes all nucleotide changes at the second position lead to amino acid replacement, but if we consider all codon positions about 72% of nucleotide changes are expected to affect amino acids because of degeneracy of the genetic code (Nei 1975, 1987). At the first position, about 95% of nucleotide substitutions result in amino acid changes and at the third position about 28% result in amino acid substitutions. Therefore, if the neutral theory is correct, the extent of DNA polymorphism is expected to be highest at the third position and lowest at the second position. Available data indicate that this is generally the case. One of the fastest evolving genes in terms of the rate of nucleotide substitution is the hemagglutinin gene in the human influenza virus A (RNA virus). The rate is more than 1 million times higher than that of nuclear genes in eukaryotes (Air 1981; Holland et al. 1982). Initially, this high rate of nucleotide substitution appeared to be due to positive selection in addition to the high mutation rate. Table 2, however, indicates that even in this gene the third position has the highest degree of polymorphism and the second position has the lowest. Therefore, the high degree of polymorphism in this gene can be explained by the neutral theory.
The above studies indicate that the substitution rate is much higher at the DNA (or RNA) level than at the protein level and that many mutations resulting in amino acid changes are deleterious and eliminated from the population. However, the comparison of the rate of nucleotide substitution at the first, second, and third codon positions is an inefficient way of comparing the extents of synonymous and nonsynonymous nucleotide substitutions because some of the third-position substitutions lead to amino acid substitution and a small proportion of first-position substitutions do not change amino acids.
For this reason, a number of authors developed various statistical methods for estimating the number of synonymous substitutions per synonymous site (dS) and the number of nonsynonymous substitutions per nonsynonymous site (dN) (Miyata and Yasunaga 1980; Perler et al. 1980; Li, Wu, and Luo 1985; Nei and Gojobori 1986; Li 1993; Pamilo and Bianchi 1993; Goldman and Yang 1994; Muse and Gaut 1994; Zhang, Rosenberg, and Nei 1998; Seo, Kishino, and Thorne 2004). Different methods depend on different assumptions but give similar estimates unless the extent of sequence divergence (d) is high. When d is high, the reliability of the estimates of dN and dS is low in all methods (Nei and Kumar 2000). For the purpose of testing positive or negative selection, conservative estimates of dN and dS are preferable because the assumptions of parametric methods are unlikely to be satisfied. For this reason, the method of Nei and Gojobori (1986) is often used. To minimize the errors due to incorrect assumptions, one may also use the number of synonymous “differences” per synonymous site (pS) and the number of nonsynonymous differences per nonsynonymous site (pN) (Zhang, Rosenberg, and Nei 1998).
In the above approach, neutral evolution is examined by testing the null hypothesis of dN = dS or pN = pS. If dN(pN) is significantly greater than dS (pS), one may conclude that positive selection is involved. By contrast, dN (pN) < dS (pS) implies negative or purifying selection. Figure 3 shows the dN/dS ratio when a large number of orthologous genes are compared between humans and mice. This result clearly indicates that most genes are under purifying selection. Because purifying selection is allowed in the neutral theory, this result is also consistent with the theory.
In the last two decades various statistical methods for testing the neutral theory have been proposed (see Kreitman 2000; Wright and Gaut 2005 for reviews). They can be broadly divided into five major categories. The first one is to test the difference between synonymous (dS) and nonsynonymous (dN) nucleotide differences, as mentioned above. This test is useful for detecting the balancing selection (Hughes and Nei 1988, 1989a; Clark and Kao 1991) or directional selection (Tanaka and Nei 1989). However, I would like to discuss this method in Advantageous Mutations. The second category of tests is to examine the pattern of nucleotide frequency distribution and to detect positive or negative selection by testing the deviation of the distribution from the neutral expectation (Tajima 1989; Fu and Li 1993; Simonsen, Churchill, and Aquadro 1995). This is a DNA version of the test of neutrality of electrophoretic alleles of Watterson (1977), but it is expected to be more powerful than Watterson's if all the assumptions are satisfied. The third category of tests is to consider two or more loci and examine the consistency of within-species variation and between-species variation (Hudson, Kreitman, and Aguade 1987). If there is inconsistency, selection is invoked in one or more of the loci examined. This is again a DNA version of the previous test of electrophoretic data (Chakraborty, Fuerst, and Nei 1978; Skibinsky and Ward 1981). The fourth category is to examine whether the ratio of synonymous and nonsynonymous changes within populations is the same as that between populations (fixed differences) (McDonald and Kreitman 1991). The fifth category is to use Wright's (1938) steady-state distribution of allele frequency under irreversible mutations for synonymous and nonsynonymous sites separately (Sawyer and Hartl 1992). There are several other tests, but they are not used so often (Kreitman 2000).
In practice, however, many of the above methods (except the first category) are dependent on the assumption that the population size remains the same throughout the evolutionary process. If this assumption is violated, the interpretation of the test results becomes complicated. Furthermore, even if the above assumption approximately holds, the statistical power of the methods is generally quite low. Partly because of these reasons, it is often difficult to obtain convincing evidence of positive selection (Kreitman 2000; Wright and Gaut 2005). Nevertheless, these tests have shown that most mutations are nearly neutral or subject to purifying selection.
In the study of molecular evolution, many different theories such as overdominant selection, frequency-dependent selection, and varying selection intensity due to ecological factors have been proposed particularly with respect to the maintenance of genetic variation (Lewontin 1974; Nei 1975, 1987; Gillespie 1991). Most of them are no longer seriously considered as a general explanation, but the theory of slightly deleterious mutation of Ohta (1973, 1974) has recently received considerable attention. In early studies of protein polymorphism detected by electrophoresis, Lewontin (1974, p. 208) and Ohta (1974) noted that the average gene diversity or heterozygosity (H) for protein loci was about 6%–18% irrespective of the species studied and appeared to have no relationship with species population size. If this is true, it is certainly inconsistent with the neutral theory because in this theory the average heterozygosity should increase with population size if the mutation rate remains the same. For this reason, these authors criticized the neutral theory.
Ohta's original proposal of the slightly deleterious mutation theory was to explain this apparent constancy of average heterozygosity. She stated that if a population contains the wild-type alleles and many slightly deleterious mutations at a locus, the average heterozygosity in small populations would be relatively high because slightly deleterious alleles would behave as though they were neutral. In large populations, however, the effect of selection is stronger, and many deleterious mutations would be eliminated. Therefore, average heterozygosity could be more or less the same for different population sizes (Ohta 1974). However, the observation by Lewontin and Ohta was based on data from a small number of species, and when many different species were considered, average heterozygosity was generally lower in vertebrates with small population sizes than in invertebrate species with large population sizes (Nei 1975). Later Nei and Graur (1984) studied this problem using data from 341 species and reached the conclusion that average heterozygosity generally increases with increasing species size particularly when bottleneck effects are taken into account (fig. 4). Therefore, Ohta's original explanation is no longer applicable.
On the basis of classical experiments on lethal or other deleterious mutations, Ohta also argued that the mutation rate must be constant per generation rather than per year though empirical data on amino acid substitutions suggested approximate constancy per year. To resolve this inconsistency, she developed an elaborate mathematical formula about the relationships among the mutation rate (v), generation time (g), effective population size (N), and selection coefficient (s) and used it to support her theory (Ohta 1977). However, later studies showed that the rate of “nondeleterious mutations” is roughly constant per year rather than per generation when long-term evolution is considered (e.g., Nei 1975, 1987; Wilson, Carlson, and White 1977). Although the molecular clock is very crude, this raised a question about her formulation.
Nevertheless, investigators studying the mechanism of maintenance of DNA polymorphism have often concluded that there is an excess of polymorphism due to deleterious mutations compared with apparently neutral mutations that have been fixed in closely related species (e.g., Sunyaev et al. 2001; Hughes et al. 2003; Zhao et al. 2003; Hughes 2005). However, this type of observations do not necessarily refute the neutral theory because in the premolecular era it was already shown that most outbreeding populations contain a large number of deleterious alleles in heterozygous condition (Muller 1950; Morton, Crow, and Muller 1956; Simmons and Crow 1977). Note that the neutral theory never claims that all alleles are neutral but that the majority of mutations fixed in the population are neutral or nearly neutral, as mentioned above. Existence of deleterious alleles in the population also does not necessarily support Ohta's theory.
Another problem with Ohta's theory is that if deleterious mutations accumulate in a gene, the gene gradually deteriorates and eventually loses its function (fig. 1B). If this event occurs in many important genes, the population or species would become extinct (Kondrashov 1995). In some genes such as ribosomal RNA (rRNA) genes the effects of initial mutations occurring in the stem regions may be detrimental, but the effects can be nullified by subsequent compensatory mutations (Hartl and Taubes 1998). Ohta (1973) included these mutations in the category of slightly deleterious mutations. However, these mutations do not change the gene function when long-term evolution is considered (fig. 1A), and therefore they should be called neutral mutations (Itoh, Martin, and Nei 2002). Note that evolution cannot happen by deleterious mutations; it should be caused by either advantageous or neutral mutations, as recognized by Darwin (1859).
In recent years Ohta (1992, 2002) modified her theory calling it the nearly neutral theory. In this theory she now assumes that both positive and negative mutations occur, with the condition of |Ns| ≤ 4. However, this theory is essentially the same as the original neutral theory conceived by many early molecular biologists (fig. 1A). Actually, mutations with even larger |Ns| values can be called neutral if N is large.
Muller (1932) argued that in the absence of back mutation, deleterious mutation would accumulate more quickly in asexual organisms than in sexual organisms. The reason is that in asexual organisms all deleterious mutations are inherited together from the parent to the offspring because in the absence of recombination they cannot be eliminated. Deleterious mutations can also accumulate in sexual populations, but the rate of accumulation is expected to be much lower than in asexual populations. This effect of asexual reproduction is called the ratchet effect or Muller's ratchet. Computer simulations (e.g., Felsenstein 1974; Haigh 1978; Takahata 1982) and theoretical studies (e.g., Crow and Kimura 1965; Pamilo, Nei, and Li 1987; Kondrashov 1995) have shown that Muller's argument is essentially correct and provided a theoretical basis of advantage of sexual reproduction over asexual reproduction. Muller's ratchet also predicts that asexual species eventually become extinct unless back mutation occurs, and this explains the fact that most asexual or parthenogenetic species in animals and plants are generally short lived (White 1978). Only in the presence of strong purifying selection at the protein level and a small amount of back mutations can asexual species survive for tens of millions of years (Welch and Meselson 2000).
A number of authors have argued that because slightly deleterious mutations are likely to be fixed with a higher probability in small asexual populations than in large outbreeding populations, a higher rate of amino acid substitution observed in sheltered chromosomes such as some mitochondrial genomes (Lynch 1996) and the genomes of the bacteria Buchnera species which are parasitic to aphids (e.g., Moran 1996; Clark, Moran, and Baumann 1999) is caused by the ratchet effect. It is true that slightly deleterious mutations may be fixed in small populations even if they are eliminated in large populations. This is particularly so for sheltered chromosomes without recombination such as the Y chromosome in mammals (Nei 1971; Charlesworth 1978). However, the continuous accumulation of deleterious mutations will deteriorate the gene function irrespective of the population size (fig. 1B). Because the symbiosis of Buchnera and aphids apparently occurred about 200 MYA (Moran et al. 1993) and the mitochondrial genome in eukaryotes appears to have originated by infection of an α-proteobacterial species about 1.5 billion years ago (Javaux, Knoll, and Walter 2001), the functional genes remaining in these genomes must have been maintained by strong purifying selection and occasional back mutation. This suggests that the ratchet effect is unlikely to explain the enhanced rate of amino acid substitution, which has been observed in Buchnera genomes.
Comparing the rates of amino acid substitution of all the genes of a Buchnera species with those of their closely related bacterial species, Itoh, Martin, and Nei (2002) suggested that the higher rate in Buchnera is caused by either enhanced mutation rate or relaxation of selective constraints in small populations. The first hypothesis was supported by the lack of several DNA repair enzymes in the Buchnera genome, and the latter hypothesis is likely to apply because of the change of metabolism in symbiotic bacteria. This explanation is more reasonable than Muller's ratchet. Of course, these genomes have lost many original genes either because they were no longer needed under the condition of symbiosis or because they were transferred to the host nuclear genome (Martin et al. 2002). However, this is a problem different from the enhancement of amino acid substitution and will be treated elsewhere.
Kimura (1983) proposed that molecular evolution occurs by random fixation of neutral or nearly neutral mutations, but he believed that the evolution of morphological or physiological characters occurs following the classical neo-Darwinian principle. However, we should note that all morphological characters are ultimately controlled by DNA, and therefore morphological evolution must be explained by molecular evolution of genes. In other words, evolution is not dichotomous as Kimura assumed, and we should be able to find the molecular basis of phenotypic evolution. For this reason, Nei (1975, 1987) presented the view that although a majority of amino acid substitutions may have occurred by random fixation, there must be some substitutions which are adaptive.
Indeed, there are now many such examples (table 3). One of the earliest works supporting this idea was the study of Perutz et al. (1981) on the evolutionary change of hemoglobin in crocodiles. Crocodilian hemoglobin lost its original function (the binding of organic phosphate, chloride, and carbamino CO2) and gained a new function (bicarbonate binding). This functional change represents an adaptive response to the blood acidity that occurs during the prolonged stay of crocodiles under water and can be explained by five amino acid substitutions. This is a small portion of the total number of amino acid substitutions (123) between crocodiles and humans. In general, most amino acid substitutions in hemoglobins do not appear to be related to any significant functional change (Perutz 1983). The functional change of stomach lysozyme of ruminants can also be explained by a small proportion of amino acid changes (Jolles et al. 1984).
The “red” and “green” color vision genes in humans are contiguously located on the X chromosome and are believed to have been generated by gene duplication that occurred just before humans and Old World monkeys diverged. The proteins (opsins) encoded by these two genes are known to have 15 amino acid differences (Nathans, Thomas, and Hogness 1986). However, only three amino acid differences are responsible for the functional difference of the two proteins, and other amino acid differences are virtually irrelevant (R. Yokoyama and S. Yokoyama 1990). Yokoyama and Radlwimmer (2001) have shown that most evolutionary changes of red-green color vision in vertebrates can be explained by amino acid changes at five critical sites of the protein. Some other examples of adaptive evolution by a few amino acid substitutions are given in table 3.
As mentioned earlier, positive Darwinian selection may be detected by comparing the number of synonymous substitutions per synonymous site (dS) and the number of nonsynonymous substitutions per nonsynonymous site (dN). One of the first applications of this approach was done by Hughes and Nei (1988, 1989b), who compared dN and dS for the peptide-binding site (PBS) (or antigen-recognition site) composed of about 57 amino acids and the non-PBS of MHC genes from humans and mice. MHC molecules are for distinguishing between self- and non–self-peptides and play a role of the initial step of the adaptive immunity. Their results clearly showed dN > dS for the PBS but dN < dS for the non-PBS. These results suggested that in the PBS positive selection operates, whereas in the non-PBS purifying selection prevails. Interestingly, vertebrate MHC loci are exceptionally polymorphic, and the cause of this high degree of polymorphism had been debated for more than two decades before 1988. One hypothesis for explaining this polymorphism was heterozygote advantage or overdominant selection (Doherty and Zinkernagel 1975), but there was no evidence supporting this hypothesis. Knowing that dN will be greater than dS under overdominant selection (Maruyama and Nei 1981), Hughes and Nei (1988) proposed that the high degree of MHC polymorphism is probably caused by overdominant selection. Later Takahata and Nei (1990) showed that the overdominance hypothesis can also explain the transspecific polymorphism of MHC genes previously observed by Figueroa, Günther, and Klein (1988), Lawlor et al. (1988), and Mayer et al. (1988), and others. Since then, hundreds of different studies have been conducted about the relative values of dN and dS for MHC genes from different species, and most of the studies have shown essentially the same results (Hughes and Yeager 1998; Hughes 1999). Some demographic data suggesting heterozygote advantage at MHC loci have also been published (Hedrick 2002).
These findings about MHC genes stimulated similar studies for many other immune systems genes including those for IGs (Tanaka and Nei 1989), T-cell receptors (TCRs) (Su and Nei 2001), and natural killer cell receptors (Hughes 2000). These studies also identified positive selection at the ligand-recognition site, but the genes involved are not as polymorphic as are MHC genes, and it appears that positive selection is not just for generating genetic polymorphism but for accelerating gene turnover in the population (Tanaka and Nei 1989). It is possible that the accelerated rate of nonsynonymous substitution is caused by protection of the host organism from the attack of ever-changing parasites such as viruses, bacteria, and fungi (arms race).
A higher value of dN than dS has also been observed in many disease-resistant genes in plants (e.g., Michelmore and Meyers 1998; Xiao et al. 2004). These genes are essentially immune system genes and defend the host organism from parasites. Another group of genes that often show the dN > dS relationship are antigenic genes in the influenza virus (Ina and Gojobori 1994; Fitch et al. 1997), human immunodeficiency virus-1 (Hughes 1999), plasmodia (Hughes 1999), and other parasites. These genes, especially RNA viral genes, usually show a high rate of mutation and help the parasites to avoid the surveillance systems of host organisms. Here the high rate of nonsynonymous substitution compared with that of synonymous substitution is apparently caused by the “arms race” between hosts and parasites.
Another class of genes that often show a high ratio of dN/dS is the genes expressed in reproductive organs such as testis and ovary (see Swanson and Vacquier 2002 for reviews). This was first noticed by Civetta and Singh (1995) in their electrophoretic study of interspecific protein differences in Drosophila. Recently this problem was studied by comparing synonymous and nonsynonymous nucleotide substitutions. Singh and colleagues (Torgerson, Kulathinal, and Singh 2002; Torgerson and Singh 2003) computed the dN/dS ratio between human and mouse orthologous genes and showed that the ratio is generally higher for the genes expressed in sperm than those expressed in other tissues, but it was generally lower than 1.
Transcription factor genes are generally highly conserved (Makalowski, Zhang, and Boguski 1996; Nam and Nei 2005). However, some homeobox genes which are located on X chromosome and control testis-gene expression have often shown a dN/dS value higher than 1 (Sutton and Wilkinson 1997; Ting et al. 1998; Wang and Zhang 2004). This high dN/dS ratio is primarily caused by a high rate of nonsynonymous substitution in nonhomeobox domains. Furthermore, comparison of human and mouse genes showed that the X-linked homeobox genes expressed in testis generally evolve faster than the genes expressed in other tissues whether they are X linked or autosomal (Wang and Zhang 2004). A number of authors suggested that these genes might be related to the evolution of reproductive isolation. However, it is more likely that because the morphology of reproductive organs, particularly animal genitalias, are known to evolve rapidly without serious consequences (Darwin 1859; Eberhard 1985), the extent of purifying selection for the genes controlling reproductive organs may not be as strong as that for other organs. It is also possible that the accelerated evolution of these genes is due to the secondary effects of sexual selection (Eberhard 1985, 1996) or sperm competition (Clark 2002). At this stage, the real reasons for the accelerated evolution of genes expressed in reproductive organs remain unclear.
There are a number of reports indicating that the genes apparently controlling reproductive isolation between different species evolve faster than other genes. A well-documented case is the gene encoding sperm lysin in abalones, marine mollusk. In abalones, the eggs are enclosed by a vitelline envelope, and sperm must penetrate this envelope to fertilize the egg (Shaw et al. 1995). The vitelline envelope receptor for lysin (VERL) is a long acidic glycoprotein composed of 22 tandem repeats of 153 amino acids, and about 40 molecules of lysin bind with one molecule of VERL (Galindo, Vacquier, and Swanson 2003). The interaction between lysin and VERL is species specific, and therefore this pair of proteins apparently controls species-specific mating.
Interestingly, comparison of the lysin gene sequences from closely related abalone species generally show a higher number of nonsynonymous substitutions (dN) than synonymous substitutions (dS) (Lee and Vacquier 1992; Lee, Ota, and Vacquier 1995). By contrast, comparison of VERL genes generally show the relationship of dN < dS (Swanson and Vacquier 1998). Because these genes are involved in reproductive isolation of different species, a number of authors have speculated why the genes controlling reproductive isolation evolve faster than other genes (e.g., Swanson and Vacquier 2002).
Theoretically, the genes controlling reproductive isolation are expected to evolve slowly because they are supposed to maintain the mating of individuals within species and prevent the mating between different species or reduce the viability or fertility of interspecific hybrids (Nei, Maruyama, and Wu 1983; Nei and Zhang 1998). Figure 5 shows a genetic model explaining the species specificity between the lysin and VERL genes. Within a species (species 1 or 2), the lysin and VERL genes are compatible, so that mating occurs freely. However, if species 1 and 2 are hybridized, lysin and VERL are incompatible, and therefore the fertilization is blocked or reduced. This guarantees the species-specific mating when the two species are mixed. This is often called the Dobzhansky-Muller scheme of reproductive isolation. However, it is not very simple to produce species 2 from species 1 (or species 1 and 2 from their common ancestor) by a single mutation at the lysin and VERL loci because a mutation (Ak) at the lysin locus makes the lysin gene incompatible with the wild-type allele (Bi) at the VERL locus. A mutation (Bk) at the VERL locus also results in the incompatibility with the wild-type allele (Ai) at the lysin locus. Therefore, these mutations would not increase in frequency in the population. Of course, if mutations Ak and Bk occur simultaneously, lysin Ak and VERL Bk may become compatible. However, the chance that these mutants meet with each other in a large population would be negligible.
For this reason, Nei, Maruyama, and Wu (1983) proposed that the evolutionary change of allele Ai (or Bi) to Ak (or Bk) occurs by two or more steps of mutation and closely related alleles have similar functions. For example, Ai may mutate first to Aj and then to Ak, whereas Bi may mutate to Bj and then Bk. If Ai is compatible with Bi and Bj but not with Bk and if Bk is compatible with Aj and Ak but not with Ai, then it is possible to generate the species-specific combination of alleles at the lysin and VERL loci in species 2 with an aid of genetic drift or some kind of selection. However, this model of speciation would not accelerate the rate of nonsynonymous substitution compared with that of synonymous substitution. Rather it would generally give a relationship of dN < dS because negative selection operates when lysin Ai meets with VERL Bk or lysin Ak meets with VERL Bi. In fact, computer simulation has shown that the rate of amino acid substitution slows down in this case (Nei, Maruyama, and Wu 1983). A similar situation may occur with genes controlling hybrid inviability or infertility. Some authors called this a neutral model (see Coyne and Orr 2004), but this is clearly incorrect. The fertility of mating Ak × Bi need not be the same as that of mating Ai × Bk. This will explain asymmetric hybrid sterilities observed in experiments. Furthermore, many similar genetic loci are likely to be involved in real reproduction isolation. Orr (1995) presented a mathematical model of evolution of reproductive isolation. However, because he did not consider the process of fixation of incompatibility genes, his model does not really explain the evolution of reproductive isolation.
Why is then the rate of nonsynonymous substitution enhanced in lysin genes? Swanson and Vacquier (1998, 2002) argued that the rate is enhanced because the internal repeats of the VERL gene are subject to concerted evolution, and the concerted evolution at this locus is the driving force of evolution of the lysin-VERL specificity enhancing the evolution of the lysin gene. However, this argument is not convincing. For example, if the lysin-VERL specificity is strong, a new mutant gene occurring in one of the VERL repeats would not be compatible with the wild-type lysin allele. Therefore, this mutant repeat would not spread through all the repeats of the VERL gene. Even if it spreads through many repeats, this new mutant allele will be incompatible with the wild-type lysin allele. This suggests that the Swanson-Vacquier model would not work. Vacquier and colleagues (Galindo, Vacquier, and Swanson 2003) later sequenced the entire set of VERL gene repeats for several abalone species and discovered that the last 20 C-terminal repeats of VERL show a pattern of sequence similarity as expected from concerted evolution. However, the N-terminal repeats 1 and 2 showed no evidence of concerted evolution. (Galindo, Vacquier, and Swanson  actually claimed that repeats 1–2 evolved faster than repeats 3–22 because of positive selection. However, their phylogenetic tree for repeats 1–2 and 3–22 shows that repeats 3–22 evolved faster than repeats 1–2, the tree lengths for repeats 3–22 and repeats 1–2 being 1.3 and 0.75, respectively. Furthermore, their statistical method used for detecting positively selected sites was later criticized; see Selection at Single Codon Sites.) They then speculated that the accelerated rate of the lysin gene is probably due to sexual selection, sex conflict, or microbial attack (pathogen avoidance). However, the real mechanism for the enhanced lysin evolution remains unclear.
One hypothesis they did not consider is the high basicity of lysin (PI ≈ 10 ~ 11; J. Nam, personal communication) and the high acidity of VERLs (PI ≈ 4.7, Swanson and Vacquier 1998). These observations suggest that lysin must maintain a high level of basicity to bind acidic VERLs. Therefore, if a basic amino acid (arginine or lysine) in a lysin protein mutates to a nonbasic amino acid, some other amino acids must change to a basic amino acid to maintain a high PI level. This type of mutation and selection may enhance the rate of nonsynonymous substitution. Actually, this type of “unusual form of purifying selection” or “equilibrating selection” has been observed with the sperm protein protamine P1 (Rooney, Zhang, and Nei 2000). This small protein replaces histones and binds DNA in the process of spermatogenesis. It is a highly basic protein, and about 60% of amino acid residues of this protein are arginines, and the PI value is about 13. This high level of basicity is apparently required because the protein has to bind acidic DNA. Using various statistical methods, a number of authors suggested that the protamine P1 gene is subject to positive Darwinian selection in primates (e.g., Rooney and Zhang 1999; Wyckoff, Wang, and Wu 2000). However, if we know the function of protamine P1, it is unlikely that positive selection occurs continuously at many amino acid sites in this P1 gene. A more reasonable explanation is the occurrence of equilibrating selection to maintain a given PI level. If the same principle applies to lysin, the majority of amino acid substitutions in lysin may not be directly related to the species specificity of lysin and VERL, though the equilibrating selection alone would not be sufficient for explaining the high dN/dS ratio in lysin.
The lysin work stimulated further studies on the evolutionary rate of proteins that are potentially involved in reproductive isolation. Some of them are sea urchin bindin that mediates the attachment of the sperm to the egg (Metz and Palumbi 1996), abalone protein sp18 having a function for the fusion of the sperm and egg (Swanson and Vacquier 1995), and a homeotic gene apparently responsible for a male hybrid sterility (Ting et al. 1998). These genes appear to evolve fast, but the target genes have not been identified except for sea urchin bindin (Kamei and Glabe 2003). Therefore, the real reason for the accelerated evolution remains unclear.
Speciation or development of reproductive isolation is one of the most important unsolved problems in population genetics. To understand this problem, it is important to identify pairs of incompatibility genes between species and study the molecular mechanism of the incompatibility. At present, only Vacquier, Glabe, and their colleagues have done this in abalone and sea urchin. Unfortunately, their study has not given clear-cut answers. More studies with other incompatibility genes are necessary.
In plants there are many species showing self-incompatibility, and the genes controlling this self-incompatibility have recently been cloned and sequenced in a number of species. In these genes some portion of the coding regions appear to show the relationship of dN/dS > 1 (Clark and Kao 1991; Ishimizu et al. 1998; Takebayashi et al. 2003). This result is similar to that of MHC genes. However, this is what one would expect because the mating system ensures the occurrence of only heterozygotes for the self-incompatibility locus, and therefore this class of genes represents a case of strong overdominant selection (Wright 1939; Fisher 1958; Yokoyama and Nei 1979). Because of this strong overdominant selection, even a population of about 1,000 individuals may contain about 40 alleles in a species of evening primroses.
A genetic system similar to the plant self-incompatibility in terms of the formation of heterozygous individuals is the sex-determining or the complementary sex determiner (csd) gene in the honeybee. In this species males are produced from unfertilized eggs and are hemizygous (haploid) at the csd locus, whereas all individuals which are heterozygous develop into females (queen or worker bees). Homozygotes for this locus are also produced, but they are eaten by worker bees in the larval stage (Woyke 1963). In some other hymenopteran insects diploid males are also produced, but their offspring are sterile because diploid males produce diploid sperm (Hasselman and Beye 2004). Therefore, only hemizygous males participate in reproduction. Yokoyama and Nei (1979) studied the population dynamics of the alleles at the csd locus and showed that a large number of alleles can be maintained even in a relatively small population primarily because of strong heterozygote advantage. Recently, the csd gene was cloned (Beye et al. 2003), and the polymorphic alleles were sequenced (Hasselmann and Beye 2004). The dN/dS analysis again showed that positive selection is operating in certain regions of the gene. Therefore, the evolutionary dynamics of alleles of plant incompatibility loci and of the honeybee csd locus are very similar.
As mentioned earlier, the extensive polymorphism at MHC loci is caused by diversifying selection at a relatively small number of amino acid sites. These sites were identified by examining the crystallographic structure of the protein molecule. Recently, two different types of statistical methods were developed to identify such amino acid sites without doing any experimental work. One of them is the individual site (IS) method proposed by Suzuki and Gojobori (1999). This method is based on the idea that if positive selection is operating at a given codon site, the total number of nonsynonymous substitutions (cN) for the codon site must be greater than the total number of synonymous substitutions (cS) when all branches of a phylogenetic tree are considered. In practice, a phylogenetic tree for the entire sequences is constructed by the neighbor-joining or some other method, and the nucleotide sequence at each ancestral node (organism) is inferred by parsimony methods (Fitch 1971; Hartigan 1973). The cN and cS values are then compared with the values expected under the assumption of neutrality. If cN is significantly greater than the neutral expectation, the codon site is considered to be under positive selection. Because cN and cS are computed by parsimony methods in the Suzuki-Gojobori method, it is model free as long as the sequence divergence is low. This makes it advantageous compared with model-dependent methods. However, the power of this method is low unless a large number (>100) of sequences are used. Recently, Suzuki (2004), Massingham and Goldman (2005), and Kosakovsky Pond and Frost (2005) developed a likelihood IS method, using a specific codon substitution model. Although some authors of this method claimed a higher efficiency of identifying selected sites than the other method mentioned below, it should be examined more carefully by using actual data.
The other method (pooled site or PS method) developed by Nielsen and Yang (1998) and Yang et al. (2000) is a Bayesian method of detecting selected sites using the mathematical quantity w = dN/dS. They first consider a specific selection model, in which a certain proportion of codon sites (p0) is assumed to have a given distribution of w ≤ 1 (neutral or negative selection), whereas other sites (p1) have a given value of w > 1 (positive selection). They then consider a null model in which no positive selection is assumed to operate with w ≤ 1. Otherwise, the two models are supposed to be close to each other. For example, one of their favorite null models assume that w shows a β distribution among different sites with 0 ≤ w ≤ 1 (model M7). In this null model there are two parameters (a and b) for determining the β distribution and an additional parameter, k, representing the transition/transversion rate ratio. In the corresponding selection model (model M8), a proportion of sites (p0) follows a β distribution, and the remaining sites (p1) have a given w (w1 > 1). The existence of w1 is determined by conducting a likelihood ratio test for the two models. If the selection model shows a significantly higher maximum likelihood value than that for the null model, the posterior probability of w > 1 at each site of the p1 group is computed. If this probability is higher than 95%, the site is assumed to be positively selected.
This PS method has been criticized by a number of authors. The main criticisms are as follows. (1) The initial computer program (PAML) often gave unreasonable results (Suzuki and Nei 2001; Wong et al. 2004). (2) This method often gives a high rate of false positives even when the assumptions hold (Suzuki and Nei 2002; Zhang 2004; Kosakovsky Pond and Frost 2005; Massingham and Goldman 2005). (3) False positives can also occur when wrong tree topologies are used (Suzuki and Nei 2004) or when intragenic recombination occurs (Shriner et al. 2003). (4) The null model to be compared with the selection model may not be appropriate (Swanson, Nielsen, and Yang 2003; Suzuki and Nei 2004). (5) Their selection model including M8 is often quite unrealistic. Taking into account some of these criticisms, Yang and his colleagues (Swanson, Nielsen, and Yang 2003; Wong et al. 2004; Yang, Wong, and Nielsen 2005) have revised both the computer program and the mathematical models for the likelihood ratio test. However, the problems raised have not been completely solved (Kosakovsky Pond and Frost 2005; Massingham and Goldman 2005). Theoretically, this method has an inherent tendency of generating false positives, when dN and dS are small and the number of sequences used is small. The reason is that in this case w = (dN/dS) at a given site can be greater than 1 because of sampling errors, and if this happens the site is often included in the p1 site group. In particular, the sites with dN > 0 and dS = 0 are almost always included in the selected-sites group, even if dS = 0 occurred by chance (Suzuki and Nei 2004; Bishop 2005; Hughes and Friedman 2005; Suzuki 2005). Note also that dS varies considerably with gene region (Hughes and Nei 1988). In the PS method, the comparison of the selection and null models, is always somewhat arbitrary. For these reasons, it appears that the IS method gives more reliable results than the PS method in detecting positively selected sites though the statistical power may not be high. However, whichever method is used, the estimated number of selected sites is generally small (Massingham and Goldman 2005). Therefore, the results obtained by these methods do not refute the early finding that the majority of amino acid substitutions are more or less neutral.
As mentioned above, the dN/dS test is often used to identify positive selection. In the case of MHC loci, overdominant or diversifying selection is apparently operating, and positive selection for a mutant gene is expected to enhance the fitness of the gene in the presence of newly introduced parasites. Needless to say, this positive selection is caused by some kind of functional change of the gene. Similar functional changes by amino acid substitution appear to occur in the disease-resistant genes in plants, which are known to be quite polymorphic (Xiao et al. 2004). However, in many other immune systems genes such as the IG genes in mammals and the antigenic sites of influenza viruses, the extent of polymorphism within a population is not necessarily high though the dN/dS ratio is quite high at the ligand-recognition sites (Tanaka and Nei 1989). This suggests that a high rate of turnover of amino acid sequences occurs in these genes, but it does not necessarily show overdominant selection. In other words, the amino acid sequences in these genes are subject to directional selection rather than diversifying selection, as mentioned earlier.
However, the functional change of a gene may occur even in the case of dN/dS < 1. As mentioned earlier, the function of a protein may change drastically even by a single or a few amino acid substitutions. In these cases the dN/dS ratio can be low even for this particular set of amino acid sites. In fact, Yokoyama and Takenaka (2004) reported that neither the Suzuki-Gojobori nor the Yang method can detect the five critical amino acid sites of color vision pigments that determine the variation of different vertebrate species.
Similarly, a high dN/dS value at some sites does not necessarily mean that the functional change of the gene has occurred. In hemoglobins, which are fairly well conserved, the rate of amino acid substitution is higher in the surface region of the molecule than in the interior or hem pocket region, but it does not seem to affect the function appreciably (Zuckerkandl and Pauling 1965; Kimura and Ohta, 1973). (Yang et al.  identified a few positively selected sites in the surface region by the PS method, but Massingham and Goldman  could not find any such sites for the same data set by their method.) In general, most amino acid substitutions appear to be effectively neutral in the surface region of hemoglobin molecules, but those occurring in the other regions often change the protein function significantly (Perutz 1983). These results indicate that a high dN/dS ratio does not necessarily mean the improvement of protein function. The basic process of evolution or speciation is the functional change of phenotypic characters, which are undoubtedly caused by the functional change of proteins. Therefore, the evolutionary process should eventually be studied by experimental methods. However, it is more important to find a functional change of proteins than to identify sites with a high dN/dS value. Some statistical methods for this purpose have already been developed (e.g., Gu 1999, 2001; Knudsen and Miyamoto 2001; Nam et al. 2005).
Recent genome sequencing in many different model organisms has made it clear that gene duplication is an important mechanism of creating new genes and new genetic systems. The importance of gene duplication in creating new genes was first noted by Bridges (1935) and Muller (1936). They both reported that the salivary chromosome of Drosophila melanogaster contains many small intra-chromosomal duplications and that the mutant phenotype Bar is caused by a set of duplicate genes. This finding was later extended to propose that duplicate genes are the important source of creating new genes (Lewis 1951; Stephens 1951). However, molecular evidence for this idea was lacking until Ingram (1961, 1963) showed that myoglobin and hemoglobin α, β, and γ chains in humans are products of a series of gene duplications that occurred long time ago. In the 1960s several more different multigene families (Dayhoff 1969) were discovered, and this discovery set forth the study of evolution of multigene families. However, the real magnitude of the importance of gene duplication was not recognized until DNA sequence data became available from different model organisms.
There are two major mechanisms of producing duplicate genes: (1) genome duplication and (2) tandem gene duplication. Genome duplication does not necessarily double the number of functional genes because some genes are quickly silenced or lost from the genome (Wolfe and Shields 1997; Adams et al. 2003; Adams and Wendel 2005). Yet, this is the most effective mechanism for increasing the number of genes in the genome. By contrast, the increase of gene number by tandem duplication is small at a time, but it may produce thousands of genes if we consider long-term evolution as in the case of olfactory receptor (OR) genes (Glusman et al. 2001; Young et al. 2002; Zhang and Firestein 2002; Niimura and Nei 2003).
Considering the rate of increase of DNA content from bacteria to humans, the rate of amino acid substitution, and the rate of inactivation of duplicate genes, Nei (1969) predicted that vertebrate genomes contain a large number of duplicate genes and nonfunctional genes (pseudogenes). Recent genome sequencing in vertebrates has shown that this is indeed the case. It is interesting to know that the mammalian genome contains over 20,000 pseudogenes (Podlaha and Zhang 2004). This is nearly as large as the number of functional genes (about 23,000 genes). Nei also predicted that some nonfunctional genes may become useful again as genetic materials in the course of evolution. Although this was a bold hypothesis at that time, we now know that the IG pseudogenes in chicken and rabbits are used for the diversification of IG genes through somatic gene conversion (Reynaud et al. 1989; Tunyaplin and Knight 1995). Furthermore, some pseudogenes apparently evolve into regulatory genes, which are quite conserved (Korneev, Park, and O'Shea 1999; Podlaha and Zhang 2004). For example, the transcript of the mouse pseudogene Makorin1-p1 regulates the stability of the transcript of its paralogous functional gene Makorin1 (Hirotsune et al. 2003). For this reason, the gene Makorin1-p1 has a biological function and therefore evolves at a low rate in rodents (Podlaha and Zhang 2004).
Ohno (1970) has presented a treatise on the evolution by gene duplication. One of his main themes in this treatise was that genome duplication has an advantage over tandem duplication in the formation of new genes because in genome duplication both protein-coding and regulatory gene regions are duplicated, whereas tandem gene duplication may disrupt the coordination of regulatory and protein-coding genes. He then proposed that the mammalian genome experienced about two rounds of genome duplications before the evolution of the X and Y chromosomes. Ohno (1972) also proposed that a large proportion of the mammalian genome is noncoding or junk DNA. The latter view now seems to be largely correct, though the noncoding DNA contains a substantial number of regulatory elements.
However, the 2-round (2R) genome duplication hypothesis has been controversial. A number of authors including Ohno (1998) suggested that this hypothesis is supported by the presence of four chromosomes that contain the homologous set of Antenapedia-class Hox genes in mammals and only one linkage group in amphioxus (Branchiostomata floridae), a sister group of vertebrates. Kasahara et al. (1996) also supported this view by finding a similar pattern of multiple chromosomes having MHC class III gene clusters. By contrast, Hughes (1999), Friedman and Hughes (2001), and others examined the pattern of distribution of various duplicate genes on different chromosomes and rejected the importance of the 2R hypothesis.
In practice, however, we now know that a large number of tandem duplications or gene block duplications have occurred during the past several hundred million years and the duplicate genes have often been transferred to different chromosomes or chromosomal segments as was observed with OR genes (Glusman et al. 2001; Zhang and Firestein 2002; Niimura and Nei 2003). Therefore, even if the 2R hypothesis is correct, it would be very difficult to prove it because the history of genome duplication should have largely been erased (Makalowski 2001). Unlike Ohno's original argument, tandem duplication has no disadvantage in creating new genes compared with genome duplication.
In the past it was customary to study adaptive evolution by examining the evolution of a single or a few related genes, as in the case of globin and color vision genes. However, we now know that most genetic systems or phenotypic characters are controlled by many genes or many multigene families and their interaction. Here a genetic system means any functional unit of biological organization such as the olfactory system and the adaptive immune system (AIS) in vertebrates, flower development in plants, meiosis, and mitosis. Therefore it is important to understand the evolution of multigene families and their interaction.
Until around 1990, multigene families were believed to evolve following the model of concerted evolution as in the case of vertebrate rRNA genes (Brown, Wensink, and Jordan 1972; Smith 1974; Dover 1982). In this model it is assumed that all member genes of a gene family evolve as a unit in concert, and a mutation occurring in a repeat spreads through the entire set of member genes by repeated occurrence of unequal crossing over or gene conversion. This has an effect of homogenizing the member genes so that a large quantity of the same or similar gene products can be produced. This is very important for such a gene family as rRNA genes because a large quantity of rRNA is necessary for gene translation. However, recent studies indicate that most multigene families which are concerned with genetic systems or phenotypic characters evolve following the model of birth-and-death evolution (Nei and Hughes 1992; Nei, Gu, and Sitnikova 1997; Nei and Rooney 2005). This model assumes that new genes are created by gene duplication, and some duplicate genes are maintained in the genome for a long time because of the new gene function acquired whereas others are deleted or become nonfunctional through deleterious mutations. The number of cases to which this model applies is rapidly increasing (Nei and Rooney 2005). Because birth-and-death evolution allows some groups of duplicate genes to stay in the genome for a long time while others acquire new gene functions, this model is useful for understanding the origins of new genetic systems.
One of the most well-studied cases is the evolution of AIS in jawed vertebrates. In the AIS the immunity for certain groups of parasites (viruses, bacteria, fungi, and others) is memorized once the host is attacked by them. A well-known example is the immunity against smallpox viruses. However, the vertebrates without jaw and nonvertebrate animals do not have this system, though most animals have the so-called innate immune system which defends the host from parasites but does not memorize the past attack. The evolution of the AIS is still unclear, but this system works with the interaction of many different multigene families. Most of these multigene families are evolutionarily related (fig. 6) and are apparently products of long-term birth-and-death evolution. Therefore, it seems that continuous operation of birth-and-death process and interaction of genes within and between different gene families has generated this immune system.
The major gene families controlling the AIS are the MHC, TCR, and IG families. Each of these families is composed of several gene subfamilies (Klein and Horejsi 1997). Thus, the MHC family can be divided into the class I and class II subfamilies, which have different molecular structures and functions. The TCR family can be divided into the α, β, γ,and δ subfamilies, and the genes from different subfamilies are required to form TCRs. The IG family can also be divided into the heavy (H) and light (L) chain gene subfamilies. H and L genes encode the H and L chains of IG molecules. However, these components are not sufficient to form the AIS. There are several other molecules specific to the AIS in jawed vertebrates, and these genes are also necessary (Klein and Horejsi 1997; Klein and Nikolaidis 2005). For this system to work, some organs that are apparently specific to jawed vertebrates such as thymus and T-lymphocytes are also needed.
Because the MHC, TCR, and IG gene families and some organs exist only in jawed vertebrates, a number of authors (e.g., Abi Rached, McDermott, and Pontarotti 1999; Kasahara, Suzuki, and Pasquier 2004) proposed the big bang hypothesis that the AIS arose suddenly within a relatively short evolutionary time owing to the postulated two rounds of genome duplication. Klein and Nikolaidis (2005) examined the evolution of each component of the AIS and reached the conclusion that the evolution of the AIS must have begun long before the divergence of jawed and jawless vertebrates and that this evolution initially occurred through changes in genes unrelated to immune response and later these component genes were assembled to form the AIS. This view of evolution is consistent with Darwin's gradual evolution.
Another example of the origin of new genetic systems by birth-and-death evolution is the evolution of the olfactory system in vertebrates. One group of molecules that controls olfaction (odor recognition) is the OR group, expressed in epithelial neurons in the nasal cavity. The mouse genome contains about 1,000 functional genes that encode ORs and about 400 pseudogenes (Zhang and Firestein 2002; Young et al. 2002; Zhang et al. 2004; Niimura and Nei 2005a). It is relatively easy to identify the orthologous genes between different species of vertebrates. The numbers of functional genes and pseudogenes in humans, chicken, and Xenopus are somewhat smaller than those in mice, but zebrafish and pufferfish have much smaller numbers of genes (table 4).
Phylogenetic analysis of functional olfactory genes from humans and mice have shown that they can be divided into two major groups, i.e., class I and class II genes, and that both classes of genes are subject to birth-and-death evolution. The numbers of human class I and class II genes are 57 and 331, respectively, and class II genes are further divided into 19 phylogenetic clades (Niimura and Nei 2003) or 172 subfamilies identified by sequence similarity (Malnic, Godfrey, and Buck 2004). These phylogenetic clades or subfamilies are associated with the recognition of different odorants (Malnic, Godfrey, and Buck 2004). Therefore, the functional differentiation of duplicate genes has already occurred. Class I genes are often called “fish like” genes because they have some sequence similarity with fish sequences. These genes are believed to accept aquatic odorants. By contrast, class II genes are called “mammalian” genes and are supposed to be for airborne odorants.
However, the phylogenetic analysis of OR genes from zebrafish, pufferfish, Xenopus, chicken, mice, and humans showed that the fish genome has several highly divergent groups of duplicate genes, and one of them gave rise to the mammalian class I genes and another one generated the class II genes (Niimura and Nei 2005b). After bony fishes and mammals diverged about 400 MYA, class II genes increased enormously in the mammalian lineage. Bony fishes apparently have at least eight OR gene groups which are highly divergent (table 4). Although the functions of these groups of genes have not been studied, it is quite possible that the sequence divergences are associated with functional differentiation.
In the above paragraphs we presented two examples of evolution of genetic systems by gene duplication and differentiation. Another important genetic system is the animal and plant development controlled by the homeobox gene superfamily. This superfamily is very old and shared by animals, plants, and fungi, but in the three different kingdoms they play different roles. The homeobox genes can be divided into typical and atypical genes. Typical homeobox genes contain a homeobox of 60 codons, whereas atypical group genes have a homeobox with a little more or fewer codons (Burglin 1997). In animals the homeobox genes can be divided into at least 49 different families (Burglin 1997; Nam and Nei 2005). Different families show different developmental roles. For example, the HOX gene family is responsible for determination of body pattern, and the PAX6 family is concerned with eye formation and evolution (Gehring and Ikeo 1999). All these gene families are also products of gene duplication and functional differentiation. The lineage-specific gain and loss of gene families also appears to be important for the differentiation of morphological characters among different organisms (Wagner, Amemiya, and Ruddle 2003; Hughes and Friedman 2004; Nam and Nei 2005; Ogura, Ikeo, and Gojobori 2005).
Another homeotic gene superfamily is the MADS-box gene superfamily. This superfamily is also very old and concerned with the development of plants and animals. The most well-known function of this family is the flower formation in plants (Weigel and Meyerowitz 1994; Theissen 2001; Nam et al. 2003). Some other gene families such as those for aldehyde dehydrogenase (Yoshida et al. 1998; Kirch et al. 2004), adenosine triphosphate–binding cassette transporters (Dean, Rzhetsky, and Allikmets 2001), and plant disease resistance genes (Kuang et al. 2004) are also known to have functionally differentiated subfamilies of duplicate genes though the interaction among these subfamilies has not been studied well.
In the beginning of this paper I indicated that Charles Darwin recognized natural selection as the major factor of evolution, but he also accepted the possibility of nonadaptive evolution of some phenotypic characters. Thomas Morgan modernized this view by clarifying the roles of mutation and natural selection based on Mendelian genetics. He suggested that mutation is the primary force of evolution and selection is merely a sieve to save advantageous mutations and eliminate deleterious mutations. Neo-Darwinians criticized this view arguing that natural selection has creative power in the presence of abundant mutations that are stored in the population and that mutation merely provides raw material on which natural selection operates to create innovative characters (Simpson 1953; Mayr 1963; Dobzhansky 1970).
The molecular study of evolution has generated many new findings that have indicated the importance of mutation in the evolutionary change of DNA or protein molecules. Neo-Darwinians dismissed these findings by arguing that they have nothing to do with phenotypic evolution in which most evolutionists are interested (Mayr 2001). However, because phenotypic characters are ultimately controlled by DNA molecules, any change in phenotypic characters must be caused by some changes of DNA (excluding environmental effects). In fact, recent molecular and genomic studies indicate that the evolutionary change of phenotypic characters is primarily caused by new mutations including gene duplication and other DNA changes.
It now seems that the basic process of phenotypic evolution is essentially the same as that of molecular evolution. The major difference is in the relative importance of mutation and selection (Nei 1975, 1987, chap. 14). Obviously, natural selection plays more important roles in phenotypic evolution than in molecular evolution. However, the major force is still mutation for both types of evolution, and without mutation no evolution can occur. This is consistent with Morgan's mutation-selection theory, and in this sense molecular evolution is not non-Darwinian, as once called by King and Jukes (1969).
One might think that the extent of nonadaptive changes is greater in molecular evolution than in phenotypic evolution. I am not sure about this view, though it is difficult to compare these two types of evolution. We have certainly seen that the extent of nonadaptive changes of proteins is much greater than that of adaptive changes. However, this may be the case even with phenotypic evolution. At present, about 6 billion people are living on our planet, but all of them show different phenotypes except identical twins. Most of this phenotypic variation in human populations appears to have little to do with the fitness as measured by the number of children (excluding the effects of birth control). If this is the case, the extent of nonadaptive genetic changes of phenotypic characters may be as great as that of protein variation (Nei 1975, 1984, 1987).
Previously I mentioned that one of the most important findings in the study of molecular evolution is that of relationships between the functional constraints of gene products and the rates of amino acid or nucleotide substitution. We have seen that pseudogenes which do not have any function evolve at the fastest rate and functionally important genes evolve very slowly. Similar properties exist also in phenotypic evolution. Many unused or vestigial characters evolve quickly and lose their function or disappear, as in the case of the eyes of cavefish. However, if the environmental constraints are strong, the organism does not evolve very much. For example, the morphology of horseshoe crabs has scarcely changed for the last 200 Myr apparently because they have lived in essentially the same environment. By contrast, if a group of organisms lives in various new environments, the organisms may evolve rapidly into different taxonomic groups specialized in different environmental conditions, as in the case of mammalian radiation. Darwin (1872) discussed this type of evolution extensively with respect to morphological characters.
Molecular evolution, however, has a distinct feature which is not shared by phenotypic evolution. It is the evolution of multigene families. We now know the general properties of evolution of multigene families and their importance for the evolution of phenotypic characters and new genetic systems (Nei and Rooney 2005). In this case, both gene duplication and gene loss appear to play an important role (Nam and Nei 2005). Gene duplication also generates the multiple biochemical pathways for developing the same phenotypic character. However, the interaction of transcription factors and protein-coding genes or protein-protein interaction is quite complex. Therefore, the study of evolution of complex phenotypic characters would not be easy. Yet, this is one of the most important problems in evolutionary biology at present.
When gene duplication occurs, the occurrence of this event must be more or less random. The fixation of duplicate genes in multigene families also appears to be quite haphazard, depending on the existence of other gene families and environmental conditions. At present, we do not know the relative importance of chance and natural selection. However, the chance factor working here is a new feature of random evolution, which is qualitatively different from the neutral evolution of genes by random genetic drift. This view of evolution is based on a large amount of molecular data, and in this sense it is different from Morgan's mutationism, which was largely speculative. For this reason I have called it neomutationism (Nei 1983, 1984) or the neoclassical theory of evolution (Nei 1987, chap. 14). Whatever it is called, however, recent molecular data supports the theory of mutation-driven evolution rather than neo-Darwinism.
I thank Hiroshi Akashi, Jim Crow, Li Hao, Dan Hartl, Eddie Holmes, Jan Klein, Kateryna Makova, Yoshihito Niimura, Joram Piatigorsky, Will Provine, Naruya Saitou, Yoshiyuki Suzuki, Victor Vacquier, and Jianzhi Zhang for their comments on an earlier version of this paper. This work was supported by the National Institutes of Health grant GM020293.
Standard Error of the Fitness Difference Between Two Genotypes Caused by Progeny Size Variation
Studying the distribution of the number of children (progeny size) per female in a human population of rural Japan, Imaizumi, Nei, and Furusho (1970) showed that the progeny size approximately follows the Poisson distribution and therefore the mean and variance are approximately equal to each other. This study was done with the family cohorts, which practiced no birth control (around 1990). In a stable population, therefore, the mean and variance of progeny size per individual is expected to be 1. Note also that the sum of progenies for a large number of individuals approximately follows a normal distribution. Therefore, the average progeny size is also normally distributed with mean and variance equal to 1. In this population the pstandard error (SE) of mean progeny size is given by , where N is the effective population size. This indicates that SE is equal to 0.001 when N = 106.
Let us assume that the above population is fixed with allele A1, so that all individuals have genotype A1A1. Consider another population of the same size, which is fixed with allele A2, and denote the fitnesses of A1A1 and A2A2 by 1 and 1 + 2s, respectively. When , the SE of the mean progeny size or fitness is given by , which is approximately . Therefore, the difference in mean fitness between the two populations is 2s, and the SE of this difference is given by approximately. The normal deviate (z) of 2s is therefore . If N = 106 and s = 0.0001 (corresponding to the case of 2Ns = 200), z = 0.14. This is much smaller than the z value (approximately 2) at the 5% significance level. This indicates that this magnitude of fitness difference between the two populations can easily be swamped by progeny size variation. For to be greater than 2, s must be greater than . For example, if N = 106, s must be greater than for the fitness difference to be significant. If N = 104, s must be greater than 0.014. In practice, however, z = 1 (30% significance level) or may be acceptable under certain conditions. In this case, we have s > 0.0007 for N = 106 and s > 0.007 for N = 104.
It should be noted that the variance of progeny size is often greater than the mean especially in invertebrates (Crow and Morton 1955; Fisher 1958). Therefore, or should be a minimum requirement. Note also that in population genetics theory, s is generally assumed to be the same for all generations, even if it is very small. This assumption is unlikely to hold in reality, because any gene would interact with some other genes differently in different generations (with different environments), and therefore the maintenance of constantant s would be impossible.
William Martin, Associate Editor