There are several reasons why the mitochondrial genome (mtDNA) of humans as well as of other mammalians has been considered to be useful in population genetics, phylogeographic, and phylogenetic studies. Besides the usually invoked mtDNA characteristics (high copy number per cell, compact organization and maternal transmission), mtDNA has been widely used because it provides easy access to an orthologous set of genes with little or no recombination and rapid evolution [1
]. Moreover, from a theoretical perspective, it has been accepted for a long time that mtDNA haplotype frequencies are controlled primarily by migration and genetic drift and that most of the variation within a species is selectively neutral [1
]. However, more recent reports sustain the hypothesis that mtDNA frequency variation is due to natural selection [2
Given the importance of mitochondrial function, it is not straightforward to assume a priori
that mtDNA evolves as a strictly neutral marker. Changes in the mtDNA sequence can have substantial impacts on the fitness of the organelle/cell (within individuals) and on the fitness of the individual host organism. Deviations from a strictly neutral model of evolution have been found in a variety of organisms [5
]. Ballard and Rand [1
], in a revision, gave three main reasons why it is reasonable to predict that mtDNA variation may be under strong selection: i) Mitochondrion is the powerhouse of the cell and, in most organisms, a reduction in ATP production is expected to reduce fecundity. In humans, a reduction in the efficiency of ATP production is known to be highly deleterious or lethal in the extreme case; ii) Proteins from mtDNA interact with those imported from the nuclear genome to form four of the five complexes of the electron transport chain; and iii) The assumption of total absence of recombination in mtDNA means that each genome has a single genealogical history and all genes will share that history. Any evolutionary force acting at any one site will equally affect the history of the whole molecule. Thus, the fixation of an advantageous mutation by selection, for example, will cause the fixation of all other polymorphisms by a process known as "genetic hitchhiking" [10
]. Even the quickly evolving non-coding mtDNA region (D-loop) cannot be assumed to have neutral allele frequencies: It is linked to the rest of the genome (where selection has been documented) and conserved motifs within this region exhibit variation that affects mitochondrial transcription and replication in significant ways [11
]. Alternatively, polymorphism within a mitochondrial genome may be depressed through selection against linked deleterious mutations, a process known as "background selection" [12
Besides the significance of mtDNA selection per se
, the assessment of the impact of selection in mtDNA is crucial in the establishment of mtDNA evolutionary rate, as was previously demonstrated by Denver et al. [7
]. In a strict neutral model of evolution, the substitution rate depends only on the mutation rate per individual; however, if there was any, even if slight, effect of selection this would not be true. According to Ohta [15
] very slightly deleterious mutants are effectively selected against in populations large enough; however these same mutants should be governed by random drift in small populations behaving as selectively neutral. Thus, small populations may accumulate deleterious mitochondrial mutations at an increased rate [17
Human mtDNA substitution rate has been estimated using mainly phylogenetic [18
] and empirical methods [30
]. Estimations for the D-loop, based on phylogenetic approaches, range from 0.0575 mutations/site/Myr [20
] to 0.2860 mutations/site/Myr [21
], whereas estimations using an empirical methodology range from 0 mutations/site/Myr [33
] to 2.5 mutations/site/Myr [32
]. On what concerns the coding region, the few estimations performed so far showed that the substitution rate of the coding region is ~10 times lower than that reported for the D-loop; furthermore, and as observed for the D-loop, important differences between phylogenetic and empirical estimations have been pinpointed [31
]. Mishmar et al. [2
] applying a phylogenetic approach to complete sequences of the coding mtDNA region obtained a substitution rate of 0.0126 mutations/site/Myr, whereas Howell et al. [31
], using an empirical estimation for the coding region of mtDNA, report a value of 0.075 mutations/site/Myr.
Since the first empirical D-loop estimation of mtDNA evolutionary rate by Howell et al. [30
] an intense debate about the causes of the discrepancies between phylogenetic and empirical rates has taken place [9
]. Such discrepancy has been attributed to distinct causes, namely: to differences in the rate of mutation at different mtDNA positions; to the effect of selection and genetic drift; to the occurrence of somatic mutations; to the unintended sequencing of nuclear mitochondrial pseudogenes; and to the leakage of paternal mtDNA and recombination. Moreover Ho et al. [9
] described an acceleration of the rate of substitution at evolutionarily short timescales and among other factors, the authors attributed this acceleration to purifying selection acting on mtDNA [9
]. However, Emerson [42
] reanalyzing the data of Ho et al. [9
] suggested that it would seem that the time-scale upon which the pedigree rate converges to the evolutionary rate is very much shorter than the timescale that Ho et al. [9
] have focused on, and it is debatable whether this convergence would follow an exponential distribution, or if such a pattern existed, whether it could not be equally explained by coalescent effects.
Recently, we reported on the mtDNA mutation rate of the D-loop [37
], using an empirical approach. Our results supported the conclusion that the discrepancy between phylogenetic and pedigree derived rates cannot be attributed neither to the inclusion of somatic mutations in calculations, nor to the use of families with mtDNA disease, or even to paternal contribution of mtDNA. Moreover, the discrepancy cannot be justified by the fact that mutations observed in families occurred preferentially in hypervariable sites. Santos et al. [37
] advanced two additional factors that must be taken into account: the gender of individuals carrying germinal mutations (mutations carried exclusively by men will never be passed to the next generation) and "the weight" of heteroplasmic germinal mutations (for mutations in mtDNA to reach polymorphism levels in the population – and eventually become fixed – it is first necessary that they pass from an heteroplasmic to an homoplasmic state, at the individual level, and this will be dictated by the initial levels of heteroplasmy).
To date there are only two studies that deal with the empirical mutation rate estimation of coding mtDNA region [31
]. Moreover, the discussion of the effect of selection in the fate of new arising mutations in coding and non-coding portions of mtDNA has not been fully addressed. In this work we present the results of the analysis of a portion of the coding region of mtDNA, using individuals belonging to extended families from the Azores Islands (Portugal). The main aims are: a) to provide empirical estimations of the mutation rate of the coding region of mtDNA under different assumptions, and b) to better understand the mtDNA evolutionary process, including the factors that control the levels and progress of mtDNA heteroplasmy until the intraindividual fixation of new arising mutations.
Heteroplasmy was detected in 6.5% of the families analyzed. In all of the families the presence of mtDNA heteroplasmy resulted from three new point mutations, and no cases of insertions or deletions were identified. Our empirical estimation of mtDNA coding region mutation rate, calculated taking into account several factors is similar to that obtained using phylogenetic approaches.