We compared high quality DNA sequences from a 344bp region of the 5′ terminus of the mitochondrial HVR I region of a large sample of modern Adélie penguins. These comprised the mother and father from each of 508 families and typically two chicks per family. All blood samples were collected from an Adélie penguin colony at Cape Bird, Ross Island, Antarctica over four consecutive summers starting in 2001/2. DNA sequences were scored for quality using PHRED 
and poor sequences were eliminated from our analysis or re-sequenced (see Materials and Methods
). From the remaining sequence data, a number of mitochondrial heteroplasmic sites were detected. At these sites, two nucleotide signals were apparent in the same individual. Such heteroplasmies are the result of an earlier mutation event and two variants (the original and the mutant) have persisted in the same individual. In order to rule out false positives, such as substitutions that might arise from PCR amplification errors, all heteroplasmies were re-sequenced from different DNA extractions from the same samples. In total, we detected 62 heteroplasmies from DNA trace data. A calibration study (see Materials and Methods
) showed that the proportions of each mitochondrial haplotype can be accurately inferred from the DNA trace data, with a standard error of approximately 5%. All but one of the recorded heteroplasmies were transitions and all but three of the heteroplasmic sites were at positions in which polymorphisms were recorded in populations of Adélie penguins from colonies in the Ross Sea, Antarctica. illustrates the position of the heteroplasmic sites and the frequency of the non-majority base in each case. Using available Adélie penguin life history data 
, we estimated the average intergenerational age (g) as 6.46 years. Using these data, the observed rate of heteroplasmies (μo
) of the HVR I region is 54.9 mutations/site/Myr with 95% confidence intervals of 41.2–68.6 mutations/site/Myr ().
A plot of the frequency of the non-majority nucleotide bases across the 344 bp region of the HVR I for all adult penguins examined in this study.
The estimates of the frequency of observed heteroplasmies (μo) with the mean and 95% confidence intervals are shown.
Mutation events in mitochondria result in heteroplasmies that can persist over generations and which may or may not be detected, depending on the frequency of rarer variants 
. In our study, heteroplasmies appeared to be germline variants rather than somatic, as evidenced by the fact that, in all families, they were transmitted from the mother to one or both chicks (). In combination, these data suggest that, in contrast to mutations in some other species 
, mutations in Adélie penguins persist in the heteroplasmic state for many generations. A heteroplasmy can only be transmitted across generations if a chick inherits multiple copies of its mother's genome. The larger the number of generations that a heteroplasmy persists, the higher will be the probability that it will be detected. A heteroplasmy can persist for many generations in a maternal line of descent until it is either lost or goes to fixation. This persistence time is influenced by the number of segregating mitochondrial genomes (N
) that pass through the inheritance bottleneck. For human oocytes, N
has been estimated to be between 15 and 70 
Details of heteroplasmies recorded from pedigree material of Adélie penguin families.
Our estimate of the mutation rate is affected by our ability to discriminate between low frequency heteroplasmies and noise in DNA trace data. If we set the threshold detection level too low, we would mistakenly include ‘noise’ as evidence of heteroplasmies. To avoid such false positives, at least one of the two chicks had to have a haplotype that exceeded a threshold level. As a result of a calibration study (Materials and Methods
), we set a detection threshold frequency (θ
) of 23%. Our further analyses take account of the expected number of heteroplasmies that are excluded using this threshold.
We used a recently developed model 
to take account of the above factors in our estimate of the mutation rate. The model defines the rate at which new mutations enter the germ-line (α). Assuming these mutations are neutral and that each mutation is equally likely to be transmitted to the next generation, only 1/N
of these mutations are expected to go to fixation, so μ
. This model 
assumes that we can only observe a heteroplasmy when the proportion of a new haplotype exceeds θ
. This model also assumes that if θ
0.23, most heteroplasmies are lost without reaching this proportion, and most heteroplasmies that reach this level do not go to fixation. If N
were doubled, then the number of heteroplasmies that reach the threshold halves, but the observed persistence of those that reach it, doubles. Hence, the rate of observed heteroplasmies (μo
) is independent of N
, and approximated by the expression 2α/ln(1/θ
. Thus for the threshold θ
≈ 2.417α, so μ
The ratio of the heteroplasmic variants present in mothers and chicks was estimated from the relative peak heights in DNA trace data for each individual (). These represent the relative proportions of each of the four possible nucleotides at any given position in a DNA sequence. Using the binomial distribution to model the inheritance of heteroplasmies, we showed that for a large N there is little difference in heteroplasmy ratios between mothers and chicks. We used this finding to infer N, from the distribution of differences in heteroplasmy ratios of mothers and chicks. These differences are summarised in .
The distribution of differences between the two haplotypes in heteroplasmic mothers and their chicks is shown, as determined by DNA trace peak heights.
We present here a brief description of the method used to estimate N
. A more detailed description is found in 
. From , we calculated the mean square difference in the frequency of haplotypes (as estimated by the peak heights in DNA trace data) between mothers and chicks, which we designated the “raw variance” (σ ^2raw
). The raw variance is dependent on the actual heteroplasmy difference between mother and chick, which we designated the “genetic variance” (σ2genetic
), and the uncertainties in measuring the heteroplasmy, known as the “measurement variance” (σ2measure
). The genetic variance was estimated by subtracting the measurement variance from the raw variance. N
was estimated from this using the variance of a binomial distribution. We also estimated the uncertainties in our analysis. For a Gaussian distribution, the variance in the sample variance is var (σ ^2
(n−1)/n2 σ4 
, where n
is the number of samples and σ2
represents the true variance.
Within the data in , there are 123 mother-chick pairs, with mean square difference σ ^2raw
(corresponding to a root mean square of 10.71%) with estimator variance var (σ ^2raw
. From the calibration study, each measurement has variance σ ^2measure
(i.e. a standard error in measurement of 4.62%) with estimator variance var (σ ^2measure
. Then the genetic variance is σ ^2genetic
with estimator variance var (σ ^2genetic
var (σ ^2raw
)+2 var (σ ^2measure
(corresponding to (8.48±0.99)%). If the proportion of each haplotype inherited by the chick comes from a binomial distribution with population size N
is the mother's heteroplasmy ratio. The expression p(1−p)
varies little, so we use the mean value of 0.234 to estimate 1/N
0.0319±0.0075 (standard error), which then becomes N
31.3 (95% confidence interval 21.5–57.9). Using a more detailed model 
, we found that the posterior distribution of N
had a median value 38.3 and HPD 95% confidence intervals 24.3 to 63.3. Using a maximum likelihood estimation, the corresponding point estimate for μ
is 0.55 mutations/site/Myrs with a HPD 95% confidence interval of 0.29–0.88 mutations/site/Myrs (). A number of authors have recently reported similar high rates of mutation from organisms as phylogenetically diverse as Caenorhabditis elegans 
and humans 
In order to compare an evolutionary rate with this mutation rate, we expanded on an earlier study 
that analysed 96 known age sub-fossil bones of Adélie penguins to determine the evolutionary rate for the HVR I region. For this study, we sequenced an additional 66 bones of ages up to 37,000 years (Tables S1
) and estimated the rate of evolution for the same 344 bps of the HVR I region used to estimate the mutation rate. These data were characterised by 156 segregating sites and a nucleotide diversity of 0.049 (±0.006 S.E.). The majority of the substitutions were transitional changes (0.047±0.006 S.E.). We estimated the rate of evolution using a Bayesian Markov chain Monte Carlo (MCMC) approach, as implemented in the software Bayesian Evolutionary Analysis Sampling Trees (BEAST v1.3) 
. An MCMC simulation of 20 million steps, with the first 500,000 steps discarded as the burn-in time, estimated k
to be between 0.53 and 1.17 substitutions/site/Myr (95% HPD) with a median value of 0.86 substitutions/site/Myr. Our previous median estimate of k
was 0.96 substitutions/site/Myr and our new analysis reduced the confidence interval from 0.53 to 1.43 
. Our analysis showed that the results were not significantly dependent upon the priors.
In relation to the power of our analysis, if μ
was, for example, four times as large as we observed, we would have recorded approximately four times as many heteroplasmies. Hence, the relative size of the confidence interval would be reduced by a factor of two (but as μ
quadruples, the absolute size of the confidence interval would double). Approximating the posterior distributions of μ
as normal distributions, we find that for μ
<0.44 or μ
>1.44, we could reject the null hypothesis (μ
) at the 95% confidence level.