A total of 385 plants of Centaurea horrida were analysed using four microsatellite markers, identifying a total of 80 alleles. All the loci studied are highly polymorphic: the number of detected alleles per locus across all the populations ranged from 15 (locus 21D9) to 25 (locus 13D10). There were no indications for null alleles at any of the loci. No alleles were found fixed at any of the loci; neither was evidence found that a given population harboured specific alleles.
Genetic diversity () was measured using Nei's heterozygosity (He) and ranged from 0·449 (locus 21D9, TAV population) to 0·925 (locus 13D10, DON population). The high estimates of genetic variability are confirmed by the average He values, ranging from 0·603 (LIO) to 0·854 (FAL and DON). These values are higher for the populations of the Stintino–Asinara region than for the two populations of the Alghero region and the isolated population of Tavolara.
Observed and expected heterozygosity measured at each locus for each population, and averages over loci and populations
The Hardy–Weinberg equilibrium was tested for all the loci and populations by testing the departure of FIS from zero under the null hypothesis. FIS values are significantly different from zero for all the loci except locus 28A7 for the STR, FAL and DON populations, locus 12B1 for the FOR and LIO populations and locus 13D10 for the BAR and TAV populations. In the vast majority of cases, deviation from the Hardy–Weinberg equilibrium was associated with positive FIS values, while negative FIS values were mainly associated with the locus 28A7 (four populations).
The non-random association of the alleles at different loci, or linkage disequilibrium (LD), was investigated. A significant departure from equilibrium at the 5 % level was found for almost all pairs of loci within population. Only five comparisons out of 42 were not significant, for the pairs of loci 21D9–13D10 (LIO), 21D9–28A7 (LIO and FOR) and 28A7–12B1 (DON and FOR).
Genetic differentiation among populations
The genetic divergence among populations was measured using both FST and RST (). Their significance was tested by a permutation procedure: all FST and RST values differed significantly from zero. The maximum FST value was found between the LIO and TAV populations and the maximum RST value between the BAR and TAV populations. It is to be noted that the pairwise RST values are constantly higher than the respective FST values, with the exception of the values relating to the LIO population.
FST (below diagonal) and RST (above diagonal) values for each population pair
The overall genetic differentiation between populations was significant. By means of FST = 0·123 (confidence interval at 95 % results in 0·072 ≤ FST ≤ 0·178) it was estimated that >12 % of the genetic variance can be attributed to differentiation between populations. The same procedures for RST yielded an estimated overall RST = 0·158, with a confidence interval at 95 % of 0·137 ≤ RST ≤ 0·196.
Isolation by distance
The presence of correlation between genetic differentiation (estimated as FST/1–FST) and geographic distance (log km) between populations was demonstrated by a Mantel test (P = 0·004, G = 2·41, Z = 10·6), indicating that the present distribution of genetic variation among the remnant populations of Centaurea horrida is, at least in part, the result of an equilibrium between drift and gene flow. Gene flow was estimated on the basis of either FST or RST. The maximum value of Nm was 8·37 (populations FAL and DON), whereas the minimum value was 1·33 (populations LIO and TAV).
Under the assumption of drift-gene flow equilibrium, the distribution of the expected heterozygosities was compared with the Hardy–Weinberg heterozygosity for each locus and for all populations, to identify those populations which could have experienced a reduction of Ne in recent times. Of the three statistical methods used by the BOTTLENECK software, sign test, Wilcoxon test and standardized differences test, the latter was not employed, because it requires at least 20 polymorphic loci to be reliable. Even so, the four polymorphic SSRs do not guarantee high statistical power. The presence of genetic bottlenecks was tested under the IAM, the SMM and the TPM models of evolution. In neither case was evidence of a recent (within approx. the past 2Ne–4Ne generations) bottleneck found.
Analysis of the population structure
Since this study concerns a rare and endangered species, it was of paramount importance to estimate K
, the most probable number of ‘genetic units’ or ‘gene pools’ present in the data, in order to be able to suggest possible mechanisms that have shaped their genetic variability, and to reach conservation recommendations. This was done by applying the Bayesian clustering method as implemented by STRUCTURE (Pritchard et al., 2000
). The estimate of K
was based on ΔK
, the second-order rate of change of the likelihood function with respect to K
, as suggested by Evanno et al. (2005)
. A sharp signal was found at K
= 2 (see Supplementary Information Table, available online), therefore suggesting that two homogeneous gene pools shaped the genetic structure of the populations analysed. To check the composition of each individual population and each plant with respect to the inferred populations, further analysis was conducted based on K
= 2. The results are shown for the populations in . Analysis of the genetic components of the populations shows that the STR, FOR, FAL, DON and TAV populations derive the major component of their genetic composition from the first inferred population and the LIO and BAR populations from the second. Quantitative analysis of this process is also shown in in a supplementary figure (Supplementary Information, available online), where the contribution of the two inferred gene pools is reported in graphical form for each of the plants analysed.
Fig. 3. Analysis of population structure according to a Bayesian clustering method. The populations studied derive their genetic structure from two inferred populations (‘gene pools’ 1 and 2) of origin. A pie diagram indicates the proportion of (more ...)
The total amount of genetic variation was also partitioned by AMOVA into components according to the geographic subdivision of the populations. First, based upon the analysis of the population structure, the hypothesis that the populations fall into two geographic regions was tested, separating the Alghero area from the rest of the range. The AMOVA results (A) show that the within-population component accounts for 82 % of the total variance and that both the differences between regions and the differences between populations within a region account for smaller, but significant, amounts of the total genetic variation. Second, the hypothesis that all three geographic areas () harbour significant amounts of variation was tested. This partitioning of the data revealed that 10 % of the genetic variance resided between regions and 7 % between populations within regions (B).
Analysis of molecular variance (AMOVA) based on four SSRs for the seven populations of Centaurea horrida