Natural selection is expected to increase the frequency of locally advantageous alleles, resulting in higher among-population differentiation at these adaptive loci (measured by locus-specific
FST), compared with differentiation in the rest of the genome (neutral loci) [
1,
2]. Identifying adaptive divergence among populations at specific loci from genome scans is an active and challenging research area (see [
3] for a review). This task is especially demanding for dominant biallelic markers such as Amplified Fragment Length Polymorphism (AFLP) markers, although they represent an easy way to scan a large number of markers scattered throughout the genome in non-model species [
4-
6]. So far, the most widely used method to detect outliers from AFLP genome scans is implemented in Dfdist, an extension of Fdist software that allows the use of dominant markers [
7]. Dfidst is a frequentist method based on summary statistics of a symmetrical island model (i.e. drift-migration equilibrium, [
8]). Each locus is compared to the neutral distribution built from the mean
FST averaged across all populations. Dfdist is likely to generate false positives (i.e. loci with higher than expected
FST) when gene flows are asymmetric across populations, and/or when some populations experiment bottlenecks. To overcome that limitation, Foll and Gaggiotti [
9] have recently developed a new hierarchical Bayesian method (BayeScan) that also allows the accommodation of AFLPs data. Their method is derived from the method of Beaumont and Baldwin [
10]. It produces a posterior probability for each locus being under selection. The main advantage of BayeScan is that it estimates population-specific
FST coefficients, therefore allowing for different demographic histories and different amounts of genetic drift between populations. In structured populations, the Bayesian approach is less likely to detect false positives. The proportion of false positive in detected outliers cannot be easily estimated in the Bayesian approaches as it requires simulating datasets under different scenarios [
9]. By contrast, it can be estimated using false discovery rate [
11] in frequentist methods.
After detecting outlier loci, it is then a challenging prospect to verify whether they are involved in local adaptation and to isolate the different ecological factors responsible for the behaviour of each outlier from a complex natural environment [
4,
12]. Indeed, despite recent advances in tracking adaptive genes out from the neutral background genetic variability across populations of a species, the relative role of biotic versus abiotic constraints acting on genomes in their natural environment remains a largely under-explored area. For non-model organisms, outlier loci cannot be mapped and hence their functional role will remain unknown. An alternative approach is to correlate their variation in frequency throughout the sampling area with that of continuous environmental variables, such as altitude or climate [
13-
15], or qualitative variables such as different host-plants for insects [
6,
16,
17] or different life habits (limnic or benthic) for fish ecotypes [
18].
The evolutionary success of phytophagous insects is thought to result mainly from their adaptation to various host-plants, with insect adaptation driving plant diversification in a co-evolutionary process [
19-
21]. Alternatively, the diversification of widespread species could be driven by adaptation along environmental abiotic gradients. The large pine weevil
Hylobius abietis L. (Curculionidae) is a good model for addressing this question because it is widespread in Europe (large environmental variation) and because, during larval development, it depends exclusively on only two plant genera: spruce (
Picea) and pine (
Pinus). The large pine weevil is one of the most important economic pests of European conifer forests. The larvae feed under the bark of stumps and roots of recently felled trees, and take from three months to two years to develop into adults, depending on location [
22], presumably because of climatic conditions and/or host plant quality [
23,
24]. The adults are active only under cool climatic conditions, usually in spring and autumn, and burrow into the soil during hot summers and cold winters; adults can fly large distances and can live up to four years [
22]. Because of this complex life cycle, several climatic factors including temperature, precipitation, soil, frost and wind speed may have either a direct impact on larval/adult survival, or a more indirect impact on fitness through host plant quality, and represent therefore potential selective forces acting on the pine weevil genome at a large geographical scale. Adults are attracted to recently felled trees (spruce or pine) where the females lay eggs under the bark. Managed pine and spruce forests planted in Western Europe 200-300 years ago offer many oviposition opportunities for this weevil, allowing large stable populations to be sustained in contrasting climatic environments.
The first study on the population genetic structure of the pine weevils, at the European scale [
25] revealed that genetic variation of this insect is better explained by geography than host-plant (5% versus 1% of total variation). Furthermore, a locus by locus AMOVA identified some loci with significant
FST across different host-plant groups, suggesting that host-plant linked selection might occur in this species. A second analysis consistently identified 2 out of 83 unlinked AFLP markers as outliers by using univariate logistic regressions to search for correlations between molecular markers scattered throughout the genome and several environmental variables suggesting an effect of climate on weevil adaptation [
14]. However, the effect of the host-plant type was not tested.
In the present analysis, we focused on disentangling the role of abiotic environmental variables on one hand from the effect of the host-plant (pine or spruce) on the other. We therefore excluded adult weevils from the original AFLP dataset because they cannot be assigned to a host-plant, reducing the analysed dataset to 296 individual larvae. We firstly detected outlier loci across geographic and host-specific groups of individuals in 16 managed forest sites distributed across 4 large forestry regions (Table , Figure ) using population genetic approaches (Dfdist and BayeScan). We then used a correlative approach to disentangle the effects of host-plants from those of various environmental variables. To further confirm the involvement of selection in genetic patterns of differentiation, we tested for the drift-migration equilibrium (i.e. isolation by distance pattern) on neutral loci. If adaptation to host-plant is promoting divergence, then we would expect to find outliers loci when comparing different host populations whereas loci candidate to diverge independently from host-plant would be rather correlated with other ecological pressures, such as climate.
| Table 1Geographic location, sample sizes and host-association characteristics of Hylobius abietis sites collected. |