Here we present a comprehensive catalogue of genomic regions that are candidates for being affected by artificial selection in dogs using the densest panel of SNPs to date. We focus on two main types of variant: 1) common variants that affect variation in a trait in many breeds and 2) rare variants that have undergone selective sweeps in one or a few breeds. For the first category, we identify loci where variation correlates with morphological traits such as body size and tail curl, and behavioral traits such as sociability and boldness. We also identify several loci with evidence for a high degree of population differentiation between breeds, for which the connection with phenotypic traits in dogs is not known, but that are known to associate with traits such as pigmentation and body size. To identify loci in the second category, we searched for regions with reduced heterozygosity and high population differentiation, characteristic of selective sweeps. This analysis identified loci known to be associated with breed-defining characteristics such as chondrodysplasia, skin wrinkling, and furnishings. In addition, we identify several extended regions with reduced heterozygosity > 1 Mb consistent with recent selective sweeps in one or more breeds, including striking examples such as a region containing the FNDC3A and CYSLTR2 genes, and a region containing the MSTN (myostatin) gene that both bear the signal of selection in multiple breeds.
The candidate selection loci we identified are strongly enriched for genes involved in developmental and metabolic processes. In general, the GO terms we find to be significantly enriched are different from analyses of selection in natural populations, in which genes commonly targeted by positive selection include those involved in immunity and defense, olfaction and responses to external stimuli 
. These results are consistent with the idea that artificial selection in domestic animals target different functional categories than natural selection. This result contrasts with that of Akey et al. 
who found genes involved in immunity and defense to be overrepresented among their candidate selection regions.
Artificial selection on dog breeds coincided with breed creation bottlenecks leading to genetically distinct breeds fixed for novel traits 
. Hence a large proportion of phenotypic and genetic variation is apportioned between but not within breeds. It is notable that 35% of polymorphic SNPs we analyzed are fixed or almost fixed for alternative alleles in two or more breeds. This is in sharp contrast to the differences between human populations, where only 78 near-fixed differences, that are all strong candidates for being under selection, were observed between four populations among 15 million SNPs identified using whole-genome resequencing 
. The strong influence of genetic drift on genetic variation in dog breeds has also led to random fixation of long haplotypes and it is estimated that on average ~25% of the genome lies within a homozygous block >100kb in an average breed. This suggests that functional genetic variation has also been affected by genetic drift. This background of fixation of haplotypes by drift makes it extremely difficult to distinguish the signal of a selective sweep from background variation, and they may often be indistinguishable.
We performed coalescent modeling using realistic estimates of recombination and demographic parameters in order to compare the length distributions of genomic segments identified by our analyses with those expected under neutrality. These simulations are by necessity an approximation of the actual evolutionary and demographic forces that shaped patterns of genetic variation in dog breeds. In particular, we do not model selection, which may reduce effective population size. Secondly, we assume a simplified demographic model, involving a single domestication bottleneck, and simultaneous breed creation. The true history of dog evolution is likely to be more complex than this, with some breeds showing closer relatedness than others. Nevertheless, long segments identified by the Si and di that pass the 5% FDR cut off are strong candidates for selective sweeps, and contain a number of regions already associated with phenotypic traits.
Simulations indicate that large segments of reduced heterozygosity and elevated FST
are expected under neutrality but longer segments of reduced heterozygosity, particularly those longer than 1 Mb, are not expected to occur due to drift alone and hence are more likely to reflect selection. In general we expect segments of reduced heterozygosity to contain causative variants under selection, however, in some cases we observe large blocks of reduced heterozygosity that appear to be broken up into adjacent regions separated by more variable regions. This pattern may reflect heterogeneity in ancestral haplotypes, which makes it difficult to pinpoint the focus of selection. Smaller blocks of elevated di
often occur within extended regions of reduced heterozygosity. These probably reflect the fixation of variants that are otherwise rare in the dog population due to hitchhiking on the selected haplotype. However, most variants that are fixed by hitchhiking during a selective sweep are likely to be already common in the population, and therefore will not have a big effect on the di
value of a region. This leads to stochasticity in the di
statistic, which may explain the fact that even the longest di
segments still do not pass a 5% FDR. When even denser surveys of SNP variation (e.g. from whole genome sequencing) are available, a more promising approach could be to identify selective sweeps using reductions in heterozygosity, and identify potential causative variants within these sweeps by their elevated FST
(see e.g. 
In addition to aiding in the dissection of the genetic components of phenotypic variation in dog breeds, we anticipate that our fine-scale map of genomic regions of extreme population differentiation and fixation of extended haplotypes will find utility for identification of disease causing variants. Firstly, regardless of whether they are caused by selection or drift, regions with reduced heterozygosity in a particular breed are problematic to interrogate with GWAS and may harbor disease-causing variants that are not tagged on a SNP array. Secondly, genetic variants responsible for breed characteristics may have pleiotropic effects that increase incidence of disease in that breed. Thirdly, disease-causing mutations may have risen in frequency in regions under selection by genetic hitchhiking on haplotypes bearing variants under artificial selection. These considerations suggest that our candidate selection regions warrant additional scrutiny in disease mapping studies. An example of the second effect has recently been highlighted in the Shar Pei breed, where strong artificial selection for genetic variants that likely affect expression of the HAS2 gene is responsible for both the characteristic wrinkled skin of the breed and an increased predisposition to periodic fever syndrome 
Our analysis of single-SNP FST across breeds identified a number of extended genomic regions of extreme population differentiation between dog breeds, which harbor variants responsible for commonly varying traits between dog breeds. Genetic variation in some of these regions correlates with multiple traits that vary between dog breeds, in some cases including both morphological and behavioral differences. There are several possible reasons for these multiple associations. One possibility is that these regions harbor multiple variants that each has an effect on different traits. Alternatively the associations could be the result of single mutations with pleiotropic effects that affect multiple traits. It is also possible that traits may correlate with each other for other reasons. For example, there may have been coordinated selection for more than one trait in a subset of breeds, or a subset of breeds may share a trait simply by chance. We have comprehensively surveyed genetic variation in a region of extreme population differentiation on chromosome 10, where genetic variation correlates with body size, drop ears and boldness. As boldness shows strong correspondence with drop ears it is unclear whether this trait is affected by an independent variant in this region. A more detailed analyses of the allele frequencies of SNPs associated with body size and drop ears is consistent with a hypothesis that these traits are controlled by two linked SNPs, which in combination produce three observed haplotypes associated with distinct phenotypes. It is therefore possible that additional regions of extreme population differentiation also harbor multiple variants affecting different traits. Careful genetic dissection of each region is necessary to identify all functional variants and the traits they affect. As extensive LD is found in these regions, it is difficult to determine how many functional variants are present and their precise location. Such analysis would therefore be aided by the use of multiple breeds or populations with less extensive LD in order to narrow down the associated intervals.
In its most extreme form, a selective sweep is characterized by the rapid fixation of a new mutation under selection along with linked genetic variants (a hard sweep). However, less extreme selective episodes (soft sweeps), such as incomplete selective sweeps or selection on standing variation may also be common 
. It has been argued that polygenic adaptation, where subtle changes in allele frequencies occur at many loci, is the dominant form of phenotypic evolution in natural populations 
. This type of evolution is likely when variation in a trait of interest is controlled by a large number of loci with small effect, which is now known to be the case with a number of highly heritable quantitative metabolic and morphological traits in humans. A long-term selection experiment in Drosophila melanogaster
also uncovers evidence for this kind of adaptation 
. Artificial selection in dogs appears to have caused genetic variants with much larger phenotypic effects to segregate at high frequencies, resulting in the simplification of the genetic architecture of phenotypic variation. In some cases, breed-defining characteristics such as chondrodysplasia, skin wrinkling and brachycephaly are likely to result from hard sweeps at breed creation. However, many variants with large phenotypic effects appear to show continuous variation between breeds that correlates with particular traits, including genetic variants that associate with body size in the IGF1 locus on chromosome 15 and with drop ear on chromosome 10, suggesting that selection by attenuation of allele frequencies is also common. Hence, although hard sweeps are likely to be a more common form of selection in domestic compared with wild species, it is likely that more minor changes in allele frequencies across many loci also contribute to phenotypic evolution.
The huge phenotypic diversity present in dogs raises the question as to whether levels of functional genetic variation in the ancestral dog population were elevated, adding to the raw material that artificial selection could act on. Relatively higher levels of replacement amino acid changes are found in dogs compared with wolves, possibly indicating a relaxation of selective constraint 
. There are also a large number of loci in the dog genome polymorphic for the active SINEC_Cf elements 
, which may also contribute to functional genetic variation, although it is not known whether functional variation due to these elements is increased in dogs compared with wolves. It has also been suggested (and disputed) that the dog genome has a high intrinsic mutation rate 
. There is also great interest in looking for “domestication genes” by identifying loci under selection in domestic species compared to wild ancestors. Investigation of these processes that occurred in the ancestral dog population requires detailed comparisons of patterns of genetic variation in dogs and wolves. As the majority (>98%) of SNPs on the CanineHD array were discovered by comparisons of dog breeds, they are biased against fixed differences between dogs and wolves and wolf-specific SNPs. Additional SNP discovery in wolves is therefore necessary to unravel the evolutionary processes involved in early dog domestication. Whole genome resequencing of both dogs and wolves will be important for a more detailed understanding of these processes.
It is likely that artificial selection in dogs (and other domestic animals) has led to the proliferation of mutations with large effects. This has contributed to the success of the dog as a model for genetic dissection of phenotypic traits. Such variants are likely to be maladaptive in the wild, and may also increase susceptibility to disease. Hence examining regions under selection in breeds may aid in identification of genetic risk factors affecting susceptibility to disease. Studying the extreme variation in forms produced by artificial selection also gives us a window into studying the effects of selection in natural populations, as first realized by Darwin 
. Understanding the effects of selection on the genomes of domestic animals should give us insight into understanding its effects on nondomestic species, including our own.