1.  Galaxy tools to study genome diversity 
GigaScience  2013;2:17.
Intra-species genetic variation can be used to investigate population structure, selection, and gene flow in non-model vertebrates; and due to the plummeting costs for genome sequencing, it is now possible for small labs to obtain full-genome variation data from their species of interest. However, those labs may not have easy access to, and familiarity with, computational tools to analyze those data.
We have created a suite of tools for the Galaxy web server aimed at handling nucleotide and amino-acid polymorphisms discovered by full-genome sequencing of several individuals of the same species, or using a SNP genotyping microarray. In addition to providing user-friendly tools, a main goal is to make published analyses reproducible. While most of the examples discussed in this paper deal with nuclear-genome diversity in non-human vertebrates, we also illustrate the application of the tools to fungal genomes, human biomedical data, and mitochondrial sequences.
This project illustrates that a small group can design, implement, test, document, and distribute a Galaxy tool collection to meet the needs of a particular community of biologists.
2.  A Copy Number Variant at the KITLG Locus Likely Confers Risk for Canine Squamous Cell Carcinoma of the Digit 
PLoS Genetics  2013;9(3):e1003409.
The domestic dog is a robust model for studying the genetics of complex disease susceptibility. The strategies used to develop and propagate modern breeds have resulted in an elevated risk for specific diseases in particular breeds. One example is that of Standard Poodles (STPOs), who have increased risk for squamous cell carcinoma of the digit (SCCD), a locally aggressive cancer that causes lytic bone lesions, sometimes with multiple toe recurrence. However, only STPOs of dark coat color are at high risk; light colored STPOs are almost entirely unaffected, suggesting that interactions between multiple pathways are necessary for oncogenesis. We performed a genome-wide association study (GWAS) on STPOs, comparing 31 SCCD cases to 34 unrelated black STPO controls. The peak SNP on canine chromosome 15 was statistically significant at the genome-wide level (Praw = 1.60×10−7; Pgenome = 0.0066). Additional mapping resolved the region to the KIT Ligand (KITLG) locus. Comparison of STPO cases to other at-risk breeds narrowed the locus to a 144.9-Kb region. Haplotype mapping among 84 STPO cases identified a minimal region of 28.3 Kb. A copy number variant (CNV) containing predicted enhancer elements was found to be strongly associated with SCCD in STPOs (P = 1.72×10−8). Light colored STPOs carry the CNV risk alleles at the same frequency as black STPOs, but are not susceptible to SCCD. A GWAS comparing 24 black and 24 light colored STPOs highlighted only the MC1R locus as significantly different between the two datasets, suggesting that a compensatory mutation within the MC1R locus likely protects light colored STPOs from disease. Our findings highlight a role for KITLG in SCCD susceptibility, as well as demonstrate that interactions between the KITLG and MC1R loci are potentially required for SCCD oncogenesis. These findings highlight how studies of breed-limited diseases are useful for disentangling multigene disorders.
Author Summary
Domesticated dogs offer a unique mechanism for disentangling complex genetic traits, such as cancer. Over 300 breeds exist worldwide, each selected for particular morphologic and behavioral traits. Unfortunately the breeding programs used to generate such diversity are associated with breed-specific increase in disease. Squamous cell carcinoma of the digit (SCCD) is a locally aggressive cancer that causes lytic bone lesions and, occasionally, death. Among the breeds with the highest risk is the Standard Poodle (STPO), where the disease is found only in dark-coated dogs. We show that the KITLG locus is highly associated with SCCD and that a 5.7-Kb copy number variant is likely causative for the disease when in an expanded form. Interestingly, light-colored STPO carry the putative causal variant at the same frequency as black STPOs, but are protected from SCCD. We show this is likely due to a compensatory mutation in the well-known coat color locus, MC1R. This work demonstrates the utility of dog breeds for understanding the genetic causes of complex diseases of interest to both human and animal health.
3.  Genome-wide SNP and haplotype analyses reveal a rich history underlying dog domestication 
Nature  2010;464(7290):898-902.
Advances in genome technology have facilitated a new understanding of the historical and genetic processes crucial to rapid phenotypic evolution under domestication1,2. To understand the process of dog diversification better, we conducted an extensive genome-wide survey of more than 48,000 single nucleotide polymorphisms in dogs and their wild progenitor, the grey wolf. Here we show that dog breeds share a higher proportion of multi-locus haplotypes unique to grey wolves from the Middle East, indicating that they are a dominant source of genetic diversity for dogs rather than wolves from east Asia, as suggested by mitochondrial DNA sequence data3. Furthermore, we find a surprising correspondence between genetic and phenotypic/functional breed groupings but there are exceptions that suggest phenotypic diversification depended in part on the repeated crossing of individuals with novel phenotypes. Our results show that Middle Eastern wolves were a critical source of genome diversity, although interbreeding with local wolf populations clearly occurred elsewhere in the early history of specific lineages. More recently, the evolution of modern dog breeds seems to have been an iterative process that drew on a limited genetic toolkit to create remarkable phenotypic diversity.
4.  Breed-Specific Ancestry Studies and Genome-Wide Association Analysis Highlight an Association Between the MYH9 Gene and Heat Tolerance in Alaskan Sprint Racing Sled Dogs 
Mammalian Genome  2011;23(1-2):178-194.
Alaskan sled dogs are a genetically distinct population shaped by generations of selective interbreeding with purebred dogs to create a group of high performance athletes. As a result of selective breeding strategies, sled dogs present a unique opportunity to employ admixture-mapping techniques to investigate how breed composition and trait selection impact genomic structure. We used admixture mapping to investigate genetic ancestry across the genomes of two classes of sled dogs, sprint and long distance racers, and combined that with genome wide association studies (GWAS) to identify regions correlating with performance enhancing traits. The sled dog genome is enhanced by differential contributions from four non-admixed breeds (Alaskan Malamute, Siberian Husky, German Shorthaired Pointer, and Borzoi). A principle components analysis (PCA) of 115,000 genome-wide SNPs clearly resolved the sprint and distance populations as distinct genetic groups, with longer blocks of linkage disequilibrium (LD) observed in the distance versus sprint dogs (7.5–10 and 2.5–3.75 kb, respectively). Further, we identified eight regions with the genomic signal either from a selective sweep or an association analysis, corroborated by an excess of ancestry when comparing sprint and distance dogs. A comparison of elite and poor performing sled dogs identified a single region significantly association with heat tolerance. Within the region we identified seven SNPs within the myosin heavy chain 9 gene (MYH9) that were significantly associated with heat tolerance in sprint dogs, two of which correspond to conserved promoter and enhancer regions in the human ortholog.
5.  Heterozygosity of the Yellowstone wolves 
Molecular ecology  2010;19(16):3246-3249.
6.  A Simple Genetic Architecture Underlies Morphological Variation in Dogs 
PLoS Biology  2010;8(8):e1000451.
The largest genetic study to date of morphology in domestic dogs identifies genes controlling nearly 100 morphological traits and identifies important trends in phenotypic variation within this species.
Domestic dogs exhibit tremendous phenotypic diversity, including a greater variation in body size than any other terrestrial mammal. Here, we generate a high density map of canine genetic variation by genotyping 915 dogs from 80 domestic dog breeds, 83 wild canids, and 10 outbred African shelter dogs across 60,968 single-nucleotide polymorphisms (SNPs). Coupling this genomic resource with external measurements from breed standards and individuals as well as skeletal measurements from museum specimens, we identify 51 regions of the dog genome associated with phenotypic variation among breeds in 57 traits. The complex traits include average breed body size and external body dimensions and cranial, dental, and long bone shape and size with and without allometric scaling. In contrast to the results from association mapping of quantitative traits in humans and domesticated plants, we find that across dog breeds, a small number of quantitative trait loci (≤3) explain the majority of phenotypic variation for most of the traits we studied. In addition, many genomic regions show signatures of recent selection, with most of the highly differentiated regions being associated with breed-defining traits such as body size, coat characteristics, and ear floppiness. Our results demonstrate the efficacy of mapping multiple traits in the domestic dog using a database of genotyped individuals and highlight the important role human-directed selection has played in altering the genetic architecture of key traits in this important species.
Author Summary
Dogs offer a unique system for the study of genes controlling morphology. DNA from 915 dogs from 80 domestic breeds, as well as a set of feral dogs, was tested at over 60,000 points of variation and the dataset analyzed using novel methods to find loci regulating body size, head shape, leg length, ear position, and a host of other traits. Because each dog breed has undergone strong selection by breeders to have a particular appearance, there is a strong footprint of selection in regions of the genome that are important for controlling traits that define each breed. These analyses identified new regions of the genome, or loci, that are important in controlling body size and shape. Our results, which feature the largest number of domestic dogs studied at such a high level of genetic detail, demonstrate the power of the dog as a model for finding genes that control the body plan of mammals. Further, we show that the remarkable diversity of form in the dog, in contrast to some other species studied to date, appears to have a simple genetic basis dominated by genes of major effect.
7.  Molecular and Evolutionary History of Melanism in North American Gray Wolves 
Science (New York, N.Y.)  2009;323(5919):1339-1343.
Morphological diversity within closely related species is an essential aspect of evolution and adaptation. Mutations in the Melanocortin 1 receptor (Mc1r) gene contribute to pigmentary diversity in natural populations of fish, birds, and many mammals. However, melanism in the gray wolf, Canis lupus, is caused by a different melanocortin pathway component, the K locus, that encodes a beta-defensin protein that acts as an alternative ligand for Mc1r. We show that the melanistic K locus mutation in North American wolves derives from past hybridization with domestic dogs, has risen to high frequency in forested habitats, and exhibits a molecular signature of positive selection. The same mutation also causes melanism in the coyote, Canis latrans, and in Italian gray wolves, and hence our results demonstrate how traits selected in domesticated species can influence the morphological diversity of their wild relatives.
8.  Coat Variation in the Domestic Dog Is Governed by Variants in Three Genes 
Science (New York, N.Y.)  2009;326(5949):150-153.
Coat color and type are essential characteristics of domestic dog breeds. Although the genetic basis of coat color has been well characterized, relatively little is known about the genes influencing coat growth pattern, length, and curl. We performed genome-wide association studies of more than 1000 dogs from 80 domestic breeds to identify genes associated with canine fur phenotypes. Taking advantage of both inter- and intrabreed variability, we identified distinct mutations in three genes, RSPO2, FGF5, and KRT71 (encoding R-spondin–2, fibroblast growth factor–5, and keratin-71, respectively), that together account for most coat phenotypes in purebred dogs in the United States. Thus, an array of varied and seemingly complex phenotypes can be reduced to the combinatorial effects of only a few genes.
9.  An Expressed Fgf4 Retrogene Is Associated with Breed-Defining Chondrodysplasia in Domestic Dogs 
Science (New York, N.Y.)  2009;325(5943):995-998.
Retrotransposition of processed mRNAs is a frequent source of novel sequence acquired during the evolution of genomes. The vast majority of retroposed gene copies are inactive pseudogenes that rapidly acquire mutations that disrupt the reading frame, while precious few are conserved to become new genes. Utilizing a multi-breed association analysis in the domestic dog, we demonstrate that a recently acquired fgf4 retrogene causes chondrodysplasia, a short-legged phenotype that defines several common dog breeds including the dachshund, corgi and basset hound. The discovery that a single evolutionary event underlies a breed-defining phenotype for 19 diverse dog breeds demonstrates the importance of unique mutational events in constraining and directing phenotypic diversity in the domestic dog.
