Patterns of allelic variation in the genome are shaped by successes and failures of genes influenced by evolutionary forces acting throughout population history. When a genetic variant becomes adaptive, populations experience changes in allele frequencies that reflect the strength and recurrence of the selective pressure(s). By identifying these residual footprints of genomic evolutionary processes in the form of “signatures of selection,” we hope to gain valuable insight into the evolutionary past of a species. A principal selection signature involves the local reduction in variation within the selected gene, as well as in adjacent SNP variants, around the selected chromosomal region known as “selective seep” 
. Further, when two isolated populations are examined, one of which underwent strong selection in the past but the other did not, the frequencies of the selected SNP and adjacent alleles will often be more different between the populations than expected under the assumption of neutral genetic drift 
. In addition, selection affects chromosomal segments, not simply individual SNPs, thus creating complex patterns of allele frequencies in the regions immediately surrounding the targeted site. In this study, we explore patterns of reduced heterozygosity and elevated between-population allele differentiation to identify strong selection signatures in the human genome. The method can also be applied to other diploid species when light-coverage SNP allele frequency genome scans of similar magnitude become available.
To address these aspects as well as to explore a new approach, we designed and tested a strategy for revealing footprints of recent selection first by simulation and then based upon a sample of 183,993 SNP markers genotyped in 45 European Americans and 45 African Americans (www.allsnp.com
). This dataset was chosen since it is independent and has never been used to estimate selection signatures. It also represents a modest database size that is smaller than the sizes of current whole-genome genotyping human population studies based on existing genotyping technologies 
. We used minimum information that would likely be available from such databases and searched for selective signatures by analysing three parameters observed for each SNP: 1) heterozygosity in European Americans (HEA
); 2) heterozygosity in African Americans (HAA
); and 3) FST
between the corresponding individual SNPs in the two populations (FST
). Centered on each available SNP, 31 arrays (or windows) including 5–65 SNPs were sampled along all human chromosomes except Y to evaluate each window for: 1) average SNP heterozygosity in European Americans (EA
); 2) average SNP heterozygosity in African Americans (AA
); and 3) variance of FST
among the adjacent SNPs (S2
). We used S2
instead of the FST
mean estimator from each group of adjacent SNPs, since a measure of FST
mean across an array of loci would be more sensitive to those alleles that reached fixation in the form of the opposite allele, and less sensitive for those fixed in the same direction while variance captures this alternation.
Several previous analyses of selective signatures have appeared which have been based either on decreased heterozygosity (H), population differentiation (FST
), extended linkage disequilibrium, and even the premise that certain modern hereditary disease alleles were adaptive sometime in the past 
. Discovered selection candidate regions included genes involved in development, immune defenses, reproduction, nutrition, behavior and other functions 
. Although several of these regions have been discovered with multiple approaches (e.g., CCR5
, FY, LCT, G6PD, FOXP2
and others; Table S1
, Notes S1
), other provocative regions have not, raising issues around the context of different algorithms and approaches, the strength and mode of selection, the timing of imputed selective events, the influence of study design, and the validity of unreplicated regions. To test the validity of our approach, the results of our scan was applied to previously nominated regions to explore how well this method validated previous discoveries.
Finally, we incorporated nine other genome-wide or chromosome-wide attempts to find signatures of selection that included whole-genome searches for signatures of selection either by searching for the high values of local genomic divergence alone 
or in combination with the allelic frequency spectrum 
, looking for gene neighborhoods exhibiting extended linkage disequilibrium alone 
or in combination with local genomic divergence 
, or by examining an aberrant frequency spectrum 
. We assessed the ten studies, including our own, and evaluated our findings in a multiple study comparison.