While there have been several population genetics studies of Jewish cohorts published in the past two years [21
], the findings of the present study are novel in several ways. First, prior studies have emphasized commonalities amongst Jewish sub-populations, as well as relative proximity to European and Levantine populations. By contrast, the present study took the complementary approach of defining the spectrum of autosomal variation that is AJ-specific. Moreover, using novel pathway analyses, the present study related population genetic variation to patterns of disease propensity in the Ashkenazi population. Second, the present study examined intra-Ashkenazi variation. Finally, we provide a robust yet compact list of AIMs for the Ashkenazi population.
The primary result of the present study is the specification of the allelic content of an autosomal genetic signature that can distinguish the Ashkenazi Jewish population from both its host populations in Europe and other populations that originate in the same geographic area of the Levant. To our knowledge, ours is the first study of the Ashkenazi population to utilize cross-validation metrics to identify the optimal solution to the assignment of population ancestry scores. Previous studies using similar approaches have demonstrated the ability of genomic information to differentiate Ashkenazi samples from those drawn from other populations [1
]. However, each of these studies has suggested that AJ samples represent an intermediate position or admixture between European and Levantine populations. Although one recent paper suggested 30 to 60% European admixture in Ashkenazi and other Jewish samples [24
], the present study found relatively little (≤10%) overlap of AJ genetic ancestry components in non-AJ Levantine populations. In the statistically optimal ADMIXTURE result in our study, European admixture followed a pattern indicative of second-generation admixture rather than deeper mingling with the host populations. Moreover, pairwise genetic distances were not consistent with an intermediate positioning of the AJ population relative to the European and Levantine populations.
It should be emphasized that these results do not suggest an independent (for example, Khazar or non-Levantine) lineage for the AJ population, a hypothesis that has generally been ruled out by prior literature [16
]. Rather, Table demonstrates relative proximity amongst several populations with Mediterranean heritage, including the AJ, Palestinians, and Italians, suggestive of an ancient common deme. Additionally, the FST
data indicate approximately equal genetic distances between the AJ and western (French), eastern (Adygei), and Middle Eastern (Palestinian) cohorts, consistent with the suggestion that founder effects and subsequent drift account for the data more strongly than substantial local in-mixture with the European host populations in the last 1,000 years.
Moreover, the present study is the first to examine residual intra-population variance in AJ samples in comparison to host European populations. Results of our intra-AJ principal components analysis indicated that residual structure was minimal, was not related to geographic origin within Europe, and did not map onto differences in host population. Taken together, these data most likely reflect the unique contributions of the AJ founder population to the genetic make-up of present-day Ashkenazim. At the same time, it is acknowledged that our autosomal data may not capture certain components of ancestry that are accessible to mitochondrial DNA and Y-chromosome studies, such as sex differences in origin and number of founders [16
Having identified this AJ-specific signature, we then sought to characterize its primary allelic content in order to determine potential relevance to future disease mapping studies. We developed a robust yet compact set of AIMs that can be applied to refine studies of European or European-American cohorts, which are still the most commonly used in disease mapping GWASs. These AIMs will also be useful in future GWASs of AJ cohorts, insofar as they can identify individuals with varying degrees of recent European admixture, thereby reducing residual intra-population structure (Figure ). The lack of significant intra-population structure suggests that the AJ population may be useful for disease-mapping studies, with the possibility of enhanced signal-to-noise for the detection of (at least a subset) of disease-related alleles [15
Alleles within the MHC were the most substantial contributors to both inter-population and intra-population variance. MHC markers comprised approximately 6% of all approximately 13,841 SNPs that were correlated with the AJ-specific signature, including polymorphisms in both class I and class II genes. Prior research has consistently demonstrated the MHC to be most sensitive to population differences [40
], typically due to geographic differences in exposure history [41
]. These population differences have implications for susceptibility to autoimmune diseases [42
], and may account for the increased rate of pemphigus vulgaris in AJ individuals [43
]. Recent studies associating SNPs in the MHC with serious drug-induced side effects [44
], viral load in HIV [45
] and psychiatric illness [46
] also indicate the clinical relevance of more extensive elaboration of population differences in MHC alleles.
Characterization of the AJ-specific component also resulted in the identification of several coding variants known to be associated with disease, and was able to detect markers in CFTR
that are relevant to increased prevalence of cystic fibrosis and Crohn's disease in the Ashkenazi population [47
]. Perhaps the most surprising result from the present study, however, was the over-representation of GO categories containing disease-bearing genes commonly associated with the AJ population. For example, the AJ cohort did not merely differ from other populations in CFTR
allele frequencies, but also in allelic frequencies in most other genes associated with transepithelial chloride transport. However, it should be noted that these data do not provide specific evidence of causality between the existence of AJ-prevalent disease-causing mutations in these pathways and the over-representation of certain common alleles in related genes. Speculatively, these results suggest the possibility that deleterious recessive alleles may persist at relatively high frequencies in the AJ population due to epistatic effects with other genes in the same biological pathway, which also display altered allelic frequencies in the AJ population.