Control region vs. complete mtDNA sequence comparisons
presents genotyping results for all samples. A total of 49 putative founding lineages were identified and further analyzed as outlined above. Once again, we put forward herein and below the narrow designation for “a founder lineage” as being present in the sample of the contemporary community at a frequency equal to or greater than 5%, based upon the complete mtDNA genomic sequence for that lineage. illustrates the overall diversity of the non-Ashkenazi (this study) and Ashkenazi 
Jewish founding lineages and their distribution within the various Jewish communities. One X2b Moroccan Jewish putative founding lineage was analyzed using 2 complete mtDNA sequences. One putative founding lineage in Hg T2 was shared by Iraqi and Iranian Jews, and was assessed by two complete mtDNA sequences. Two putative founding lineages (one in Hg H and one in Hg X2e) were shared by Libyan and Tunisian Jews, and were assessed by the same complete mtDNA information. Therefore, the current study yielded a total of 49 novel complete mtDNAs. The detailed phylogenetic tree drawn from the complete mtDNA information is shown in Figure S1
Figure 1 A median joining network representing all founding lineages found among Jewish communities and comprised of the 49 novel complete mtDNA sequences from non-Ashkenazi Jews and the four Ashkenazi lineages previously reported by Behar et al.  (more ...)
For 37 of the 49 putative founding lineages assessed, diagnostic coding region single nucleotide polymorphisms (SNP) (described in Methods
) have been identified (Table S4
). The identifications of such diagnostic coding region SNPs implicate the corresponding samples as belonging to one founding lineage. Often, the inferences obtained from the complete mtDNA sequence paralleled the information from the control region. For example, in Georgian Jews, the presence of the control region haplotype 16067-16355-150-263 suggested the presence of a monophyletic clade HV1a1a1
(the use of italic fonts in a clade name is explained in Material and Methods
) within Hg HV1–an assumption that was confirmed by genotyping three private coding region positions 4227, 4257 and 9554 (), discerned from the complete mtDNA sequences (Table S1
). One additional Georgian Jewish sample belonged to Hg HV1, but did not share the control region substitutions at 16355 and 150, and also lacked the coding region transitions specific for the HV1a1a1
founder. To the contrary, two mtDNA genomes found among Azerbaijani Jews exactly matched the Georgian founding lineage and indeed were shown to share the identical coding region position variants as well.
Phylogeny networks of the Azerbaijani (a), Georgian (b) and Libyan (c) Jewish case studies.
The remaining 12 putative founding lineages showed a variety of relationships between the control region and complete sequence information within the hierarchy of coding region SNPs (Table S4
). For example, in Cochin Jews, 12 samples were ascertained as belonging to Hg M5a1
, Figure S1
), of which all were considered to belong to a monophyletic clade, as they all shared the first hypervariable segment (HVS-I) haplotype 16223-16257-16519-73-263. However, the two putative diagnostic coding region positions that were examined (4373 and 10589), clustered these samples into two nested groups. All samples shared position 4373, but only 11 of the 12 samples shared position 10589, the remaining one representing the likely ancestral haplotype or a sister lineage within the sub-clade. The Iraqi Jewish mtDNAs within Hg J1 had the hallmark of control region haplotype 16069-16126-16145-16222-16261-73-263-295 and thus was initially considered to belong to Hg J1b. Following complete sequence analysis, it became apparent that the lineage does not share positions 5460 and 13879 with Hg J1b and therefore represents a split on the link from J1 to J1b labeled herein as J1b'e. Two mutations (1733 and 8269), were examined in all 14 samples with the relevant control region motif (Tables S1
and Table S4
). Position 8269 was shared by all of them, suggesting their descent from the same deep J1 branch, while position 1733 turned out not to be in the derived state in 6 samples, but rather all contained the gain of a substitution at position 152, in the second hypervariable segment (HVS-II). Hence, the J1 mtDNA lineages in Iraqi Jews descend from two rather than one founding mothers with a yet not fully resolved location under J1b'e and J1e
). The same pattern repeated itself in Moroccan Jewish Hg H4a1a
, Libyan-Tunisian Jewish Hg H30
and Yemenite Jewish Hg R0a1c
, putative founding lineages that were also shown to either be two daughter or two sister lineages within the sub-clade (Tables S1
and Table S4
). Yet in other cases, mtDNAs with identical control region sequences within a community, generally indicative of very recent common ancestry, showed a rather high level of heterogeneity at their coding regions, consistent with a considerably more remote shared ancestry. Among Moroccan Jews, all 8 samples designated as Hg X2b1 (Table S1
and Table S4
, Figure S1
) shared the identical control region haplotype 16183C-16189-16223-16248-16278-16519-73-153-195-225-226-263. The first randomly chosen complete mtDNA sequence determined among them revealed three substitutions (at nps 1555, 2308 and 8814) which were not present among previously published X2 sequences 
. Variation at nps 1555 and 2308 was assayed among the rest of Moroccan Jewish X2b1 sequences, and none of these samples were found to show the derived state at these positions. A second randomly chosen sample was also fully sequenced, and showed a derived allele at position 8814 as in the previous sample, plus two additional mutations at positions 6335 and 8277. Further genotyping revealed that this single informative variant at position 8814 was shared among all samples, confirming their remote common ancestry within X2b1. An analogous scenario was observed in the Bulgarian Jewish Hg T2f
lineage (Tables S1
and Table S4
In four putative founding lineages identified among Moroccan (H1e
), Bulgarian (H25) and Turkish (H1p) Jews, the number of samples actually found to belong to one monophyletic clade was substantially lower than the suspected number, based solely on the initial inspection of their control region variation (Table S4
). Their control region haplotypes contain positions 16519C and 263G (Table S4
). We found, however, that even within regional communities, these mutations provide no phylogenetically informative value within Hg H. Finally, in the Hg X2e1a1a Libyan-Tunisian Jewish putative founding lineage, no coding region hierarchy was observed, but the HVS-I based tree clearly deviated from a star phylogeny and suggested the expansion of several sister clades (Table S1
The founding lineages
Based on the coding region analysis, we extended the initial number of 49 putative lineages (Table S4
) to 53 potentially founding lineages, because the coding region analysis suggested that four putative founding lineages (J1b'e/J1e
) were each comprised of 2 daughter sub-lineages. For 52 of the 53 lineages, confidence intervals covered 2,000 years as a potential coalescence age, and are therefore concordant with the founder event occurring during the last 2,000 years. The only lineage for which the confidence interval did not cover 2,000 years is the Iranian Jewish U7a1
. However, two additional lineages, L0a2c
, gave a coalescence estimate larger than 4000 years, with their confidence intervals greatly exceeding 2,000 years. Thus, we can state that in general, while almost the entire set of founder lineages identified herein is consistent with the constraint of a coalescence age within the last 2,000 years, it is likely that some may have started to expand earlier. It should be recalled that the imposed 2,000 year constraint adds a degree of stringency–and relaxing this to values greater than 2,000 years would be expected to result in the inclusion of more lineages within the time from expansion.
The Jewish community of the Caucasus also known as Mountain Jews is believed to have been established during the 8th century C.E. in the region corresponding to Dagestan and the current state of Azerbaijan as a result of a movement of Jews from Iran. Indeed, this community shows a striking maternal founding event, with 58.6% of their total mtDNA genetic variation tracing back to only one woman carrying an mtDNA lineage within Hg J2b. This lineage was chosen as one of our three exemplary case studies, presented below (, ). The Georgian Jewish community, considered to have been established in the 6th century C.E., similarly shows a founding event with 58.1% of its total mtDNA variation tracing back to one woman. This particular mtDNA lineage within Hg HV1 was chosen as an additional case study for further phylogeographic resolution (, ).
Case study: the mtDNA founder of the Azerbaijani Jews
Case study: the mtDNA founder of the Georgian Jews
Multiple theories exist regarding the establishment of the Ethiopian Beta Israel community, of which the most widely cited posits a migration event of Hebrews to Ethiopia in biblical times. Because of small sample size, even doubletons could meet the >5% threshold for inclusion as a putative founding lineage and therefore the results should be interpreted with caution. Nevertheless, it was not possible to cumulatively account for 40% of the genetic mtDNA variation with the lineages ascertained, using the criteria applied in this study. The four most frequent lineages belonged to Hgs R0a1b
and M1a1c () all frequent in the region 
suggesting East Africa and not the Levant as their likely geographic origin.
The Indian Jewish community of Mumbai (known as B'nei Israel) oral history claim to have descended from Jews who reached the shores of India in the 2nd
century C.E. MtDNA analysis for this community shows a strong maternal founding event, with 41.2% of its total mtDNA genetic variation tracing back to one woman and 67.6% tracking back to four women (). The Indian Jewish community of Cochin myth claims the community to have emanated in the times of King Solomon and has had no documented contact with the B'nei Israel of Mumbai. This community also shows a strong maternal founding event, with 44.4% of its total mtDNA genetic variation tracing back to two women (). In both Indian Jewish communities, their mtDNA gene pool is dominated by Hg M sub-branches specific for the subcontinent 
, and therefore appears to be of clearly local origin. It is important to note that in agreement with an oral tradition of the two independent founding events for the respective communities, the prevailing sub-branches among B'nei Israel Hg M samples belong to Hgs M39a1
and M30c1a1, while the Cochin Hg M sub-branches belong to Hgs M5a1
The Jewish communities of Iraq and Iran constitute the oldest non-Ashkenazi Jewish communities outside the Levant and were established during the 6th
century B.C.E. For the Iranian (Persian) Jewish community sample set, we found that 41.5% of the mtDNA variation can be attributed to 6 women carrying mtDNA genomes that belong to sub-branches of Hgs H6a1b1
, and J1b1
(), all known to be present in West Eurasia. In this regard, it is noteworthy that though Hg H is the dominant European mtDNA Hg (40-50%), its sub-Hgs H6 and H14 are largely restricted to the Near East and the South Caucasus 
. Similarly, we found that about 43% of the Iraqi Jewish community can be traced back to 5 women whose mtDNA belongs to Hgs T2c1
(), all frequent in the Near and Middle East. Again, Hg H13 is typically the Near Eastern, not European variant of Hg H 
. Consistent with our findings, an independent sample of Iraqi Jews reported in a previous study 
, contained eleven out of 20 individuals who carry mtDNA variants, that can be assigned to the five founding lineages identified in the current study.
The presence of Jews in North Africa spans from the Roman domination of this region, through the period of the Arab caliphate of Baghdad and finally to the arrival of Jews, exiled from Spain and Portugal at the end of 15th
Century. The Libyan and Tunisian Jewish communities share, as their two most frequent mtDNA variants, lineages in Hgs X2e1a1a and H30
). It is important to note that the Hg H30
is split by the coding region information into 2 sub-lineages, one restricted to Libyan Jews and one primarily to Tunisian Jews. The maternal founding event in Libyan Jews is evident, as 39.8% of their mtDNAs could be related to one woman carrying the X2e1a1a lineage, supported by an earlier observation, where ten out of twenty Libyan Jews were found to share this haplotype 
. A well pronounced, though less narrow founder event characterizes Tunisian Jewry, where 4 maternal lineages (Hgs X2e1a1a, H30
, R0a1a and U4a1
) contributed to 43.2% of the entire mtDNA variation. The shared Libyan-Tunisian X2e1 was chosen as the third case study for further phylogeographic resolution (, ). The Moroccan Jewish community, known to be the largest of the three showed no evidence for a significant maternal founding event in the sense defined in the current study. The most frequent lineages, each accounted for no more than 6.0% of the entire mtDNA genetic variation, and only 12.7% of overall mtDNA variation could be explained by lineages with frequencies greater than 5%.
Case study: The mtDNA founder of the Libyan Jews
It is worth stressing here that all predominant mtDNA lineages found among the North African Jewish communities belong to the general West Eurasian pool. Jews expelled from Spain and Portugal joined with those existing elsewhere or established new communities in many locations. At some locations, such as Bulgaria and Turkey, the influx was large enough to consider the entire community as a representative subset of the parental Jewish population in Spain. Our data does not support a narrow founding event in the establishment of the Bulgarian or Turkish Jewish communities. Two of the most prevalent mtDNA lineages in Bulgarian Jews were identical to those found among Ashkenazi Jews. It is also worth noting that both communities had a high prevalence of Hg H mtDNA genomes, which, while frequent in the Near East, has still a significantly higher prevalence in the Iberian Peninsula. The Iberian Jewish community of Belmonte, Portugal, listed under the Iberian exile communities, is comprised of only 300-400 people. The community survived for hundreds of years by adhering to a crypto-Judaic lifestyle. It is impossible to ascertain that the sampling within this community avoided putative recent maternal introgression events. A total of 93.3% of the mtDNA genomes in the Belmonte samples could be attributed to one mother, carrying an mtDNA lineage within Hg HV0b, and thereby likely narrowing down the ancestry of these crypto Jewish “communities” to one endogamously expanding family, at least on the maternal side.
The Yemenite Jewish community is thought to have been established in the second century CE. Here we found that 42.0% of the mtDNA variation in this community can be attributed to 5 women carrying mtDNAs that belong to sub-branches of Hgs R0a1c
. While these Hgs, except L3x1a
, can be considered as a part of the general West Asian mtDNA genetic pool, they have higher frequencies in East Africa and Yemen 
To illustrate a straightforward approach for a deeper phylogeographic resolution of a founding lineage we choose three case studies comprised of dominant mtDNA haplotypes among Azerbaijani, Georgian and Libyan-Tunisian Jews.
1. The Azerbaijani Jewish community is dominated by a J2b1 lineage (). We screened a selection of West Eurasian and North African population samples (N
6076) for Hg J2 genomes that contained HVS-I motif 16069, 16126, 16193. Significantly, this large mtDNA selection contains a duplicate of the Azerbaijani Jewish community collected in Israel since the same community was also sampled directly in its Caucasus Diaspora homeland. This community is known in the Caucasus as Jewish Tats. details the genotyping of 65 geographically and ethnically diverse J2b mtDNAs for private coding region mutations, identified by complete sequencing of the Azerbaijani Jewish sample. The positions examined included an indel at 5899 and transitions at 10223, 10914 and 15453. The reconstructed Azerbaijani Jewish J2b founder phylogeny is shown in . The Jewish Tat population showed identical findings as those observed among in its Israeli sister community as evidenced by the fact that the same J2b1 lineage was found in 12 out of the 23 Jewish Tats mtDNA genomes. The same mtDNA lineage was also found in one out of 111 Kumyk samples.
2. The Georgian Jewish mtDNA pool was dominated by the Hg HV1a1a1 lineage. We screened our population samples for Hg HV1 variants that contained the same HVS-I mutation 16355 observed in Georgian Jews. We then chose a selection of geographically wide-spread HV1 samples containing a transition at position 16355 and a few samples that did not, and genotyped them for the three coding region mutations 4227, 4257 and 9554, identified in complete sequencing of the particular Georgian Jewish mtDNA founder lineage. details the genotyping information of the samples included, and shows phylogenetic reconstruction of the Georgian Jewish HV1 founder. A substitution at position 4257 was restricted to Georgian Jews whilst all other mutations were shared by almost all other samples studied here, carrying the transition at 16355. Interestingly, the substitution at 4227 was missing also in the Caucasus region, including Georgian HV1a1a samples (), suggesting that this particular mutation might have been arisen within the Georgian Jewish community.
3. The Libyan and Tunisian Jewish communities shared among them an X2e1a1a lineage as the most frequent. We examined the two Libyan-Tunisian Jewish lineage-specific coding region mutations 9380 and 13789 in relevant samples at hand (, ). Position 13789 appears uninformative, while 9380 was shared among Hg X samples from the Near East and Africa, but not from Europe, suggesting Near Eastern/ North African origin of the particular founder lineage.