shows the geographical location of populations included in this study. J1*
chromosomes have their maximal frequency in the Taurus and Zagros mountain regions of Eastern Anatolia, Northern Iraq and Western Iran (). It is noted that the J1*
chromosomes frequently appear in combination with the 12 or 13 repeat pattern at DYS388, whereas the J1e chromosomes almost always display 15 or more repeats. Therefore, the J1e SNP information supports the previous inference that J1 chromosomes linked with DYS388=13 repeats share a common ancestry.1
Network analysis of J1*
chromosomes () show a bifurcating substructure. One cluster is associated with DYS388=15 and DYS390 >23 repeats and the other cluster with DYS388=13 repeats. The locale of highest J1*
frequency occurs in the vicinity of eastern Anatolia (). Both J1*
and J1e occur in Sudan and Ethiopia (Supplementary Table 1). Our data show that the YCAII 22-22 allele state is closely associated with J1e (Supplementary Table 2). Interestingly, in Ethiopia, all Cushitic Oromo and ~29% of Semitic Amharic J1 chromosomes are J1*
Figure 1 (a) Red symbols indicate the geographical locations of 36 populations analyzed. (b) Interpolated spatial contours of annual precipitation (mm) distribution. (c) Interpolated J1* frequency spatial distribution. (d) Interpolated J1e frequency spatial (more ...)
Figure 2 (a) Median-joining network for J1* using the nine-locus Y-STR haplotypes. Networks were weighted according to Qamar et al.22 Loci analyzed included DYS19, DYS388, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393 and DYS439. (b) Median-joining (more ...)
shows the average variance and expansion times of J1e with their linguistic and archeological correlates from those populations with five or more samples; the Assyrians of Syria, Iraq, Turkey and Iran were amalgamated into one group and the Arab populations of Qatar, UAE and Saudi Arabia were also combined. The mean variance across the 19 populations in correlates significantly with latitude (r=0.36, P<0.035, two-tailed Kendall's τ) and nonsignificantly with longitude (r=0.02, NS). This result supports the hypothesis that the origin of J1e is likely in the more northerly populations in and spreads southward into the Arabian Peninsula (). The high YSTR variance of J1e in Turks and Syrians (, ) supports the inference of an origin of J1e in nearby eastern Anatolia. Moreover, the network analysis of J1e haplotypes () shows that some of the populations with low diversity, such as Bedouins from Israel, Qatar, Sudan and UAE, are tightly clustered near high-frequency haplotypes suggesting founder effects with star burst expansion in the Arabian Desert.
J1e and J1* expansion time, mean YSTR variance based on DYS19, DYS390, DYS391, DYS392, DYS393, DYS389I, DYS389II and DYS439, linguistic and archeological correlates by population
The series of expansion times () is also consistent with a subsequent Neolithic range expansion of J1e from a geographical zone, including northeast Syria, northern Iraq and eastern Turkey toward Mediterranean Anatolia, Ismaili from southern Syria, Jordan, Palestine and northern Egypt. Although there is a trend between the mean variances and the expansion time estimates, the latter do not uniformly increase with variance () as some populations likely have more than one J1e founder. Support for this explanation involves cases in which there is the presence of two distinct varieties of YCAII chromosomes, namely, 19, 22 and 22, 22, whereas those with low mean diversity typically just reflect the 22, 22 class (Supplementary Table 2). A network analysis of J1e chromosomes () also reflects situations of multiple founders.
Although the haplogroup diversification within J1e remains incomplete, the somewhat rare J1e1-M368 provides an insight into the geographical origin of J1e. It has been reported both in the Black Sea region of Turkey1
and Dagestan in the northeast Caucasus.18
Furthermore, J1e1-M368 displays the YCAII 19-22 pattern. Although the haplogroup relationships of YCAII alleles are unstable, nevertheless in the context of haplogroup J1, they are suggestive that the prevalent YCAII 22-22 variety may have evolved from a YCAII 19-22 ancestor.
lists the current languages and the first millennium BCE Iron Age languages spoken in the geographical regions from which the samples were collected. Tracking back to the Iron Age, all the branches of the Central Semitic languages are represented – NW Semitic, Arabic and Old South Arabian in the Levantine and Yemeni sampling regions. The Assyrian samples and Iraqi Kurdish samples have been drawn from areas in Northern Mesopotamia speaking East Semitic languages at the time. The current data suggest an origin of J1e in the general area of eastern Turkey/northern Iraq associated with the Zarzian horizon,23, 24, 25
as they have similar early pre-agricultural expansions (16 kya, ).
The timing and geographical distribution of J1e is representative of a demic expansion of agriculturalists and herder–hunters from the Pre-Pottery Neolithic B to the late Neolithic era.24, 26
The higher variances observed in Oman, Yemen and Ethiopia suggest either sampling variability and/or demographic complexity associated with multiple founders and multiple migrations. The expansion time associated with Yemen is somewhat older (7000 BCE) and may reflect a migration of herders into southern Arabia.27
Finally, the more recent expansion times () observed in Arabs from the Arabian Peninsula, Negev Bedouins and Sunni Arabs from Hama, Syria, are consistent with a subsequent Chalcolithic/Early Bronze Age (3000–5000 BCE) advance of J1e to the Arab populations of Arabia from near the early attested Arabian-speaking area of Tayma in north central Arabia28, 29
A comparison of the mean annual rainfall and spatial frequency distribution of J1e ( respectively) indicates J1e peaks in the arid regions of the Arabian Peninsula. We performed a nonparametric Mann–Whitney test to address the hypothesis: is the frequency of J1e higher in arid regions (≤300
mm) compared with regions with more rainfall in our sample set of African and Near Eastern populations? We found that the frequency of J1e was significantly greater in the arid than in the non-arid populations (P
=0.0035). By combining all the arid populations (Supplementary Table 1) into one sample (n
=16), we circumvented the details of the geographic frequency distribution, such that the J1e frequency pattern was examined primarily with regard to precipitation rather than geography, although the two are correlated.
Although most post-Last Glacial Maximum recolonization events have a typically northward signature,30, 31
our J1e results provide an example of a southward spread during the early Holocene. Although J1e is one of the most frequent haplogroups in the region, haplogroup E-M123 also shows its highest frequency and haplotype diversity in regions of the Fertile Crescent, decreasing toward the Arabian Peninsula.1, 2, 6
This co-distribution pattern of Y-chromosome haplogroups J1e and E-M123 resembles mtDNA haplogroups J1b and (PreHV)1 distributions that also display low levels of diversity despite their high frequency in Saudi Arabia.32, 33
Although on a broad scale the haplogroup J1e frequency distribution and expansion times are consistent with the model that it tracks a possible expansion of Neolithic agro-pastoralists from the Fertile Crescent into the arid Arabian Peninsula, several caveats must be considered. First, the patchy distribution of J1e frequency in the Levant (Syria, Jordan, Israel and Palestine) may reflect the complex demographic dynamics of religion and ethnicity in the region. Second, even though the highest YSTR variance of J1e lineages is in eastern Anatolia, northern Iraq and northwest Iran, one cannot entirely rule out recent admixture as a contribution to the high variance among ethnic Assyrians.
A recent Bayesian analysis of Semitic languages supports an origin in the Levant 5750 years ago and subsequent arrival in the Horn of Africa from Arabia 2800 years ago,11
thus providing an indirect support of our phylogenetic clock estimates. It is important to note that the glottochronological dates yield estimates for the break-up and expansion of the Proto-Semitic language. Proto-Semitic, itself, may have been spoken in a localized linguistic community for millennia before its bifurcation into the East and West Semitic branches. In summary, haplogroup J1e data suggest an advance of the Neolithic period agriculturalists/pastoralists into the arid regions of Arabia from the Fertile Crescent and support an association with a Semitic linguistic common denominator.14