We typed eight Y-chromosomal microsatellites and 16 binary markers in 246 Muslim men from Andhra Pradesh (south India), and defined 124 different haplotypes, or five haplogroups and four paragroups, respectively (Supplementary Table 1). We then compared our data (excluding DYS389) to published data from 4,204 males (Muslims and non-Muslims) from other parts of India, China, Central Asia, Sri Lanka, Pakistan, Iran, the Middle East, Turkey, Egypt, and Morocco (, ).
For this worldwide comparison, Rst
genetic distances were calculated between all the populations and their pairwise values were used to perform an MDS analysis. The resulting plots () showed considerable structure. Although a continuum of variation is seen, rather than discrete groups, populations from a particular region or country tend to cluster together; this is in agreement with the expectation that human genetic structure is predominantly geographical and clinal. Thus, for example, most Chinese populations are seen in the left-hand part of each plot rather than dispersed throughout the plot. Interestingly, however, the three Chinese Muslim populations do not lie in this cluster, but are located more towards the centre of each plot, close to populations with geographical origins lying further west. It has previously been reported that the conversion to Islam in China involved the movement of people, and, in particular, the influx of genes from the Middle East into China (Wang et al. 2003
) and thus confirms that our analysis readily detects such events. The Indian populations show considerable diversity, and northern and southern populations barely overlap (with the exception of Dravidians and Chenchu) and tend to lie in two distinct clusters (). Most importantly, the Muslim and non-Muslim populations are intermingled in these clusters: the Y-chromosomal heritage in India is influenced more by geographical location than by the religious practices. It appears that the Muslim genetic contribution in India was less important than in other places such as China.
Figure 2 Multidimensional scaling presentation of population pairwise values of Rst and st based on Y-microsatellite haplotypes (A) and Y-biallelic markers (B). Symbol shapes indicate religion (squares for Muslim and circles for non-Muslims). RSQ value (more ...)
In order to assess the significance of this observation, we next combined the populations from India into two classes, Muslims and non-Muslims, and calculated pairwise genetic distances 1) among Muslims, 2) among non-Muslims, and 3) between Muslims and non-Muslims, and compared their average values after following a jackknife approach within each group. The comparison between Muslims and non-Muslims in India showed the lowest distance (). We then restricted the comparisons to Muslim and non-Muslim populations who live in neighboring regions of south India (again under a jackknife approach). We found that this comparison resulted in the lowest average value of genetic distances (), which suggests that the close geographic proximity of Muslims and non-Muslims in south India might have facilitated gene flow between those two groups. Our hypothesis of geography playing a more important role than religion in structuring Y-chromosomal diversity in India was then assessed by means of a Mantel test. This test asks whether there is a correlation between geographic distances (or religion) and genetic distances. Genetic distances were based on Rst or Φst, geographic distances were calculated using the approximate latitude and longitude of the sample sites, and religious distances were defined as 0 or 1 according to whether or not the populations belonged to the same religious group. shows that when 19 Indian populations are considered (), there is a correlation between genetic distances and geographic distances (r1=0.43, p<0.001 for microsatellites; r2=0.24, p<0.01, for biallelic markers) but not between genetics and religion (r1=0.10, p>0.05 for microsatellites; r2=0.08, p>0.05, for biallelic markers). The correlation is even stronger when the test is performed in populations from north India versus south India only, (r1=0.63, p<0.001, for microsatellites; r2=0.50, p<0.01, for biallelic markers), but not with religion (r1=0.03, p>0.05, for microsatellites; r2=0.03, p>0.05, for biallelic markers). This positive correlation between the Y diversity and geography still remains when the same test is performed among populations from south India only, despite the shorter geographic distances between them (r1=0.16, p<0.05, for microsatellite data only) (). Thus these results indicate that in India, the processes that cause a positive correlation between the pattern of Y variation and geography are not disrupted by religious affiliation. It is also worth noting that stronger support is obtained with Y-chromosomal microsatellites than with biallelic markers, which may reflect the less biased measure of diversity provided by microsatellites.
Figure 3 The averaged genetic distances (Rst) after following a jackknife approach between groups of populations based on 6 Y-microsatellites among Muslims (pattern filled), among non-Muslims (white), between Muslims and non-Muslims (black) in India and between (more ...)
Figure 4 Correlation coefficient between genetics and geography (white) and genetics and religion (black) in populations from south, north and the remaining regions of India (1), in populations from south and north India (2), and in populations from south India (more ...)
We then performed an analysis of molecular variance (AMOVA) using both microsatellite and biallelic markers in 19 populations from north India, south India and from the remaining regions of India (); and in 14 populations from north India and south India only (). As expected, the highest fraction of variation was within populations when no grouping was defined (). We then pooled the populations into two groups according to religion (Muslim or non-Muslim) or geography. With the first grouping, the amount of variation among populations from the same group was always higher than the among-group variation (). However, when we grouped the populations according to the geographic regions in India, the fraction of variation among groups was significantly higher than the among-population within-group variation, and ranged from 8.2 to 12.8% depending on the regions and markers considered (). These results confirm the large differences between populations that live in south India and those than live in north India, rather than between Muslims and non-Muslims. We also performed AMOVA in nine populations from south India only and pooled them in two groups according to their religion, i.e. Muslims and non-Muslims. The among-population variation in south India only was 5.4%, lower than the values of 9.2% obtained when considering Muslims and non-Muslims from larger geographic regions of India (). Overall, the AMOVA analyses emphasize the importance of geography in shaping the Y diversity in India and give further support to our hypothesis of no major contribution of Muslim Y chromosomes into the Hindu paternal gene pool during the Islamization of India.
Finally, we assessed the evidence of gene flow among the different south Indian Muslim isolates and among Muslims and Hindus in south India. We calculated both the proportion of shared haplotypes in the two sets of two populations, and the rho distance (the average number of mutations between a haplotype in one population and its closest counterpart in the second population) (Helgason et al. 2000
). These measures are more sensitive to low levels of gene flow, but were not significantly different between Muslims and Hindus (), confirming the lack of genetic differentiation according to religious affinity in India.
Comparison of Y-STR Lineages in South India
Although marriage between Muslim men and Hindu women was important for the spread of Islam in India, it has not been sufficient to replace the Hindu Y-chromosomal heritage built up in prehistoric times. This is in contrast with observations in Muslim groups from other places such as China and Central Asia, where there has been more marked movement of Muslim Y chromosomes into the area. Our conclusion does assume that the Muslim population entering India would have been genetically distinct from the indigenous populations, which seems likely in view of their distinct geographical origin. Moreover, our results are in accordance with previous work on the sharing of Y-chromosomes among different religious communities that live side by side, namely Jewish groups and their non-Jewish neighbors in the Near East (Hammer et al. 2000
; Nebel et al. 2000
; Thomas et al. 2002
At least at the Y-chromosomal level, the origin of Muslim isolates in south India is predominantly from local populations rather than from other Muslims of other parts of India, or outside the country. Some Indian Muslim families can trace their ancestry back to sources outside India >1,000 years ago, and our findings do not conflict with this fact, but do show that the largest minority religious group in India arose in the main from a cultural change among Hindus who started to follow and spread the precepts of Islam. The Y-chromosomal variation among Indian populations reflects geographical and prehistorical factors rather than the practices of Hinduism or Islam.