|Home | About | Journals | Submit | Contact Us | Français|
Arab forces conquered the Indus Delta region in 711 A.D. and, although a Muslim state was established there, their influence was barely felt in the rest of South Asia at that time. By the end of the tenth century, Central Asian Muslims moved into India from the northwest and expanded throughout the subcontinent. Muslim communities are now the largest minority religion in India, comprising more than 138 million people in a predominantly Hindu population of over one billion. It is unclear whether the Muslim expansion in India was a purely cultural phenomenon or had a genetic impact on the local population. To address this question from a male perspective, we typed eight microsatellite loci and 16 binary markers from the Y chromosome in 246 Muslims from Andhra Pradesh, and compared them to published data on 4,204 males from China, Central Asia, other parts of India, Sri Lanka, Pakistan, Iran, the Middle East, Turkey, Egypt and Morocco. We find that the Muslim populations in general are genetically closer to their non-Muslim geographical neighbors than to other Muslims in India, and that there is a highly significant correlation between genetics and geography (but not religion). Our findings indicate that, despite the documented practice of marriage between Muslim men and Hindu women, Islamization in India did not involve large-scale replacement of Hindu Y chromosomes. The Muslim expansion in India was predominantly a cultural change and was not accompanied by significant gene flow, as seen in other places, such as China and Central Asia.
Islam is India’s largest minority religion, with Muslims officially comprising ~13 % of the population, or 138 million people (Census 2001). The history of Islam in India began in the year 711 A.D., when it was introduced into Sind by the Arabs (Titus 2005). Soon after, however, Sind was abandoned and for the next two and a half centuries there was little Muslim presence in India. Then, in 1001, the Turks entered India from Afghanistan and started spreading Islam from west to east (Titus 2005). By the beginning of the fourteenth century the Deccan in south India had been invaded, and soon after that the Muslim empire and influence attained its greatest extent and importance in the history of India, remaining dominant up to 1707 A.D. (Titus 2005).
The Muslim conquest of India was undertaken with the purpose of establishing a Muslim government over the people and implementing the Muslim faith. This was accomplished by foreign conquerors, traders, religious devotees and preachers using a wide range of methods, including war, enslavement and conversion (voluntary or compulsory), and through marriage between Muslims and Hindus (Lal 1993; Titus 2005). Such mixed marriages appear to have been part of the policy of absorption and domination by which it was hoped Hinduism would be overthrown (Titus 2005). For that reason, the practice became well established and the resulting progeny contributed extensively to the increase in the Muslim populations in India (Lal 1993; Titus 2005).
The biological contribution to India accompanying these historical events has not been thoroughly investigated and the extensive studies of Indian genetic pre-history (reviewed by McElreavey and Quintana-Murci 2005) have focused on the origin of caste and tribal populations, the birthplace of the Dravidian languages, and the contribution of genes from the Indo-European speakers during their movement out of Central Asia (e.g. Sahoo et al. 2006; Sengupta et al. 2006). The few studies examining the origins of Muslims in India have provided conflicting results. Classical marker studies, for example, have shown that Muslims and Hindus in north and northwestern India are different from each other (Aarzoo and Afzal 2005; Balgir 2003; Balgir and Sharma 1988), whereas a study of the Y chromosome revealed close affinity between Muslims and Indo-European upper-caste groups (Basu et al. 2003). Since the expansion of Islam in other places, such as China and Central Asian countries, involved the movement of people and Y chromosomes (Wang et al. 2003; Zerjal et al. 2002) and left a detectable genetic signature in the current populations, a similar genetic impact from the Middle East on the Hindu gene pool seems plausible, but needs further investigation.
We therefore set out to clarify this aspect of the history of India by studying 24 Y-chromosomal markers in 246 Muslims from south India, and comparing our results to published data on 4,204 Muslim and non-Muslim males from several other countries. By investigating a large set of Indian Muslims and performing a comprehensive analysis of the data, we show that in India the spread of Islam did not have a detectable genetic impact on the local populations and thus differed from its expansion in neighboring countries. In India, the spread of Islam was predominantly a cultural event.
The sample consisted of 246 unrelated males from five different populations from Andhra Pradesh, South India: Yamani, Pathans and Bohra Muslim groups, and two other Sunni and Shia groups here referred to as “Sunni” and “Shia”, respectively. Blood samples were collected with informed consent and DNA was extracted following standard procedures. Eight microsatellite loci (DYS19, DYS388, DYS389I, DYS389II, DYS390, DYS391, DYS392 and DYS393) and 16 biallelic markers (YAP, M9, M89, M52, M45, M173, M172, M17, M11, M15, M40, M70, M147, M95, M103 and M88) were typed as previously described (Ramana et al. 2001) and used to assign haplotypes and Y haplogroups, respectively. In addition, relevant Y-chromosomal data from literature sources were collated and analyzed. In compiling these data, we were unable to reconcile all DYS389 repeat counts from different sources satisfactorily, and so excluded this locus from our analyses. Data from 4,204 males (non-Muslims and Muslims) from other parts of India, China, Central Asia, Sri Lanka, Pakistan, Iran, the Middle East, Turkey, Egypt, and Morocco were included (Fig. 1, Table 1).
Both haplotype and haplogroup frequencies were determined, and combined with their molecular information to compute genetic distances between all the populations depicted in Fig. 1. Pairwise distances based on microsatellite markers (Rst) and on biallelic marker (Φst) were obtained with Arlequin 2.0 (Schneider et al. 2000). Distance matrices separating each pair of populations were then used to perform multidimensional scaling (MDS) analysis with the SPSS 13.0 software package. Negative genetic distances were assigned a value of zero; when we alternatively increased all distances to eliminate the negative values, or used additional software tools (Statistica 6), the results were very similar (not shown). For the Indian samples only, we combined the populations into classes and computed average Rst values 1) among Muslims, 2) among non-Muslims, and 3) between Muslims and non-Muslims using a jackknife approach within each group. Mantel tests to assess the significance of correlations between genetics and religion, or geography were carried out in populations from India by use of Arlequin. Analysis of molecular variance (AMOVA) was also performed with Arlequin using microsatellite and biallelic data in Indian populations, which were either grouped according to religion (Muslims and non-Muslims) or geographical regions, or not grouped at all. The possibility of gene flow among the different Muslim isolates, and among Muslims and Hindus in south India, was investigated by estimating the proportion of lineage sharing and the rho genetic distance (Helgason et al. 2000).
We typed eight Y-chromosomal microsatellites and 16 binary markers in 246 Muslim men from Andhra Pradesh (south India), and defined 124 different haplotypes, or five haplogroups and four paragroups, respectively (Supplementary Table 1). We then compared our data (excluding DYS389) to published data from 4,204 males (Muslims and non-Muslims) from other parts of India, China, Central Asia, Sri Lanka, Pakistan, Iran, the Middle East, Turkey, Egypt, and Morocco (Table 1, Fig.1).
For this worldwide comparison, Rst and Φst genetic distances were calculated between all the populations and their pairwise values were used to perform an MDS analysis. The resulting plots (Fig. 2) showed considerable structure. Although a continuum of variation is seen, rather than discrete groups, populations from a particular region or country tend to cluster together; this is in agreement with the expectation that human genetic structure is predominantly geographical and clinal. Thus, for example, most Chinese populations are seen in the left-hand part of each plot rather than dispersed throughout the plot. Interestingly, however, the three Chinese Muslim populations do not lie in this cluster, but are located more towards the centre of each plot, close to populations with geographical origins lying further west. It has previously been reported that the conversion to Islam in China involved the movement of people, and, in particular, the influx of genes from the Middle East into China (Wang et al. 2003) and Figure 2 thus confirms that our analysis readily detects such events. The Indian populations show considerable diversity, and northern and southern populations barely overlap (with the exception of Dravidians and Chenchu) and tend to lie in two distinct clusters (Fig. 2). Most importantly, the Muslim and non-Muslim populations are intermingled in these clusters: the Y-chromosomal heritage in India is influenced more by geographical location than by the religious practices. It appears that the Muslim genetic contribution in India was less important than in other places such as China.
In order to assess the significance of this observation, we next combined the populations from India into two classes, Muslims and non-Muslims, and calculated pairwise genetic distances 1) among Muslims, 2) among non-Muslims, and 3) between Muslims and non-Muslims, and compared their average values after following a jackknife approach within each group. The comparison between Muslims and non-Muslims in India showed the lowest distance (Fig. 3). We then restricted the comparisons to Muslim and non-Muslim populations who live in neighboring regions of south India (again under a jackknife approach). We found that this comparison resulted in the lowest average value of genetic distances (Fig. 3), which suggests that the close geographic proximity of Muslims and non-Muslims in south India might have facilitated gene flow between those two groups. Our hypothesis of geography playing a more important role than religion in structuring Y-chromosomal diversity in India was then assessed by means of a Mantel test. This test asks whether there is a correlation between geographic distances (or religion) and genetic distances. Genetic distances were based on Rst or Φst, geographic distances were calculated using the approximate latitude and longitude of the sample sites, and religious distances were defined as 0 or 1 according to whether or not the populations belonged to the same religious group. Figure 4 shows that when 19 Indian populations are considered (Table 1), there is a correlation between genetic distances and geographic distances (r1=0.43, p<0.001 for microsatellites; r2=0.24, p<0.01, for biallelic markers) but not between genetics and religion (r1=0.10, p>0.05 for microsatellites; r2=0.08, p>0.05, for biallelic markers). The correlation is even stronger when the test is performed in populations from north India versus south India only, (r1=0.63, p<0.001, for microsatellites; r2=0.50, p<0.01, for biallelic markers), but not with religion (r1=0.03, p>0.05, for microsatellites; r2=0.03, p>0.05, for biallelic markers). This positive correlation between the Y diversity and geography still remains when the same test is performed among populations from south India only, despite the shorter geographic distances between them (r1=0.16, p<0.05, for microsatellite data only) (Fig. 4). Thus these results indicate that in India, the processes that cause a positive correlation between the pattern of Y variation and geography are not disrupted by religious affiliation. It is also worth noting that stronger support is obtained with Y-chromosomal microsatellites than with biallelic markers, which may reflect the less biased measure of diversity provided by microsatellites.
We then performed an analysis of molecular variance (AMOVA) using both microsatellite and biallelic markers in 19 populations from north India, south India and from the remaining regions of India (Table 2A); and in 14 populations from north India and south India only (Table 2B). As expected, the highest fraction of variation was within populations when no grouping was defined (Table 2). We then pooled the populations into two groups according to religion (Muslim or non-Muslim) or geography. With the first grouping, the amount of variation among populations from the same group was always higher than the among-group variation (Table 2). However, when we grouped the populations according to the geographic regions in India, the fraction of variation among groups was significantly higher than the among-population within-group variation, and ranged from 8.2 to 12.8% depending on the regions and markers considered (Table 2). These results confirm the large differences between populations that live in south India and those than live in north India, rather than between Muslims and non-Muslims. We also performed AMOVA in nine populations from south India only and pooled them in two groups according to their religion, i.e. Muslims and non-Muslims. The among-population variation in south India only was 5.4%, lower than the values of 9.2% obtained when considering Muslims and non-Muslims from larger geographic regions of India (Table 2). Overall, the AMOVA analyses emphasize the importance of geography in shaping the Y diversity in India and give further support to our hypothesis of no major contribution of Muslim Y chromosomes into the Hindu paternal gene pool during the Islamization of India.
Finally, we assessed the evidence of gene flow among the different south Indian Muslim isolates and among Muslims and Hindus in south India. We calculated both the proportion of shared haplotypes in the two sets of two populations, and the rho distance (the average number of mutations between a haplotype in one population and its closest counterpart in the second population) (Helgason et al. 2000). These measures are more sensitive to low levels of gene flow, but were not significantly different between Muslims and Hindus (Table 3), confirming the lack of genetic differentiation according to religious affinity in India.
Although marriage between Muslim men and Hindu women was important for the spread of Islam in India, it has not been sufficient to replace the Hindu Y-chromosomal heritage built up in prehistoric times. This is in contrast with observations in Muslim groups from other places such as China and Central Asia, where there has been more marked movement of Muslim Y chromosomes into the area. Our conclusion does assume that the Muslim population entering India would have been genetically distinct from the indigenous populations, which seems likely in view of their distinct geographical origin. Moreover, our results are in accordance with previous work on the sharing of Y-chromosomes among different religious communities that live side by side, namely Jewish groups and their non-Jewish neighbors in the Near East (Hammer et al. 2000; Nebel et al. 2000; Thomas et al. 2002).
At least at the Y-chromosomal level, the origin of Muslim isolates in south India is predominantly from local populations rather than from other Muslims of other parts of India, or outside the country. Some Indian Muslim families can trace their ancestry back to sources outside India >1,000 years ago, and our findings do not conflict with this fact, but do show that the largest minority religious group in India arose in the main from a cultural change among Hindus who started to follow and spread the precepts of Islam. The Y-chromosomal variation among Indian populations reflects geographical and prehistorical factors rather than the practices of Hinduism or Islam.
We specially thank all the donors for making this work possible; George van Driem for encouragement; Toomas Kivisild, Sarabjit Mastana, Partha P. Majumder, Peter Underhill and Rene Herrera for helpful information; Joan Green and Andrew King for facilitating the access to historical books; S. Qasim Mehdi, Tatiana Zerjal and Oscar Lao for comments and discussions; and three referees for suggesting improvements to the manuscript. DRC-S was supported by funds from the Arts and Humanities Research Board and the EC Sixth Framework Programme under Contract no. ERAS-CT-2003-980409. CT-S was supported by The Wellcome Trust.
Census 2001, http://www.censusindia.net/