India occupies a unique stage in human population evolution because one of the early waves of migration of modern humans was out of Africa, through West Asia, into India (Cann 2001
). More recently, about 15 000–10 000 years before present (ybp), when agriculture developed in the Fertile Crescent region that extended from Israel through Northern Syria to Western Iran, there was an eastward wave of human migration (Renfrew 1989
; Cavalli-Sforza et al. 1994
). It has been postulated that this wave brought the Dravidian language into India (Renfrew 1989
). Subsequently, the Indo-European (Aryan) language was introduced into India from the Iranian plateau approximately 4000–3000 ybp, where this language was probably brought by pastoral nomads from the Central Asian steppes (Renfrew 1989
). Therefore, linguistic evidence suggests that West Asia and Central Asia were two major geographical sources contributing to the Indian gene pool.
Indian society predominantly revolves around the concept of caste, or the Caste System, a strong socio-cultural conglomerate of traditions that have created and maintained a great number of hierarchically arranged endogamous groups (Bamshad et al. 2001
). This unique social system exists only in India. One impact of the system is that a person’s fate, including even the choice of marriage partner, is largely determined at his/her birth. The Hindu caste system plays a major role in social and economic organization of the Indian population. In this system, the society is divided into four broad castes: (from low to high) Sudras, Vaishyas, Kshatriyas and Brahmins. The rules that generally prevent marriages between castes may have contributed to population substructure and the pattern of genetic diversity. Another important feature in Indian population history was the occurrence of four separate or distinct waves of migration into the subcontinent (Cordaux et al. 2004
): (i) an ancient Palaeolithic migration by modern humans, (ii) an early Neolithic migration, probably via Proto-Dravidian speakers from the eastern horn of the Fertile Crescent, (iii) an influx of Indo-European speakers, and (iv) a migration from East/Southeast Asians, i.e. Tibeto-Burman speakers. In addition to these migrations, India has also experienced colonization by Europeans, which may have also contributed to the ethnic multiplicity. Furthermore, it has been reported (Cordaux et al. 2004
) that the Y lineages of Indian castes are more closely related to Central Asians than to Indian tribal populations, suggesting that Indian caste groups are primarily the descendants of Indo-European migrants.
The Y chromosome is one of the most informative loci for investigating genetic diversity and population substructure. The DNA variation found within the non-recombining portion of the Y chromosome (NRY) reflects a simple paternal history revealed by the pattern of alleles at informative loci, i.e. markers, comprising the haplotype (Underhill et al. 2000
). At present, NRY contains approximately 600 binary markers which form 311 distinct haplogroups in the Y chromosome tree (Karafet et al. 2008
). When we combine DNA variation information with other population characteristics such as geographic, archaeological, and linguistic background, we may have more power to trace the histories of contemporary populations. Moreover, Y-chromosomal markers are appropriate for the genetic study of populations that have small effective population size or recent divergence time. Therefore, markers on the Y chromosome should provide a powerful tool for investigating the origins of Indian populations, as well as genetic substructure within these populations.
There have been a few recent studies of both South Indian and North Indian caste groups, and South Indian Muslims using Y-chromosomal markers (Gutala et al. 2006
; Sahoo et al. 2006
; Zerjal et al. 2007
). However, no genetic study has been conducted on Shias and Sunnis from North India. Such knowledge is expected to provide insights on migration into and within India, as well as how Indian populations have evolved in recent history, and how genetic variation is distributed in the modern Indian population.
In this study we examined the genetic compositions of three endogamous North Indian upper caste populations (Brahmins and two sub-populations of Brahmins: Bhargavas and Chaturvedis) and two Muslims sects (Shias and Sunnis). Bhargavas and Chaturvedis practice strict surname endogamy (Agrawal et al. 2005
) and Muslims sects (Shias and Sunnis) practice consanguinity. One major aim of the present study is to evaluate the impact of Muslim invasions and their admixturing with upper caste Hindus who otherwise claim to be highly endogamous. We selected Bhargava, Chaturvedis and Brahmins because they are highly homogeneous groups and follow strict endogamy, which does not apply to the lower caste populations. We examined a set of 32 Y-unique event polymorphisms (UEPs) including 27 single nucleotide polymorphisms (SNPs), four insertions/deletions (indels) and 1 Alu
repeat in a total of 560 Y chromosomes from three upper caste and two Muslim populations in North India. Analysis of these markers revealed substantial genetic variation between groups and provided evidence of male-driven gene flow among these populations.