Worldwide, the incidence of oral tongue cancer is on the rise, adding to the existing burden due to prevailing low survival and high recurrence rates. This study uses high-throughput expression profiling to identify candidate markers of resistance/response in patients with oral tongue cancer. Analysis of primary and post-treatment samples (12 tumor and 8 normal) by the Affymetrix platform (HG U133 plus 2) identified 119 genes as differentially regulated in recurrent tumors. The study groups had distinct profiles, with induction of immune response and apoptotic pathways in the non-recurrent and metastatic/invasiveness pathways in the recurrent group. Validation was carried out in tissues by Quantitative Real-Time PCR (QPCR) (n=30) and immunohistochemistry (IHC) (n=35) and in saliva by QPCR (n=37). The markers, COL5A1, HBB, IGLA and CTSC individually and COL5A1 and HBB in combination had the best predictive power for treatment response in the patients. A subset of markers identified (COL5A1, ABCG1, MMP1, IL8, FN1) could be detected in the saliva of patients with oral cancers with their combined sensitivity and specificity being 0.65 and 0.87 respectively. The study thus emphasizes the extreme prognostic value of exploring markers of treatment resistance that are expressed in both tissue and saliva.
Tongue cancer; resistance; response; micro array; gene expression; saliva; biomarkers
IL8RA and IL8RB, encoded by CXCR1 and CXCR2, are receptors for interleukin (IL)-8 and other CXC chemokines involved in chemotaxis and activation of polymorphonuclear neutrophils (PMN). Variants at CXCR1 and CXCR2 have been associated with susceptibility to cutaneous and mucocutaneous leishmaniasis in Brazil. Here we investigate the role of CXCR1/CXCR2 in visceral leishmaniasis (VL) in India.
Three single nucleotide polymorphisms (SNPs) (rs4674259, rs2234671, rs3138060) that tag linkage disequilibrium blocks across CXCR1/CXCR2 were genotyped in primary family-based (313 cases; 176 nuclear families; 836 individuals) and replication (941 cases; 992 controls) samples. Family- and population-based analyses were performed to look for association between CXCR1/CXCR2 variants and VL. Quantitative RT/PCR was used to compare CXCR1/CXCR2 expression in mRNA from paired splenic aspirates taken before and after treatment from 19 VL patients.
Family-based analysis using FBAT showed association between VL and SNPs CXCR1_rs2234671 (Z-score = 2.935, P = 0.003) and CXCR1_rs3138060 (Z-score = 2.22, P = 0.026), but not with CXCR2_rs4674259. Logistic regression analysis of the case-control data under an additive model of inheritance showed association between VL and SNPs CXCR2_rs4674259 (OR = 1.15, 95%CI = 1.01-1.31, P = 0.027) and CXCR1_rs3138060 (OR = 1.25, 95%CI = 1.02-1.53, P = 0.028), but not with CXCR1_rs2234671. The 3-locus haplotype T_G_C across these SNPs was shown to be the risk haplotype in both family- (TRANSMIT; P = 0.014) and population- (OR = 1.16, P = 0.028) samples (combined P = 0.002). CXCR2, but not CXCR1, expression was down regulated in pre-treatment compared to post-treatment splenic aspirates (P = 0.021).
This well-powered primary and replication genetic study, together with functional analysis of gene expression, implicate CXCR2 in determining outcome of VL in India.
The Warburg Effect is characterized by an irreversible injury to mitochondrial oxidative phosphorylation (OXPHOS) and an increased rate of aerobic glycolysis. In this study, we utilized a breast epithelial cell line lacking mitochondrial DNA (rho0) that exhibits the Warburg Effect associated with breast cancer. We developed a MitoExpress array for rapid analysis of all known nuclear genes encoding the mitochondrial proteome. The gene-expression pattern was compared among a normal breast epithelial cell line, its rho0 derivative, breast cancer cell lines and primary breast tumors. Among several genes, our study revealed that over-expression of mitochondrial uncoupling protein UCP2 in rho0 breast epithelial cells reflects gene expression changes in breast cancer cell lines and in primary breast tumors. Furthermore, over-expression of UCP2 was also found in leukemia, ovarian, bladder, esophagus, testicular, colorectal, kidney, pancreatic, lung and prostate tumors. Ectopic expression of UCP2 in MCF7 breast cancer cells led to a decreased mitochondrial membrane potential and increased tumorigenic properties as measured by cell migration, in vitro invasion and anchorage independent growth. Consistent with in vitro studies, we demonstrate that UCP2 over-expression leads to development of tumors in vivo in an orthotopic model of breast cancer. Genipin, a plant derived small molecule, suppressed the UCP2 led tumorigenic properties, which were mediated by decreased reactive oxygen species and down-regulation of UCP2. However, UCP1, 3, 4 and 5 gene expression was unaffected. UCP2 transcription was controlled by SMAD4. Together, these studies suggest a tumor-promoting function of UCP2 in breast cancer. In summary, our studies demonstrate that i) the Warburg Effect is mediated by UCP2; ii) UCP2 is over-expressed in breast and many other cancers; iii) UCP2 promotes tumorigenic properties in vitro and in vivo and iv) genipin suppresses the tumor promoting function of UCP2.
SLC11A1 has pleiotropic effects on macrophage function and remains a strong candidate for infectious disease susceptibility. 5' and/or 3' polymorphisms have been associated with tuberculosis, leprosy, and visceral leishmaniasis (VL). Most studies undertaken to date were under-powered, and none has been replicated within a population. Association with tuberculosis has replicated variably across populations. Here we investigate SLC11A1 and VL in India.
Nine polymorphisms (rs34448891, rs7573065, rs2276631, rs3731865, rs17221959, rs2279015, rs17235409, rs17235416, rs17229009) that tag linkage disequilibrium blocks across SLC11A1 were genotyped in primary family-based (313 cases; 176 families) and replication (941 cases; 992 controls) samples. Family- and population-based analyses were performed to look for association between SLC11A1 variants and VL. Quantitative RT/PCR was used to compare SLC11A1 expression in mRNA from paired splenic aspirates taken before and after treatment from 24 VL patients carrying different genotypes at the functional promoter GTn polymorphism (rs34448891).
No associations were observed between VL and polymorphisms at SLC11A1 that were either robust to correction for multiple testing or replicated across primary and replication samples. No differences in expression of SLC11A1 were observed when comparing pre- and post-treatment samples, or between individuals carrying different genotypes at the GTn repeat.
This is the first well-powered study of SLC11A1 as a candidate for VL, which we conclude does not have a major role in regulating VL susceptibility in India.
SLC11A1; visceral leishmaniasis; genetic susceptibility
Human Y-chromosome haplogroup structure is largely circumscribed by continental boundaries. One notable exception to this general pattern is the young haplogroup R1a that exhibits post-Glacial coalescent times and relates the paternal ancestry of more than 10% of men in a wide geographic area extending from South Asia to Central East Europe and South Siberia. Its origin and dispersal patterns are poorly understood as no marker has yet been described that would distinguish European R1a chromosomes from Asian. Here we present frequency and haplotype diversity estimates for more than 2000 R1a chromosomes assessed for several newly discovered SNP markers that introduce the onset of informative R1a subdivisions by geography. Marker M434 has a low frequency and a late origin in West Asia bearing witness to recent gene flow over the Arabian Sea. Conversely, marker M458 has a significant frequency in Europe, exceeding 30% in its core area in Eastern Europe and comprising up to 70% of all M17 chromosomes present there. The diversity and frequency profiles of M458 suggest its origin during the early Holocene and a subsequent expansion likely related to a number of prehistoric cultural developments in the region. Its primary frequency and diversity distribution correlates well with some of the major Central and East European river basins where settled farming was established before its spread further eastward. Importantly, the virtual absence of M458 chromosomes outside Europe speaks against substantial patrilineal gene flow from East Europe to Asia, including to India, at least since the mid-Holocene.
Y chromosome; haplogroup R1a; human evolution; population genetics
Islam is the second most practiced religion in India, next to Hinduism. It is still unclear whether the spread of Islam in India has been only a cultural transformation or is associated with detectable levels of gene flow. To estimate the contribution of West Asian and Arabian admixture to Indian Muslims, we assessed genetic variation in mtDNA, Y-chromosomal and LCT/MCM6 markers in 472, 431 and 476 samples, respectively, representing six Muslim communities from different geographical regions of India. We found that most of the Indian Muslim populations received their major genetic input from geographically close non-Muslim populations. However, low levels of likely sub-Saharan African, Arabian and West Asian admixture were also observed among Indian Muslims in the form of L0a2a2 mtDNA and E1b1b1a and J*(xJ2) Y-chromosomal lineages. The distinction between Iranian and Arabian sources was difficult to make with mtDNA and the Y chromosome, as the estimates were highly correlated because of similar gene pool compositions in the sources. In contrast, the LCT/MCM6 locus, which shows a clear distinction between the two sources, enabled us to rule out significant gene flow from Arabia. Overall, our results support a model according to which the spread of Islam in India was predominantly cultural conversion associated with minor but still detectable levels of gene flow from outside, primarily from Iran and Central Asia, rather than directly from the Arabian Peninsula.
Indian Muslims; mtDNA; Y chromosome; Middle East; sub-Saharan; gene flow
The geographical position of Maharashtra state makes it rather essential to study the dispersal of modern humans in South Asia. Several hypotheses have been proposed to explain the cultural, linguistic and geographical affinity of the populations living in Maharashtra state with other South Asian populations. The genetic origin of populations living in this state is poorly understood and hitherto been described at low molecular resolution level.
To address this issue, we have analyzed the mitochondrial DNA (mtDNA) of 185 individuals and NRY (non-recombining region of Y chromosome) of 98 individuals belonging to two major tribal populations of Maharashtra, and compared their molecular variations with that of 54 South Asian contemporary populations of adjacent states. Inter and intra population comparisons reveal that the maternal gene pool of Maharashtra state populations is composed of mainly South Asian haplogroups with traces of east and west Eurasian haplogroups, while the paternal haplogroups comprise the South Asian as well as signature of near eastern specific haplogroup J2a.
Our analysis suggests that Indian populations, including Maharashtra state, are largely derived from Paleolithic ancient settlers; however, a more recent (∼10 Ky older) detectable paternal gene flow from west Asia is well reflected in the present study. These findings reveal movement of populations to Maharashtra through the western coast rather than mainland where Western Ghats-Vindhya Mountains and Narmada-Tapti rivers might have acted as a natural barrier. Comparing the Maharastrian populations with other South Asian populations reveals that they have a closer affinity with the South Indian than with the Central Indian populations.
Islam is the second-most practiced religion in India, next to Hinduism. It is still unclear whether the spread of Islam in India has been only a cultural transformation or was associated with detectable levels of gene flow. To estimate the contribution of West Asian and Arabian admixture to Indian Muslims we have assessed genetic variation in mtDNA, Y-chromosomal and LCT/MCM6 markers in 472, 431 and 476 samples, respectively, representing six Muslim communities from different geographical regions of India. We found that most of the Indian Muslim populations received their major genetic input from geographically close non-Muslim populations. However, low levels of likely Arabian and West Asian admixture were also observed among Indian Muslims in the form of L0a2a2 mtDNA and E1b1b1a and J*(xJ2) Y-chromosomal lineages. The distinction between Iranian and Arabian sources was difficult to make with mtDNA and the Y chromosome as the estimates were highly correlated due to similar gene pool compositions in the sources. In contrast, the LCT/MCM6 locus, which shows a clear distinction between the two sources, enabled us to rule out significant gene flow from Arabia. Overall, our results support a model according to which the spread of Islam in India was predominantly cultural conversion associated with minor but still detectable levels of gene flow from outside, primarily from Iran and Central Asia, rather than directly from the Arabian Peninsula.
Indian Muslims; mtDNA; Y chromosome; Middle East; sub-Saharan; gene flow
The present study was carried out to assess the role of androgen receptor CAG repeat polymorphism and X chromosome inactivation (XCI) pattern among Indian PCOS women and controls which has not been hitherto explored and also to test the hypothesis that shorter CAG alleles would be preferentially activated in PCOS. CAG repeat polymorphism and X chromosome methylation patterns were compared between PCOS and non-PCOS women. 250 PCOS women and 299 controls were included for this study. Androgen receptor CAG repeat sizes, XCI percentages, and clinical and biochemical parameters were measured. The mean CAG repeat number is similar between the cases (18.74±0.13) and controls (18.73±0.12). The obese PCOS women were significantly more frequent in the <18 and >20 CAG repeat category than the lean PCOS women, yielding a highly significant odds (p = 0.001). Among the women with non-random X-inactivation, alleles with <19 repeats were more frequently activated among cases than controls (p = 0.33). CAG repeat polymorphism by itself cannot be considered as a useful marker for discriminating PCOS. We observed a trend of preferential activation of the shorter allele among the PCOS cases with non random XCI pattern. In the obese PCOS women, this microsatellite variation may account for the hyperandrogenicity to a larger extent than the lean PCOS women.
India has been underrepresented in genome-wide surveys of human variation. We analyze 25 diverse groups to provide strong evidence for two ancient populations, genetically divergent, that are ancestral to most Indians today. One, the “Ancestral North Indians” (ANI), is genetically close to Middle Easterners, Central Asians, and Europeans, while the other, the “Ancestral South Indians” (ASI), is as distinct from ANI and East Asians as they are from each other. By introducing methods that can estimate ancestry without accurate ancestral populations, we show that ANI ancestry ranges from 39-71% in India, and is higher in traditionally upper caste and Indo-European speakers. Groups with only ASI ancestry may no longer exist in mainland India. However, the Andamanese are an ASI-related group without ANI ancestry, showing that the peopling of the islands must have occurred before ANI-ASI gene flow on the mainland. Allele frequency differences between groups in India are larger than in Europe, reflecting strong founder effects whose signatures have been maintained for thousands of years due to endogamy. We therefore predict that there will be an excess of recessive diseases in India, different in each group, which should be possible to screen and map genetically.
We attempt to ascertain if the 3 linked single nucleotide polymorphisms (SNPs) of the Progesterone Receptor (PR) gene (exon 1: G 1031 C; S344T, exon 4: G 1978 T; L660V and exon 5: C 2310 T; H770H) and the PROGINS insertion in the intron G, between exons 7 and 8, are associated with Recurrent Spontaneous Abortion (RSA) in the Indian population.
A total of 143 women with RSA and 150 controls were sequenced for all the 8 exons looking for the above 3 linked SNPs of the PR gene earlier implicated in the RSA, as well as for any new SNPs that may be possibly found in the Indian population. PROGINS insertion was screened by electrophoresis. We did not find any new mutations, not observed earlier, in our population. Further, we did not find significant role of the *2 allele (representing the mutant allele at the three SNP loci) or the T2 allele (PROGINS insertion) in the manifestation of RSA. We also did not find an LD pattern between each of the 3 SNPs and the PROGINS insertion.
The results suggest that the PR gene mutations may not play any exclusive role in the manifestation of RSA, and instead, given significantly higher frequency of the *2 allele among the normal women, we surmise if it does not really confer a protective role among the Indian populations, albeit further studies are required in the heterogeneous populations of this region before making any conclusive statement.
The phylogeny of the indigenous Indian-specific mitochondrial DNA (mtDNA) haplogroups have been determined and refined in previous reports. Similar to mtDNA superhaplogroups M and N, a profusion of reports are also available for superhaplogroup R. However, there is a dearth of information on South Asian subhaplogroups in particular, including R8. Therefore, we ought to access the genealogy and pre-historic expansion of haplogroup R8 which is considered one of the autochthonous lineages of South Asia.
Upon screening the mtDNA of 5,836 individuals belonging to 104 distinct ethnic populations of the Indian subcontinent, we found 54 individuals with the HVS-I motif that defines the R8 haplogroup. Complete mtDNA sequencing of these 54 individuals revealed two deep-rooted subclades: R8a and R8b. Furthermore, these subclades split into several fine subclades. An isofrequency contour map detected the highest frequency of R8 in the state of Orissa. Spearman's rank correlation analysis suggests significant correlation of R8 occurrence with geography.
The coalescent age of newly-characterized subclades of R8, R8a (15.4±7.2 Kya) and R8b (25.7±10.2 Kya) indicates that the initial maternal colonization of this haplogroup occurred during the middle and upper Paleolithic period, roughly around 40 to 45 Kya. These results signify that the southern part of Orissa currently inhabited by Munda speakers is likely the origin of these autochthonous maternal deep-rooted haplogroups. Our high-resolution study on the genesis of R8 haplogroup provides ample evidence of its deep-rooted ancestry among the Orissa (Austro-Asiatic) tribes.
Heart failure is a leading cause of mortality in South Asians. However, its genetic etiology remains largely unknown1. Cardiomyopathies due to sarcomeric mutations are a major monogenic cause for heart failure (MIM600958). Here, we describe a deletion of 25 bp in the gene encoding cardiac myosin binding protein C (MYBPC3) that is associated with heritable cardiomyopathies and an increased risk of heart failure in Indian populations (initial study OR = 5.3 (95% CI = 2.3–13), P = 2 × 10−6; replication study OR = 8.59 (3.19–25.05), P = 3 × 10−8; combined OR = 6.99 (3.68–13.57), P = 4 × 10−11) and that disrupts cardiomyocyte structure in vitro. Its prevalence was found to be high (~4%) in populations of Indian subcontinental ancestry. The finding of a common risk factor implicated in South Asian subjects with cardiomyopathy will help in identifying and counseling individuals predisposed to cardiac diseases in this region.
The caste system has persisted in Indian Hindu society for around 3,500 years. Like the Y chromosome, caste is defined at birth, and males cannot change their caste. In order to investigate the genetic consequences of this system, we have analysed male-lineage variation in a sample of 227 Indian men of known caste, 141 from the Jaunpur district of Uttar Pradesh and 86 from the rest of India. We typed 131 Y-chromosomal binary markers and 16 microsatellites. We find striking evidence for male substructure: in particular, Brahmins and Kshatriyas (but not other castes) from Jaunpur each show low diversity and the predominance of a single distinct cluster of haplotypes. These findings confirm the genetic isolation and drift within the Jaunpur upper castes, which are likely to result from founder effects and social factors. In the other castes, there may be either larger effective population sizes, or less strict isolation, or both.
Y chromosome; haplotype; human population substructure; Indian caste system
We have analyzed 7137 samples from 125 different caste, tribal and religious groups of India and 99 samples from three populations of Nepal for the length variation in the COII/tRNALys region of mtDNA. Samples showing length variation were subjected to detailed phylogenetic analysis based on HVS-I and informative coding region sequence variation. The overall frequencies of the 9-bp deletion and insertion variants in South Asia were 1.8% and 0.5%, respectively. We have also defined a novel deep-rooting haplogroup M43 and identified the rare haplogroup H14 in Indian populations carrying the 9bp-deletion by complete mtDNA sequencing. Moreover, we redefined haplogroup M6 and dissected it into two well-defined subclades. The presence of haplogroups F1 and B5a in Uttar Pradesh suggests minor maternal contribution from Southeast Asia to Northern India. The occurrence of haplogroup F1 in the Nepalese sample implies that Nepal might have served as a bridge for the flow of eastern lineages to India. The presence of R6 in the Nepalese, on the other hand, suggests that the gene flow between India and Nepal has been reciprocal.
South Asia; 9bp indel; mtDNA; Haplogroup
Human genetic diversity observed in Indian subcontinent is second only to that of Africa. This implies an early settlement and demographic growth soon after the first 'Out-of-Africa' dispersal of anatomically modern humans in Late Pleistocene. In contrast to this perspective, linguistic diversity in India has been thought to derive from more recent population movements and episodes of contact. With the exception of Dravidian, which origin and relatedness to other language phyla is obscure, all the language families in India can be linked to language families spoken in different regions of Eurasia. Mitochondrial DNA and Y chromosome evidence has supported largely local evolution of the genetic lineages of the majority of Dravidian and Indo-European speaking populations, but there is no consensus yet on the question of whether the Munda (Austro-Asiatic) speaking populations originated in India or derive from a relatively recent migration from further East.
Here, we report the analysis of 35 novel complete mtDNA sequences from India which refine the structure of Indian-specific varieties of haplogroup R. Detailed analysis of haplogroup R7, coupled with a survey of ~12,000 mtDNAs from caste and tribal groups over the entire Indian subcontinent, reveals that one of its more recently derived branches (R7a1), is particularly frequent among Munda-speaking tribal groups. This branch is nested within diverse R7 lineages found among Dravidian and Indo-European speakers of India. We have inferred from this that a subset of Munda-speaking groups have acquired R7 relatively recently. Furthermore, we find that the distribution of R7a1 within the Munda-speakers is largely restricted to one of the sub-branches (Kherwari) of northern Munda languages. This evidence does not support the hypothesis that the Austro-Asiatic speakers are the primary source of the R7 variation. Statistical analyses suggest a significant correlation between genetic variation and geography, rather than between genes and languages.
Our high-resolution phylogeographic study, involving diverse linguistic groups in India, suggests that the high frequency of mtDNA haplogroup R7 among Munda speaking populations of India can be explained best by gene flow from linguistically different populations of Indian subcontinent. The conclusion is based on the observation that among Indo-Europeans, and particularly in Dravidians, the haplogroup is, despite its lower frequency, phylogenetically more divergent, while among the Munda speakers only one sub-clade of R7, i.e. R7a1, can be observed. It is noteworthy that though R7 is autochthonous to India, and arises from the root of hg R, its distribution and phylogeography in India is not uniform. This suggests the more ancient establishment of an autochthonous matrilineal genetic structure, and that isolation in the Pleistocene, lineage loss through drift, and endogamy of prehistoric and historic groups have greatly inhibited genetic homogenization and geographical uniformity.
The domestic goat is one of the important livestock species of India. In the present study we assess genetic diversity of Indian goats using 17 microsatellite markers. Breeds were sampled from their natural habitat, covering different agroclimatic zones.
The mean number of alleles per locus (NA) ranged from 8.1 in Barbari to 9.7 in Jakhrana goats. The mean expected heterozygosity (He) ranged from 0.739 in Barbari to 0.783 in Jakhrana goats. Deviations from Hardy-Weinberg Equilibrium (HWE) were statistically significant (P < 0.05) for 5 loci breed combinations. The DA measure of genetic distance between pairs of breeds indicated that the lowest distance was between Marwari and Sirohi (0.135). The highest distance was between Pashmina and Black Bengal. An analysis of molecular variance indicated that 6.59% of variance exists among the Indian goat breeds. Both a phylogenetic tree and Principal Component Analysis showed the distribution of breeds in two major clusters with respect to their geographic distribution.
Our study concludes that Indian goat populations can be classified into distinct genetic groups or breeds based on the microsatellites as well as mtDNA information.
We have analyzed 7,137 samples from 125 different caste, tribal and religious groups of India and 99 samples from three populations of Nepal for the length variation in the COII/tRNALys region of mtDNA. Samples showing length variation were subjected to detailed phylogenetic analysis based on HVS-I and informative coding region sequence variation. The overall frequencies of the 9-bp deletion and insertion variants in South Asia were 1.9 and 0.6%, respectively. We have also defined a novel deep-rooting haplogroup M43 and identified the rare haplogroup H14 in Indian populations carrying the 9-bp deletion by complete mtDNA sequencing. Moreover, we redefined haplogroup M6 and dissected it into two well-defined subclades. The presence of haplogroups F1 and B5a in Uttar Pradesh suggests minor maternal contribution from Southeast Asia to Northern India. The occurrence of haplogroup F1 in the Nepalese sample implies that Nepal might have served as a bridge for the flow of eastern lineages to India. The presence of R6 in the Nepalese, on the other hand, suggests that the gene flow between India and Nepal has been reciprocal.
South Asia; 9bp indel; mtDNA; Haplogroup
The Austro-Asiatic linguistic family, which is considered to be the oldest of all the families in India, has a substantial presence in Southeast Asia. However, the possibility of any genetic link among the linguistic sub-families of the Indian Austro-Asiatics on the one hand and between the Indian and the Southeast Asian Austro-Asiatics on the other has not been explored till now. Therefore, to trace the origin and historic expansion of Austro-Asiatic groups of India, we analysed Y-chromosome SNP and STR data of the 1222 individuals from 25 Indian populations, covering all the three branches of Austro-Asiatic tribes, viz. Mundari, Khasi-Khmuic and Mon-Khmer, along with the previously published data on 214 relevant populations from Asia and Oceania.
Our results suggest a strong paternal genetic link, not only among the subgroups of Indian Austro-Asiatic populations but also with those of Southeast Asia. However, maternal link based on mtDNA is not evident. The results also indicate that the haplogroup O-M95 had originated in the Indian Austro-Asiatic populations ~65,000 yrs BP (95% C.I. 25,442 – 132,230) and their ancestors carried it further to Southeast Asia via the Northeast Indian corridor. Subsequently, in the process of expansion, the Mon-Khmer populations from Southeast Asia seem to have migrated and colonized Andaman and Nicobar Islands at a much later point of time.
Our findings are consistent with the linguistic evidence, which suggests that the linguistic ancestors of the Austro-Asiatic populations have originated in India and then migrated to Southeast Asia.
India is a country with enormous social and cultural diversity due to its positioning on the crossroads of many historic and pre-historic human migrations. The hierarchical caste system in the Hindu society dominates the social structure of the Indian populations. The origin of the caste system in India is a matter of debate with many linguists and anthropologists suggesting that it began with the arrival of Indo-European speakers from Central Asia about 3500 years ago. Previous genetic studies based on Indian populations failed to achieve a consensus in this regard. We analysed the Y-chromosome and mitochondrial DNA of three tribal populations of southern India, compared the results with available data from the Indian subcontinent and tried to reconstruct the evolutionary history of Indian caste and tribal populations.
No significant difference was observed in the mitochondrial DNA between Indian tribal and caste populations, except for the presence of a higher frequency of west Eurasian-specific haplogroups in the higher castes, mostly in the north western part of India. On the other hand, the study of the Indian Y lineages revealed distinct distribution patterns among caste and tribal populations. The paternal lineages of Indian lower castes showed significantly closer affinity to the tribal populations than to the upper castes. The frequencies of deep-rooted Y haplogroups such as M89, M52, and M95 were higher in the lower castes and tribes, compared to the upper castes.
The present study suggests that the vast majority (>98%) of the Indian maternal gene pool, consisting of Indio-European and Dravidian speakers, is genetically more or less uniform. Invasions after the late Pleistocene settlement might have been mostly male-mediated. However, Y-SNP data provides compelling genetic evidence for a tribal origin of the lower caste populations in the subcontinent. Lower caste groups might have originated with the hierarchical divisions that arose within the tribal groups with the spread of Neolithic agriculturalists, much earlier than the arrival of Aryan speakers. The Indo-Europeans established themselves as upper castes among this already developed caste-like class structure within the tribes.
Macrohaplogroups 'M' and 'N' have evolved almost in parallel from a founder haplogroup L3. Macrohaplogroup N in India has already been defined in previous studies and recently the macrohaplogroup M among the Indian populations has been characterized. In this study, we attempted to reconstruct and re-evaluate the phylogeny of Macrohaplogroup M, which harbors more than 60% of the Indian mtDNA lineage, and to shed light on the origin of its deep rooting haplogroups.
Using 11 whole mtDNA and 2231 partial coding sequence of Indian M lineage selected from 8670 HVS1 sequences across India, we have reconstructed the tree including Andamanese-specific lineage M31 and calculated the time depth of all the nodes. We defined one novel haplogroup M41, and revised the classification of haplogroups M3, M18, and M31.
Our result indicates that the Indian mtDNA pool consists of several deep rooting lineages of macrohaplogroup 'M' suggesting in-situ origin of these haplogroups in South Asia, most likely in the India. These deep rooting lineages are not language specific and spread over all the language groups in India. Moreover, our reanalysis of the Andamanese-specific lineage M31 suggests population specific two clear-cut subclades (M31a1 and M31a2). Onge and Jarwa share M31a1 branch while M31a2 clade is present in only Great Andamanese individuals. Overall our study supported the one wave, rapid dispersal theory of modern humans along the Asian coast.
Because of the widespread phenomenon of patrilocality, it is hypothesized that Y-chromosome variants tend to be more localized geographically than those of mitochondrial DNA (mtDNA). Empirical evidence confirmatory to this hypothesis was subsequently provided among certain patrilocal and matrilocal groups of Thailand, which conforms to the isolation by distance mode of gene diffusion. However, we expect intuitively that the patterns of genetic variability may not be consistent with the above hypothesis among populations with different social norms governing the institution of marriage, particularly among those that adhere to strict endogamy rules. We test the universality of this hypothesis by analyzing Y-chromosome and mtDNA data in three different sets of Indian populations that follow endogamy rules to varying degrees. Our analysis of the Indian patrilocal and the matrilocal groups is not confirmatory to the sex-specific variation observed among the tribes of Thailand. Our results indicate spatial instability of the impact of different cultural processes on the genetic variability, resulting in the lack of universality of the hypothesized pattern of greater Y-chromosome variation when compared to that of mtDNA among the patrilocal populations.
In most human societies, women traditionally move to their husband's home after marriage, and these societies are thus “patrilocal,” but in a few “matrilocal” societies, men move to their wife's home. These social customs are expected to influence the patterns of genetic variation. They should lead to a localization of male-specific Y-chromosomal variants and wide dispersal of female-specific mitochondrial DNA variants in patrilocal societies and vice versa in matrilocal societies. These predicted patterns have indeed been observed in previous studies of populations from Thailand. Indian societies, however, are endogamous, so marriage should always take place within a population, and these different patterns of genetic variation should not build up. The authors have now analyzed ten patrilocal and five matrilocal Indian populations, and find that there is indeed little difference between the patrilocal and matrilocal societies. The authors therefore conclude that patterns of genetic variation in humans are not universal, but depend on local cultural practices.