|Home | About | Journals | Submit | Contact Us | Français|
We have analyzed 7,137 samples from 125 different caste, tribal and religious groups of India and 99 samples from three populations of Nepal for the length variation in the COII/tRNALys region of mtDNA. Samples showing length variation were subjected to detailed phylogenetic analysis based on HVS-I and informative coding region sequence variation. The overall frequencies of the 9-bp deletion and insertion variants in South Asia were 1.9 and 0.6%, respectively. We have also defined a novel deep-rooting haplogroup M43 and identified the rare haplogroup H14 in Indian populations carrying the 9-bp deletion by complete mtDNA sequencing. Moreover, we redefined haplogroup M6 and dissected it into two well-defined subclades. The presence of haplogroups F1 and B5a in Uttar Pradesh suggests minor maternal contribution from Southeast Asia to Northern India. The occurrence of haplogroup F1 in the Nepalese sample implies that Nepal might have served as a bridge for the flow of eastern lineages to India. The presence of R6 in the Nepalese, on the other hand, suggests that the gene flow between India and Nepal has been reciprocal.
Sequence variation of mitochondrial DNA (mtDNA) has been widely studied to assess genetic relatedness at both the species and population levels. In the consensus sequence of the human mtDNA  there are two copies of the 9-bp motif ccccctcta in the non-coding region V [2,3,4]. Deletion of one copy has been reported in many populations worldwide at different frequencies [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]. Previous studies have demonstrated multiple and independent origins of the 9-bp deletion on different mtDNA haplogroup backgrounds in populations of different continental affiliation such as Australian Aborigines  South Indians [18,19,20,21,22] and Sub-Saharan Africans . However, in Central and East Asian populations, the 9-bp deletion is predominantly associated with a single haplogroup, B [23,24,25,26,27,28,29].
India is culturally and geographically a highly heterogeneous country. The caste and tribal groups of India are considered socially and culturally the most stratified of all known societies in human history. Indian populations are structured further by their linguistic and religious affiliations. More than 60% of the present day Indian maternal lineages are affiliated with mtDNA haplogroup M. Most of the sub-clades of these Eurasian founder haplogroups are autochthonous to South Asia [for a recent review see ]. South East Asia is conventionally divided into two cultural, linguistic and geographic regions, i.e. Insular Southeast Asia – the island or peninsular countries of Malaysia, Singapore, Brunei, The Philippines and East Timor; and Mainland Southeast Asia – the countries of Thailand, Laos, Burma, Cambodia and Vietnam. Tibeto-Burman, Austronesian and Austro-Asiatic are the major language families of this area.
Recently, we have shown that populations from the Nicobar Islands that speak the Mon-Khmer branch of the Austro-Asiatic language family carry the highest frequency of the 9-bp deletion thus far reported among South Asian populations . In contrast, we did not detect the 9-bp deletion among 302 Mundari samples, another Austro-Asiatic speaking group from mainland India . Detailed molecular characterization of the Nicobarese samples using coding region SNPs classified them under haplogroup B5a1a [18, 31], which is also common in China and other South East Asian populations [23,24,25,26,27,28,29]. In contrast, studies on Mundari populations have not so far detected any lineages with the 9-bp deletion [18, 32, 33]. These findings suggest that the Austro-Asiatic populations of the Indian subcontinent and Nicobar Islands may have distinct genetic sources . Given the limited number of Mundari populations covered in our previous survey, we have now analyzed mtDNA diversity in 3,844 additional individuals from India, including 243 Mundari speakers from six populations not investigated previously, and, for the first time, populations from Nepal, to determine the genetic affinities of the population of mainland India with South East Asian populations, and to infer the origins of the 9-bp deletion/insertion variants in the Indian subcontinent.
Approximately 5–10 ml of blood was collected from 7,137 individuals belonging to 125 endogamous populations of India, which includes the 3,293 samples reported in our previous work . In addition, 99 samples from three Nepalese populations were also analyzed. All samples were collected with the written informed consent of the donors. The non-coding intergenic region V between the genes COII and tRNALys in the mtDNA was amplified  and size-fractionated in 8% polyacrylamide gel electrophoresis. HVS-I and Single Nucleotide Polymorphisms (SNPs, from the coding region) were typed by sequencing and RFLP analysis (supplementary table 1, www.karger.com/doi/ 10.1159/000114160) [35, 36] in the samples carrying either the deletion or insertion of the 9-bp motif. Individuals with ambiguous haplogroup affiliation were subjected to further sequencing of informative coding region stretches. To minimize errors, both strands were double-sequenced. Phylogenetic relationships between the observed haplotypes were reconstructed with the NETWORK program (version 4.1)  [www.fluxus-engineering.com].
Of the 7,137 individuals studied (present data and previous work) 139 individuals had the 9-bp deletion (1.94%) whereas 42 individuals had the insertion (0.59%) (supplementary table 1). The highest frequency (~20%) of the 9-bp deletion was observed in the Yanadi, a Dravidian-speaking population from southern districts of Andhra Pradesh. Nine populations showed the presence of the 9-bp insertion and the highest frequency (8.7%) was also observed in a Dravidian-speaking population, from Andhra Pradesh (table (table1,1, supplementary table 1).
Both Mishmar et al.  and Ruiz-Pesini et al.  have postulated a role for natural selection in shaping the mtDNA distribution at a global level. They classified the South and South East Asian haplogroup B, the African haplogroup L0a  and the Australian variants  as ‘tropical’ haplogroups, which are associated with the 9-bp deletion. In contrast, none of the haplogroups found in Europe or Northeast Asia had high frequencies of the 9-bp deletion and were not classified as ‘tropical’ [38, 39]. Surprisingly, the distribution appears different in the Indian sub-continent, where the frequency of the 9-bp deletion is generally low, except in Nicobar Islanders [[18,19,20,21,22, 33]; this study].
Given the generally low frequency we find in our large South Asian sample, it seems unlikely that the 9-bp deletion is advantageous in tropical regions. Its high frequency in certain populations is better explained by random genetic drift (e.g. a founder effect involving haplogroup B5a1 in the Nicobarese) resulting in different frequencies of different lineages in different regions, where some of the lineages harbor phylogenetically deep deletions that render them more frequent. In contrast to the 9-bp deletion, the insertion polymorphism has not reached high frequency in any of the populations of our present study. Nevertheless, Thangaraj et al.  reported that the insertion, like the deletion, has occurred repeatedly on multiple mtDNA lineages. While most of the insertion/deletion events examined in this study are autochthonous to India, i.e. they are found on mtDNA lineages specific to India, a few can still be attested as having evolved elsewhere and only recently been carried into India. Besides the Nicobarese B5a, several other sub-clades of haplogroup B are found in India at detectable but overall low frequency (fig. (fig.1).1). In India, the insertion is rarer than the deletion (supplementary table 1, fig. fig.2).2). We did not find any individual with four or more copies of the 9-bp motif, which may suggest that more copies have a lower fitness or that they are structurally unstable. The frequency distribution of mtDNA haplogroups associated with the 9-bp indel in mainland India is shown in figure figure22.
By additional genotyping of informative SNPs in the coding region in the samples reported in our present and previous studies , we classified the samples with the 9-bp ins/del polymorphism according to known mtDNA haplogroups (supplementary table 1). Our results suggest that there have been at least 21 independent occurrences of the 9-bp ins/del polymorphism in macrohaplogroup M (fig. (fig.3)3) and 15 independent incidents in macrohaplogroup R background (fig. (fig.1).1). Our survey of published complete mtDNA sequence data of European populations shows a 3% (16/527) 9-bp deletion frequency [40,41,42] on the background of multiple European-specific haplogroups which is by and large consistent with our findings for India. This comparison further strengthens the view that the elevated frequency of the 9-bp deletion in some previously reported populations from the tropics can be explained by limited sampling and the effect of genetic drift rather than requiring explanations involving adaptive selection to the tropical environment.
We identified here a novel deep-rooting mtDNA clade M43, based on the complete sequencing of a 9-bp-deleted Tharu individual from Uttaranchal state of India (fig. (fig.4).4). This sample shared four coding region mutations with individual B177 of Sun et al. . We redefined haplogroup M6 by five coding and a single control region substitution. The fine dissection of this haplogroup suggests two well-defined subclades M6a and M6b. Subclade, M6b occupies one branch among the two major founder clades of South Indian 9-bp-deleted samples; the another major clad is unclassified M* (fig. (fig.3).3). Moreover, for the first time we have identified the rare West Eurasia- specific haplogroup H14 [44, 45] in Indian population (fig. (fig.4).4). The complete sequencing of the Indian branch suggests a deep split between the Indian and European lineages ~20 KYA.
After screening a total of 545 Mundari-speaking individuals from mainland India, including 302 samples from our previous study, we found that the 9-bp deletion polymorphism was observed in 0.5% of the analyzed samples, in contrast to our previous study  where no samples with the deletion were detected and only a single individual with the 9bp-insertion was found. However, while all Nicobarese with the 9-bp deletion belonged to haplogroup B5a, the new Mundari samples with the 9-bp deletion polymorphism were found in different backgrounds within the macrohaplogroup M. These findings suggest independent occurrences of the 9-bp deletion event even among the Mundari population. Accordingly, these results support our previous hypothesis that the Mundari (an Austro-Asiatic speaking population) from mainland India did have an independent migration/origin compared with the Mon-Khmeric Austro-Asiatic speaking groups from the Nicobar Islands [18, 31]. In addition to that, the 262 samples (220 from this work (table (table1)1) and 42 Nishi from our previous study) from Tibeto-Burman speaking groups of North Eastern India did not show the presence of haplogroup B either.
The presence of haplogroups B5a and F1 in the Tharu of Eastern Uttar Pradesh (near the Indo-Nepal border) points to past human movements from East Asia to India, perhaps through Nepal. In order to investigate this possibility further, we sequenced the HVS-1 in 99 samples from Nepal and searched for the 9-bp polymorphism. We did not find any Nepali sample either belonging to haplogroup B5a or carrying the 9-bp deletion/insertion polymorphism. However, we detected haplogroups R6 and F1, which suggest reciprocal gene flow. The presence of haplogroup F1 in the Nepalese implies that Nepal could have served as a bridge for migrations from East Asia to India. On the other hand, haplogroup R6 in Nepal suggests gene flow from India to Nepal.
In conclusion, we report a novel autochthonous deep-rooting M43 lineage, redefine haplogroup M6 and identify a rare West Eurasia-specific haplogroup H14 in Indian population (fig. (fig.4).4). Based on the co-presence of haplogroup F1 in North Indian and Nepali populations, but the lack of B5a in the latter, it seems likely that the migration of Southeast Asian maternal lineages to North India has occurred not only through the corridor of Nepal but also has involved gene flow along the southern slopes of the Himalayas within the Indian sub-continent. The occurrence of R6 in the Nepalese suggests gene flow from India to Nepal. Further analyses of the B5a and F1 lineages in larger samples from Nepal and Eastern Uttar Pradesh should shed further light on the origin and spread of the Tharu and other populations of North India. Studies of mtDNA and Y-chromosomal markers in Nepalese populations would be useful for the investigation of this putative gene flow into India.
We are gratefully acknowledged Council of Scientific and Industrial Research, Govt. of India for their financial support to L.S. and K.T. for analyzing the Indian samples. We are thankful to all anonymous blood donors. Nepalese samples were collected as part of the European Science Foundation EUROCORES Programme OMLL, which was supported by funds from the Arts and Humanities Research Board and the EC Sixth Framework Programme under Contract no. ERAS-CT-2003-980409. C.T.-S. was supported by The Wellcome Trust.
K. Thangaraj, G. Chaubey, T. Kivisild and D. Selvi Rani contributed equally to the paper.