In the present study, we screened 630 samples belonging to different regions of India and observed 2.38% (15/630) individuals bearing Y-HG Q. It was interesting to observe that 14/15 samples did not show any of the already known Y-HG Q sub-haplogroups (Q1, Q2, Q3 and Q4), defined by biallelic markers M120, M25/M143, M3 and M346, respectively (Table ). Only one individual was observed with the presence of M120 polymorphism, representing Q1 lineage.
| Table 1Indian populations screened for Y-HG Q and its distribution. |
Further, a novel 4 bp del/ins polymorphism (rs41352448, details provided in the Additional file
1) at 72,314 position of human arylsulfatase D pseudogene (ARSDP gene), in 5/15 individuals (33.3% of the observed Y-HG Q in the study) indicated the presence of a novel sub-group Q5 of Y-HG Q. In order to establish the exclusiveness of this polymorphism to Y-HG Q, we screened some of our samples already categorized in other unrelated haplogroups like R1a1, R2, L, H1, J, C, etc. The presence of an ancestral allele (without insertion at 72,314 position of ARSDP gene) in these samples confirmed the restriction of this novel polymorphism within the Y-HG Q. We also screened the sample with a derived state at M120 marker in the present study (representing Q1), using ss4 bp marker and found an absence of the polymorphism. In order to assign an independent status to designated Q5 and to confirm the placement of M346 derived samples as Q4 [
4], it was necessary to study the novel ss4 bp marker (Q5) in M346 derived samples (Table ). Three samples provided on request were screened for the ss4 bp polymorphism. The absence of this polymorphism in these three samples not only confirmed the authenticity of Q4 lineage but also validated the independent status of Q5 observed by us (Figure ). In addition to the Y-HG Q5 (5/15) and Q1 (1/15) samples, there were 9/15 (60%) individuals who did not show any of the known Q subgroup signature and were designated as Q*.
In order to put together our observations along with those made in literature earlier, we pooled data of 1615 Y- chromosomes (630 present study and 985 from literature) (Table ) for analyses, of which 21/1615 (1.3%) samples represented Q lineage in India. All of our 15 samples and 3 samples belonging to Y-HG Q4 [
4] were genotyped for 12 Y-microsatellite markers. However, to keep uniformity in evaluation, data of 10 overlapping Y-STR markers from present study and from different population groups of central Asia and India was used to construct median joining network [
5], although 12 Y-STR markers were analysed by us (Additional file
2). For most of the Indian Y-HG Q and its sub-lineages, three clusters of Y-STR haplotypes were observed (Figure ). One cluster included all the three Q4 and one Q*, another with all the Q5 and the third with most of the Q* bearing individuals. It was interesting to find that most of the Indian Q (Q4, Q5 and Q*) associated Y-STR haplotypes were separated from the bulk of Central Asian Q* associated haplotypes. Further, the clustering of the Y-STR haplotypes reassured the findings by the bi-allelic markers. This could either be due to population differentiation or because of the presence of these clusters in the ancestral migration from Central Asia, not clear at the moment. The increased diversity within the Indian population clusters could be interpreted as an overall effect of geographical differentiation, population expansions and severe bottlenecks resulting in loss of many of the in-between haplotypes thus, reducing the reticulation and increasing the branch lengths. It could also be as a result of independent migrations and admixture.
The age estimations made using a small sample size need to be increased which is not feasible at the moment, keeping in mind a very low frequency of Y-HG Q in India. The age estimation for haplogroup Q in India was carried out on the bigger cluster bearing Q* and Q5 in the median joining network. The calculated age of 47,101.5 (34,210.5 – 75,581.4) Years at 95% CI appears to be an over estimate than the age of haplogroup Q (15,000–18,000 Years Before Present) in literature[
2,
3,
6]. This probably has occurred due to the enhanced diversity, probably as an effect of population expansions and severe bottlenecks or might be due to later migrations and admixture. A further estimation of the age of Y-HG Q5 alone, using similar parameters, provided an age estimate of 14,492.7 (10,526.3 – 23,255.8) Years at 95% CI.
The compilation of distribution pattern of Y-HG Q in Indian population (Table ) from the present study as well as from literature, points out that this HG is distributed widely, ranging from Indo-European castes and tribes to their Dravidian counterparts, despite its low frequency. These observations could be explained either on the basis of the ancestral relationship of different Indian population groups, irrespective of linguistic and social divisions or alternatively by some degree of recent gene flow between these groups, not clear at the moment.