PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (433306)

Clipboard (0)
None

Related Articles

1.  PanSNPdb: The Pan-Asian SNP Genotyping Database 
PLoS ONE  2011;6(6):e21451.
The HUGO Pan-Asian SNP consortium conducted the largest survey to date of human genetic diversity among Asians by sampling 1,719 unrelated individuals among 71 populations from China, India, Indonesia, Japan, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand. We have constructed a database (PanSNPdb), which contains these data and various new analyses of them. PanSNPdb is a research resource in the analysis of the population structure of Asian peoples, including linkage disequilibrium patterns, haplotype distributions, and copy number variations. Furthermore, PanSNPdb provides an interactive comparison with other SNP and CNV databases, including HapMap3, JSNP, dbSNP and DGV and thus provides a comprehensive resource of human genetic diversity. The information is accessible via a widely accepted graphical interface used in many genetic variation databases. Unrestricted access to PanSNPdb and any associated files is available at: http://www4a.biotec.or.th/PASNP.
doi:10.1371/journal.pone.0021451
PMCID: PMC3121791  PMID: 21731755
2.  Reconstructing Roma History from Genome-Wide Data 
PLoS ONE  2013;8(3):e58633.
The Roma people, living throughout Europe and West Asia, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1,000–1,500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry–derived from a combination of European and South Asian sources–and that the date of admixture of South Asian and European ancestry was about 850 years before present. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which appears to have been followed by a major demographic expansion after the arrival in Europe.
doi:10.1371/journal.pone.0058633
PMCID: PMC3596272  PMID: 23516520
3.  Insights into the Genetic Structure and Diversity of 38 South Asian Indians from Deep Whole-Genome Sequencing 
PLoS Genetics  2014;10(5):e1004377.
South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language–speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.
Author Summary
Indians of South Asia has long been a population of interest to a wide audience, due to its unique diversity. We have deep-sequenced 38 individuals of Indian descent residing in Singapore (SSIP) in an effort to illustrate their diversity from a whole-genome standpoint. Indeed, among Asians in our population panel, SSIP was most diverse, followed by the Malays in Singapore (SSMP). Their diversity is further observed in the population's chromosome Y haplogroup and mitochondria haplogroup profiles; individuals with European-dominant haplogroups had greater proportion of European admixture. Among variants (single nucleotide polymorphism and small insertions/deletions) discovered in SSIP, 21.69% were novel with respect to previous sequencing projects. In addition, some 14 loss-of-function variants (LOFs) were associated to cancer, Type II diabetes, and cholesterol levels. Finally, D statistic test with ancient hominids concurred that there was gene flow to East Asians compared to South Asians.
doi:10.1371/journal.pgen.1004377
PMCID: PMC4022468  PMID: 24832686
4.  Genetic variation in South Indian castes: evidence from Y-chromosome, mitochondrial, and autosomal polymorphisms 
BMC Genetics  2008;9:86.
Background
Major population movements, social structure, and caste endogamy have influenced the genetic structure of Indian populations. An understanding of these influences is increasingly important as gene mapping and case-control studies are initiated in South Indian populations.
Results
We report new data on 155 individuals from four Tamil caste populations of South India and perform comparative analyses with caste populations from the neighboring state of Andhra Pradesh. Genetic differentiation among Tamil castes is low (RST = 0.96% for 45 autosomal short tandem repeat (STR) markers), reflecting a largely common origin. Nonetheless, caste- and continent-specific patterns are evident. For 32 lineage-defining Y-chromosome SNPs, Tamil castes show higher affinity to Europeans than to eastern Asians, and genetic distance estimates to the Europeans are ordered by caste rank. For 32 lineage-defining mitochondrial SNPs and hypervariable sequence (HVS) 1, Tamil castes have higher affinity to eastern Asians than to Europeans. For 45 autosomal STRs, upper and middle rank castes show higher affinity to Europeans than do lower rank castes from either Tamil Nadu or Andhra Pradesh. Local between-caste variation (Tamil Nadu RST = 0.96%, Andhra Pradesh RST = 0.77%) exceeds the estimate of variation between these geographically separated groups (RST = 0.12%). Low, but statistically significant, correlations between caste rank distance and genetic distance are demonstrated for Tamil castes using Y-chromosome, mtDNA, and autosomal data.
Conclusion
Genetic data from Y-chromosome, mtDNA, and autosomal STRs are in accord with historical accounts of northwest to southeast population movements in India. The influence of ancient and historical population movements and caste social structure can be detected and replicated in South Indian caste populations from two different geographic regions.
doi:10.1186/1471-2156-9-86
PMCID: PMC2621241  PMID: 19077280
5.  Autosomal and uniparental portraits of the native populations of Sakha (Yakutia): implications for the peopling of Northeast Eurasia 
Background
Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.
Results
We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.
Conclusions
Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
doi:10.1186/1471-2148-13-127
PMCID: PMC3695835  PMID: 23782551
mtDNA; Y chromosome; Autosomal SNPs; Sakha
6.  Power to Detect Risk Alleles Using Genome-Wide Tag SNP Panels 
PLoS Genetics  2007;3(10):e170.
Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550) and 650,000 (HumanHap650Y) SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (λ ∼ 1.8–2.0). Relative risks as low as λ ∼ 1.1–1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%–35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data.
Author Summary
Advances in high-throughput genotyping technology and the International HapMap Project have enabled genetic association studies at the whole-genome level. Our paper describes two genome-wide SNP panels that contain tag SNPs derived from the International HapMap Project. Tag SNPs are proxies for groups of highly correlated SNPs. Information can be captured for the entire group of correlated SNPs by genotyping only one representative SNP, the tag SNP. These whole-genome SNP panels also contain additional content thought to be overrepresented in disease, such as amino acid–changing nonsynonymous SNPs and mitochondrial SNPs. We show that these panels cover the genome with very high efficiency as measured by coverage of all HapMap SNPs and a set of SNPs derived from completely resequenced genes from the Seattle SNPs database. We also show that these panels have high power to detect disease risk alleles for both HapMap and non-HapMap SNPs. In complex disease where multiple risk alleles are believed to be involved, we show that the ability to detect at least one risk allele with the tag SNP panels is also high.
doi:10.1371/journal.pgen.0030170
PMCID: PMC2000969  PMID: 17922574
7.  Power to Detect Risk Alleles Using Genome-Wide Tag SNP Panels 
PLoS Genetics  2007;3(10):e170.
Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550) and 650,000 (HumanHap650Y) SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (λ ∼ 1.8–2.0). Relative risks as low as λ ∼ 1.1–1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%–35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data.
Author Summary
Advances in high-throughput genotyping technology and the International HapMap Project have enabled genetic association studies at the whole-genome level. Our paper describes two genome-wide SNP panels that contain tag SNPs derived from the International HapMap Project. Tag SNPs are proxies for groups of highly correlated SNPs. Information can be captured for the entire group of correlated SNPs by genotyping only one representative SNP, the tag SNP. These whole-genome SNP panels also contain additional content thought to be overrepresented in disease, such as amino acid–changing nonsynonymous SNPs and mitochondrial SNPs. We show that these panels cover the genome with very high efficiency as measured by coverage of all HapMap SNPs and a set of SNPs derived from completely resequenced genes from the Seattle SNPs database. We also show that these panels have high power to detect disease risk alleles for both HapMap and non-HapMap SNPs. In complex disease where multiple risk alleles are believed to be involved, we show that the ability to detect at least one risk allele with the tag SNP panels is also high.
doi:10.1371/journal.pgen.0030170
PMCID: PMC2000969  PMID: 17922574
8.  The Influence of Natural Barriers in Shaping the Genetic Structure of Maharashtra Populations 
PLoS ONE  2010;5(12):e15283.
Background
The geographical position of Maharashtra state makes it rather essential to study the dispersal of modern humans in South Asia. Several hypotheses have been proposed to explain the cultural, linguistic and geographical affinity of the populations living in Maharashtra state with other South Asian populations. The genetic origin of populations living in this state is poorly understood and hitherto been described at low molecular resolution level.
Methodology/Principal Findings
To address this issue, we have analyzed the mitochondrial DNA (mtDNA) of 185 individuals and NRY (non-recombining region of Y chromosome) of 98 individuals belonging to two major tribal populations of Maharashtra, and compared their molecular variations with that of 54 South Asian contemporary populations of adjacent states. Inter and intra population comparisons reveal that the maternal gene pool of Maharashtra state populations is composed of mainly South Asian haplogroups with traces of east and west Eurasian haplogroups, while the paternal haplogroups comprise the South Asian as well as signature of near eastern specific haplogroup J2a.
Conclusions/Significance
Our analysis suggests that Indian populations, including Maharashtra state, are largely derived from Paleolithic ancient settlers; however, a more recent (∼10 Ky older) detectable paternal gene flow from west Asia is well reflected in the present study. These findings reveal movement of populations to Maharashtra through the western coast rather than mainland where Western Ghats-Vindhya Mountains and Narmada-Tapti rivers might have acted as a natural barrier. Comparing the Maharastrian populations with other South Asian populations reveals that they have a closer affinity with the South Indian than with the Central Indian populations.
doi:10.1371/journal.pone.0015283
PMCID: PMC3004917  PMID: 21187967
9.  A Panel of Ancestry Informative Markers for the Complex Five-Way Admixed South African Coloured Population 
PLoS ONE  2013;8(12):e82224.
Admixture is a well known confounder in genetic association studies. If genome-wide data is not available, as would be the case for candidate gene studies, ancestry informative markers (AIMs) are required in order to adjust for admixture. The predominant population group in the Western Cape, South Africa, is the admixed group known as the South African Coloured (SAC). A small set of AIMs that is optimized to distinguish between the five source populations of this population (African San, African non-San, European, South Asian, and East Asian) will enable researchers to cost-effectively reduce false-positive findings resulting from ignoring admixture in genetic association studies of the population. Using genome-wide data to find SNPs with large allele frequency differences between the source populations of the SAC, as quantified by Rosenberg et. al's -statistic, we developed a panel of AIMs by experimenting with various selection strategies. Subsets of different sizes were evaluated by measuring the correlation between ancestry proportions estimated by each AIM subset with ancestry proportions estimated using genome-wide data. We show that a panel of 96 AIMs can be used to assess ancestry proportions and to adjust for the confounding effect of the complex five-way admixture that occurred in the South African Coloured population.
doi:10.1371/journal.pone.0082224
PMCID: PMC3869660  PMID: 24376522
10.  Examining markers in 8q24 to explain differences in evidence for association with cleft lip with/without cleft palate between Asians and Europeans 
Genetic epidemiology  2012;36(4):392-399.
In a recent genome wide association study (GWAS) from an international consortium, evidence of linkage and association in chr8q24 was much stronger among non-syndromic cleft lip/palate (CL/P) case-parent trios of European ancestry than among trios of Asian ancestry. We examined marker information content and haplotype diversity across 13 recruitment sites (from Europe, USA and Asia) separately, and conducted principal components analysis (PCA) on parents. As expected, PCA revealed large genetic distances between Europeans and Asians, and a north-south cline from Korea to Singapore in Asia, with Filipino parents forming a somewhat distinct Southeast Asian cluster. Hierarchical clustering of SNP heterozygosity revealed two major clades consistent with PCA results. All genotyped SNPs giving p<10−6 in the allelic TDT showed higher heterozygosity in Europeans than Asians. On average, European ancestry parents had higher haplotype diversity than Asians. Imputing additional variants across chr8q24 increased the strength of statistical evidence among Europeans and also revealed a significant signal among Asians (although it did not reach genome-wide significance). Tests for SNP-population interaction were negative, indicating the lack of strong signal for 8q24 in families of Asian ancestry was not due to any distinct genetic effect, but could simply reflect low power due to lower allele frequencies in Asians.
doi:10.1002/gepi.21633
PMCID: PMC3615645  PMID: 22508319
cleft lip with/without cleft palate; 8q24; genome wide association; imputation
11.  Principal component analysis reveals the 1000 Genomes Project does not sufficiently cover the human genetic diversity in Asia 
Frontiers in Genetics  2013;4:127.
The 1000 Genomes Project (1KG) aims to provide a comprehensive resource on human genetic variations. With an effort of sequencing 2,500 individuals, 1KG is expected to cover the majority of the human genetic diversities worldwide. In this study, using analysis of population structure based on genome-wide single nucleotide polymorphisms (SNPs) data, we examined and evaluated the coverage of genetic diversity of 1KG samples with the available genome-wide SNP data of 3,831 individuals representing 140 population samples worldwide. We developed a method to quantitatively measure and evaluate the genetic diversity revealed by population structure analysis. Our results showed that the 1KG does not have sufficient coverage of the human genetic diversity in Asia, especially in Southeast Asia. We suggested a good coverage of Southeast Asian populations be considered in 1KG or a regional effort be initialized to provide a more comprehensive characterization of the human genetic diversity in Asia, which is important for both evolutionary and medical studies in the future.
doi:10.3389/fgene.2013.00127
PMCID: PMC3701331  PMID: 23847652
human genetic diversity; population structure; 1000 Genomes Project; Pan-Asian SNP Project; Human Genome Diversity Project; single nucleotide polymorphisms; principal component analysis
12.  The Light Skin Allele of SLC24A5 in South Asians and Europeans Shares Identity by Descent 
PLoS Genetics  2013;9(11):e1003912.
Skin pigmentation is one of the most variable phenotypic traits in humans. A non-synonymous substitution (rs1426654) in the third exon of SLC24A5 accounts for lighter skin in Europeans but not in East Asians. A previous genome-wide association study carried out in a heterogeneous sample of UK immigrants of South Asian descent suggested that this gene also contributes significantly to skin pigmentation variation among South Asians. In the present study, we have quantitatively assessed skin pigmentation for a largely homogeneous cohort of 1228 individuals from the Southern region of the Indian subcontinent. Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the cohort studied. Our extensive survey of the polymorphism in 1573 individuals from 54 ethnic populations across the Indian subcontinent reveals wide presence of the derived-A allele, although the frequencies vary substantially among populations. We also show that the geospatial pattern of this allele is complex, but most importantly, reflects strong influence of language, geography and demographic history of the populations. Sequencing 11.74 kb of SLC24A5 in 95 individuals worldwide reveals that the rs1426654-A alleles in South Asian and West Eurasian populations are monophyletic and occur on the background of a common haplotype that is characterized by low genetic diversity. We date the coalescence of the light skin associated allele at 22–28 KYA. Both our sequence and genome-wide genotype data confirm that this gene has been a target for positive selection among Europeans. However, the latter also shows additional evidence of selection in populations of the Middle East, Central Asia, Pakistan and North India but not in South India.
Author Summary
Human skin color is one of the most visible aspects of human diversity. The genetic basis of pigmentation in Europeans has been understood to some extent, but our knowledge about South Asians has been restricted to a handful of studies. It has been suggested that a single nucleotide difference in SLC24A5 accounts for 25–38% European-African pigmentation differences and correlates with lighter skin. This genetic variant has also been associated with skin color variation among South Asians living in the UK. Here, we report a study based on a homogenous cohort of South India. Our results confirm that SLC24A5 plays a key role in pigmentation diversity of South Asians. Country-wide screening of the variant reveals that the light skin associated allele is widespread in the Indian subcontinent and its complex patterning is shaped by a combination of processes involving selection and demographic history of the populations. By studying the variation of SLC24A5 sequences among a diverse set of individuals, we show that the light skin associated allele in South Asians is identical by descent to that found in Europeans. Our study also provides new insights into positive selection acting on the gene and the evolutionary history of light skin in humans.
doi:10.1371/journal.pgen.1003912
PMCID: PMC3820762  PMID: 24244186
13.  Genetic studies of human diversity in East Asia 
East Asia is one of the most important regions for studying evolution and genetic diversity of human populations. Recognizing the relevance of characterizing the genetic diversity and structure of East Asian populations for understanding their genetic history and designing and interpreting genetic studies of human diseases, in recent years researchers in China have made substantial efforts to collect samples and generate data especially for markers on Y chromosomes and mtDNA. The hallmark of these efforts is the discovery and confirmation of consistent distinction between northern and southern East Asian populations at genetic markers across the genome. With the confirmation of an African origin for East Asian populations and the observation of a dominating impact of the gene flow entering East Asia from the south in early human settlement, interpretation of the north–south division in this context poses the challenge to the field. Other areas of interest that have been studied include the gene flow between East Asia and its neighbouring regions (i.e. Central Asia, the Sub-continent, America and the Pacific Islands), the origin of Sino-Tibetan populations and expansion of the Chinese.
doi:10.1098/rstb.2007.2028
PMCID: PMC2435565  PMID: 17317646
East Asian populations; genetic structure; origin of modern humans; migrations; gene flow; admixture
14.  Dispersal, Mating Events and Fine-Scale Genetic Structure in the Lesser Flat-Headed Bats 
PLoS ONE  2013;8(1):e54428.
Population genetic structure has important consequences in evolutionary processes and conservation genetics in animals. Fine-scale population genetic structure depends on the pattern of landscape, the permanent movement of individuals, and the dispersal of their genes during temporary mating events. The lesser flat-headed bat (Tylonycteris pachypus) is a nonmigratory Asian bat species that roosts in small groups within the internodes of bamboo stems and the habitats are fragmented. Our previous parentage analyses revealed considerable extra-group mating in this species. To assess the spatial limits and sex-biased nature of gene flow in the same population, we used 20 microsatellite loci and mtDNA sequencing of the ND2 gene to quantify genetic structure among 54 groups of adult flat-headed bats, at nine localities in South China. AMOVA and FST estimates revealed significant genetic differentiation among localities. Alternatively, the pairwise FST values among roosting groups appeared to be related to the incidence of associated extra-group breeding, suggesting the impact of mating events on fine-scale genetic structure. Global spatial autocorrelation analyses showed positive genetic correlation for up to 3 km, indicating the role of fragmented habitat and the specialized social organization as a barrier in the movement of individuals among bamboo forests. The male-biased dispersal pattern resulted in weaker spatial genetic structure between localities among males than among females, and fine-scale analyses supported that relatedness levels within internodes were higher among females than among males. Finally, only females were more related to their same sex roost mates than to individuals from neighbouring roosts, suggestive of natal philopatry in females.
doi:10.1371/journal.pone.0054428
PMCID: PMC3548791  PMID: 23349888
15.  Identification of Close Relatives in the HUGO Pan-Asian SNP Database 
PLoS ONE  2011;6(12):e29502.
The HUGO Pan-Asian SNP Consortium has recently released a genome-wide dataset, which consists of 1,719 DNA samples collected from 71 Asian populations. For studies of human population genetics such as genetic structure and migration history, this provided the most comprehensive large-scale survey of genetic variation to date in East and Southeast Asia. However, although considered in the analysis, close relatives were not clearly reported in the original paper. Here we performed a systematic analysis of genetic relationships among individuals from the Pan-Asian SNP (PASNP) database and identified 3 pairs of monozygotic twins or duplicate samples, 100 pairs of first-degree and 161 second-degree of relationships. Three standardized subsets with different levels of unrelated individuals were suggested here for future applications of the samples in most types of population-genetics studies (denoted by PASNP1716, PASNP1640 and PASNP1583 respectively) based on the relationships inferred in this study. In addition, we provided gender information for PASNP samples, which were not included in the original dataset, based on analysis of X chromosome data.
doi:10.1371/journal.pone.0029502
PMCID: PMC3248454  PMID: 22242128
16.  Evidence of recent natural selection on the Southeast Asian deletion (--SEA) causing α-thalassemia in South China 
Background
The Southeast Asian deletion (--SEA) is the most commonly observed mutation among diverse α-thalassemia alleles in Southeast Asia and South China. It is generally argued that mutation --SEA, like other variants causing hemoglobin disorders, is associated with protection against malaria that is endemic in these regions. However, little evidence has been provided to support this claim.
Results
We first examined the genetic imprint of recent positive selection on the --SEA allele and flanking sequences in the human α-globin cluster, covering a genomic region spanning ~410 kb, by genotyping 28 SNPs in a Chinese population consisting of 76 --SEA heterozygotes and 138 normal individuals. The pattern of linkage disequilibrium (LD) and the long-range haplotype test revealed a signature of positive selection. The network of inferred haplotypes suggested a single origin of the --SEA allele.
Conclusions
Thus, our data support the hypothesis that the --SEA allele has been subjected to recent balancing selection, triggered by malaria.
doi:10.1186/1471-2148-13-63
PMCID: PMC3626844  PMID: 23497175
17.  Fine-scale mapping of meiotic recombination in Asians 
BMC Genetics  2013;14:19.
Background
Meiotic recombination causes a shuffling of homologous chromosomes as they are passed from parents to children. Finding the genomic locations where these crossovers occur is important for genetic association studies, understanding population genetic variation, and predicting disease-causing structural rearrangements. There have been several reports that recombination hotspot usage differs between human populations. But while fine-scale genetic maps exist for European and African populations, none have been constructed for Asians.
Results
Here we present the first Asian genetic map with resolution high enough to reveal hotspot usage. We constructed this map by applying a hidden Markov model to genotype data for over 500,000 single nucleotide polymorphism markers from Korean and Mongolian pedigrees which include 980 meioses. We identified 32,922 crossovers with a precision rate of 99%, 97% sensitivity, and a median resolution of 105,949 bp. For direct comparison of genetic maps between ethnic groups, we also constructed a map for CEPH families using identical methods. We found high levels of concordance with known hotspots, with approximately 72% of recombination occurring in these regions. We investigated the hypothesized contribution of recombination problems to age-related aneuploidy. Our large sample size allowed us to detect a weak but significant negative effect of maternal age on recombination rate.
Conclusions
We have constructed the first fine-scale Asian genetic map. This fills an important gap in the understanding of recombination pattern variation and will be a valuable resource for future research in population genetics. Our map will improve the accuracy of linkage studies and inform the design of genome-wide association studies in the Asian population.
doi:10.1186/1471-2156-14-19
PMCID: PMC3599818  PMID: 23510153
Recombination; Hotspot; Asian; Genetic map
18.  Epidemiological study of ulcerative proctocolitis in Indian migrants and the indigenous population of Leicestershire. 
Gut  1992;33(5):687-693.
A retrospective epidemiological study of ulcerative colitis (UC) and proctitis was performed in Leicestershire from 1972-89. Potential cases were identified from hospital departments of pathology, endoscopy, and medical records and from general practitioners. The county population includes more than 93,000 South Asians. There were 573 cases of UC and 286 of proctitis in Europeans and 115 cases of UC and 29 of proctitis in South Asians. The standardised incidence of UC in Europeans and South Asians was stable, except in Sikhs in whom it had increased rapidly. The relative risk of UC to South Asians was 2.45. The standardised incidences of UC in South Asians during the 1980s were: 10.8/10(5)/year in Hindus (95% confidence interval (CI) 7.4-14.1 cases/10(5)/year) 16.5/10(5)/year in Sikhs (95% CI 7.9-25.2 cases/10(5)/year), and 6.2/10(5)/year in Muslims (95% CI 1.6-10.9 cases/10(5)/year). There was no difference in incidence between Asians from East Africa and India. The standardised incidence of UC in Europeans was 5.3/10(5)/year (95% CI 4.3-6.3 cases/10(5)/year). The standardised incidences of proctitis were 3.1/10(5)/year (95% CI 1.9-2.5 cases/10(5)/year) in South Asians and 2.3/10(5)/year (95% CI 1.8-2.4 cases/10(5)/year) in Europeans. Ethnic groups had a similar disease distribution, except Sikhs in whom it was less extensive. Despite the similar disease distribution, South Asians had fewer operations and complications from UC than Europeans. There was a bimodal age specific incidence in Europeans, but not in other ethnic groups. First and second generation South Asians were at similar risk. Hindus and Sikhs have a significantly higher incidence of UC than Europeans in Leicestershire.
PMCID: PMC1379303  PMID: 1307684
19.  Continent-Wide Decoupling of Y-Chromosomal Genetic Variation from Language and Geography in Native South Americans 
PLoS Genetics  2013;9(4):e1003460.
Numerous studies of human populations in Europe and Asia have revealed a concordance between their extant genetic structure and the prevailing regional pattern of geography and language. For native South Americans, however, such evidence has been lacking so far. Therefore, we examined the relationship between Y-chromosomal genotype on the one hand, and male geographic origin and linguistic affiliation on the other, in the largest study of South American natives to date in terms of sampled individuals and populations. A total of 1,011 individuals, representing 50 tribal populations from 81 settlements, were genotyped for up to 17 short tandem repeat (STR) markers and 16 single nucleotide polymorphisms (Y-SNPs), the latter resolving phylogenetic lineages Q and C. Virtually no structure became apparent for the extant Y-chromosomal genetic variation of South American males that could sensibly be related to their inter-tribal geographic and linguistic relationships. This continent-wide decoupling is consistent with a rapid peopling of the continent followed by long periods of isolation in small groups. Furthermore, for the first time, we identified a distinct geographical cluster of Y-SNP lineages C-M217 (C3*) in South America. Such haplotypes are virtually absent from North and Central America, but occur at high frequency in Asia. Together with the locally confined Y-STR autocorrelation observed in our study as a whole, the available data therefore suggest a late introduction of C3* into South America no more than 6,000 years ago, perhaps via coastal or trans-Pacific routes. Extensive simulations revealed that the observed lack of haplogroup C3* among extant North and Central American natives is only compatible with low levels of migration between the ancestor populations of C3* carriers and non-carriers. In summary, our data highlight the fact that a pronounced correlation between genetic and geographic/cultural structure can only be expected under very specific conditions, most of which are likely not to have been met by the ancestors of native South Americans.
Author Summary
In the largest population genetic study of South Americans to date, we analyzed the Y-chromosomal makeup of more than 1,000 male natives. We found that the male-specific genetic variation of Native Americans lacks any clear structure that could sensibly be related to their geographic and/or linguistic relationships. This finding is consistent with a rapid initial peopling of South America, followed by long periods of isolation in small tribal groups. The observed continent-wide decoupling of geography, spoken language, and genetics contrasts strikingly with previous reports of such correlation from many parts of Europe and Asia. Moreover, we identified a cluster of Native American founding lineages of Y chromosomes, called C-M217 (C3*), within a restricted area of Ecuador in North-Western South America. The same haplogroup occurs at high frequency in Central, East, and North East Asia, but is virtually absent from North (except Alaska) and Central America. Possible scenarios for the introduction of C-M217 (C3*) into Ecuador may thus include a coastal or trans-Pacific route, an idea also supported by occasional archeological evidence and the recent coalescence of the C3* haplotypes, estimated from our data to have occurred some 6,000 years ago.
doi:10.1371/journal.pgen.1003460
PMCID: PMC3623769  PMID: 23593040
20.  Can novel Apo A-I polymorphisms be responsible for low HDL in South Asian immigrants? 
Coronary artery disease (CAD) is the leading cause of death in the world. Even though its rates have decreased worldwide over the past 30 years, event rates are still high in South Asians. South Asians are known to have low high-density lipoprotein (HDL) levels. The objective of this study was to identify Apolipoprotein A-I (Apo A-I) polymorphisms, the main protein component of HDL and explore its association with low HDL levels in South Asians. A pilot study on 30 South Asians was conducted and 12-h fasting samples for C-reactive protein, total cholesterol, HDL, low-density lipoprotein (LDL), triglycerides, Lipoprotein (a), Insulin, glucose levels, DNA extraction, and sequencing of Apo A-I gene were done. DNA sequencing revealed six novel Apo A-I single nucleotide polymorphisms (SNPs) in South Asians, one of which (rs 35293760, C938T) was significantly associated with low (<40 mg/dl) HDL levels (P = 0.004). The association was also seen with total cholesterol (P = 0.026) and LDL levels (P = 0.032). This pilot work has highlighted some of the gene-environment associations that could be responsible for low HDL and may be excess CAD in South Asians. Further larger studies are required to explore and uncover these associations that could be responsible for excess CAD risk in South Asians.
doi:10.4103/0971-6866.42321
PMCID: PMC2840779  PMID: 20300285
Coronary artery diseases; high density lipoprotein; lipids; risk factors; South Asians
21.  Austro-Asiatic Tribes of Northeast India Provide Hitherto Missing Genetic Link between South and Southeast Asia 
PLoS ONE  2007;2(11):e1141.
Northeast India, the only region which currently forms a land bridge between the Indian subcontinent and Southeast Asia, has been proposed as an important corridor for the initial peopling of East Asia. Given that the Austro-Asiatic linguistic family is considered to be the oldest and spoken by certain tribes in India, Northeast India and entire Southeast Asia, we expect that populations of this family from Northeast India should provide the signatures of genetic link between Indian and Southeast Asian populations. In order to test this hypothesis, we analyzed mtDNA and Y-Chromosome SNP and STR data of the eight groups of the Austro-Asiatic Khasi from Northeast India and the neighboring Garo and compared with that of other relevant Asian populations. The results suggest that the Austro-Asiatic Khasi tribes of Northeast India represent a genetic continuity between the populations of South and Southeast Asia, thereby advocating that northeast India could have been a major corridor for the movement of populations from India to East/Southeast Asia.
doi:10.1371/journal.pone.0001141
PMCID: PMC2065843  PMID: 17989774
22.  Evaluating the possibility of detecting evidence of positive selection across Asia with sparse genotype data from the HUGO Pan-Asian SNP Consortium 
BMC Genomics  2014;15(1):332.
Background
The HUGO Pan-Asian SNP Consortium (PASNP) has generated a genetic resource of almost 55,000 autosomal single nucleotide polymorphisms (SNPs) across more than 1,800 individuals from 73 urban and indigenous populations in Asia. This has offered valuable insights into the correlation between the genetic ancestry of these populations with major linguistic systems and geography. Here, we attempt to understand whether adaptation to local climate, diet and environment partly explains the genetic variation present in these populations by investigating the genomic signatures of positive selection.
Results
To evaluate the impact to the selection analyses due to the considerably lower SNP density as compared to other population genetics resources such as the International HapMap Project (HapMap) or the Singapore Genome Variation Project, we evaluated the extent of haplotype phasing switch errors and the consistency of selection signals from three haplotype-based approaches (iHS, XP-EHH, haploPS) when the HapMap data is thinned to a similar density as PASNP. We subsequently applied haploPS to detect and characterize positive selection in the PASNP populations, identifying 59 genomics regions that were selected in at least one PASNP populations. A cluster analysis on the basis of these 59 signals showed that indigenous populations such as the Negrito from Malaysia and Philippines, the China Hmong, and the Taiwan Ami and Atayal shared more of these signals. We also reported evidence of a positive selection signal encompassing the beta globin gene in the Taiwan Ami and Atayal that was distinct from the signal in the HapMap Africans, suggesting the possibility of convergent evolution at this locus due to malarial selection.
Conclusions
We established that the lower SNP content of the PASNP data conferred weaker ability to detect signatures of positive selection, but the availability of the new approach haploPS retained modest power. Out of all the populations in PASNP, we identified only 59 signals, suggesting a strong need for high-density population-level genotyping data or sequencing data in order to achieve a comprehensive survey of positive selection in Asian populations.
Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-332) contains supplementary material, which is available to authorized users.
doi:10.1186/1471-2164-15-332
PMCID: PMC4035063  PMID: 24885517
Haplotype phasing; Positive selection; Population structure; Genetic diversity
23.  Detailed Analysis of Gene Polymorphisms Associated with Ischemic Stroke in South Asians 
PLoS ONE  2013;8(3):e57305.
The burden of stroke is disproportionately high in the South Asian subcontinent with South Asian ethnicity conferring a greater risk of ischemic stroke than European ancestry regardless of country inhabited. While genes associated with stroke in European populations have been investigated, they remain largely unknown in South Asians. We conducted a comprehensive meta-analysis of known genetic polymorphisms associated with South Asian ischemic stroke, and compared effect size of the MTHFR C677T-stroke association with effect sizes predicted from homocysteine-stroke association. Electronic databases were searched up to August 2012 for published case control studies investigating genetic polymorphisms associated with ischemic stroke in South Asians. Pooled odds ratios (OR) for each gene-disease association were calculated using a random-effects model. We identified 26 studies (approximately 2529 stroke cases and 2881 controls) interrogating 33 independent genetic polymorphisms in 22 genes. Ten studies described MTHFR C677T (108 with TT genotype and 2018 with CC genotype) -homocysteine relationship and six studies (735 stroke cases and 713 controls) described homocysteine-ischemic stroke relationship. Risk association ORs were calculated for ACE I/D (OR 5.00; 95% CI, 1.17–21.37; p = 0.03), PDE4D SNP 83 (OR 2.20; 95% CI 1.21–3.99; p = 0.01), PDE4D SNP 32 (OR 1.57; 95% CI 1.01–2.45, p = 0.045) and IL10 G1082A (OR 1.44; 95% CI, 1.09–1.91, p = 0.01). Significant association was observed between elevated plasma homocysteine levels and MTHFR/677 TT genotypes in healthy South Asians (Mean difference (ΔX) 5.18 µmol/L; 95% CI 2.03–8.34: p = 0.001). Our results demonstrate that the genetic etiology of ischemic stroke in South Asians is broadly similar to the risk conferred in Europeans, although the dataset is considerably smaller and warrants the same clinical considerations for risk profiling.
doi:10.1371/journal.pone.0057305
PMCID: PMC3591429  PMID: 23505425
24.  Detailed Analysis of Japanese Population Substructure with a Focus on the Southwest Islands of Japan 
PLoS ONE  2012;7(4):e35000.
Uncovering population structure is important for properly conducting association studies and for examining the demographic history of a population. Here, we examined the Japanese population substructure using data from the Japan Multi-Institutional Collaborative Cohort (J-MICC), which covers all but the northern region of Japan. Using 222 autosomal loci from 4502 subjects, we investigated population substructure by estimating FST among populations, testing population differentiation, and performing principal component analysis (PCA) and correspondence analysis (CA). All analyses revealed a low but significant differentiation between the Amami Islanders and the mainland Japanese population. Furthermore, we examined the genetic differentiation between the mainland population, Amami Islanders and Okinawa Islanders using six loci included in both the Pan-Asian SNP (PASNP) consortium data and the J-MICC data. This analysis revealed that the Amami and Okinawa Islanders were differentiated from the mainland population. In conclusion, we revealed a low but significant level of genetic differentiation between the mainland population and populations in or to the south of the Amami Islands, although genetic variation between both populations might be clinal. Therefore, the possibility of population stratification must be considered when enrolling the islander population of this area, such as in the J-MICC study.
doi:10.1371/journal.pone.0035000
PMCID: PMC3318002  PMID: 22509376
25.  Is South Asian ethnicity an independent cardiovascular risk factor? 
People of South Asian origin constitute a large, visible minority in Canada and are known to be at heightened risk for premature coronary artery disease. Conventional risk factors clearly confer risk in South Asians but do not adequately explain their excess risk compared with other populations. Rates of smoking, hypertension and levels of low density lipoprotein-cholesterol tend to be similar or lower in South Asians, although diabetes is more prevalent. Recent studies have suggested that the metabolic syndrome and abdominal obesity may play a causative role in both the prevalence of diabetes and the premature atherosclerosis noted in South Asians. It is possible that genetically susceptible individuals develop abdominal obesity and insulin resistance when exposed to a toxic environment of reduced energy expenditure and increased caloric consumption. This pattern is increasingly noted in parallel with urbanization, suggesting that the increased cardiovascular risk in South Asians may be preventable through lifestyle interventions and the judicious use of medicines to attain optimal levels of blood pressure, lipids and glucose.
PMCID: PMC2528919  PMID: 16520847
Coronary artery disease; Ethnicity; Insulin resistance; Metabolic syndrome; Population health

Results 1-25 (433306)