PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (573957)

Clipboard (0)
None

Related Articles

1.  Insights into the Genetic Structure and Diversity of 38 South Asian Indians from Deep Whole-Genome Sequencing 
PLoS Genetics  2014;10(5):e1004377.
South Asia possesses a significant amount of genetic diversity due to considerable intergroup differences in culture and language. There have been numerous reports on the genetic structure of Asian Indians, although these have mostly relied on genotyping microarrays or targeted sequencing of the mitochondria and Y chromosomes. Asian Indians in Singapore are primarily descendants of immigrants from Dravidian-language–speaking states in south India, and 38 individuals from the general population underwent deep whole-genome sequencing with a target coverage of 30X as part of the Singapore Sequencing Indian Project (SSIP). The genetic structure and diversity of these samples were compared against samples from the Singapore Sequencing Malay Project and populations in Phase 1 of the 1,000 Genomes Project (1 KGP). SSIP samples exhibited greater intra-population genetic diversity and possessed higher heterozygous-to-homozygous genotype ratio than other Asian populations. When compared against a panel of well-defined Asian Indians, the genetic makeup of the SSIP samples was closely related to South Indians. However, even though the SSIP samples clustered distinctly from the Europeans in the global population structure analysis with autosomal SNPs, eight samples were assigned to mitochondrial haplogroups that were predominantly present in Europeans and possessed higher European admixture than the remaining samples. An analysis of the relative relatedness between SSIP with two archaic hominins (Denisovan, Neanderthal) identified higher ancient admixture in East Asian populations than in SSIP. The data resource for these samples is publicly available and is expected to serve as a valuable complement to the South Asian samples in Phase 3 of 1 KGP.
Author Summary
Indians of South Asia has long been a population of interest to a wide audience, due to its unique diversity. We have deep-sequenced 38 individuals of Indian descent residing in Singapore (SSIP) in an effort to illustrate their diversity from a whole-genome standpoint. Indeed, among Asians in our population panel, SSIP was most diverse, followed by the Malays in Singapore (SSMP). Their diversity is further observed in the population's chromosome Y haplogroup and mitochondria haplogroup profiles; individuals with European-dominant haplogroups had greater proportion of European admixture. Among variants (single nucleotide polymorphism and small insertions/deletions) discovered in SSIP, 21.69% were novel with respect to previous sequencing projects. In addition, some 14 loss-of-function variants (LOFs) were associated to cancer, Type II diabetes, and cholesterol levels. Finally, D statistic test with ancient hominids concurred that there was gene flow to East Asians compared to South Asians.
doi:10.1371/journal.pgen.1004377
PMCID: PMC4022468  PMID: 24832686
2.  Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies 
BMC Genetics  2009;10:27.
Background
Although high-throughput genotyping arrays have made whole-genome association studies (WGAS) feasible, only a small proportion of SNPs in the human genome are actually surveyed in such studies. In addition, various SNP arrays assay different sets of SNPs, which leads to challenges in comparing results and merging data for meta-analyses. Genome-wide imputation of untyped markers allows us to address these issues in a direct fashion.
Methods
384 Caucasian American liver donors were genotyped using Illumina 650Y (Ilmn650Y) arrays, from which we also derived genotypes from the Ilmn317K array. On these data, we compared two imputation methods: MACH and BEAGLE. We imputed 2.5 million HapMap Release22 SNPs, and conducted GWAS on ~40,000 liver mRNA expression traits (eQTL analysis). In addition, 200 Caucasian American and 200 African American subjects were genotyped using the Affymetrix 500 K array plus a custom 164 K fill-in chip. We then imputed the HapMap SNPs and quantified the accuracy by randomly masking observed SNPs.
Results
MACH and BEAGLE perform similarly with respect to imputation accuracy. The Ilmn650Y results in excellent imputation performance, and it outperforms Affx500K or Ilmn317K sets. For Caucasian Americans, 90% of the HapMap SNPs were imputed at 98% accuracy. As expected, imputation of poorly tagged SNPs (untyped SNPs in weak LD with typed markers) was not as successful. It was more challenging to impute genotypes in the African American population, given (1) shorter LD blocks and (2) admixture with Caucasian populations in this population. To address issue (2), we pooled HapMap CEU and YRI data as an imputation reference set, which greatly improved overall performance. The approximate 40,000 phenotypes scored in these populations provide a path to determine empirically how the power to detect associations is affected by the imputation procedures. That is, at a fixed false discovery rate, the number of cis-eQTL discoveries detected by various methods can be interpreted as their relative statistical power in the GWAS. In this study, we find that imputation offer modest additional power (by 4%) on top of either Ilmn317K or Ilmn650Y, much less than the power gain from Ilmn317K to Ilmn650Y (13%).
Conclusion
Current algorithms can accurately impute genotypes for untyped markers, which enables researchers to pool data between studies conducted using different SNP sets. While genotyping itself results in a small error rate (e.g. 0.5%), imputing genotypes is surprisingly accurate. We found that dense marker sets (e.g. Ilmn650Y) outperform sparser ones (e.g. Ilmn317K) in terms of imputation yield and accuracy. We also noticed it was harder to impute genotypes for African American samples, partially due to population admixture, although using a pooled reference boosts performance. Interestingly, GWAS carried out using imputed genotypes only slightly increased power on top of assayed SNPs. The reason is likely due to adding more markers via imputation only results in modest gain in genetic coverage, but worsens the multiple testing penalties. Furthermore, cis-eQTL mapping using dense SNP set derived from imputation achieves great resolution, and locate associate peak closer to causal variants than conventional approach.
doi:10.1186/1471-2156-10-27
PMCID: PMC2709633  PMID: 19531258
3.  The Light Skin Allele of SLC24A5 in South Asians and Europeans Shares Identity by Descent 
PLoS Genetics  2013;9(11):e1003912.
Skin pigmentation is one of the most variable phenotypic traits in humans. A non-synonymous substitution (rs1426654) in the third exon of SLC24A5 accounts for lighter skin in Europeans but not in East Asians. A previous genome-wide association study carried out in a heterogeneous sample of UK immigrants of South Asian descent suggested that this gene also contributes significantly to skin pigmentation variation among South Asians. In the present study, we have quantitatively assessed skin pigmentation for a largely homogeneous cohort of 1228 individuals from the Southern region of the Indian subcontinent. Our data confirm significant association of rs1426654 SNP with skin pigmentation, explaining about 27% of total phenotypic variation in the cohort studied. Our extensive survey of the polymorphism in 1573 individuals from 54 ethnic populations across the Indian subcontinent reveals wide presence of the derived-A allele, although the frequencies vary substantially among populations. We also show that the geospatial pattern of this allele is complex, but most importantly, reflects strong influence of language, geography and demographic history of the populations. Sequencing 11.74 kb of SLC24A5 in 95 individuals worldwide reveals that the rs1426654-A alleles in South Asian and West Eurasian populations are monophyletic and occur on the background of a common haplotype that is characterized by low genetic diversity. We date the coalescence of the light skin associated allele at 22–28 KYA. Both our sequence and genome-wide genotype data confirm that this gene has been a target for positive selection among Europeans. However, the latter also shows additional evidence of selection in populations of the Middle East, Central Asia, Pakistan and North India but not in South India.
Author Summary
Human skin color is one of the most visible aspects of human diversity. The genetic basis of pigmentation in Europeans has been understood to some extent, but our knowledge about South Asians has been restricted to a handful of studies. It has been suggested that a single nucleotide difference in SLC24A5 accounts for 25–38% European-African pigmentation differences and correlates with lighter skin. This genetic variant has also been associated with skin color variation among South Asians living in the UK. Here, we report a study based on a homogenous cohort of South India. Our results confirm that SLC24A5 plays a key role in pigmentation diversity of South Asians. Country-wide screening of the variant reveals that the light skin associated allele is widespread in the Indian subcontinent and its complex patterning is shaped by a combination of processes involving selection and demographic history of the populations. By studying the variation of SLC24A5 sequences among a diverse set of individuals, we show that the light skin associated allele in South Asians is identical by descent to that found in Europeans. Our study also provides new insights into positive selection acting on the gene and the evolutionary history of light skin in humans.
doi:10.1371/journal.pgen.1003912
PMCID: PMC3820762  PMID: 24244186
4.  Autosomal and uniparental portraits of the native populations of Sakha (Yakutia): implications for the peopling of Northeast Eurasia 
Background
Sakha – an area connecting South and Northeast Siberia – is significant for understanding the history of peopling of Northeast Eurasia and the Americas. Previous studies have shown a genetic contiguity between Siberia and East Asia and the key role of South Siberia in the colonization of Siberia.
Results
We report the results of a high-resolution phylogenetic analysis of 701 mtDNAs and 318 Y chromosomes from five native populations of Sakha (Yakuts, Evenks, Evens, Yukaghirs and Dolgans) and of the analysis of more than 500,000 autosomal SNPs of 758 individuals from 55 populations, including 40 previously unpublished samples from Siberia. Phylogenetically terminal clades of East Asian mtDNA haplogroups C and D and Y-chromosome haplogroups N1c, N1b and C3, constituting the core of the gene pool of the native populations from Sakha, connect Sakha and South Siberia. Analysis of autosomal SNP data confirms the genetic continuity between Sakha and South Siberia. Maternal lineages D5a2a2, C4a1c, C4a2, C5b1b and the Yakut-specific STR sub-clade of Y-chromosome haplogroup N1c can be linked to a migration of Yakut ancestors, while the paternal lineage C3c was most likely carried to Sakha by the expansion of the Tungusic people. MtDNA haplogroups Z1a1b and Z1a3, present in Yukaghirs, Evens and Dolgans, show traces of different and probably more ancient migration(s). Analysis of both haploid loci and autosomal SNP data revealed only minor genetic components shared between Sakha and the extreme Northeast Siberia. Although the major part of West Eurasian maternal and paternal lineages in Sakha could originate from recent admixture with East Europeans, mtDNA haplogroups H8, H20a and HV1a1a, as well as Y-chromosome haplogroup J, more probably reflect an ancient gene flow from West Eurasia through Central Asia and South Siberia.
Conclusions
Our high-resolution phylogenetic dissection of mtDNA and Y-chromosome haplogroups as well as analysis of autosomal SNP data suggests that Sakha was colonized by repeated expansions from South Siberia with minor gene flow from the Lower Amur/Southern Okhotsk region and/or Kamchatka. The minor West Eurasian component in Sakha attests to both recent and ongoing admixture with East Europeans and an ancient gene flow from West Eurasia.
doi:10.1186/1471-2148-13-127
PMCID: PMC3695835  PMID: 23782551
mtDNA; Y chromosome; Autosomal SNPs; Sakha
5.  Candidate Gene Approach for Parasite Resistance in Sheep – Variation in Immune Pathway Genes and Association with Fecal Egg Count 
PLoS ONE  2014;9(2):e88337.
Sheep chromosome 3 (Oar3) has the largest number of QTLs reported to be significantly associated with resistance to gastro-intestinal nematodes. This study aimed to identify single nucleotide polymorphisms (SNPs) within candidate genes located in sheep chromosome 3 as well as genes involved in major immune pathways. A total of 41 SNPs were identified across 38 candidate genes in a panel of unrelated sheep and genotyped in 713 animals belonging to 22 breeds across Asia, Europe and South America. The variations and evolution of immune pathway genes were assessed in sheep populations across these macro-environmental regions that significantly differ in the diversity and load of pathogens. The mean minor allele frequency (MAF) did not vary between Asian and European sheep reflecting the absence of ascertainment bias. Phylogenetic analysis revealed two major clusters with most of South Asian, South East Asian and South West Asian breeds clustering together while European and South American sheep breeds clustered together distinctly. Analysis of molecular variance revealed strong phylogeographic structure at loci located in immune pathway genes, unlike microsatellite and genome wide SNP markers. To understand the influence of natural selection processes, SNP loci located in chromosome 3 were utilized to reconstruct haplotypes, the diversity of which showed significant deviations from selective neutrality. Reduced Median network of reconstructed haplotypes showed balancing selection in force at these loci. Preliminary association of SNP genotypes with phenotypes recorded 42 days post challenge revealed significant differences (P<0.05) in fecal egg count, body weight change and packed cell volume at two, four and six SNP loci respectively. In conclusion, the present study reports strong phylogeographic structure and balancing selection operating at SNP loci located within immune pathway genes. Further, SNP loci identified in the study were found to have potential for future large scale association studies in naturally exposed sheep populations.
doi:10.1371/journal.pone.0088337
PMCID: PMC3922807  PMID: 24533078
6.  Power to Detect Risk Alleles Using Genome-Wide Tag SNP Panels 
PLoS Genetics  2007;3(10):e170.
Advances in high-throughput genotyping and the International HapMap Project have enabled association studies at the whole-genome level. We have constructed whole-genome genotyping panels of over 550,000 (HumanHap550) and 650,000 (HumanHap650Y) SNP loci by choosing tag SNPs from all populations genotyped by the International HapMap Project. These panels also contain additional SNP content in regions that have historically been overrepresented in diseases, such as nonsynonymous sites, the MHC region, copy number variant regions and mitochondrial DNA. We estimate that the tag SNP loci in these panels cover the majority of all common variation in the genome as measured by coverage of both all common HapMap SNPs and an independent set of SNPs derived from complete resequencing of genes obtained from SeattleSNPs. We also estimate that, given a sample size of 1,000 cases and 1,000 controls, these panels have the power to detect single disease loci of moderate risk (λ ∼ 1.8–2.0). Relative risks as low as λ ∼ 1.1–1.3 can be detected using 10,000 cases and 10,000 controls depending on the sample population and disease model. If multiple loci are involved, the power increases significantly to detect at least one locus such that relative risks 20%–35% lower can be detected with 80% power if between two and four independent loci are involved. Although our SNP selection was based on HapMap data, which is a subset of all common SNPs, these panels effectively capture the majority of all common variation and provide high power to detect risk alleles that are not represented in the HapMap data.
Author Summary
Advances in high-throughput genotyping technology and the International HapMap Project have enabled genetic association studies at the whole-genome level. Our paper describes two genome-wide SNP panels that contain tag SNPs derived from the International HapMap Project. Tag SNPs are proxies for groups of highly correlated SNPs. Information can be captured for the entire group of correlated SNPs by genotyping only one representative SNP, the tag SNP. These whole-genome SNP panels also contain additional content thought to be overrepresented in disease, such as amino acid–changing nonsynonymous SNPs and mitochondrial SNPs. We show that these panels cover the genome with very high efficiency as measured by coverage of all HapMap SNPs and a set of SNPs derived from completely resequenced genes from the Seattle SNPs database. We also show that these panels have high power to detect disease risk alleles for both HapMap and non-HapMap SNPs. In complex disease where multiple risk alleles are believed to be involved, we show that the ability to detect at least one risk allele with the tag SNP panels is also high.
doi:10.1371/journal.pgen.0030170
PMCID: PMC2000969  PMID: 17922574
7.  Continent-Wide Decoupling of Y-Chromosomal Genetic Variation from Language and Geography in Native South Americans 
PLoS Genetics  2013;9(4):e1003460.
Numerous studies of human populations in Europe and Asia have revealed a concordance between their extant genetic structure and the prevailing regional pattern of geography and language. For native South Americans, however, such evidence has been lacking so far. Therefore, we examined the relationship between Y-chromosomal genotype on the one hand, and male geographic origin and linguistic affiliation on the other, in the largest study of South American natives to date in terms of sampled individuals and populations. A total of 1,011 individuals, representing 50 tribal populations from 81 settlements, were genotyped for up to 17 short tandem repeat (STR) markers and 16 single nucleotide polymorphisms (Y-SNPs), the latter resolving phylogenetic lineages Q and C. Virtually no structure became apparent for the extant Y-chromosomal genetic variation of South American males that could sensibly be related to their inter-tribal geographic and linguistic relationships. This continent-wide decoupling is consistent with a rapid peopling of the continent followed by long periods of isolation in small groups. Furthermore, for the first time, we identified a distinct geographical cluster of Y-SNP lineages C-M217 (C3*) in South America. Such haplotypes are virtually absent from North and Central America, but occur at high frequency in Asia. Together with the locally confined Y-STR autocorrelation observed in our study as a whole, the available data therefore suggest a late introduction of C3* into South America no more than 6,000 years ago, perhaps via coastal or trans-Pacific routes. Extensive simulations revealed that the observed lack of haplogroup C3* among extant North and Central American natives is only compatible with low levels of migration between the ancestor populations of C3* carriers and non-carriers. In summary, our data highlight the fact that a pronounced correlation between genetic and geographic/cultural structure can only be expected under very specific conditions, most of which are likely not to have been met by the ancestors of native South Americans.
Author Summary
In the largest population genetic study of South Americans to date, we analyzed the Y-chromosomal makeup of more than 1,000 male natives. We found that the male-specific genetic variation of Native Americans lacks any clear structure that could sensibly be related to their geographic and/or linguistic relationships. This finding is consistent with a rapid initial peopling of South America, followed by long periods of isolation in small tribal groups. The observed continent-wide decoupling of geography, spoken language, and genetics contrasts strikingly with previous reports of such correlation from many parts of Europe and Asia. Moreover, we identified a cluster of Native American founding lineages of Y chromosomes, called C-M217 (C3*), within a restricted area of Ecuador in North-Western South America. The same haplogroup occurs at high frequency in Central, East, and North East Asia, but is virtually absent from North (except Alaska) and Central America. Possible scenarios for the introduction of C-M217 (C3*) into Ecuador may thus include a coastal or trans-Pacific route, an idea also supported by occasional archeological evidence and the recent coalescence of the C3* haplotypes, estimated from our data to have occurred some 6,000 years ago.
doi:10.1371/journal.pgen.1003460
PMCID: PMC3623769  PMID: 23593040
8.  Gene Flow between the Korean Peninsula and Its Neighboring Countries 
PLoS ONE  2010;5(7):e11855.
SNP markers provide the primary data for population structure analysis. In this study, we employed whole-genome autosomal SNPs as a marker set (54,836 SNP markers) and tested their possible effects on genetic ancestry using 320 subjects covering 24 regional groups including Northern ( = 16) and Southern ( = 3) Asians, Amerindians ( = 1), and four HapMap populations (YRI, CEU, JPT, and CHB). Additionally, we evaluated the effectiveness and robustness of 50K autosomal SNPs with various clustering methods, along with their dependencies on recombination hotspots (RH), linkage disequilibrium (LD), missing calls and regional specific markers. The RH- and LD-free multi-dimensional scaling (MDS) method showed a broad picture of human migration from Africa to North-East Asia on our genome map, supporting results from previous haploid DNA studies. Of the Asian groups, the East Asian group showed greater differentiation than the Northern and Southern Asian groups with respect to Fst statistics. By extension, the analysis of monomorphic markers implied that nine out of ten historical regions in South Korea, and Tokyo in Japan, showed signs of genetic drift caused by the later settlement of East Asia (South Korea, Japan and China), while Gyeongju in South East Korea showed signs of the earliest settlement in East Asia. In the genome map, the gene flow to the Korean Peninsula from its neighboring countries indicated that some genetic signals from Northern populations such as the Siberians and Mongolians still remain in the South East and West regions, while few signals remain from the early Southern lineages.
doi:10.1371/journal.pone.0011855
PMCID: PMC2912326  PMID: 20686617
9.  Calibrating the Performance of SNP Arrays for Whole-Genome Association Studies 
PLoS Genetics  2008;4(6):e1000109.
To facilitate whole-genome association studies (WGAS), several high-density SNP genotyping arrays have been developed. Genetic coverage and statistical power are the primary benchmark metrics in evaluating the performance of SNP arrays. Ideally, such evaluations would be done on a SNP set and a cohort of individuals that are both independently sampled from the original SNPs and individuals used in developing the arrays. Without utilization of an independent test set, previous estimates of genetic coverage and statistical power may be subject to an overfitting bias. Additionally, the SNP arrays' statistical power in WGAS has not been systematically assessed on real traits. One robust setting for doing so is to evaluate statistical power on thousands of traits measured from a single set of individuals. In this study, 359 newly sampled Americans of European descent were genotyped using both Affymetrix 500K (Affx500K) and Illumina 650Y (Ilmn650K) SNP arrays. From these data, we were able to obtain estimates of genetic coverage, which are robust to overfitting, by constructing an independent test set from among these genotypes and individuals. Furthermore, we collected liver tissue RNA from the participants and profiled these samples on a comprehensive gene expression microarray. The RNA levels were used as a large-scale set of quantitative traits to calibrate the relative statistical power of the commercial arrays. Our genetic coverage estimates are lower than previous reports, providing evidence that previous estimates may be inflated due to overfitting. The Ilmn650K platform showed reasonable power (50% or greater) to detect SNPs associated with quantitative traits when the signal-to-noise ratio (SNR) is greater than or equal to 0.5 and the causal SNP's minor allele frequency (MAF) is greater than or equal to 20% (N = 359). In testing each of the more than 40,000 gene expression traits for association to each of the SNPs on the Ilmn650K and Affx500K arrays, we found that the Ilmn650K yielded 15% times more discoveries than the Affx500K at the same false discovery rate (FDR) level.
Author Summary
Advances in SNP genotyping array technologies have made whole-genome association studies (WGAS) a readily available approach. Genetic coverage and the statistical power are two key properties to evaluate on the arrays. In this study, 359 newly sampled individuals were genotyped using Affymetrix 500K and Illumina 650Y SNP arrays. From these data, we obtained new estimates of genetic coverage by constructing a test set from among these genotypes and individuals that is independent from the SNPs and individuals used to construct the arrays. These estimates are notably smaller than previous ones, which we argue is due to an overfitting bias in previous studies. We also collected liver tissue RNA from the participants and profiled these samples on a comprehensive gene expression microarray. The RNA levels were used as a large-scale set of quantitative traits to calibrate the relative statistical power of the commercial arrays. Through this dataset and simulations, we find that the SNP arrays provide adequate power to detect quantitative trait loci when the causal SNP's minor allele frequency is greater than 20%, but low power is less than 10%. Importantly, we provide evidence that sample size has a greater impact on the power of WGAS than SNP density or genetic coverage.
doi:10.1371/journal.pgen.1000109
PMCID: PMC2432039  PMID: 18584036
10.  Examining markers in 8q24 to explain differences in evidence for association with cleft lip with/without cleft palate between Asians and Europeans 
Genetic epidemiology  2012;36(4):392-399.
In a recent genome wide association study (GWAS) from an international consortium, evidence of linkage and association in chr8q24 was much stronger among non-syndromic cleft lip/palate (CL/P) case-parent trios of European ancestry than among trios of Asian ancestry. We examined marker information content and haplotype diversity across 13 recruitment sites (from Europe, USA and Asia) separately, and conducted principal components analysis (PCA) on parents. As expected, PCA revealed large genetic distances between Europeans and Asians, and a north-south cline from Korea to Singapore in Asia, with Filipino parents forming a somewhat distinct Southeast Asian cluster. Hierarchical clustering of SNP heterozygosity revealed two major clades consistent with PCA results. All genotyped SNPs giving p<10−6 in the allelic TDT showed higher heterozygosity in Europeans than Asians. On average, European ancestry parents had higher haplotype diversity than Asians. Imputing additional variants across chr8q24 increased the strength of statistical evidence among Europeans and also revealed a significant signal among Asians (although it did not reach genome-wide significance). Tests for SNP-population interaction were negative, indicating the lack of strong signal for 8q24 in families of Asian ancestry was not due to any distinct genetic effect, but could simply reflect low power due to lower allele frequencies in Asians.
doi:10.1002/gepi.21633
PMCID: PMC3615645  PMID: 22508319
cleft lip with/without cleft palate; 8q24; genome wide association; imputation
11.  Detailed Analysis of Gene Polymorphisms Associated with Ischemic Stroke in South Asians 
PLoS ONE  2013;8(3):e57305.
The burden of stroke is disproportionately high in the South Asian subcontinent with South Asian ethnicity conferring a greater risk of ischemic stroke than European ancestry regardless of country inhabited. While genes associated with stroke in European populations have been investigated, they remain largely unknown in South Asians. We conducted a comprehensive meta-analysis of known genetic polymorphisms associated with South Asian ischemic stroke, and compared effect size of the MTHFR C677T-stroke association with effect sizes predicted from homocysteine-stroke association. Electronic databases were searched up to August 2012 for published case control studies investigating genetic polymorphisms associated with ischemic stroke in South Asians. Pooled odds ratios (OR) for each gene-disease association were calculated using a random-effects model. We identified 26 studies (approximately 2529 stroke cases and 2881 controls) interrogating 33 independent genetic polymorphisms in 22 genes. Ten studies described MTHFR C677T (108 with TT genotype and 2018 with CC genotype) -homocysteine relationship and six studies (735 stroke cases and 713 controls) described homocysteine-ischemic stroke relationship. Risk association ORs were calculated for ACE I/D (OR 5.00; 95% CI, 1.17–21.37; p = 0.03), PDE4D SNP 83 (OR 2.20; 95% CI 1.21–3.99; p = 0.01), PDE4D SNP 32 (OR 1.57; 95% CI 1.01–2.45, p = 0.045) and IL10 G1082A (OR 1.44; 95% CI, 1.09–1.91, p = 0.01). Significant association was observed between elevated plasma homocysteine levels and MTHFR/677 TT genotypes in healthy South Asians (Mean difference (ΔX) 5.18 µmol/L; 95% CI 2.03–8.34: p = 0.001). Our results demonstrate that the genetic etiology of ischemic stroke in South Asians is broadly similar to the risk conferred in Europeans, although the dataset is considerably smaller and warrants the same clinical considerations for risk profiling.
doi:10.1371/journal.pone.0057305
PMCID: PMC3591429  PMID: 23505425
12.  Genetic affinities between endogamous and inbreeding populations of Uttar Pradesh 
BMC Genetics  2007;8:12.
Background
India has experienced several waves of migration since the Middle Paleolithic. It is believed that the initial demic movement into India was from Africa along the southern coastal route, approximately 60,000–85,000 years before present (ybp). It has also been reported that there were two other major colonization which included eastward diffusion of Neolithic farmers (Elamo Dravidians) from Middle East sometime between 10,000 and 7,000 ybp and a southern dispersal of Indo Europeans from Central Asia 3,000 ybp. Mongol entry during the thirteenth century A.D. as well as some possible minor incursions from South China 50,000 to 60,000 ybp may have also contributed to cultural, linguistic and genetic diversity in India. Therefore, the genetic affinity and relationship of Indians with other world populations and also within India are often contested. In the present study, we have attempted to offer a fresh and immaculate interpretation on the genetic relationships of different North Indian populations with other Indian and world populations.
Results
We have first genotyped 20 tetra-nucleotide STR markers among 1800 north Indian samples of nine endogamous populations belonging to three different socio-cultural strata. Genetic distances (Nei's DA and Reynold's Fst) were calculated among the nine studied populations, Caucasians and East Asians. This analysis was based upon the allelic profile of 20 STR markers to assess the genetic similarity and differences of the north Indian populations. North Indians showed a stronger genetic relationship with the Europeans (DA 0.0341 and Fst 0.0119) as compared to the Asians (DA 0.1694 and Fst – 0.0718). The upper caste Brahmins and Muslims were closest to Caucasians while middle caste populations were closer to Asians. Finally, three phylogenetic assessments based on two different NJ and ML phylogenetic methods and PC plot analysis were carried out using the same panel of 20 STR markers and 20 geo-ethnic populations. The three phylogenetic assessments revealed that north Indians are clustering with Caucasians.
Conclusion
The genetic affinities of Indians and that of different caste groups towards Caucasians or East Asians is distributed in a cline where geographically north Indians and both upper caste and Muslim populations are genetically closer to the Caucasians.
doi:10.1186/1471-2156-8-12
PMCID: PMC1855350  PMID: 17417972
13.  Suicide by burning in the South Asian origin population in England and Wales a secondary analysis of a national data set 
BMJ Open  2011;1(2):e000326.
Objectives
A descriptive analysis of suicide by burning in England and Wales in the general population and in people of South Asian origin.
Design
A cross-sectional secondary analysis of a national data set.
Setting
A population study of all those who died by suicide in England and Wales between 1993 and 2003 inclusive.
Participants
All cases of suicide and undetermined intent identified by the Office for National Statistics for England and Wales. A computer algorithm was used to identify people of the South Asian origin from their names. There were 55 140 suicides in the UK between 1993 and 2003. The ratio of male to female suicides was 3:1. There were 1455 South Asian suicides identified by South Asian Name and Group Recognition Algorithm.
Primary and secondary outcome measures
Death by suicide and undetermined intent, as determined by Coroner's Inquest. ICD9 codes E958.1 and E988.1 and ICD10 codes X76 and Y26.
Results
1.77% of suicides in the general population and 8.45% of suicides in the South Asian origin population were by burning. The suicide rate by burning was 0.8/100 000 person-years for England and Wales and 2.9/100 000 person-years for the South Asian origin population. The odds of suicide by burning were increased in the South Asian group as a whole (OR 3.06, 95% CI 2.30 to 4.08). Those born in Asia and Africa were at higher risk than those born in the UK (OR 2.69, 95% CI 2.01 to 3.60 and OR 2.10, 95% CI 1.46 to 3.01, respectively). The increased risk was for those aged 25–64 years.
Conclusion
Suicide by burning remains a significant issue in the South Asian origin working-age population in England and Wales. A prevention strategy could target working-age people of South Asian origin born abroad as they are at the highest risk. More in depth research on the reasons for using this method may help to identify possible prevention strategies.
Article summary
Article focus
A descriptive analysis of suicide by burning in the UK.
A description of suicide by burning in those of South Asian origin in the UK.
Key messages
Suicide by burning is a significant issue in the South Asian population in England and Wales.
The working-age population is at particular risk with those born abroad, especially those born in South Asian countries, having the highest odds.
Though the risk is more elevated in South Asian women when compared with the rest of the population, there are more male suicides using this method overall.
Strengths and limitations of this study
This study used a contemporary national data set.
Using name recognition software rather than place of birth allowed better identification of the South Asian population of the UK.
The data used did not allow for confounders to be taken into account when analysing differences between ethnic groups.
Because administrative data were used, in-depth information on the reasons for choosing this method of suicide were not available.
doi:10.1136/bmjopen-2011-000326
PMCID: PMC3244662  PMID: 22184588
14.  Population Structure of Hispanics in the United States: The Multi-Ethnic Study of Atherosclerosis 
PLoS Genetics  2012;8(4):e1002640.
Using ∼60,000 SNPs selected for minimal linkage disequilibrium, we perform population structure analysis of 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By projection of principal components (PCs) of ancestry to samples from the HapMap phase III and the Human Genome Diversity Panel (HGDP), we show the first two PCs quantify the Caucasian, African, and Native American origins, while the third and fourth PCs bring out an axis that aligns with known South-to-North geographic location of HGDP Native American samples and further separates MESA Mexican versus Central/South American samples along the same axis. Using k-means clustering computed from the first four PCs, we define four subgroups of the MESA Hispanic cohort that show close agreement with self-identification, labeling the clusters as primarily Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. To demonstrate our recommendations for genetic analysis in the MESA Hispanic cohort, we present pooled and stratified association analysis of triglycerides for selected SNPs in the LPL and TRIB1 gene regions, previously reported in GWAS of triglycerides in Caucasians but as yet unconfirmed in Hispanic populations. We report statistically significant evidence for genetic association in both genes, and we further demonstrate the importance of considering population substructure and genetic heterogeneity in genetic association studies performed in the United States Hispanic population.
Author Summary
Using genotype data from about 60,000 distinct genetic markers, we examined population structure in 1,374 unrelated Hispanic individuals from the Multi-Ethnic Study of Atherosclerosis (MESA), with self-identification corresponding to Central America (n = 93), Cuba (n = 50), the Dominican Republic (n = 203), Mexico (n = 708), Puerto Rico (n = 192), and South America (n = 111). By comparing genetic ancestry of MESA Hispanic participants to reference samples representing worldwide diversity, we show major differences in ancestry of MESA Hispanics reflecting their Caucasian, African, and Native American origins, with finer differences corresponding to North-South geographic origins that separate MESA Mexican versus Central/South American samples. Based on our analysis, we define four subgroups of the MESA Hispanic cohort that show close agreement with the following self-identified regions of origin: Dominican/Cuban, Mexican, Central/South American, and Puerto Rican. We examine association of triglycerides with selected genetic markers, and we further demonstrate the importance of considering differences in genetic ancestry (or factors associated with genetic ancestry) when performing genetic studies of the United States Hispanic population.
doi:10.1371/journal.pgen.1002640
PMCID: PMC3325201  PMID: 22511882
15.  Identification of a Functional Genetic Variant at 16q12.1 for Breast Cancer Risk: Results from the Asia Breast Cancer Consortium 
PLoS Genetics  2010;6(6):e1001002.
Genetic factors play an important role in the etiology of breast cancer. We carried out a multi-stage genome-wide association (GWA) study in over 28,000 cases and controls recruited from 12 studies conducted in Asian and European American women to identify genetic susceptibility loci for breast cancer. After analyzing 684,457 SNPs in 2,073 cases and 2,084 controls in Chinese women, we evaluated 53 SNPs for fast-track replication in an independent set of 4,425 cases and 1,915 controls of Chinese origin. Four replicated SNPs were further investigated in an independent set of 6,173 cases and 6,340 controls from seven other studies conducted in Asian women. SNP rs4784227 was consistently associated with breast cancer risk across all studies with adjusted odds ratios (95% confidence intervals) of 1.25 (1.20−1.31) per allele (P = 3.2×10−25) in the pooled analysis of samples from all Asian samples. This SNP was also associated with breast cancer risk among European Americans (per allele OR  = 1.19, 95% CI  = 1.09−1.31, P = 1.3×10−4, 2,797 cases and 2,662 controls). SNP rs4784227 is located at 16q12.1, a region identified previously for breast cancer risk among Europeans. The association of this SNP with breast cancer risk remained highly statistically significant in Asians after adjusting for previously-reported SNPs in this region. In vitro experiments using both luciferase reporter and electrophoretic mobility shift assays demonstrated functional significance of this SNP. These results provide strong evidence implicating rs4784227 as a functional causal variant for breast cancer in the locus 16q12.1 and demonstrate the utility of conducting genetic association studies in populations with different genetic architectures.
Author Summary
Breast cancer is one of the most common malignancies among women worldwide. Genetic factors play an important role in the etiology of breast cancer. To identify genetic susceptibility loci for breast cancer, we performed a genome-wide association study in 15,468 breast cancer cases and 13,001 controls. A single nucleotide polymorphism (SNP) rs4784227 located on chromosome 16q12.1, a previously-reported region for breast cancer risk, was found to be associated with breast cancer risk. The association of this SNP with breast cancer risk remained highly significant in Asians after adjusting all previously-reported SNPs in this region. In vitro biochemical experiments using both luciferase reporter and electrophoretic mobility shift assays confirmed the functional importance of this SNP. Our results demonstrate the importance of conducting genetic association studies in populations with different genetic backgrounds to identify functional variants.
doi:10.1371/journal.pgen.1001002
PMCID: PMC2891809  PMID: 20585626
16.  Genetic variation in South Indian castes: evidence from Y-chromosome, mitochondrial, and autosomal polymorphisms 
BMC Genetics  2008;9:86.
Background
Major population movements, social structure, and caste endogamy have influenced the genetic structure of Indian populations. An understanding of these influences is increasingly important as gene mapping and case-control studies are initiated in South Indian populations.
Results
We report new data on 155 individuals from four Tamil caste populations of South India and perform comparative analyses with caste populations from the neighboring state of Andhra Pradesh. Genetic differentiation among Tamil castes is low (RST = 0.96% for 45 autosomal short tandem repeat (STR) markers), reflecting a largely common origin. Nonetheless, caste- and continent-specific patterns are evident. For 32 lineage-defining Y-chromosome SNPs, Tamil castes show higher affinity to Europeans than to eastern Asians, and genetic distance estimates to the Europeans are ordered by caste rank. For 32 lineage-defining mitochondrial SNPs and hypervariable sequence (HVS) 1, Tamil castes have higher affinity to eastern Asians than to Europeans. For 45 autosomal STRs, upper and middle rank castes show higher affinity to Europeans than do lower rank castes from either Tamil Nadu or Andhra Pradesh. Local between-caste variation (Tamil Nadu RST = 0.96%, Andhra Pradesh RST = 0.77%) exceeds the estimate of variation between these geographically separated groups (RST = 0.12%). Low, but statistically significant, correlations between caste rank distance and genetic distance are demonstrated for Tamil castes using Y-chromosome, mtDNA, and autosomal data.
Conclusion
Genetic data from Y-chromosome, mtDNA, and autosomal STRs are in accord with historical accounts of northwest to southeast population movements in India. The influence of ancient and historical population movements and caste social structure can be detected and replicated in South Indian caste populations from two different geographic regions.
doi:10.1186/1471-2156-9-86
PMCID: PMC2621241  PMID: 19077280
17.  A Single Nucleotide Polymorphism within DUSP9 Is Associated with Susceptibility to Type 2 Diabetes in a Japanese Population 
PLoS ONE  2012;7(9):e46263.
Aims
The DUSP9 locus on chromosome X was identified as a susceptibility locus for type 2 diabetes in a meta-analysis of European genome-wide association studies (GWAS), and GWAS in South Asian populations identified 6 additional single nucleotide polymorphism (SNP) loci for type 2 diabetes. However, the association of these loci with type 2 diabetes have not been examined in the Japanese. We performed a replication study to investigate the association of these 7 susceptibility loci with type 2 diabetes in the Japanese population.
Methods
We genotyped 11,319 Japanese participants (8,318 with type 2 diabetes and 3,001 controls) for each of the 7 SNPs–rs5945326 near DUSP9, rs3923113 near GRB14, rs16861329 in ST6GAL1, rs1802295 in VPS26A, rs7178572 in HMG20A, rs2028299 near AP3S2, and rs4812829 in HNF4A–and examined the association of each of these 7 SNPs with type 2 diabetes by using logistic regression analysis.
Results
All SNPs had the same direction of effect (odds ratio [OR]>1.0) as in the original reports. One SNP, rs5945326 near DUSP9, was significantly associated with type 2 diabetes at a genome-wide significance level (p = 2.21×10−8; OR 1.39, 95% confidence interval [CI]: 1.24−1.56). The 6 SNPs derived from South Asian GWAS were not significantly associated with type 2 diabetes in the Japanese population by themselves (p≥0.007). However, a genetic risk score constructed from 6 South Asian GWAS derived SNPs was significantly associated with Japanese type 2 diabetes (p = 8.69×10−4, OR  = 1.06. 95% CI; 1.03−1.10).
Conclusions/interpretation
These results indicate that the DUSP9 locus is a common susceptibility locus for type 2 diabetes across different ethnicities, and 6 loci identified in South Asian GWAS also have significant effect on susceptibility to Japanese type 2 diabetes.
doi:10.1371/journal.pone.0046263
PMCID: PMC3459833  PMID: 23029454
18.  Reconstructing Roma History from Genome-Wide Data 
PLoS ONE  2013;8(3):e58633.
The Roma people, living throughout Europe and West Asia, are a diverse population linked by the Romani language and culture. Previous linguistic and genetic studies have suggested that the Roma migrated into Europe from South Asia about 1,000–1,500 years ago. Genetic inferences about Roma history have mostly focused on the Y chromosome and mitochondrial DNA. To explore what additional information can be learned from genome-wide data, we analyzed data from six Roma groups that we genotyped at hundreds of thousands of single nucleotide polymorphisms (SNPs). We estimate that the Roma harbor about 80% West Eurasian ancestry–derived from a combination of European and South Asian sources–and that the date of admixture of South Asian and European ancestry was about 850 years before present. We provide evidence for Eastern Europe being a major source of European ancestry, and North-west India being a major source of the South Asian ancestry in the Roma. By computing allele sharing as a measure of linkage disequilibrium, we estimate that the migration of Roma out of the Indian subcontinent was accompanied by a severe founder event, which appears to have been followed by a major demographic expansion after the arrival in Europe.
doi:10.1371/journal.pone.0058633
PMCID: PMC3596272  PMID: 23516520
19.  Upper Palaeolithic Siberian genome reveals dual ancestry of Native Americans 
Nature  2013;505(7481):87-91.
The origins of the First Americans remain contentious. Although Native Americans seem to be genetically most closely related to east Asians1–3, there is no consensus with regard to which specific Old World populations they are closest to4–8. Here we sequence the draft genome of an approximately 24,000-year-old individual (MA-1), from Mal’ta in south-central Siberia9, to an average depth of 13. To our knowledge this is the oldest anatomically modern human genome reported to date. The MA-1 mitochondrial genome belongs to haplogroup U, which has also been found at high frequency among Upper Palaeolithic and Mesolithic European hunter-gatherers10–12, and the Y chromosome of MA-1 is basal to modern-day western Eurasians and near the root of most Native American lineages5. Similarly, we find autosomal evidence that MA-1 is basal to modern-day western Eurasians and genetically closely related to modern-day Native Americans, with no close affinity to east Asians. This suggests that populations related to contemporary western Eurasians had a more north-easterly distribution 24,000 years ago than commonly thought. Furthermore, we estimate that 14 to 38% of Native American ancestry may originate through gene flow from this ancient population. This is likely to have occurred after the divergence of Native American ancestors from east Asian ancestors, but before the diversification of Native American populations in the New World. Gene flow from the MA-1 lineage into Native American ancestors could explain why several crania from the First Americans have been reported as bearing morphological characteristics that do not resemble those of east Asians2,13. Sequencing of another south-central Siberian, Afontova Gora-2 dating to approximately 17,000 years ago14, revealed similar autosomal genetic signatures as MA-1, suggesting that the region was continuously occupied by humans throughout the Last Glacial Maximum. Our findings reveal that western Eurasian genetic signatures in modern-day Native Americans derive not only from post-Columbian admixture, as commonly thought, but also from a mixed ancestry of the First Americans.
doi:10.1038/nature12736
PMCID: PMC4105016  PMID: 24256729
20.  Molecular insight into the genesis of ranked caste populations of western India based upon polymorphisms across non-recombinant and recombinant regions in genome 
Genome Biology  2005;6(8):P10.
To trace admixture and genesis of caste populations of western India, polymorphisms were examined across non-recombining 20 Y-SNPs, 20 Y-STRs, 18 mtDNA diagnostic sites, HVS-1 plus HVS-2 regions; and recombining 15 highly polymorphic autosomal STRs in four predominant caste populations- upper-ranking Desasth-brahmin and Chitpavan-brahmin; a middle-ranking Kshtriya Maratha; and a lower-rank peasant Dhangar.
Background
Large-scale trade and cultural contacts between coastal populations of western India and Western-Eurasians paved for extensive immigration and genesis of wide spectrum of admixed gene pool. To trace admixture and genesis of caste populations of western India, we have examined polymorphisms across non-recombining 20 Y-SNPs, 20 Y-STRs, 18 mtDNA diagnostic sites, HVS-1 plus HVS-2 regions; and recombining 15 highly polymorphic autosomal STRs in four predominant caste populations- upper-ranking Desasth-brahmin and Chitpavan-brahmin; a middle-ranking Kshtriya Maratha; and a lower-rank peasant Dhangar.
Results
The generated genomic data was compared with putative parental populations- Central Asians, West Asians and Europeans using AMOVA, PC plot, and admixture estimates. Overall, disparate uniparental ancestries, and l.1% GST value for biparental markers among four studied caste populations linked well with their exchequer demographic histories. Marathi-speaking ancient Desasth-brahmin shows substantial admixture from Central Asian males but Paleolithic maternal component support their Scytho-Dravidian origin. Chitpavanbrahmin demonstrates younger maternal component and substantial paternal gene flow from West Asia, thus giving credence to their recent Irano-Scythian ancestry from Mediterranean or Turkey, which correlated well with European-looking features of this caste. This also explains their untraceable ethno-history before 1000 years, brahminization event and later amalgamation by Maratha. The widespread Palaeolithic mtDNA haplogroups in Maratha and Dhangar highlight their shared Proto-Asian ancestries. Maratha males harboured Anatolianderived J2 lineage corroborating the blending of farming communities. Dhangar heterogeneity is ascribable to predominantly South-Asian males and West-Eurasian females.
Conclusions
The genomic data-sets of this study provide ample genomic evidences of diverse origins of four ranked castes and synchronization of caste stratification with asymmetrical gene flows from Indo-European migration during Upper Paleolithic, Neolithic, and later dates. However, subsequent gene flows among these castes living in geographical proximity, have diminished significant genetic differentiation as indicated by AMOVA and structure.
doi:10.1186/gb-2005-6-8-p10
PMCID: PMC4071276
21.  Polymorphisms in peptidylarginine deiminase associate with rheumatoid arthritis in diverse Asian populations: evidence from MyEIRA study and meta-analysis 
Arthritis Research & Therapy  2012;14(6):R250.
Introduction
The majority of our knowledge regarding disease-related mechanisms of uncontrolled citrullination and anti-citrullinated protein antibody development in rheumatoid arthritis (RA) was investigated in Caucasian populations. However, peptidylarginine deiminase (PADI) type 4 gene polymorphisms are associated with RA in East Asian populations and weak or no association was found in Caucasian populations. This study explores the association between the PADI4 polymorphisms and RA risk in a multiethnic population residing in South East Asia with the goal of elucidating generalizability of association in non-Caucasian populations.
Methods
A total of 320 SNPs from the PADI locus (including PADI1, PADI2, PADI3, PADI4 and PADI6 genes) were genotyped in 1,238 RA cases and 1,571 control subjects from the Malaysian Epidemiological Investigation of Rheumatoid Arthritis (MyEIRA) case-control study. Additionally, we conducted meta-analysis of our data together with the previously published studies of RA from East Asian populations.
Results
The overall odds ratio (ORoverall) for the PADI4 (rs2240340) allelic model was 1.11 (95% confidence interval (CI) = 1.00 to 1.23, P = 0.04) and for the genotypic model was 1.20 (95% CI = 1.01 to 1.44, P = 0.04). Haplotype analysis for four selected PADI4 SNPs revealed a significant association of one with susceptibility (P = 0.001) and of another with a protective effect (P = 0.02). The RA susceptibility was further confirmed when combined meta-analysis was performed using these data together with data from five previously published studies from Asia comprising 5,192 RA cases and 4,317 control subjects (ORoverall = 1.23 (95% CI = 1.16 to 1.31, Pheterogeneity = 0.08) and 1.31 (95% CI = 1.20 to 1.44, Pheterogeneity = 0.32) in allele and genotype-based models, respectively). In addition, we also detected a novel association of PADI2 genetic variant rs1005753 with RA (ORoverall = 0.87 (95% CI = 0.77 to 0.99)).
Conclusion
Our study demonstrates an association between PADI4 and RA in the multiethnic population from South East Asia and suggests additional association with a PADI2 gene. The study thus provides further support for the notion that polymorphisms in genes for enzymes responsible for citrullination contribute to RA development in multiple populations of Asian descent.
doi:10.1186/ar4093
PMCID: PMC3674620  PMID: 23164236
22.  High-Throughput Sequencing of a South American Amerindian 
PLoS ONE  2013;8(12):e83340.
The emergence of next-generation sequencing technologies allowed access to the vast amounts of information that are contained in the human genome. This information has contributed to the understanding of individual and population-based variability and improved the understanding of the evolutionary history of different human groups. However, the genome of a representative of the Amerindian populations had not been previously sequenced. Thus, the genome of an individual from a South American tribe was completely sequenced to further the understanding of the genetic variability of Amerindians. A total of 36.8 giga base pairs (Gbp) were sequenced and aligned with the human genome. These Gbp corresponded to 95.92% of the human genome with an estimated miscall rate of 0.0035 per sequenced bp. The data obtained from the alignment were used for SNP (single-nucleotide) and INDEL (insertion-deletion) calling, which resulted in the identification of 502,017 polymorphisms, of which 32,275 were potentially new high-confidence SNPs and 33,795 new INDELs, specific of South Native American populations. The authenticity of the sample as a member of the South Native American populations was confirmed through the analysis of the uniparental (maternal and paternal) lineages. The autosomal comparison distinguished the investigated sample from others continental populations and revealed a close relation to the Eastern Asian populations and Aboriginal Australian. Although, the findings did not discard the classical model of America settlement; it brought new insides to the understanding of the human population history. The present study indicates a remarkable genetic variability in human populations that must still be identified and contributes to the understanding of the genetic variability of South Native American populations and of the human populations history.
doi:10.1371/journal.pone.0083340
PMCID: PMC3875439  PMID: 24386182
23.  Sequencing and analysis of a South Asian-Indian personal genome 
BMC Genomics  2012;13:440.
Background
With over 1.3 billion people, India is estimated to contain three times more genetic diversity than does Europe. Next-generation sequencing technologies have facilitated the understanding of diversity by enabling whole genome sequencing at greater speed and lower cost. While genomes from people of European and Asian descent have been sequenced, only recently has a single male genome from the Indian subcontinent been published at sufficient depth and coverage. In this study we have sequenced and analyzed the genome of a South Asian Indian female (SAIF) from the Indian state of Kerala.
Results
We identified over 3.4 million SNPs in this genome including over 89,873 private variations. Comparison of the SAIF genome with several published personal genomes revealed that this individual shared ~50% of the SNPs with each of these genomes. Analysis of the SAIF mitochondrial genome showed that it was closely related to the U1 haplogroup which has been previously observed in Kerala. We assessed the SAIF genome for SNPs with health and disease consequences and found that the individual was at a higher risk for multiple sclerosis and a few other diseases. In analyzing SNPs that modulate drug response, we found a variation that predicts a favorable response to metformin, a drug used to treat diabetes. SNPs predictive of adverse reaction to warfarin indicated that the SAIF individual is not at risk for bleeding if treated with typical doses of warfarin. In addition, we report the presence of several additional SNPs of medical relevance.
Conclusions
This is the first study to report the complete whole genome sequence of a female from the state of Kerala in India. The availability of this complete genome and variants will further aid studies aimed at understanding genetic diversity, identifying clinically relevant changes and assessing disease burden in the Indian population.
doi:10.1186/1471-2164-13-440
PMCID: PMC3534380  PMID: 22938532
Indian genome; Personal genomics; Whole genome sequencing
24.  Detailed Analysis of Japanese Population Substructure with a Focus on the Southwest Islands of Japan 
PLoS ONE  2012;7(4):e35000.
Uncovering population structure is important for properly conducting association studies and for examining the demographic history of a population. Here, we examined the Japanese population substructure using data from the Japan Multi-Institutional Collaborative Cohort (J-MICC), which covers all but the northern region of Japan. Using 222 autosomal loci from 4502 subjects, we investigated population substructure by estimating FST among populations, testing population differentiation, and performing principal component analysis (PCA) and correspondence analysis (CA). All analyses revealed a low but significant differentiation between the Amami Islanders and the mainland Japanese population. Furthermore, we examined the genetic differentiation between the mainland population, Amami Islanders and Okinawa Islanders using six loci included in both the Pan-Asian SNP (PASNP) consortium data and the J-MICC data. This analysis revealed that the Amami and Okinawa Islanders were differentiated from the mainland population. In conclusion, we revealed a low but significant level of genetic differentiation between the mainland population and populations in or to the south of the Amami Islands, although genetic variation between both populations might be clinal. Therefore, the possibility of population stratification must be considered when enrolling the islander population of this area, such as in the J-MICC study.
doi:10.1371/journal.pone.0035000
PMCID: PMC3318002  PMID: 22509376
25.  PanSNPdb: The Pan-Asian SNP Genotyping Database 
PLoS ONE  2011;6(6):e21451.
The HUGO Pan-Asian SNP consortium conducted the largest survey to date of human genetic diversity among Asians by sampling 1,719 unrelated individuals among 71 populations from China, India, Indonesia, Japan, Malaysia, the Philippines, Singapore, South Korea, Taiwan, and Thailand. We have constructed a database (PanSNPdb), which contains these data and various new analyses of them. PanSNPdb is a research resource in the analysis of the population structure of Asian peoples, including linkage disequilibrium patterns, haplotype distributions, and copy number variations. Furthermore, PanSNPdb provides an interactive comparison with other SNP and CNV databases, including HapMap3, JSNP, dbSNP and DGV and thus provides a comprehensive resource of human genetic diversity. The information is accessible via a widely accepted graphical interface used in many genetic variation databases. Unrestricted access to PanSNPdb and any associated files is available at: http://www4a.biotec.or.th/PASNP.
doi:10.1371/journal.pone.0021451
PMCID: PMC3121791  PMID: 21731755

Results 1-25 (573957)