PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-25 (411543)

Clipboard (0)
None

Related Articles

1.  Perspectives on Human Genetic Variation from the HapMap Project 
PLoS Genetics  2005;1(4):e54.
ABSTRACT
The completion of the International HapMap Project marks the start of a new phase in human genetics. The aim of the project was to provide a resource that facilitates the design of efficient genome-wide association studies, through characterising patterns of genetic variation and linkage disequilibrium in a sample of 270 individuals across four geographical populations. In total, over one million SNPs have been typed across these genomes, providing an unprecedented view of human genetic diversity. In this review we focus on what the HapMap project has taught us about the structure of human genetic variation and the fundamental molecular and evolutionary processes that shape it.
doi:10.1371/journal.pgen.0010054
PMCID: PMC1270010  PMID: 16254603
2.  Creating and evaluating genetic tests predictive of drug response 
Nature reviews. Drug discovery  2008;7(7):568-574.
A key goal of pharmacogenetics — the use of genetic variation to elucidate inter-individual variation in drug treatment response — is to aid the development of predictive genetic tests that could maximize drug efficacy and minimize drug toxicity. The completion of the Human Genome Project and the associated HapMap Project, together with advances in technologies for investigating genetic variation, have greatly advanced the potential to develop such tests; however, many challenges remain. With the aim of helping to address some of these challenges, this article discusses the steps that are involved in the development of predictive tests for drug treatment response based on genetic variation, and factors that influence the development and performance of these tests.
doi:10.1038/nrd2520
PMCID: PMC2682785  PMID: 18587383
3.  Haplotype Block Structure Is Conserved across Mammals 
PLoS Genetics  2006;2(7):e121.
Genetic variation in genomes is organized in haplotype blocks, and species-specific block structure is defined by differential contribution of population history effects in combination with mutation and recombination events. Haplotype maps characterize the common patterns of linkage disequilibrium in populations and have important applications in the design and interpretation of genetic experiments. Although evolutionary processes are known to drive the selection of individual polymorphisms, their effect on haplotype block structure dynamics has not been shown. Here, we present a high-resolution haplotype map for a 5-megabase genomic region in the rat and compare it with the orthologous human and mouse segments. Although the size and fine structure of haplotype blocks are species dependent, there is a significant interspecies overlap in structure and a tendency for blocks to encompass complete genes. Extending these findings to the complete human genome using haplotype map phase I data reveals that linkage disequilibrium values are significantly higher for equally spaced positions in genic regions, including promoters, as compared to intergenic regions, indicating that a selective mechanism exists to maintain combinations of alleles within potentially interacting coding and regulatory regions. Although this characteristic may complicate the identification of causal polymorphisms underlying phenotypic traits, conservation of haplotype structure may be employed for the identification and characterization of functionally important genomic regions.
Synopsis
Differences at the DNA level are the major contributant underlying the phenotypic diversity between individuals in a population. The most common type of this genetic variation are single nucleotide polymorphisms (SNPs). Although the majority of SNPs do not have a functional effect, others may affect chromosome organization, gene expression, or protein function. SNPs and their individual states (alleles) are not randomly distributed throughout the genome and within a population. Recombination and mutation events, in combination with selection processes and population history, have resulted in common block-like structures in genomes. These structures are characterized by a common combination of SNP alleles, a so-called haplotype. Selection for specific haplotypes within a population is primarily driven by the advantageous effect of an individual polymorphism in the haplotype block.
By comparing the orthologous rat, mouse, and human haplotype structure of a 5-megabase region from rat Chromosome 1, the authors now show that haplotype block structure is conserved across mammals, most prominently in genic regions, suggesting the existence of an evolutionary selection process that drives the conservation of long-range allele combinations. Indeed, genome-wide gene-centric analysis of human HapMap data revealed that equally spaced polymorphic positions in genic regions and their upstream regulatory regions are genetically more tightly linked than in non-genic regions.
These findings may complicate the identification of causal polymorphisms underlying phenotypic traits, because in regions where haplotype structure is conserved, not a single polymorphism, but rather combinations of tightly linked polymorphisms could contribute to the phenotypic difference. On the other hand, conservation of haplotype structure may be employed for the identification and characterization of functionally important genomic regions.
doi:10.1371/journal.pgen.0020121
PMCID: PMC1523234  PMID: 16895449
4.  Genomic linkage map of the human blood fluke Schistosoma mansoni 
Genome Biology  2009;10(6):R71.
The first genetic linkage map of Schistosoma mansoni reveals insights into higher female recombination, confirms ZW inheritance patterns and recombination hotspots.
Background
Schistosoma mansoni is a blood fluke that infects approximately 90 million people. The complete life cycle of this parasite can be maintained in the laboratory, making this one of the few experimentally tractable human helminth infections, and a rich literature reveals heritable variation in important biomedical traits such as virulence, host-specificity, transmission and drug resistance. However, there is a current lack of tools needed to study S. mansoni's molecular, quantitative, and population genetics. Our goal was to construct a genetic linkage map for S. mansoni, and thus provide a new resource that will help stimulate research on this neglected pathogen.
Results
We genotyped grandparents, parents and 88 progeny to construct a 5.6 cM linkage map containing 243 microsatellites positioned on 203 of the largest scaffolds in the genome sequence. The map allows 70% of the estimated 300 Mb genome to be ordered on chromosomes, and highlights where scaffolds have been incorrectly assembled. The markers fall into eight main linkage groups, consistent with seven pairs of autosomes and one pair of sex chromosomes, and we were able to anchor linkage groups to chromosomes using fluorescent in situ hybridization. The genome measures 1,228.6 cM. Marker segregation reveals higher female recombination, confirms ZW inheritance patterns, and identifies recombination hotspots and regions of segregation distortion.
Conclusions
The genetic linkage map presented here is the first for S. mansoni and the first for a species in the phylum Platyhelminthes. The map provides the critical tool necessary for quantitative genetic analysis, aids genome assembly, and furnishes a framework for comparative flatworm genomics and field-based molecular epidemiological studies.
doi:10.1186/gb-2009-10-6-r71
PMCID: PMC2718505  PMID: 19566921
5.  A haplotype map of the human genome 
Nature  2005;437(7063):1299-1320.
Inherited genetic variation has a critical but as yet largely uncharacterized role in human disease. Here we report a public database of common variation in the human genome: more than one million single nucleotide polymorphisms (SNPs) for which accurate and complete genotypes have been obtained in 269 DNA samples from four populations, including ten 500-kilobase regions in which essentially all information about common DNA variation has been extracted. These data document the generality of recombination hotspots, a block-like structure of linkage disequilibrium and low haplotype diversity, leading to substantial correlations of SNPs with many of their neighbours. We show how the HapMap resource can guide the design and analysis of genetic association studies, shed light on structural variation and recombination, and identify loci that may have been subject to natural selection during human evolution.
doi:10.1038/nature04226
PMCID: PMC1880871  PMID: 16255080
6.  Trait-trait dynamic interaction: 2D-trait eQTL mapping for genetic variation study 
BMC Genomics  2008;9:242.
Background
Many studies have shown that the abundance level of gene expression is heritable. Analogous to the traditional genetic study, most researchers treat the expression of one gene as a quantitative trait and map it to expression quantitative trait loci (eQTL). This is 1D-trait mapping. 1D-trait mapping ignores the trait-trait interaction completely, which is a major shortcoming.
Results
To overcome this limitation, we study the expression of a pair of genes and treat the variation in their co-expression pattern as a two dimensional quantitative trait. We develop a method to find gene pairs, whose co-expression patterns, including both signs and strengths, are mediated by genetic variations and map these 2D-traits to the corresponding genetic loci. We report several applications by combining 1D-trait mapping with 2D-trait mapping, including the contribution of genetic variations to the perturbations in the regulatory mechanisms of yeast metabolic pathways.
Conclusion
Our approach of 2D-trait mapping provides a novel and effective way to connect the genetic variation with higher order biological modules via gene expression profiles.
doi:10.1186/1471-2164-9-242
PMCID: PMC2432080  PMID: 18498664
7.  Genes mirror geography within Europe 
Nature  2008;456(7218):98-101.
Understanding the genetic structure of human populations is of fundamental interest to medical, forensic and anthropological sciences. Advances in high-throughput genotyping technology have markedly improved our understanding of global patterns of human genetic variation and suggest the potential to use large samples to uncover variation among closely spaced populations1–5. Here we characterize genetic variation in a sample of 3,000 European individuals genotyped at over half a million variable DNA sites in the human genome. Despite low average levels of genetic differentiation among Europeans, we find a close correspondence between genetic and geographic distances; indeed, a geographical map of Europe arises naturally as an efficient two-dimensional summary of genetic variation in Europeans. The results emphasize that when mapping the genetic basis of a disease phenotype, spurious associations can arise if genetic structure is not properly accounted for. In addition, the results are relevant to the prospects of genetic ancestry testing6; an individual’s DNA can be used to infer their geographic origin with surprising accuracy—often to within a few hundred kilometres.
doi:10.1038/nature07331
PMCID: PMC2735096  PMID: 18758442
8.  SNP@Ethnos: a database of ethnically variant single-nucleotide polymorphisms 
Nucleic Acids Research  2006;35(Database issue):D711-D715.
Inherited genetic variation plays a critical but largely uncharacterized role in human differentiation. The completion of the International HapMap Project makes it possible to identify loci that may cause human differentiation. We have devised an approach to find such ethnically variant single-nucleotide polymorphisms (ESNPs) from the genotype profile of the populations included in the International HapMap database. We selected ESNPs using the nearest shrunken centroid method (NSCM), and performed multiple tests for genetic heterogeneity and frequency spectrum on genes having ESNPs. The function and disease association of the selected SNPs were also annotated. This resulted in the identification of 100 736 SNPs that appeared uniquely in each ethnic group. Of these SNPs, 1009 were within disease-associated genes, and 85 were predicted as damaging using the Sorting Intolerant From Tolerant system. This study resulted in the creation of the SNP@Ethnos database, which is designed to make this type of detailed genetic variation approach available to a wider range of researchers. SNP@Ethnos is a public database of ESNPs with annotation information that currently contains 100 736 ESNPs from 10 138 genes, and can be accessed at and or directly at .
doi:10.1093/nar/gkl962
PMCID: PMC1747186  PMID: 17135185
9.  Parameters in Dynamic Models of Complex Traits are Containers of Missing Heritability 
PLoS Computational Biology  2012;8(4):e1002459.
Polymorphisms identified in genome-wide association studies of human traits rarely explain more than a small proportion of the heritable variation, and improving this situation within the current paradigm appears daunting. Given a well-validated dynamic model of a complex physiological trait, a substantial part of the underlying genetic variation must manifest as variation in model parameters. These parameters are themselves phenotypic traits. By linking whole-cell phenotypic variation to genetic variation in a computational model of a single heart cell, incorporating genotype-to-parameter maps, we show that genome-wide association studies on parameters reveal much more genetic variation than when using higher-level cellular phenotypes. The results suggest that letting such studies be guided by computational physiology may facilitate a causal understanding of the genotype-to-phenotype map of complex traits, with strong implications for the development of phenomics technology.
Author Summary
Despite an ever-increasing number of genome locations reported to be associated with complex human diseases or quantitative traits, only a small proportion of phenotypic variations in a typical quantitative trait can be explained by the discovered variants. We argue that this problem can partly be resolved by combining the statistical methods of quantitative genetics with computational biology. We demonstrate this for the in silico genotype-to-phenotype map of a model heart cell in conjunction with publically accessible genomic data. We show that genome wide association studies (GWAS) on model parameters identify more causal variants and can build better prediction models for the higher-level phenotypes than by performing GWAS on the higher-level phenotypes themselves. Since model parameters are in principle measurable physiological phenotypes, our findings suggest that development of future phenotyping technologies could be guided by mathematical models of the biological systems being targeted.
doi:10.1371/journal.pcbi.1002459
PMCID: PMC3320574  PMID: 22496634
10.  Copy Number Variants and Common Disorders: Filling the Gaps and Exploring Complexity in Genome-Wide Association Studies 
PLoS Genetics  2007;3(10):e190.
Genome-wide association scans (GWASs) using single nucleotide polymorphisms (SNPs) have been completed successfully for several common disorders and have detected over 30 new associations. Considering the large sample sizes and genome-wide SNP coverage of the scans, one might have expected many of the common variants underpinning the genetic component of various disorders to have been identified by now. However, these studies have not evaluated the contribution of other forms of genetic variation, such as structural variation, mainly in the form of copy number variants (CNVs). Known CNVs account for over 15% of the assembled human genome sequence. Since CNVs are not easily tagged by SNPs, might have a wide range of copy number variability, and often fall in genomic regions not well covered by whole-genome arrays or not genotyped by the HapMap project, current GWASs have largely missed the contribution of CNVs to complex disorders. In fact, some CNVs have already been reported to show association with several complex disorders using candidate gene/region approaches, underpinning the importance of regions not investigated in current GWASs. This reveals the need for new generation arrays (some already in the market) and the use of tailored approaches to explore the full dimension of genome variability beyond the single nucleotide scale.
doi:10.1371/journal.pgen.0030190
PMCID: PMC2039766  PMID: 17953491
11.  Copy Number Variants and Common Disorders: Filling the Gaps and Exploring Complexity in Genome-Wide Association Studies 
PLoS Genetics  2007;3(10):e190.
Genome-wide association scans (GWASs) using single nucleotide polymorphisms (SNPs) have been completed successfully for several common disorders and have detected over 30 new associations. Considering the large sample sizes and genome-wide SNP coverage of the scans, one might have expected many of the common variants underpinning the genetic component of various disorders to have been identified by now. However, these studies have not evaluated the contribution of other forms of genetic variation, such as structural variation, mainly in the form of copy number variants (CNVs). Known CNVs account for over 15% of the assembled human genome sequence. Since CNVs are not easily tagged by SNPs, might have a wide range of copy number variability, and often fall in genomic regions not well covered by whole-genome arrays or not genotyped by the HapMap project, current GWASs have largely missed the contribution of CNVs to complex disorders. In fact, some CNVs have already been reported to show association with several complex disorders using candidate gene/region approaches, underpinning the importance of regions not investigated in current GWASs. This reveals the need for new generation arrays (some already in the market) and the use of tailored approaches to explore the full dimension of genome variability beyond the single nucleotide scale.
doi:10.1371/journal.pgen.0030190
PMCID: PMC2039766  PMID: 17953491
12.  Comparing Spatial Maps of Human Population-Genetic Variation Using Procrustes Analysis* 
Recent applications of principal components analysis (PCA) and multidimensional scaling (MDS) in human population genetics have found that “statistical maps” based on the genotypes in population-genetic samples often resemble geographic maps of the underlying sampling locations. To provide formal tests of these qualitative observations, we describe a Procrustes analysis approach for quantitatively assessing the similarity of population-genetic and geographic maps. We confirm in two scenarios, one using single-nucleotide polymorphism (SNP) data from Europe and one using SNP data worldwide, that a measurably high level of concordance exists between statistical maps of population-genetic variation and geographic maps of sampling locations. Two other examples illustrate the versatility of the Procrustes approach in population-genetic applications, verifying the concordance of SNP analyses using PCA and MDS, and showing that statistical maps of worldwide copy-number variants (CNVs) accord with statistical maps of SNP variation, especially when CNV analysis is limited to samples with the highest-quality data. As statistical maps with PCA and MDS have become increasingly common for use in summarizing population relationships, our examples highlight the potential of Procrustes-based quantitative comparisons for interpreting the results in these maps.
doi:10.2202/1544-6115.1493
PMCID: PMC2861313  PMID: 20196748
multidimensional scaling; population genetics; principal components analysis; Procrustes analysis
13.  Bench-to-bedside review: Fulfilling promises of the Human Genome Project 
Critical Care  2002;6(3):212-215.
Since most common diseases have been shown to be influenced by inherited variations in our genes, completion of the Human Genome Project and mapping of the human genome single-nucleotide polymorphisms will have a tremendous impact on our approach to medicine. New developments in genotyping techniques and bioinformatics, enabling detection of single-nucleotide polymorphisms, already provide physicians and scientists with tools that change our understanding of human biology. In the near future, studies will relate genetic polymorphisms to features of critical illnesses, increased susceptibility to common diseases, and altered response to therapy. Novel insights into the contribution of genetic factors to critical illnesses and advances in pharmacogenomics will be used to select the most effective therapeutic agent and the optimal dosage required to elicit the expected drug response for a given individual. Implementation of genetic criteria for patient selection and individual assessment of the risks and benefits of treatment emerges as a major challenge to the pharmaceutical industry.
PMCID: PMC137447  PMID: 12133180
genetics; pharmacogenomics; polymorphism
14.  An integrated 4249 marker FISH/RH map of the canine genome 
BMC Genomics  2004;5:65.
Background
The 156 breeds of dog recognized by the American Kennel Club offer a unique opportunity to map genes important in genetic variation. Each breed features a defining constellation of morphological and behavioral traits, often generated by deliberate crossing of closely related individuals, leading to a high rate of genetic disease in many breeds. Understanding the genetic basis of both phenotypic variation and disease susceptibility in the dog provides new ways in which to dissect the genetics of human health and biology.
Results
To facilitate both genetic mapping and cloning efforts, we have constructed an integrated canine genome map that is both dense and accurate. The resulting resource encompasses 4249 markers, and was constructed using the RHDF5000-2 whole genome radiation hybrid panel. The radiation hybrid (RH) map features a density of one marker every 900 Kb and contains 1760 bacterial artificial chromosome clones (BACs) localized to 1423 unique positions, 851 of which have also been mapped by fluorescence in situ hybridization (FISH). The two data sets show excellent concordance. Excluding the Y chromosome, the map features an RH/FISH mapped BAC every 3.5 Mb and an RH mapped BAC-end, on average, every 2 Mb. For 2233 markers, the orthologous human genes have been established, allowing the identification of 79 conserved segments (CS) between the dog and human genomes, dramatically extending the length of most previously described CS.
Conclusions
These results provide a necessary resource for the canine genome mapping community to undertake positional cloning experiments and provide new insights into the comparative canine-human genome maps.
doi:10.1186/1471-2164-5-65
PMCID: PMC520820  PMID: 15363096
canine; dog; radiation hybrid; microsatellites; ESTs; BAC-ends
15.  Impact of the 1000 Genomes Project on the next wave of pharmacogenomic discovery 
Pharmacogenomics  2010;11(2):249-256.
The 1000 Genomes Project aims to provide detailed genetic variation data on over 1000 genomes from worldwide populations using the next-generation sequencing technologies. Some of the samples utilized for the 1000 Genomes Project are the International HapMap samples that are composed of lymphoblastoid cell lines derived from individuals of different world populations. These same samples have been used in pharmacogenomic discovery and validation. For example, a cell-based, genome-wide approach using the HapMap samples has been used to identify pharmacogenomic loci associated with chemotherapeutic-induced cytotoxicity with the goal to identify genetic markers for clinical evaluation. Although the coverage of the current HapMap data is generally high, the detailed map of human genetic variation promised by the 1000 Genomes Project will allow a more in-depth analysis of the contribution of genetic variation to drug response. Future studies utilizing this new resource may greatly enhance our understanding of the genetic basis of drug response and other complex traits (e.g., gene expression), therefore, help advance personalized medicine.
doi:10.2217/pgs.09.173
PMCID: PMC2833269  PMID: 20136363
drug response; genetic variation; HapMap; lymphoblastoid cell lines; pharmacogenomics; SNP
16.  The HapMap Resource is Providing New Insights into Ourselves and its Application to Pharmacogenomics 
The exploration of quantitative variation in complex traits such as gene expression and drug response in human populations has become one of the major priorities for medical genetics. The International HapMap Project provides a key resource of genotypic data on human lymphoblastoid cell lines derived from four major world populations of European, African, Chinese and Japanese ancestry for researchers to associate with various phenotypic data to find genes affecting health, disease and response to drugs. Recent progress in dissecting genetic contribution to natural variation in gene expression within and among human populations and variation in drug response are two examples in which researchers have utilized the HapMap resource. The HapMap Project provides new insights into the human genome and has applicability to pharmacogenomics studies leading to personalized medicine.
PMCID: PMC2288550  PMID: 18392109
HapMap; lymphoblastoid cell lines; genotype; gene expression; population genetics
17.  The HapMap Resource is Providing New Insights into Ourselves and its Application to Pharmacogenomics 
The exploration of quantitative variation in complex traits such as gene expression and drug response in human populations has become one of the major priorities for medical genetics. The International HapMap Project provides a key resource of genotypic data on human lymphoblastoid cell lines derived from four major world populations of European, African, Chinese and Japanese ancestry for researchers to associate with various phenotypic data to find genes affecting health, disease and response to drugs. Recent progress in dissecting genetic contribution to natural variation in gene expression within and among human populations and variation in drug response are two examples in which researchers have utilized the HapMap resource. The HapMap Project provides new insights into the human genome and has applicability to pharmacogenomics studies leading to personalized medicine.
PMCID: PMC2288550  PMID: 18392109
HapMap; Lymphoblastoid cell lines; Genotype; Gene expression; Population genetics
18.  Demonstration of Loss of Heterozygosity by Single-Nucleotide Polymorphism Microarray Analysis and Alterations in Strain Morphology in Candida albicans Strains during Infection 
Eukaryotic Cell  2005;4(1):156-165.
Candida albicans is a diploid yeast with a predominantly clonal mode of reproduction, and no complete sexual cycle is known. As a commensal organism, it inhabits a variety of niches in humans. It becomes an opportunistic pathogen in immunocompromised patients and can cause both superficial and disseminated infections. It has been demonstrated that genome rearrangement and genetic variation in isolates of C. albicans are quite common. One possible mechanism for generating genome-level variation among individuals of this primarily clonal fungus is mutation and mitotic recombination leading to loss of heterozygosity (LOH). Taking advantage of a recently published genome-wide single-nucleotide polymorphism (SNP) map (A. Forche, P. T. Magee, B. B. Magee, and G. May, Eukaryot. Cell 3:705-714, 2004), an SNP microarray was developed for 23 SNP loci residing on chromosomes 5, 6, and 7. It was used to examine 21 strains previously shown to have undergone mitotic recombination at the GAL1 locus on chromosome 1 during infection in mice. In addition, karyotypes and morphological properties of these strains were evaluated. Our results show that during in vivo passaging, LOH events occur at observable frequencies, that such mitotic recombination events occur independently in different loci across the genome, and that changes in karyotypes and alterations of phenotypic characteristics can be observed alone, in combination, or together with LOH.
doi:10.1128/EC.4.1.156-165.2005
PMCID: PMC544165  PMID: 15643071
19.  Generation of a BAC-based physical map of the melon genome 
BMC Genomics  2010;11:339.
Background
Cucumis melo (melon) belongs to the Cucurbitaceae family, whose economic importance among horticulture crops is second only to Solanaceae. Melon has high intra-specific genetic variation, morphologic diversity and a small genome size (450 Mb), which make this species suitable for a great variety of molecular and genetic studies that can lead to the development of tools for breeding varieties of the species. A number of genetic and genomic resources have already been developed, such as several genetic maps and BAC genomic libraries. These tools are essential for the construction of a physical map, a valuable resource for map-based cloning, comparative genomics and assembly of whole genome sequencing data. However, no physical map of any Cucurbitaceae has yet been developed. A project has recently been started to sequence the complete melon genome following a whole-genome shotgun strategy, which makes use of massive sequencing data. A BAC-based melon physical map will be a useful tool to help assemble and refine the draft genome data that is being produced.
Results
A melon physical map was constructed using a 5.7 × BAC library and a genetic map previously developed in our laboratories. High-information-content fingerprinting (HICF) was carried out on 23,040 BAC clones, digesting with five restriction enzymes and SNaPshot labeling, followed by contig assembly with FPC software. The physical map has 1,355 contigs and 441 singletons, with an estimated physical length of 407 Mb (0.9 × coverage of the genome) and the longest contig being 3.2 Mb. The anchoring of 845 BAC clones to 178 genetic markers (100 RFLPs, 76 SNPs and 2 SSRs) also allowed the genetic positioning of 183 physical map contigs/singletons, representing 55 Mb (12%) of the melon genome, to individual chromosomal loci. The melon FPC database is available for download at http://melonomics.upv.es/static/files/public/physical_map/.
Conclusions
Here we report the construction of the first physical map of a Cucurbitaceae species described so far. The physical map was integrated with the genetic map so that a number of physical contigs, representing 12% of the melon genome, could be anchored to known genetic positions. The data presented is already helping to improve the quality of the melon genomic sequence available as a result of a project currently being carried out in Spain, adopting a whole genome shotgun approach based on 454 sequencing data.
doi:10.1186/1471-2164-11-339
PMCID: PMC2894041  PMID: 20509895
20.  How well do HapMap SNPs capture the untyped SNPs? 
BMC Genomics  2006;7:238.
Background
The recent advancement in human genome sequencing and genotyping has revealed millions of single nucleotide polymorphisms (SNP) which determine the variation among human beings. One of the particular important projects is The International HapMap Project which provides the catalogue of human genetic variation for disease association studies. In this paper, we analyzed the genotype data in HapMap project by using National Institute of Environmental Health Sciences Environmental Genome Project (NIEHS EGP) SNPs. We first determine whether the HapMap data are transferable to the NIEHS data. Then, we study how well the HapMap SNPs capture the untyped SNPs in the region. Finally, we provide general guidelines for determining whether the SNPs chosen from HapMap may be able to capture most of the untyped SNPs.
Results
Our analysis shows that HapMap data are not robust enough to capture the untyped variants for most of the human genes. The performance of SNPs for European and Asian samples are marginal in capturing the untyped variants, i.e. approximately 55%. Expectedly, the SNPs from HapMap YRI panel can only capture approximately 30% of the variants. Although the overall performance is low, however, the SNPs for some genes perform very well and are able to capture most of the variants along the gene. This is observed in the European and Asian panel, but not in African panel. Through observation, we concluded that in order to have a well covered SNPs reference panel, the SNPs density and the association among reference SNPs are important to estimate the robustness of the chosen SNPs.
Conclusion
We have analyzed the coverage of HapMap SNPs using NIEHS EGP data. The results show that HapMap SNPs are transferable to the NIEHS SNPs. However, HapMap SNPs cannot capture some of the untyped SNPs and therefore resequencing may be needed to uncover more SNPs in the missing region.
doi:10.1186/1471-2164-7-238
PMCID: PMC1586200  PMID: 16982009
21.  A pharmacogene database enhanced by the 1000 Genomes Project 
Pharmacogenetics and genomics  2009;19(10):829-832.
Human genetic variation is likely to be responsible for a substantial fraction of the variability in complex traits including drug response. Single nucleotide polymorphisms (SNPs) have been implicated in drug response using genome-wide association studies as well as candidate-gene approaches. A more comprehensive catalogue of human genetic variation should complement the current large-scale genotypic dataset from the International HapMap Project, which focuses on common genetic variants. The 1000 Genomes Project (KGP) is an international research effort that aims to provide the most comprehensive map of human genetic variation using next-generation sequencing platforms. Due to the lack of convenient tools, however, it is a challenge for the pharmacogenetic research community to take advantage of these data. We present here a new database of some pharmacogenes of particular interest to pharmacogenetic researchers. Our database provides a convenient portal for immediate utilization of the newly released KGP data in pharmacogenetic studies.
doi:10.1097/FPC.0b013e3283317bac
PMCID: PMC2935084  PMID: 19745786
pharmacogenetics; pharmacogene; single nucleotide polymorphism; next generation sequencing; database
22.  A Quantitative Comparison of the Similarity between Genes and Geography in Worldwide Human Populations 
PLoS Genetics  2012;8(8):e1002886.
Multivariate statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been widely used to summarize the structure of human genetic variation, often in easily visualized two-dimensional maps. Many recent studies have reported similarity between geographic maps of population locations and MDS or PCA maps of genetic variation inferred from single-nucleotide polymorphisms (SNPs). However, this similarity has been evident primarily in a qualitative sense; and, because different multivariate techniques and marker sets have been used in different studies, it has not been possible to formally compare genetic variation datasets in terms of their levels of similarity with geography. In this study, using genome-wide SNP data from 128 populations worldwide, we perform a systematic analysis to quantitatively evaluate the similarity of genes and geography in different geographic regions. For each of a series of regions, we apply a Procrustes analysis approach to find an optimal transformation that maximizes the similarity between PCA maps of genetic variation and geographic maps of population locations. We consider examples in Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, as well as in a worldwide sample, finding that significant similarity between genes and geography exists in general at different geographic levels. The similarity is highest in our examples for Asia and, once highly distinctive populations have been removed, Sub-Saharan Africa. Our results provide a quantitative assessment of the geographic structure of human genetic variation worldwide, supporting the view that geography plays a strong role in giving rise to human population structure.
Author Summary
The spatial pattern of human genetic variation provides a basis for investigating the history of human migrations. Statistical techniques such as principal components analysis (PCA) and multidimensional scaling (MDS) have been used to summarize spatial patterns of genetic variation, typically by placing individuals on a two-dimensional map in such a way that pairwise Euclidean distances between individuals on the map approximately reflect corresponding genetic relationships. Although similarity between these statistical maps of genetic variation and the geographic maps of sampling locations is often observed, it has not been assessed systematically across different parts of the world. In this study, we combine genome-wide SNP data from more than 100 populations worldwide to perform a formal comparison between genes and geography in different regions. By examining a worldwide sample and samples from Europe, Sub-Saharan Africa, Asia, East Asia, and Central/South Asia, we find that significant similarity between genes and geography exists in general in different geographic regions and at different geographic levels. Surprisingly, the highest similarity is found in Asia, even though the geographic barrier of the Himalaya Mountains has created a discontinuity on the PCA map of genetic variation.
doi:10.1371/journal.pgen.1002886
PMCID: PMC3426559  PMID: 22927824
23.  Commentary: Trailblazing a Research Agenda at the Interface of Pediatrics and Genomic Discovery—a Commentary on the Psychological Aspects of Genomics and Child Health 
Journal of Pediatric Psychology  2009;34(6):662-664.
Unprecedented advances in human genome science are underway with potential to benefit public health. For example, it is estimated that within a decade, geneticists and epidemiologists will complete a catalog of the majority of genes associated with common chronic diseases. Such rapid advances create possibilities, if not the mandate, for translational research in how best to apply these and other anticipated discoveries for both individual and population health benefit. Driving these discoveries are rapid advances in infrastructure (e.g., the International HapMap Project to catalog human genetic variation; http://www.hapmap.org), analytical methods, and technology. This expansion in capabilities quickly has taken us from a genetics paradigm—where the influence of individual genes on health outcomes is paramount, to a genomics paradigm—where the complex influence of individual genes is considered in concert with each other and with environmental exposures on health outcomes. We discuss these and similar groundbreaking discoveries with an eye toward understanding their importance to child health and human development, and the role of behavioral science research conducted at the interface of pediatrics and genomic discovery.
doi:10.1093/jpepsy/jsn125
PMCID: PMC2722104  PMID: 19129267
24.  Sequencing and analysis of an Irish human genome 
Genome Biology  2010;11(9):R91.
Background
Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.
Results
Using sequence data from a branch of the European ancestral tree as yet unsequenced, we identify variants that may be specific to this population. Through comparisons with HapMap and previous genetic association studies, we identified novel disease-associated variants, including a novel nonsense variant putatively associated with inflammatory bowel disease. We describe a novel method for improving SNP calling accuracy at low genome coverage using haplotype information. This analysis has implications for future re-sequencing studies and validates the imputation of Irish haplotypes using data from the current Human Genome Diversity Cell Line Panel (HGDP-CEPH). Finally, we identify gene duplication events as constituting significant targets of recent positive selection in the human lineage.
Conclusions
Our findings show that there remains utility in generating whole genome sequences to illustrate both general principles and reveal specific instances of human biology. With increasing access to low cost sequencing we would predict that even armed with the resources of a small research group a number of similar initiatives geared towards answering specific biological questions will emerge.
doi:10.1186/gb-2010-11-9-r91
PMCID: PMC2965383  PMID: 20822512
25.  A genome-wide analysis of population structure in the Finnish Saami with implications for genetic association studies 
The understanding of patterns of genetic variation within and among human populations is a prerequisite for successful genetic association mapping studies of complex diseases and traits. Some populations are more favorable for association mapping studies than others. The Saami from northern Scandinavia and the Kola Peninsula represent a population isolate that, among European populations, has been less extensively sampled, despite some early interest for association mapping studies. In this paper, we report the results of a first genome-wide SNP-based study of genetic population structure in the Finnish Saami. Using data from the HapMap and the human genome diversity project (HGDP-CEPH) and recently developed statistical methods, we studied individual genetic ancestry. We quantified genetic differentiation between the Saami population and the HGDP-CEPH populations by calculating pair-wise FST statistics and by characterizing identity-by-state sharing for pair-wise population comparisons. This study affirms an east Asian contribution to the predominantly European-derived Saami gene pool. Using model-based individual ancestry analysis, the median estimated percentage of the genome with east Asian ancestry was 6% (first and third quartiles: 5 and 8%, respectively). We found that genetic similarity between population pairs roughly correlated with geographic distance. Among the European HGDP-CEPH populations, FST was smallest for the comparison with the Russians (FST=0.0098), and estimates for the other population comparisons ranged from 0.0129 to 0.0263. Our analysis also revealed fine-scale substructure within the Finnish Saami and warns against the confounding effects of both hidden population structure and undocumented relatedness in genetic association studies of isolated populations.
doi:10.1038/ejhg.2010.179
PMCID: PMC3062008  PMID: 21150888
Saami; genetic association studies; population structure; population isolates

Results 1-25 (411543)