PMCC PMCC

Search tips
Search criteria

Advanced
Results 1-5 (5)
 

Clipboard (0)
None

Select a Filter Below

Journals
Year of Publication
Document Types
1.  Pitfalls of Merging GWAS Data: Lessons Learned in the eMERGE Network and Quality Control Procedures to Maintain High Data Quality 
Genetic epidemiology  2011;35(8):887-898.
Genome-wide association studies (GWAS) are a useful approach in the study of the genetic components of complex phenotypes. Aside from large cohorts, GWAS have generally been limited to the study of one or a few diseases or traits. The emergence of biobanks linked to electronic medical records (EMRs) allows the efficient re-use of genetic data to yield meaningful genotype-phenotype associations for multiple phenotypes or traits. Phase I of the electronic MEdical Records and GEnomics (eMERGE-I) Network is a National Human Genome Research Institute (NHGRI)-supported consortium composed of five sites to perform various genetic association studies using DNA repositories and EMR systems. Each eMERGE site has developed EMR-based algorithms to comprise a core set of fourteen phenotypes for extraction of study samples from each site’s DNA repository. Each eMERGE site selected samples for a specific phenotype, and these samples were genotyped at either the Broad Institute or at the Center for Inherited Disease Research (CIDR) using the Illumina Infinium BeadChip technology. In all, approximately 17,000 samples from across the five sites were genotyped. A unified quality control (QC) pipeline was developed by the eMERGE Genomics Working Group and used to ensure thorough cleaning of the data. This process includes examination of sample quality, marker quality, and various batch effects. Upon completion of the genotyping and QC analyses for each site’s primary study, the eMERGE Coordinating Center merged the datasets from all five sites. This larger merged dataset re-entered the established eMERGE QC pipeline. Based on lessons learned during the process, additional analyses and QC checkpoints were added to the pipeline to ensure proper merging. Here we explore the challenges associated with combining datasets from different genotyping centers and describe the expansion to the eMERGE QC pipeline for merged datasets. These additional steps will be useful as the eMERGE project expands to include additional sites in eMERGE-II and also serve as a starting point for investigators merging multiple genotype data sets accessible through the National Center for Biotechnology Information (NCBI) in the database of Genotypes and Phenotypes (dbGaP). Our experience demonstrates that merging multiple datasets after additional QC can be an efficient use of genotype data despite new challenges that appear in the process.
doi:10.1002/gepi.20639
PMCID: PMC3592376  PMID: 22125226
quality control; genome-wide association (GWAS); eMERGE; dbGaP; merging datasets
2.  A Large Candidate Gene Survey Identifies the KCNE1 D85N Polymorphism as a Possible Modulator of Drug-Induced Torsades de Pointes 
Background
Drug-induced long QT syndrome (diLQTS) is an adverse drug effect that has an important impact on drug use, development, and regulation. Here, we tested the hypothesis that common variants in key genes controlling cardiac electrical properties modify the risk of diLQTS.
Methods and Results
In a case-control setting, we included 176 patients of European descent from North America and Europe with diLQTS, defined as documented torsades de pointes during treatment with a QT prolonging drug. Control samples were obtained from 207 patients of European ancestry who displayed <50 msec QT lengthening during initiation of therapy with a QT-prolonging drug, and 837 controls from the population based KORA study. Subjects were successfully genotyped at 1,424 single nucleotide polymorphisms (SNPs) in 18 candidate genes including 1,386 SNPs tagging common haplotype blocks, and 38 non-synonymous ion channel gene SNPs. For validation we used a set of cases (n=57) and population-based controls of European descent. The SNP KCNE1 D85N (rs1805128), known to modulate an important potassium current in the heart, predicted diLQTS with an odds ratio of 9.0 (95% confidence interval: 3.5–22.9). The variant allele was present in 8.6% of cases, 2.9% of drug-exposed controls, and 1.8% of population controls. In the validation cohort the variant allele was present in 3.5% of cases, and in 1.4% of controls.
Conclusions
This high-density candidate SNP approach identified a key potassium channel susceptibility allele that may be associated with the rare adverse drug reaction torsades de pointes.
doi:10.1161/CIRCGENETICS.111.960930
PMCID: PMC3288202  PMID: 22100668
candidate genes; death, sudden; SNP; torsade de pointes; adverse drug events
3.  Quality Control Procedures for Genome Wide Association Studies 
Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of complex disease. The recent application of GWAS to clinic-based cohorts has also yielded genetic predictors of clinical outcomes. Regardless of context, the practical utility of this information will ultimately depend upon the quality of the original data. Quality control (QC) procedures for GWAS are computationally intensive, operationally challenging, and constantly evolving. With each new dataset, new realities are discovered about GWAS data and best practices continue to be developed. The Genomics Workgroup of the National Human Genome Research Institute (NHGRI) funded electronic Medical Records and Genomics (eMERGE) network has invested considerable effort in developing strategies for QC of these data. The lessons learned by this group will be valuable for other investigators dealing with large scale genomic datasets. Here we enumerate some of the challenges in QC of GWAS data and describe the approaches that the eMERGE network is using for quality assurance in GWAS data, thereby minimizing potential bias and error in GWAS results. In this protocol we discuss common issues associated with QC of GWAS data, including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We propose best practices and discuss areas of ongoing and future research.
doi:10.1002/0471142905.hg0119s68
PMCID: PMC3066182  PMID: 21234875
4.  Follow-up examination of linkage and association to chromosome 1q43 in multiple sclerosis 
Genes and immunity  2009;10(7):624-630.
Multiple sclerosis is a debilitating neuroimmunological and neurodegenerative disease affecting more than 400,000 individuals in the United States. Population and family-based studies have suggested that there is a strong genetic component. Numerous genomic linkage screens have identified regions of interest for MS loci. Our own second-generation genome-wide linkage study identified a handful of non-MHC regions with suggestive linkage. Several of these regions were further examined using single-nucleotide polymorphisms (SNPs) with average spacing between SNPs of approximately 1.0 Mb in a dataset of 173 multiplex families. The results of that study provided further evidence for the involvement of the chromosome 1q43 region. This region is of particular interest given linkage evidence in studies of other autoimmune and inflammatory diseases including rheumatoid arthritis and systemic lupus erythematosus. In this follow-up study, we saturated the region with ~700 SNPs (average spacing of 10kb per SNP) in search of disease associated variation within this region. We found preliminary evidence to suggest that common variation within the RGS7 locus may be involved in disease susceptibility.
doi:10.1038/gene.2009.53
PMCID: PMC2765552  PMID: 19626040
multiple sclerosis; linkage; association; 1q43; RGS7
5.  Examination of NRCAM, LRRN3, KIAA0716, and LAMB1 as autism candidate genes 
BMC Medical Genetics  2004;5:12.
Background
A substantial body of research supports a genetic involvement in autism. Furthermore, results from various genomic screens implicate a region on chromosome 7q31 as harboring an autism susceptibility variant. We previously narrowed this 34 cM region to a 3 cM critical region (located between D7S496 and D7S2418) using the Collaborative Linkage Study of Autism (CLSA) chromosome 7 linked families. This interval encompasses about 4.5 Mb of genomic DNA and encodes over fifty known and predicted genes. Four candidate genes (NRCAM, LRRN3, KIAA0716, and LAMB1) in this region were chosen for examination based on their proximity to the marker most consistently cosegregating with autism in these families (D7S1817), their tissue expression patterns, and likely biological relevance to autism.
Methods
Thirty-six intronic and exonic single nucleotide polymorphisms (SNPs) and one microsatellite marker within and around these four candidate genes were genotyped in 30 chromosome 7q31 linked families. Multiple SNPs were used to provide as complete coverage as possible since linkage disequilibrium can vary dramatically across even very short distances within a gene. Analyses of these data used the Pedigree Disequilibrium Test for single markers and a multilocus likelihood ratio test.
Results
As expected, linkage disequilibrium occurred within each of these genes but we did not observe significant LD across genes. None of the polymorphisms in NRCAM, LRRN3, or KIAA0716 gave p < 0.05 suggesting that none of these genes is associated with autism susceptibility in this subset of chromosome 7-linked families. However, with LAMB1, the allelic association analysis revealed suggestive evidence for a positive association, including one individual SNP (p = 0.02) and three separate two-SNP haplotypes across the gene (p = 0.007, 0.012, and 0.012).
Conclusions
NRCAM, LRRN3, KIAA0716 are unlikely to be involved in autism. There is some evidence that variation in or near the LAMB1 gene may be involved in autism.
doi:10.1186/1471-2350-5-12
PMCID: PMC420465  PMID: 15128462

Results 1-5 (5)