Genome-wide association studies (GWAS) are being conducted at an unprecedented rate in population-based cohorts and have increased our understanding of the pathophysiology of complex disease. The recent application of GWAS to clinic-based cohorts has also yielded genetic predictors of clinical outcomes. Regardless of context, the practical utility of this information will ultimately depend upon the quality of the original data. Quality control (QC) procedures for GWAS are computationally intensive, operationally challenging, and constantly evolving. With each new dataset, new realities are discovered about GWAS data and best practices continue to be developed. The Genomics Workgroup of the National Human Genome Research Institute (NHGRI) funded electronic Medical Records and Genomics (eMERGE) network has invested considerable effort in developing strategies for QC of these data. The lessons learned by this group will be valuable for other investigators dealing with large scale genomic datasets. Here we enumerate some of the challenges in QC of GWAS data and describe the approaches that the eMERGE network is using for quality assurance in GWAS data, thereby minimizing potential bias and error in GWAS results. In this protocol we discuss common issues associated with QC of GWAS data, including data file formats, software packages for data manipulation and analysis, sex chromosome anomalies, sample identity, sample relatedness, population substructure, batch effects, and marker quality. We propose best practices and discuss areas of ongoing and future research.
Age-related macular degeneration is the leading cause of blindness among the adult population in the developed world. To further the understanding of this disease, we have studied the genetically isolated Amish population of Ohio and Indiana.
Cumulative genetic risk scores were calculated using the 19 known allelic associations. Exome sequencing was performed in three members of a small Amish family with AMD who lacked the common risk alleles in complement factor H (CFH) and ARMS2/HTRA1. Follow-up genotyping and association analysis was performed in a cohort of 973 Amish individuals, including 95 with self-reported AMD.
The cumulative genetic risk score analysis generated a mean genetic risk score of 1.12 (95% confidence interval [CI]: 1.10, 1.13) in the Amish controls and 1.18 (95% CI: 1.13, 1.22) in the Amish cases. This mean difference in genetic risk scores is statistically significant (P = 0.0042). Exome sequencing identified a rare variant (P503A) in CFH. Association analysis in the remainder of the Amish sample revealed that the P503A variant is significantly associated with AMD (P = 9.27 × 10−13). Variant P503A was absent when evaluated in a cohort of 791 elderly non-Amish controls, and 1456 non-Amish cases.
Data from the cumulative genetic risk score analysis suggests that the variants reported by the AMDGene consortium account for a smaller genetic burden of disease in the Amish compared with the non-Amish Caucasian population. Using exome sequencing data, we identified a novel missense mutation that is shared among a densely affected nuclear Amish family and located in a gene that has been previously implicated in AMD risk.
In this study, we describe the analysis of the genetically isolated Amish population of Ohio and Indiana for AMD.
age-related macular degeneration; linkage analysis; rare variant; exome sequencing; risk score analysis
To identify genetic associations between specific risk genes and bilateral advanced age-related macular degeneration (AMD) in a retrospective, observational case series of 1,003 patients: 173 patients with geographic atrophy in at least 1 eye and 830 patients with choroidal neovascularization in at least 1 eye.
Patients underwent clinical examination and fundus photography. The images were subsequently graded using a modified grading system adapted from the Age-Related Eye Disease Study. Genetic analysis was performed to identify genotypes at 4 AMD-associated variants (ARMS2 A69S, CFH Y402H, C3 R102G, and CFB R32Q) in these patients.
There were no statistically significant relationships between clinical findings and genotypes at CFH, C3, and CFB. The genotype at ARMS2 correlated with bilateral advanced AMD using a variety of comparisons: unilateral geographic atrophy versus bilateral geographic atrophy (P = 0.08), unilateral choroidal neovascularization versus bilateral choroidal neovascularization (P = 9.0 × 10 −8), and unilateral late AMD versus bilateral late AMD (P = 5.9 × 10 −8).
In this series, in patients with geographic atrophy or choroidal neovascularization in at least 1 eye, the ARMS2 A69S substitution strongly associated with geographic atrophy or choroidal neovascularization in the fellow eye. The ARMS2 A69S substitution may serve as a marker for bilateral advanced AMD.
age-related macular degeneration; ARMS2; choroidal neovascularization; genotypes; geographic atrophy
The clinical course of multiple sclerosis (MS) is highly variable, and research data collection is costly and time consuming. We evaluated natural language processing techniques applied to electronic medical records (EMR) to identify MS patients and the key clinical traits of their disease course.
Materials and methods
We used four algorithms based on ICD-9 codes, text keywords, and medications to identify individuals with MS from a de-identified, research version of the EMR at Vanderbilt University. Using a training dataset of the records of 899 individuals, algorithms were constructed to identify and extract detailed information regarding the clinical course of MS from the text of the medical records, including clinical subtype, presence of oligoclonal bands, year of diagnosis, year and origin of first symptom, Expanded Disability Status Scale (EDSS) scores, timed 25-foot walk scores, and MS medications. Algorithms were evaluated on a test set validated by two independent reviewers.
We identified 5789 individuals with MS. For all clinical traits extracted, precision was at least 87% and specificity was greater than 80%. Recall values for clinical subtype, EDSS scores, and timed 25-foot walk scores were greater than 80%.
Discussion and conclusion
This collection of clinical data represents one of the largest databases of detailed, clinical traits available for research on MS. This work demonstrates that detailed clinical information is recorded in the EMR and can be extracted for research purposes with high reliability.
Multiple sclerosis; electronic health records
To identify novel late-onset Alzheimer disease (LOAD) risk genes, we have analyzed Amish populations of Ohio and Indiana. We performed genome-wide SNP linkage and association studies on 798 individuals (109 with LOAD). We tested association using the Modified Quasi-Likelihood Score (MQLS) test and also performed two-point and multipoint linkage analyses. We found that LOAD was significantly associated with APOE (P=9.0×10-6) in all our ascertainment regions except for the Adams County, Indiana, community (P=0.55). Genome-wide, the most strongly associated SNP was rs12361953 (P=7.92×10-7). A very strong, genome-wide significant multipoint peak (recessive HLOD=6.14, dominant HLOD=6.05) was detected on 2p12. Three additional loci with multipoint HLOD scores >3 were detected on 3q26, 9q31, and 18p11. Converging linkage and association results, the most significantly associated SNP under the 2p12 peak was at rs2974151 (P=1.29×10-4). This SNP is located in CTNNA2, which encodes catenin alpha 2, a neuronal-specific catenin known to have function in the developing brain. These results identify CTNNA2 as a novel candidate LOAD gene, and implicate three other regions of the genome as novel LOAD loci. These results underscore the utility of using family-based linkage and association analysis in isolated populations to identify novel loci for traits with complex genetic architecture.
GWAS; Linkage; founder population; Amish; Alzheimer
Alzheimer disease (AD) is the most common cause of dementia. As with many complex diseases, the identified variants do not explain the total expected genetic risk that is based on heritability estimates for AD. Isolated founder populations, such as the Amish, are advantageous for genetic studies as they overcome heterogeneity limitations associated with complex population studies. We determined that Amish AD cases harbored a significantly higher burden of the known risk alleles compared to Amish cognitively normal controls, but a significantly lower burden when compared to cases from a dataset of unrelated individuals. Whole-exome sequencing of a selected subset of the overall study population was used as a screening tool to identify variants located in the regions of the genome that are most likely to contribute risk. By then genotyping the top candidate variants from the known AD genes and from linkage regions implicated previous studies in the full dataset, new associations could be confirmed. The most significant result (p = 0.0012) was for rs73938538, a synonymous variant in LAMA1 within the previously identified linkage peak on chromosome 18. However, this association is specific to the Amish and did not generalize when tested in a dataset of unrelated individuals. These results suggest that additional risk variation in the Amish remains to be identified and likely resides outside of the classical protein coding gene regions.
As APOE locus variants contribute to both risk of late-onset Alzheimer disease and differences in age-at-onset, it is important to know if other established late-onset Alzheimer disease risk loci also affect age-at-onset in cases.
To investigate the effects of known Alzheimer disease risk loci in modifying age-at-onset, and to estimate their cumulative effect on age-at-onset variation, using data from genome-wide association studies in the Alzheimer’s Disease Genetics Consortium (ADGC).
Design, Setting and Participants
The ADGC comprises 14 case-control, prospective, and family-based datasets with data on 9,162 Caucasian participants with Alzheimer’s occurring after age 60 who also had complete age-at-onset information, gathered between 1989 and 2011 at multiple sites by participating studies. Data on genotyped or imputed single nucleotide polymorphisms (SNPs) most significantly associated with risk at ten confirmed LOAD loci were examined in linear modeling of AAO, and individual dataset results were combined using a random effects, inverse variance-weighted meta-analysis approach to determine if they contribute to variation in age-at-onset. Aggregate effects of all risk loci on AAO were examined in a burden analysis using genotype scores weighted by risk effect sizes.
Main Outcomes and Measures
Age at disease onset abstracted from medical records among participants with late-onset Alzheimer disease diagnosed per standard criteria.
Analysis confirmed association of APOE with age-at-onset (rs6857, P=3.30×10−96), with associations in CR1 (rs6701713, P=7.17×10−4), BIN1 (rs7561528, P=4.78×10−4), and PICALM (rs561655, P=2.23×10−3) reaching statistical significance (P<0.005). Risk alleles individually reduced age-at-onset by 3-6 months. Burden analyses demonstrated that APOE contributes to 3.9% of variation in age-at-onset (R2=0.220) over baseline (R2=0.189) whereas the other nine loci together contribute to 1.1% of variation (R2=0.198).
Conclusions and Relevance
We confirmed association of APOE variants with age-at-onset among late-onset Alzheimer disease cases and observed novel associations with age-at-onset in CR1, BIN1, and PICALM. In contrast to earlier hypothetical modeling, we show that the combined effects of Alzheimer disease risk variants on age-at-onset are on the scale of, but do not exceed, the APOE effect. While the aggregate effects of risk loci on age-at-onset may be significant, additional genetic contributions to age-at-onset are individually likely to be small.
Alzheimer Disease; Alzheimer Disease Genetics; Alzheimer’s Disease - Pathophysiology; Genetics of Alzheimer Disease; Aging
Age-related macular degeneration (AMD) is the leading cause of irreversible visual loss in developed countries. Its etiology includes genetic and environmental factors. Although VEGFA variants are associated with AMD, the joint action of variants within the VEGF pathway and their interaction with nongenetic factors have not been investigated.
Affymetrix 6.0 chipsets were used to genotype 668,238 single nucleotide polymorphisms (SNPs) in 1207 AMD cases and 686 controls. Environmental exposures were collected by questionnaire. A set-based test was conducted using the χ2 statistic at each SNP derived from Kraft's two degree of freedom (2df) joint test. Pathway- and gene-based test statistics were calculated as the mean of all independent SNP statistics. Phenotype labels were permuted 10,000 times to generate an empirical P value.
While a main effect of the VEGF pathway was not identified, the pathway was associated with neovascular AMD in women when accounting for birth control pill (BCP) use (P = 0.017). Analysis of VEGF's subpathways showed that SNPs in the proliferation subpathway were associated with neovascular AMD (P = 0.029) when accounting for BCP use. Nominally significant genes within this subpathway were also observed. Stratification by BCP use revealed novel significant genetic effects in women who had taken BCPs.
These results illustrate that some AMD genetic risk factors may be revealed only when complex relationships among risk factors are considered. This shows the utility of exploring pathways of previously associated genes to find novel effects. It also demonstrates the importance of incorporating environmental exposures in tests of genetic association at the SNP, gene, or pathway level.
Analysis using a set-based joint test of genetic main effects and environmental interaction found that SNPs in VEGF's proliferation subpathway were associated with neovascular AMD when exogenous estrogen use in women was accounted for.
age-related macular degeneration; case-control study; epidemiology; statistics; candidate genes
The electronic MEdical Records and GEnomics (eMERGE) network brings together DNA biobanks linked to electronic health records (EHRs) from multiple institutions. Approximately 51,000 DNA samples from distinct individuals have been genotyped using genome-wide SNP arrays across the nine sites of the network. The eMERGE Coordinating Center and the Genomics Workgroup developed a pipeline to impute and merge genomic data across the different SNP arrays to maximize sample size and power to detect associations with a variety of clinical endpoints. The 1000 Genomes cosmopolitan reference panel was used for imputation. Imputation results were evaluated using the following metrics: accuracy of imputation, allelic R2 (estimated correlation between the imputed and true genotypes), and the relationship between allelic R2 and minor allele frequency. Computation time and memory resources required by two different software packages (BEAGLE and IMPUTE2) were also evaluated. A number of challenges were encountered due to the complexity of using two different imputation software packages, multiple ancestral populations, and many different genotyping platforms. We present lessons learned and describe the pipeline implemented here to impute and merge genomic data sets. The eMERGE imputed dataset will serve as a valuable resource for discovery, leveraging the clinical data that can be mined from the EHR.
imputation; genome-wide association; eMERGE; electronic health records
Genome Wide Association Studies (GWAS) are a standard approach for large-scale common variation characterization and for identification of single loci predisposing to disease. However, due to issues of moderate sample sizes and particularly multiple testing correction, many variants of smaller effect size are not detected within a single allele analysis framework. Thus, small main effects and potential epistatic effects are not consistently observed in GWAS using standard analytical approaches that consider only single SNP alleles. Here we propose unique methodology that aggregates variants of interest (for example, genes in a biological pathway) using GWAS results. Multiple testing and type I error concerns are minimized using empirical genomic randomization to estimate significance. Randomization corrects for common pathway-based analysis biases such as SNP coverage and density, linkage disequilibrium, gene size and pathway size. PARIS (Pathway Analysis by Randomization Incorporating Structure) applies this randomization and in doing so directly accounts for linkage disequilibrium effects. PARIS is independent of association analysis method and is thus applicable to GWAS datasets of all study designs. Using the KEGG database as an example, we apply PARIS to the publicly available Autism Genetic Resource Exchange (AGRE) GWA dataset, revealing pathways with a significant enrichment of positive association results.
pathway analysis; genomic randomization; gene set; enrichment
Genome-wide association studies (GWAS) and other genomic technologies have accelerated the discovery of genes and genomic regions contributing to common human ocular disorders with complex inheritance. Age-related macular degeneration (AMD), diabetic retinopathy (DR), glaucoma and myopia account for the majority of visual impairment worldwide. Over 19 genes and/or genomic regions have been associated with AMD. Current investigations are assessing the clinical utility of risk score panels and therapies targeting disease-specific pathways. DR is the leading cause of blindness in the United States and globally is a major cause of vision loss. Genomic investigations have identified molecular pathways associated with DR in animal models which could suggest novel therapeutic targets. Three types of glaucoma, primary-open-angle glaucoma (POAG), angle-closure glaucoma and exfoliation syndrome (XFS) glaucoma, are common age-related conditions. Five genomic regions have been associated with POAG, three with angle-closure glaucoma and one with XFS. Myopia causes substantial ocular morbidity throughout the world. Recent large GWAS have identified >20 associated loci for this condition. In this report, we present a comprehensive overview of the genes and genomic regions contributing to disease susceptibility for these common blinding ocular disorders and discuss the next steps toward translation to effective gene-based screening tests and novel therapies targeting the molecular events contributing to disease.
Alzheimer's disease (AD) and related dementias are a major public health challenge and present a therapeutic imperative for which we need additional insight into molecular pathogenesis. We performed a genome-wide association study and analysis of known genetic risk loci for AD dementia using neuropathologic data from 4,914 brain autopsies. Neuropathologic data were used to define clinico-pathologic AD dementia or controls, assess core neuropathologic features of AD (neuritic plaques, NPs; neurofibrillary tangles, NFTs), and evaluate commonly co-morbid neuropathologic changes: cerebral amyloid angiopathy (CAA), Lewy body disease (LBD), hippocampal sclerosis of the elderly (HS), and vascular brain injury (VBI). Genome-wide significance was observed for clinico-pathologic AD dementia, NPs, NFTs, CAA, and LBD with a number of variants in and around the apolipoprotein E gene (APOE). GalNAc transferase 7 (GALNT7), ATP-Binding Cassette, Sub-Family G (WHITE), Member 1 (ABCG1), and an intergenic region on chromosome 9 were associated with NP score; and Potassium Large Conductance Calcium-Activated Channel, Subfamily M, Beta Member 2 (KCNMB2) was strongly associated with HS. Twelve of the 21 non-APOE genetic risk loci for clinically-defined AD dementia were confirmed in our clinico-pathologic sample: CR1, BIN1, CLU, MS4A6A, PICALM, ABCA7, CD33, PTK2B, SORL1, MEF2C, ZCWPW1, and CASS4 with 9 of these 12 loci showing larger odds ratio in the clinico-pathologic sample. Correlation of effect sizes for risk of AD dementia with effect size for NFTs or NPs showed positive correlation, while those for risk of VBI showed a moderate negative correlation. The other co-morbid neuropathologic features showed only nominal association with the known AD loci. Our results discovered new genetic associations with specific neuropathologic features and aligned known genetic risk for AD dementia with specific neuropathologic changes in the largest brain autopsy study of AD and related dementias.
Alzheimer's disease (AD) and related dementias are a major public health challenge and present a therapeutic imperative for which we need additional insight into molecular pathogenesis. We performed a genome-wide association study (GWAS), as well as an analysis of known genetic risk loci for AD dementia, using data from 4,914 brain autopsies. Genome-wide significance was observed for 7 genes and pathologic features of AD and related diseases. Twelve of the 22 genetic risk loci for clinically-defined AD dementia were confirmed in our pathologic sample. Correlation of effect sizes for risk of AD dementia with effect size for hallmark pathologic features of AD were strongly positive and linear. Our study discovered new genetic associations with specific pathologic features and aligned known genetic risk for AD dementia with specific pathologic changes in a large brain autopsy study of AD and related dementias.
Primary open-angle glaucoma (POAG) is a common disease with complex inheritance. The identification of genes predisposing to POAG is an important step toward the development of novel gene-based methods of diagnosis and treatment. Genome-wide association studies (GWAS) have successfully identified genes contributing to complex traits such as POAG however, such studies frequently require very large sample sizes, and thus, collaborations and consortia have been of critical importance for the GWAS approach. In this report we describe the formation of the NEIGHBOR consortium, the harmonized case control definitions used for a POAG GWAS, the clinical features of the cases and controls and the rationale for the GWAS study design.
In the decade that has passed since the initial release of the Human Genome, numerous advancements in science and technology within and beyond genetics and genomics have been encouraged and enhanced by the availability of this vast and remarkable data resource. Progress in understanding three common, complex diseases: age-related macular degeneration (AMD), Alzheimer’s disease (AD), and multiple sclerosis (MS), are three exemplars of the incredible impact on the elucidation of the genetic architecture of disease. The approaches used in these diseases have been successfully applied to numerous other complex diseases. For example, the heritability of AMD was confirmed upon the release of the first genome-wide association study (GWAS) along with confirmatory reports that supported the findings of that state-of-the art method, thus setting the foundation for future GWAS in other heritable diseases. Following this seminal discovery and applying it to other diseases including AD and MS, the genetic knowledge of AD expanded far beyond the well-known APOE locus and now includes more than 20 loci. MS genetics saw a similar increase beyond the HLA loci and now has more than 100 known risk loci. Ongoing and future efforts will seek to define the remaining heritability of these diseases; the next decade could very well hold the key to attaining this goal.
human genome project; age-related macular degeneration; Alzheimer’s disease; multiple sclerosis; genetics; genomics; genome-wide association study
We set out to determine whether expansions in the C9ORF72 repeat found in Amyotrophic Lateral Sclerosis (ALS) and Frontotemporal Dementia (FTD) families are associated with Parkinson Disease (PD). We determined the repeat size in a total of 889 clinically ascertained patients (including PD and Essential Tremor plus Parkinsonism (ETP)) and 1144 controls using a repeat-primed PCR assay. We found that large C9ORF72 repeat expansions (>30 repeats) were not contributing to PD risk. However, PD and ETP cases had a significant increase in intermediate (>20 to 30+) repeat copies compared to controls. Overall, 14 cases (13 PD, 1 ETP) and 3 controls had >20 repeat copies (Fisher’s exact test p=0.002). Further, seven cases and no controls had >23 repeat copies (p=0.003). Our results suggest that intermediate copy numbers of the C9ORF72 repeat contribute to risk for PD and ETP. This also suggests that PD, ALS and FTD share some pathophysiologic mechanisms of disease. Further studies are needed to elucidate the contribution of the C9ORF72 repeat in the overall PD population and if other common genetic risk factors exist between these neurodegenerative disorders.
Parkinson Disease; C9ORF72 repeat; association; risk factor
Successful aging (SA) is a multidimensional phenotype involving living to older age with high physical function, preserved cognition, and continued social engagement. Several domains underlying SA are heritable, and identifying health-promoting polymorphisms and their interactions with the environment could provide important information regarding the health of older adults. In the present study, we examined 263 cognitively intact Amish individuals age 80 and older (74 SA and 189 “normally aged”) all of whom are part of a single 13-generation pedigree. A genome-wide association study of 630,309 autosomal single nucleotide polymorphisms (SNPs) was performed and analyzed for linkage using multipoint analyses and for association using the modified quasi-likelihood score test. There was evidence for linkage on 6q25-27 near the fragile site FRA6E region with a dominant model maximum multipoint heterogeneity LOD score = 3.2. The 1-LOD-down support interval for this linkage contained one SNP for which there was regionally significant evidence of association (rs205990, p = 2.36 × 10−5). This marker survived interval-wide Bonferroni correction for multiple testing and was located between the genes QKI and PDE10A. Other areas of chromosome 6q25-q27 (including the FRA6E region) contained several SNPs associated with SA (minimum p = 2.89 × 10−6). These findings suggest potentially novel genes in the 6q25-q27 region linked and associated with SA in the Amish; however, these findings should be verified in an independent replication cohort.
Electronic supplementary material
The online version of this article (doi:10.1007/s11357-012-9447-1) contains supplementary material, which is available to authorized users.
Genome-wide association; Longevity; Genetic epidemiology; Family-based study
Primary open angle glaucoma (POAG) is a genetically and phenotypically complex disease that is a leading cause of blindness worldwide. Previously we completed a genome-wide scan for early-onset POAG that identified a locus on 9q22 (GLC1J). To identify potential causative variants underlying GLC1J, we used targeted DNA capture followed by high throughput sequencing of individuals from four GLC1J pedigrees, followed by Sanger sequencing to screen candidate variants in additional pedigrees. A mutation likely to cause early-onset glaucoma was not identified, however COL15A1 variants were found in the youngest affected members of 7 of 15 pedigrees with variable disease onset. In addition, the most common COL15A1 variant, R163H, influenced the age of onset in adult POAG cases. RNA in situ hybridization of mouse eyes shows that Col15a1 is expressed in the multiple ocular structures including ciliary body, astrocytes of the optic nerve and cells in the ganglion cell layer. Sanger sequencing of COL18A1, a related multiplexin collagen, identified a rare variant, A1381T, in members of three additional pedigrees with early-onset disease. These results suggest genetic variation in COL15A1 and COL18A1 can modify the age of onset of both early and late onset POAG.
Although autism is one of the most heritable neuropsychiatric disorders, its underlying genetic architecture has largely eluded description. To comprehensively examine the hypothesis that common variation is important in autism, we performed a genome-wide association study (GWAS) using a discovery dataset of 438 autistic Caucasian families and the Illumina Human 1M beadchip. 96 single nucleotide polymorphisms (SNPs) demonstrated strong association with autism risk (p-value < 0.0001). The validation of the top 96 SNPs was performed using an independent dataset of 487 Caucasian autism families genotyped on the 550K Illumina BeadChip. A novel region on chromosome 5p14.1 showed significance in both the discovery and validation datasets. Joint analysis of all SNPs in this region identified 8 SNPs having improved p-values (3.24E-04 to 3.40E-06) than in either dataset alone. Our findings demonstrate that in addition to multiple rare variations, part of the complex genetic architecture of autism involves common variation.
Multiple sclerosis is a debilitating neuroimmunological and neurodegenerative disease affecting more than 400,000 individuals in the United States. Population and family-based studies have suggested that there is a strong genetic component. Numerous genomic linkage screens have identified regions of interest for MS loci. Our own second-generation genome-wide linkage study identified a handful of non-MHC regions with suggestive linkage. Several of these regions were further examined using single-nucleotide polymorphisms (SNPs) with average spacing between SNPs of approximately 1.0 Mb in a dataset of 173 multiplex families. The results of that study provided further evidence for the involvement of the chromosome 1q43 region. This region is of particular interest given linkage evidence in studies of other autoimmune and inflammatory diseases including rheumatoid arthritis and systemic lupus erythematosus. In this follow-up study, we saturated the region with ~700 SNPs (average spacing of 10kb per SNP) in search of disease associated variation within this region. We found preliminary evidence to suggest that common variation within the RGS7 locus may be involved in disease susceptibility.
multiple sclerosis; linkage; association; 1q43; RGS7
The Alzheimer amyloid protein precursor (APP) is subject to proteolysis by ADAM10 and ADAM17, precluding the formation of Aβ. Recently, coding variations in ADAM10 resulting in altered function have been reported in familial Alzheimer disease (AD). We carried out a large-scale (n=576: Controls, 271; AD, 305) resequencing study of ADAM10 in sporadic AD. Our results do not support a significant role for ADAM10 mutations in AD. Our results also make it clear that the careful examination of ancestry required in any case-control comparison is especially true with rare variations, where even a very small number of variations might form the basis of scientific conclusions.
Mutation; rare variation; genetics; association
A broad region of chromosome 10 (chr10) has engendered continued interest in the etiology of late-onset Alzheimer Disease (LOAD) from both linkage and candidate gene studies. However, there is a very extensive heterogeneity on chr10. We converged linkage analysis and gene expression data using the concept of genomic convergence that suggests that genes showing positive results across multiple different data types are more likely to be involved in AD. We identified and examined 28 genes on chr10 for association with AD in a Caucasian case-control dataset of 506 cases and 558 controls with substantial clinical information. The cases were all LOAD (minimum age at onset ≥ 60 years). Both single marker and haplotypic associations were tested in the overall dataset and 8 subsets defined by age, gender, ApoE and clinical status. PTPLA showed allelic, genotypic and haplotypic association in the overall dataset. SORCS1 was significant in the overall data sets (p=0.0025) and most significant in the female subset (allelic association p=0.00002, a 3-locus haplotype had p=0.0005). Odds Ratio of SORCS1 in the female subset was 1.7 (p<0.0001). SORCS1 is an interesting candidate gene involved in the Aβ pathway. Therefore, genetic variations in PTPLA and SORCS1 may be associated and have modest effect to the risk of AD by affecting Aβ pathway. The replication of the effect of these genes in different study populations and search for susceptible variants and functional studies of these genes are necessary to get a better understanding of the roles of the genes in Alzheimer disease.
Alzheimer disease; late-onset Alzheimer Diseasev; LOAD; genomic convergence; association; candidate genes; PTPLA; SORCS1
Variations in a locus at chromosome 10q26 are strongly associated with the risk of age-related macular degeneration (AMD). The most significantly associated haplotype includes a nonsynonymous SNP rs10490924 in the exon 1 of ARMS2 and rs11200638 in the promoter region of HTRA1. It is under debate which gene(s), ARMS2, HTRA1 or some other genes are functionally responsible for the genetic association. To verify whether the associated variants correlate with a higher HTRA1 expression level as previously reported, HTRA1 mRNA and protein were measured in a larger human retina-RPE-choroid samples (n = 82). Results show there is no significant change of HTRA1 mRNA level among genotypes at rs11200638, rs10490924 or an indel variant of ARMS2. Furthermore, two AMD-associated synonymous SNPs rs1049331 and rs2293870 in HTRA1 exon 1 do not change its protein level either. These results suggest that the AMD-associated variants in the chromosome 10q26 locus do not significantly affect the expression of HTRA1.
MAPT encodes for tau, the predominant component of neurofibrillary tangles that are neuropathological hallmarks of Alzheimer’s disease (AD). Genetic association of MAPT variants with late-onset AD (LOAD) risk has been inconsistent, although insufficient power and incomplete assessment of MAPT haplotypes may account for this.
We examined the association of MAPT haplotypes with LOAD risk in more than 20,000 subjects (n-cases = 9,814, n-controls = 11,550) from Mayo Clinic (n-cases = 2,052, n-controls = 3,406) and the Alzheimer’s Disease Genetics Consortium (ADGC, n-cases = 7,762, n-controls = 8,144). We also assessed associations with brain MAPT gene expression levels measured in the cerebellum (n = 197) and temporal cortex (n = 202) of LOAD subjects. Six single nucleotide polymorphisms (SNPs) which tag MAPT haplotypes with frequencies greater than 1% were evaluated.
H2-haplotype tagging rs8070723-G allele associated with reduced risk of LOAD (odds ratio, OR = 0.90, 95% confidence interval, CI = 0.85-0.95, p = 5.2E-05) with consistent results in the Mayo (OR = 0.81, p = 7.0E-04) and ADGC (OR = 0.89, p = 1.26E-04) cohorts. rs3785883-A allele was also nominally significantly associated with LOAD risk (OR = 1.06, 95% CI = 1.01-1.13, p = 0.034). Haplotype analysis revealed significant global association with LOAD risk in the combined cohort (p = 0.033), with significant association of the H2 haplotype with reduced risk of LOAD as expected (p = 1.53E-04) and suggestive association with additional haplotypes. MAPT SNPs and haplotypes also associated with brain MAPT levels in the cerebellum and temporal cortex of AD subjects with the strongest associations observed for the H2 haplotype and reduced brain MAPT levels (β = -0.16 to -0.20, p = 1.0E-03 to 3.0E-03).
These results confirm the previously reported MAPT H2 associations with LOAD risk in two large series, that this haplotype has the strongest effect on brain MAPT expression amongst those tested and identify additional haplotypes with suggestive associations, which require replication in independent series. These biologically congruent results provide compelling evidence to screen the MAPT region for regulatory variants which confer LOAD risk by influencing its brain gene expression.
Genome-wide association studies (GWASs) have proven highly effective, identifying hundreds of associations across numerous complex diseases. These studies typically test hundreds of thousands of variations and identify hundreds of potential associations. However, to date, follow-up attempts have generally only concentrated on just the few most significant initial associations, leaving the majority of true associations in any GWAS study without replication. Here, we present a substantially more comprehensive follow-up of the first genome-wide association screen performed in multiple sclerosis (MS), a complex genetic disease with central nervous system inflammation. We genotyped approximately 30 000 single-nucleotide polymorphisms (SNPs) that demonstrated mild-to-moderate levels of significance (P ≤ 0.10) in the initial GWAS in an independent set of 1343 MS cases and 1379 controls. We further replicated several of the most significant findings in another independent data set of 2164 MS cases and 2016 controls. We find considerable evidence for a number of novel susceptibility loci including KIF21B [rs12122721, combined P = 6.56 × 10−10, odds ratio (OR) = 1.22] and TMEM39A (rs1132200, P = 3.09 × 10−8, OR = 1.24), both of which meet genome-wide significance. Both of these loci were overlooked in the initial replication, despite being among the top 3000 (∼1%) SNP hits in the original screen.